Preliminary Tests with multiple Database clients running over ATM

The system setup is shown below. The HP/UX system is a C200, running Objectivity v5.0. The data disk is a narrow SCSI device, rated at ~8 MBytes/sec. The ATM is on dedicated fibre (no other users), running at 155 Mbits/sec.

wpe1.jpg (23865 bytes)

Test 1: FTP of a large number of large files from the Exemplar to the C200 Disk.

ftp1.gif (21740 bytes)

Remarks:

  • The "Load Factor" is the value obtained using the "uptime" command on the C200
  • The disk rate is obtained using the "du -ks" command on the C200 and dividing by the elapsed time

    Note that the initial rate obtained of ~12 MBytes/sec on both the ATM and the disk was confirmed by the ftp utility, which reported 11.5 MBytes/sec for the second file transferred.

    Why this exceeds the nominal maximum rate for the SCSI disk is not understood.

    The oscillation in rates at the start of the transfers is not understood.

  • The ATM rates are obtained by using the "atmmgr 0 show -c" command on the Exemplar to obtain the cell count in and out of the adapter, using a cell size of 47 bytes, and dividing by the elapsed time.

Comment by Hugh Matlock: "It seems to me that the anomaly could be explained if the measurement system was not reporting results quickly enough. The disk IO and ATM counts could be accurate (as verified with FTP) but not available immediately to the "du -ks" or "atmmgr" utilities. Imagine, for example, that a low priority O.S. software process was used to copy and empty interrupt-level event counters. If this process was scheduled on a 1 or 2 second interval, then the counts available to the utilites would show pronounced spikes. Hope this helps..... "

 

The value of 47 for the cell size is confirmed for this payload by measuring the cell count for the ftp of a file of known size in bytes.

Test 2: CMS Event Reconstruction in one processor.

Reconstruction11.gif (27928 bytes)

Remarks:

  • The traffic destined for the database is exclusively object data, and consists of "raw" event data (like hit maps), "reconstructed" objects (like tracks) and analysis objects (like jets), and also small "tag" objects.
  • Note that the Reconstruction program processes 12 separate events: hence the 12 yellow spikes in input ATM traffic in the graph.
  • Note that the write rate to the disk is always lower than the input rate from the ATM.
  • There is significant traffic towards the Reconstruction client, particularly when the client starts up.

    This is not understood

  • There is a lot of traffic in both directions on the ATM, but a small fraction of it ends up as objects in the database.

    It is not understood what the excess traffic consists of.

  • The Load Factor on the C200 seems to fluctuate oddly (compare with the very smooth behaviour during the ftp transfer).

    The cause of this is not understood. Perhaps either the AMS or LockServer ... requires further tests.

Test 2: CMS Event Reconstruction in 16 processors.

Reconstruction161.gif (37447 bytes)

Remarks:

  • Again, very high traffic rates towards the clients when they start up.
  • Note that the Load Factor is now ~1.5 on the machine (cf ~1.0 for a single client).

 

Other graphs are available for 2 clients, 4 clients, 8 clients.

If you have comments on, or insights into, the above, please send me email.