Wednesday, July 20, 2005

AMBER 8 PMEMD Benchmarks on Xserve G5 Cluster

(2005-07-01 11:41:03 Update: The original benchmark did not use any code modifications for G5. New benchmark is faster.)(Update: files posted)
The contributions have been made by Apple Computer are really nice! Thank you guys!
Recently I got some chances to access Xserve G5 clusters in UC Irvine and in Apple Computer. The Research Computing group in Apple Computer is kind enough to share their progress on optimizing PMEMD with me. The modification seems to be very nice also maintaining good double precision accuracy. Here is a tarball of the files got modified during this test: pmemd_modification.tgz. Of course, md5 and filelist are here
MD5 (pmemd_modification.tgz) = 4600a2f66e40995bb087e6eb21b8a360
pmemd/configure
pmemd/src/ew_direct_cit.f90
pmemd/src/short_ene_G5_opt.f90
pmemd/config_data/macosx_g4.xlf90.nopar
pmemd/config_data/macosx_g5.xlf90.lam
pmemd/config_data/macosx_g5.xlf90.mpich_gm
pmemd/config_data/macosx_g5.xlf90.nopar
Also, the usages of this modified configure script for the g5 can be found here:
./configure macosx_g4 xlf90 nopar
./configure macosx_g5 xlf90 nopar
./configure macosx_g5 xlf90 lam
./configure macosx_g5 xlf90 mpich_gm
*Important* Please make sure to backup the old source files before untar this tarball.

As to the AltiVec patch I proposed last year, I don't think it really matters because of the precision issue --- AltiVec is a floating point monster, but it's not even a double precision animal.

This benchmark is following the way we do in AMBER 8 benchmark page. In default JAC benchmark, the simulation undergoes 1 pico second time scale. So the performance measure ppd (ps per day) would be the ratio of 86400 over the time consumed by the simulation.

JAC Benchmark on Xserve G5 Cluster
This plot clearly shows that the code optimization for G5 does improve the performance. It also tells us that mpich-gm scales pretty well.

In this table, NACS_UCI is the demo Xserve Dual G5 cluster hosted by Research Computing Support, NACS, UC Irvine while RC_AAPL is the demo Xserve Dual G5 cluster kindly provided by the Research Computing group in Apple Computer.

  =============================================================================
  "jac" == Joint Amber/Charrm DHFR benchmark.  This is the protein DHFR,
  solvated with TIP3 water, in a periodic box.  There are 23,558 total atoms,
  and PME used with a direct space cutoff of 9 Ang.  This is the benchmark
  in benchmarks/jac subdirectory of the Amber 8 distribution.  Results
  here are for pmemd.

  --------------------------------------------------------------------------------
  name   date      CPU              OS         compiler  npcu    ps per day
  --------------------------------------------------------------------------------
NACS_UCI  6/05 2.0 Ghz G5        MacOS X 10.3   xlf8.1    2        336
               GbE, LAM/MPI
 RC_AAPL  7/05 2.0 Ghz G5        MacOS X 10.3   xlf8.1    1        180
               GbE, LAM/MPI                               2        324
                                                          4        561
                                                          6        758
                                                          8       1005
                                                         10       1168
                                                         12       1290
                                                         14       1440
                                                         16       1516
 RC_AAPL  7/05 2.0 Ghz G5        MacOS X 10.3   xlf8.1    1        180
               myrinet, MPICH-GM                          2        325
                                                          4        596
                                                          6        873
                                                          8       1094
                                                         10       1371
                                                         12       1600
                                                         14       1878
                                                         16       2057