AMBER 8 PMEMD Benchmarks on Xserve G5 Cluster
The contributions have been made by Apple Computer are really nice! Thank you guys!
Recently I got some chances to access Xserve G5 clusters in UC Irvine and in Apple Computer. The Research Computing group in Apple Computer is kind enough to share their progress on optimizing PMEMD with me. The modification seems to be very nice also maintaining good double precision accuracy. Here is a tarball of the files got modified during this test: pmemd_modification.tgz. Of course, md5 and filelist are hereMD5 (pmemd_modification.tgz) = 4600a2f66e40995bb087e6eb21b8a360 pmemd/configure pmemd/src/ew_direct_cit.f90 pmemd/src/short_ene_G5_opt.f90 pmemd/config_data/macosx_g4.xlf90.nopar pmemd/config_data/macosx_g5.xlf90.lam pmemd/config_data/macosx_g5.xlf90.mpich_gm pmemd/config_data/macosx_g5.xlf90.noparAlso, the usages of this modified configure script for the g5 can be found here:
./configure macosx_g4 xlf90 nopar ./configure macosx_g5 xlf90 nopar ./configure macosx_g5 xlf90 lam ./configure macosx_g5 xlf90 mpich_gm*Important* Please make sure to backup the old source files before untar this tarball.
As to the AltiVec patch I proposed last year, I don't think it really matters because of the precision issue --- AltiVec is a floating point monster, but it's not even a double precision animal.
This benchmark is following the way we do in AMBER 8 benchmark page. In default JAC benchmark, the simulation undergoes 1 pico second time scale. So the performance measure ppd (ps per day) would be the ratio of 86400 over the time consumed by the simulation.

This plot clearly shows that the code optimization for G5 does improve the performance. It also tells us that mpich-gm scales pretty well.
In this table, NACS_UCI is the demo Xserve Dual G5 cluster hosted by Research Computing Support, NACS, UC Irvine while RC_AAPL is the demo Xserve Dual G5 cluster kindly provided by the Research Computing group in Apple Computer.
=============================================================================
"jac" == Joint Amber/Charrm DHFR benchmark. This is the protein DHFR,
solvated with TIP3 water, in a periodic box. There are 23,558 total atoms,
and PME used with a direct space cutoff of 9 Ang. This is the benchmark
in benchmarks/jac subdirectory of the Amber 8 distribution. Results
here are for pmemd.
--------------------------------------------------------------------------------
name date CPU OS compiler npcu ps per day
--------------------------------------------------------------------------------
NACS_UCI 6/05 2.0 Ghz G5 MacOS X 10.3 xlf8.1 2 336
GbE, LAM/MPI
RC_AAPL 7/05 2.0 Ghz G5 MacOS X 10.3 xlf8.1 1 180
GbE, LAM/MPI 2 324
4 561
6 758
8 1005
10 1168
12 1290
14 1440
16 1516
RC_AAPL 7/05 2.0 Ghz G5 MacOS X 10.3 xlf8.1 1 180
myrinet, MPICH-GM 2 325
4 596
6 873
8 1094
10 1371
12 1600
14 1878
16 2057