re: Perf. of Alpha vs. R10k



 Greetings,
 I don't follow this forum, but a colleague forwarded Dr. van der Spoel's
 notes.  Was the Fortran code that was posted (routine FORLJC, with a test
 harness) the same code that produced a timing of 31 seconds on a 400MHz
 Alpha?
 On my system, it runs in less than 4 seconds, either with the options
 Dr. van der Spoel gave, or with just "f90 -fast -tune host
 -non_shared".
 (The simpler command is a wee bit better for the given code on my system,
 mostly because the rather speculative optimizations at the -O5 level are not
 attempted.  As the man page suggests, it's best to test -O5 to make sure it's
 an improvement.  That's why -O4 is the default.)
 I assume from the cache size that Dr. van der Spoel's system is an
 AlphaStation 500/400.  My system is an AlphaServer 4100 5/400, which has
 a 4MB backup cache.  Both use an EV5 generation Alpha chip at the same clock
 speed, so the bcache and main memory system are the main differences that
 would affect the given code.  But I don't think that the larger bcache could
 account for the near order of magnitude difference between Dr. van der Spoel's
 timings and mine.  Either I'm timing something different, or some other
 interesting factor remains hidden.
 Regards,
 Chuck Schneider
 Digital High Performance Computing Expertise Center
 Maynard, Massachusetts
 (e-mail me directly for best response)