CCL: W:hardware for computational chemistry calculations



 Sent to CCL by: "Perry E. Metzger" [perry:piermont.com]
 --Replace strange characters with the "at" sign to recover email
 address--.
 "Luis Manuel Simon" writes:
 > Probably, there have been many discussions about hardware on this
 > list, but somehow I need some update about few questions:
 >
 > Which is the better performance/price choice for computational
 > chemistry (quantum chemistry and mollecular dynamic simulations)?
 > Is it better a PC-cluster or a workstation? Several PC nodes in a
 > cluster or several processors in a board?
 > AMD 64, OPTERON, intel PIV, Xeon, SPARC? Are dual processors worth?
 > How important is microprocessor cache size?
 > How important is RAM size? And registered DDR modules?
 > Is it RAID worth? Does SCSI hard drive really perform better than SATA?
 > What about operating system, linux distribution, etc?
 > Is there any comparison, next to spec.org, about different configurations?
 > From what I can tell, at the moment (and this could change at any
 time), the price/performance of AMD 64 equipment ("Opteron" is just a
 model of AMD 64) is by far the best. I know people who do a lot of
 number crunching who are buying AMD 64 whiteboxes by the fleet, and
 putting them in clusters.
 As to the rest, the Sparc is very very far behind on the
 price/performance curve at this point. I wouldn't even vaguely
 consider it.
 Cache size is of course quite important on modern machines, because
 main memory continues to lag far behind the speed of the cache. Taking
 a cache miss means your processor sits around twiddling its thumbs for
 a very long time, so you want as few cache misses as possible. That
 said, the precise size of cache that is important to you is totally
 dependent on the specific problem you are working on. Some problems
 fit in fairly small caches, some need very large caches. "It depends."
 Of course, since you can't buy cache separately from the processor in
 a modern machine, generally speaking this isn't something to do
 separately from the processor decision --  if you've benchmarked a
 particular processor and it works best, you know the cache it has is
 the best for your problem.
 RAM size is also an issue. On a modern machine, you want to avoid ever
 getting page faults because paging something in is Very Very Very
 Slow. That means you want enough RAM that your entire working set fits
 in memory nicely. RAM is also used these days as buffer cache for file
 data, so the more RAM you have, the faster file i/o ends up being if
 you're doing lots of it. So, again, how much is enough depends a lot
 on your problem. Different sized problems will eat different amounts
 of RAM successfully. Luckily, these days RAM is dirt cheap. The
 biggest issue you will have is in dealing with problems that need very
 large memory spaces -- if you need an address space for your problem
 with more than a gig or two of memory, you need a 64 bit
 processor. (Technically, a 32 bit processor can handle 4G of address
 space, but remember that in most OSes, a big chunk (sometimes half) of
 the address space (not memory, address space!) is used by the OS, and
 mappings for the stack, shared libraries, etc., will make it
 impossible to truly use all the rest.) The 32 bit x86 processors can
 use more than 4G of RAM, but they can't devote it all to the address
 space of one process -- it is only really useful if you have multiple
 processes that can make use of it -- so again, if you have a problem
 that needs a big address space, you want AMD 64 or the Intel stuff
 with the same 64 bit extensions (but those processors aren't as fast.)
 On the question of clusters versus MP in a single machine, again,
 "that depends". Multiple processors or machines in general will only
 help you if your problem is easily parallelized. If it is easily
 parallelized but requires extremely tight coupling (i.e. you've got
 some small places where you could vectorize but you'll lose if you
 have to take a communications hit), even more than one processor won't
 help. If you have somewhat looser communications constraints but need
 shared RAM in order to operate effectively, you have no choice but to
 use an MP machine. Here again, though, keep in mind that you have
 strong limits to where you will get that way -- price/performance of
 MP machines goes down sharply with the number of processors, and
 affordable MP machines rarely have more than 4. Even in clusters,
 though, sometimes dual processor machines make sense if you save
 enough on things like power supplies, cases, etc. that you've saved
 money over all. If your problem is embarrassingly parallel and you can
 put it over a cluster, by all means, use a cluster. Especially with
 gig E networking cards as cheap as they are now, you'll win big.
 On disk performance and controllers: in computational problems, it is
 very rare that you are actually remotely I/O bound. Most computational
 chemistry isn't moving 200G of data set in and out of RAM as fast as
 possible, it is loading a problem into memory and the problem then
 stays in memory. If your problem is a factor of 2 too big for memory,
 don't get a faster disk controller, get more RAM -- it will make far
 more of a difference. In other applications -- if you were building a
 database server, say, or you have a rare chem problem that actually is
 disk bound -- I'd give a different answer. In general, though, these
 days I go with SATA when I can -- the price/performance of SCSI
 doesn't justify it any more except at the very high end of I/O
 needs. RAID is nice on a server because it can save you when you have
 trouble and such, but really, RAID or SCSI on machines in your compute
 cluster would be silly, since they're not going to do much disk i/o if
 you can help it! If your cluster machine isn't doing much disk i/o,
 just use the SATA or IDE controller on the mother board and be done
 with it.
 On RAM, for large memories, you really *need* ECC. The odds are just
 too high that you'll get a single bit error somewhere and ruin days of
 number crunching with it. Skimping on fancy cases for your computers
 is one thing, skimping on ECC is another. Incidently, not all chip
 sets actually pay attention to ECC! Make sure your motherboards do, or
 you'll be spending extra money for nothing.
 By the same token, incidently, do not get crappy power supplies --
 they are by far and away (in my experience) the biggest source of
 "mysterious trouble" in modern machines. A nice ANTEC or other quality
 supply properly rated for the power consumption of your cluster member
 will make it a whole lot happier long run, which means less trouble
 for you. Similarly, clean AC power going in, and a nice clean machine
 room (google for "zinc whiskers" some time) will make you far happier.
 As for operating systems, that's a matter of taste. So long as you
 aren't Windows, you'll be fine. And yes, I'm quite serious about "not
 Windows". You would imagine that since we're talking about something
 compute bound that barely cares about the kernel, all OSes would be
 the same, but you would be wrong.
 There are multiple reasons to stay away from Windows for compute
 clusters. First, there are dumb architectural mistakes in Windows that
 make it do too much I/O -- it pages too often, and doesn't use RAM as
 buffer cache efficiently. Linux and BSD also need tuning to minimize
 paging, but you at least *can* effectively do the tuning. On Windows,
 you'll be fiddling with the registry forever, trying to figure out why
 it is that you can't get the buffer cache large enough and can't keep
 your executable pages in RAM long enough even though there are gigs of
 free memory, and then you'll spend six hours on the phone with
 Redmond, and then you'll give up. Give up in advance -- you will be
 happier. Also, network I/O performance just can't be tuned up as far
 as you want on Windows. Lastly, and almost most significantly, you can
 set up Linux or NetBSD or FreeBSD so that you just shove another box
 into the rack, let it PXE boot, and walk away from it, never having to
 think about it again until the day your management script says you
 need to replace it because it died. You just can't do that with
 Windows -- it is simply not architected right to allow you to mass
 manage that way. I've set up systems for mass managing hundreds of
 machines in Unix clusters without human intervention, and I know of no
 one who has ever come close on that level of automation with
 Windows. Keep away. Even if you're managing one box, it is still too
 much of a pain. By the way, all this is even ignoring the needless
 expense of Windows licenses that could be better spent on faster
 processors or more RAM. I know hedge funds where money is no object
 and still none of them run Windows on compute clusters.
 Things are a bit different if you're just buying a personal
 workstation. In that case, I'd buy a Mac. :)
 All that said, if you are building a compute cluster, if you use a
 fairly late model Linux, NetBSD, FreeBSD, etc., you can make it work
 well. I'm a NetBSD bigot myself, but that is not for reasons that
 would impact computational clusters substantially. You can do fine
 with any of them if you're doing what is largely number crunching. The
 one to pick is the one that you or your admin staff feel most
 comfortable about managing -- and most importantly, automating the
 management so you don't ever have to log on to 10 or 50 or even 5
 boxes to fix something.
 A strong recommendation though that I'll bring up here because it is
 vaguely OS related -- do NOT use more threads than processors in your
 app if you know what is good for you. Thread context switching is NOT
 instant, and you do not want to burn up good computation cycles on
 useless thread switching. If you have one processor machines in your
 cluster, stick to one computing process/thread on it. If you have dual
 processor boxes, one process with two threads or two processes with
 one thread make fine sense, but 40 would not be a good use of your
 cycles. If you need to talk to 60 TCP sockets to share data between
 boxes, use event driven code, not 60 threads, if you want performance.
 A final recommendation about buying stuff.
 If you are buying one workstation for yourself, buy a Mac.
 If you are buying just a couple of compute servers to run an open
 source Unix, buy Dell -- for whatever reason, their prices are now
 totally insanely below what I can manage anywhere else in low
 quantities -- but be careful about which of the "small business" and
 "academic" discounts are lower at the moment. Often the academic price
 is insanely high for no obvious reason.
 If you are buying 50 and especially 500 machines, either spec and buy
 parts and assemble the things, or spec the parts and work with a
 reputable white box manufacturer to get them built for you. For large
 clusters, you want to get *exactly* what you want. You especially
 don't want to find out months down the line that half the machines are
 subtly different from the other half. Also, for large clusters, the
 maint. contracts you can get with things like Dells are useless
 expenses -- just buy spare machines and spare parts, they're cheaper
 -- and keep in mind that these days, the biggest single limiting
 factor you'll get in the average machine room for a large cluster is
 getting enough power in and enough heat out.
 For truly huge clusters, you can start doing truly odd things to get
 better price/performance. Google does a couple of rather radical
 things -- for example they don't put their machines in real cases (it
 would cost more money and would lower the rate at which their racks
 can extract heat from the systems, not to mention that it slows down
 getting at the machines). You don't have to go that far, but even in
 smaller clusters thinking about mechanics can help. Nylon thumb screws
 instead of metal screws make getting stuff in and out faster. Velcro
 wraps instead of nylon tie wraps can make it far easier to pull things
 in and out. Properly planning your cabling (and labeling it!) so you
 can be sure that the connectors on the ethernet switch systematically
 correspond to the machines in the rack yields surprising benefits. Oh,
 and always remember, automate everything in the systems administration
 you can possibly automate.
 Perry