From owner-chemistry@ccl.net Mon Dec 9 10:17:00 2013 From: "Daniel Jana dfjana . gmail.com" To: CCL Subject: CCL:G: Share of experience, software and hardware Message-Id: <-49414-131209095520-23137-+tfAF2FVONKn1f6SCfcotg]![server.ccl.net> X-Original-From: Daniel Jana Content-Type: text/plain; charset=ISO-8859-1 Date: Mon, 9 Dec 2013 15:55:13 +0100 MIME-Version: 1.0 Sent to CCL by: Daniel Jana [dfjana_-_gmail.com] Hello, Strategy 1 - I see no problem with having multiple users using the same computer. Of course, physically it's hard... you tend to only have one keyboard and screen after all. However, once you have the computers on the network, nothing prevents other users from connecting (I assume we are talking about Linux workstations) via SSH and launching calculations. Through a combination of job priority and appropriate choice of the number of cores, the machine can be used for regular work (checking journals on the web, reading/writing papers, the occasional video on youtube because no one works 100% of the time) while running jobs. The easiest way to manage the software will be having a NFS-shared partition with all the software installed. This means you only install the software in one place, rather than locally in every machine. In this scenario, users typically check for available workstations and run their jobs directly on the machines. You can always go the extra-mile and install a scheduler so they can submit jobs to the queue and it runs on the first available machine. But that may be too much to learn in the beginning. Strategy 2 - Obviously having a cluster is the ideal solution, but I'm not sure with that budget you'll go far. Perhaps you buy computers > from a regular shop and not rack-ready hardware, making it a bit cheaper). You will still have a lot of computers in the same room producing heat. You have to at least consider the possibility that part of that money will go to buying and installing AC. If you go with strategy 1 you probably will have the students spread over several rooms so the problem becomes less obvious. And buying a rack-ready cluster will also mean buying a rack. With such a small budget it may end up not being a negligible part of your budget. Concerning Amber/Gaussian... those two codes have different capabilities when it comes to scaling to many cores. My personal feeling is Gaussian scales poorly beyond 8 cores and poorly over more than one machine. Amber, on the other hand, should be more or less linearly scaling for at least a few hundred cores. This means that if you plan to have a cluster for Gaussian there's not much of a need for Infiniband, while for Amber it does make sense (because running it over Ethernet does impact the performance substantially. Of course, with a total budget of 40 kUSD, talking about Infiniband is probably a bit stupid. I'd say in the beginning strategy 1 makes more sense. You still need computers for the students anyway... no point having a cluster if users can't connect to it. You can also start learning slowly the tools needed to have a cluster (e.g. learning NFS in the beginning to share the software; later on installing NIS to manage the users centrally, rather than having to install all users on all workstations; later on installing a scheduler so that jobs can be automatically submitted to remote machines). It's true that it's not trivial to manage one, so taking baby steps is probably the best way to go at it. When you feel more comfortable with it, perhaps even having one or two students capable of dealing with all the needed tools that make a cluster, perhaps you could then think about acquiring a cluster, potentially with a few other groups so you could make an investment giving you a cluster with 30 or 40 nodes. When it comes to software: I would avoid as much purchasing non-scientific software. Why spend money on an OS, when Linux (provided you and your students have either the skills or the time to learn them) costs nothing and is probably the best solution? Once your students are accustomed to the shell, they can start working on scripts that make their life easier (e.g. by parsing the output files and extracting only the relevant bits, rather than having to do it all by hand). Linux and related tools will cover most of your needs (even if you go for a cluster, NFS, DHCP, SSH, NIS, are all readily available and there's plenty of information on how to get them to work). And if you are anyway considering a cluster, chances are you'll need to learn Linux anyway. At least for the cluster, you need a scheduler to manage the jobs of the users (although, as I mentioned earlier, it may even make sense with the workstations). Lately I've been inclined to use SLURM. Torque feels a bit abandoned, SGE split into so many things after Sun got bought by Oracle that I don't even know which version to install. I could name a few other but some of the ones I've tried just felt too bad to be put into production. SLURM is a young project, it has some quirks, but it seems a good bet for the near future. Compilers: in the beginning you can certainly work with GNU compilers (gcc, gfortran, ...), coming with Linux. Most of the codes you need to compile will work with those. You'll definitely need to install BLAS and LAPACK. Perhaps they will be available from the Linux distribution you choose. But it would be best to compile them locally, for optimal performance. FFTW 2 and 3 will also be important, but you'll figure that out quickly. However, on the long run, consider purchasing Intel compilers and MKL. The codes compiled with those are often faster than those compiled with GNU compilers. With a limited number of machines, efficiency may be the best upgrade you can get. As for vendors, I feel I cannot give you a good answer. Certainly the best vendor in Egypt is not the same as here (read best in whatever way you want, from cheapest to the one giving the best costumer service). I hope this helps, Daniel PS - Please consider a backup solution. You may go with strategy 1 for now, but it serves no purpose to have all those computers and risk losing months of work because a hard drive died. Consider buying a machine, with several disks and several times the capacity of the individual computers and automating backups of the workstations. It can even be the machine where you install the software, to reduce costs. Bonus points if you manage to have it in a separate location (e.g. a server room on the other side of the campus). Like this you avoid losing the backups and the workstations when a fire burns your lab or when someone steals some computers overnight. It may seem that you can think about this later, but from personal experience and anecdotal evidence, people only think about backups when it's too late, when you already need them. On 8 December 2013 20:55, Mahmoud A. A. Ibrahim m.ibrahim[A]compchem.net wrote: > > Sent to CCL by: "Mahmoud A. A. Ibrahim" [m.ibrahim^compchem.net] > Dear Colleagues > We ask you kindly to share your experience with us. > Nowadays, we are establishing a new computational chemistry lab and aiming to > purchase some hardware. > The budget is not high. It is around 40,000$. > We have two strategies: > 1- Purchase good workstations with the available budget. The problem is that > only one user will use the workstation, i.e. we need a workstation per each > student. If there is any way to make many users to use the same workstation > at the same time, please share your knowledge with us and let us know. > 2- Purchase a small HPC which can be upgraded in the near future (just add > more processors and storage disks). I prefer this strategy which makes us > able to increase our facilities in the future very easily without getting red > off the old ones. But, we don't have a professional technicians herein at the > current time, and our colleagues say that it is not easy to manage a small > HPC to handle your jobs. > We need your experience and let us know if you were us which one you would > purchase (workstations or small HPC). > It would be nice from you if you let us know what all hardware and software > you need to purchase starting from operating system upto the software > responsible for handling the jobs and compilers. As well in case of purchase > HPC/workstations, which company you would recommend. > For your information, we are aiming to run Gaussian calculations and AMBER > simulations at the current time. > Finally, we thank you deeply in advance for your support. > Sincerely; > M. Ibrahim > P.S. we read many posts on CCL regarding the hardware but because of the fast > growing up of technology we are afraid we missed something around. We do > apologize for any inconvenience caused. > -- > Mahmoud A. A. Ibrahim > Editor, Journal of Organic and Biomolecular Simulations (JOBS), Science > Publications > Group Leader, CompChem Lab, Chemistry Department, > Faculty of Science, Minia University, Minia 61519, Egypt. > Email: m.ibrahim()compchem.net > m.ibrahim()mu.edu.eg > Website: www.compchem.net> > From owner-chemistry@ccl.net Mon Dec 9 13:31:01 2013 From: "Heng Zhang chemzhh:163.com" To: CCL Subject: CCL: Spectroscopy of some Cu complexes Message-Id: <-49415-131209132927-20398-d3BfejHNnC1KBJy5fhjHbg()server.ccl.net> X-Original-From: "Heng Zhang" Date: Mon, 9 Dec 2013 13:29:26 -0500 Sent to CCL by: "Heng Zhang" [chemzhh,163.com] Dear all, Now I want to compute the IR, UV-Vis, Raman spectroscopy of some Cu complexes (containing C, H, N, O, Cl), which DFT method and basis should I choose in order to get more accurate results? Thanks in advance. Yours, Heng