From chemistry-request@ccl.net Wed Aug 12 03:42:47 1992 Date: Tue, 11 Aug 92 21:09:10 -0700 From: ross@cgl.ucsf.edu (Bill Ross) To: chemistry@ccl.net Subject: benchmarks Status: R Molecular Mechanics/Dynamics Benchmarks The following benchmarks may be of interest, not only for the thrill of watching the price/performance competition, but also for insights into architectures and for clues about what the molecular modelling community might request of designers. It may be useful to construct a set of comp chem benchmarks, including cases such as these along with QM and semiempirical cases. Two cases are considered: a "small," 274-atom solute in a large periodic bath of water molecules and ions; and a "large," ~4000-atom molecule in vacuum with a few inflated external ions. For simplicity, both systems are DNA. The code used is Amber 4.0. Although the measurements have not been taken under controlled conditions, the trials that were repeated yielded quite similar results, probably varying by less than 2%. Formal benchmarks would require a 'bare' machine and might well include wallclock times and running multiple copies of a benchmark simultaneously to force paging. In any case, the numbers that follow must be treated as anecdotal and informal. There is certainly data to inspire thought on machine architectures and compilers. For example, the Fujitsu VP220 is almost twice as fast as the Cray Y-MP for single processor; the IBM rs6000 is faster in double precision than single; and the Silicon Graphics -fastm(ath) library is as much as 2.3 times faster than the default one and yields the same energies after 100 steps minimization followed by 100 steps of dynamics. The form of dielectric constant has a surprising effect on performance: using the normal form (dependent on 1/r) exacts as much as a 30% penalty over the "distance-dependent" form (1/r^2) on architectures that have less support for the square root operation (most notably the rs6000). Clearly this is one architectural feature that the comp chem community may want to lobby for. Only one parallel architecture was tested as such: an old 8-processor Alliant - using only compiler optimization - obtained correct results on the more tested, older code. The speedup (2.3) is harder to evaluate given the uncontrolled conditions (I don't know how many processors were used). The cases include energy minimization, dynamics, and a free energy calculation. Eventually I expect to run the same cases for the solvated system as for the vacuum one. My thanks to George Seibel and David Case for helpful observations on architectures and factors affecting program speed. Note: The minmd program contains the traditional energy minimization and molecular dynamics capabilities of Amber. Sander is essentially the same, as used here. (Both programs have significant other features which are not exercised by the benchmarks.) Gibbs is the Amber free energy perturbation program. Amber 4.0 Benchmarks These benchmarks are for larger systems than the other demo cases and are intended to compare machine performance on more realistic problems. The order is roughly that of performance for the fastest machine in a product line. All times are CPU seconds measured by system calls in the programs; wallclock times may not correspond. All results except for the Alliant are for a single processor. These are single observations. dna/Run.bench dna/Run.bench2 DNA hexamer in periodic 68 DNA base pairs in vacuum. water box, constant volume. 4282 atoms, 10A cutoff on all 7682 atoms: 274 dna, 10 nonbonded pairs. Distance- counterions, 2466 waters. dependent dielectric. All solute interactions; 8A cutoff otherwise. Constant dielectric. ______________________________ ______________________________ min min+md sander sander gibbs min md ______________________________ ______________________________ Fujitsu VP2200 52/54 62/64 63/64 26/24 26/25 92 Cray Y-MP - /91 - /104 - /104 - /44 - /44 98 HP 730 336/327 367/363 337/363 205/220 200/216 409 720/50MHz 434/462 480/512 476/503 vax 9000 vector 365/468 399/520 390/524 229/311 219/296 no vector 654/774 /865 948/917 462/497 454/794 789 convex c2 479/516 549/597 562/603 279/303 277/304 767 fps 500 744/774 855/865 921/915 mips rc6280 723/1133 758/1191 731/1101 565/869 564/867 888 iris Crimson 369/482 397/517 377/566 4d/410vgx 730/1129 779/1180 768/1156 4d/310vgx 956/1618 1015/1704 993/1560 578/809 578/804 1244 w/fastm** 271/345 272/340 582 personal 1722/2830 1724/3371 1724/2846 4d/80gt 1901/3542 1996/4006 2006/3201 rs6000 530 859/844 912/895 915/858 516/391 501/378 630 decstation 5000/200 1112/1585 1173/1657 1168/1638 670/884 663/871 1325 alliant FX/8* 1772/1876 2016/2160 2034/2184 1-process 4270 IBM 3090 200J vector - /1999 - /2051 200J scalar - /6059 - /6143 sun sparc2 1798/2145 1834/2312 1627/2299 sparc 4/280 2528/3830 2700/4062 2708/4018 PROGRAM NOTES Run.bench min: 100 steps minimization minmd: 20 steps min, 80 steps md sander: 100 steps gradual warming Run.bench2 sander/min: 100 steps minimization sander/md: 100 steps gradual warming gibbs: 100 steps of dynamic windows perturbation (double-wide sampling) note: gibbs4 does not have vectorization directives note: gibbs4 is double precision One interesting thing that came to light when developing bench2 was that the distance-dependent (1/r^2) dielectric was significantly faster than the normal (1/r) one. This effect, attributed to the taking of the square root, was more pronounced when hardware arithmetic support was lacking. Representative results (double precision sander minimization): sgi mips IBM Convex hp Cray Fujitsu diel 310vgx RC6280 530 C2 730 Y-MP VP220 1/r^2 807 616 391 303 220 44 24 1/r 1244 869 576 313 240 49 29 ratio .649 .709 .679 .968 .917 .898 .828 MACHINE NOTES The Fujitsu VP2200 is a 32-bit machine with 64-bit arithmetic. The Cray Y-MP is a 64-bit machine, so single precision results are irrelevant. The Convex C2 was running under IEEE Floating Point default mode. ----- Cray hpm (hardware performance monitor) results Run.bench Run.bench2 MFlops 60.1 79.7 MIPS 35.7 35.5 M_Mem/sec 69.1 96.2 ClockCyc/Inst 4.67 4.71 ----- Memory (Mb) Data Cache Instruction Cache Cache c2 1024 fps500 128 iris4d/410vgx 128 64K 64K Sec Cache 1MB iris4d/Crim 96 8K 8K Sec Cache 1MB alliant 64 512K mips 64 iris4d/310vgx 32 64K 64K rs6000/530 16 irisPersonal 12 32K 64K iris4d/80gt 8 32K 64K dec5000/200 32 *Automatic parallelization directives were invoked in the Alliant compilation. The machine has 8 processors. I do not know what the parallel timings mean, but am impressed that correct results were obtained on all tests except polarization. -Bill Ross **compiled with the -lfastm "fastmath" lib. Energy results were exactly the same after 100 steps min + 100 steps md. -BR ---- Bill Ross --- Administrivia: This message is automatically appended by the mail exploder. CHEMISTRY@ccl.net --- everybody; CHEMISTRY-REQUEST@ccl.net --- coordinator only OSCPOST@ccl.net : send something from chemistry; FTP: www.ccl.net --- From chemistry-request@ccl.net Wed Aug 12 04:51:54 1992 Date: Tue, 11 Aug 92 23:19:14 -0700 From: d3g359@rahman.pnl.gov Subject: Graphics code for SGI To: CHEMISTRY@ccl.net Status: R Hi everyone. I am interested in public domain, chemistry related, graphics software for SGI's. I already have MULTI, SCI-AN, and XMOL. I would like examples of code based on GL, MOTIF, and/or X. Any response is appreciated. I would also be interested in public domain code for the MAC. Thanks in advance. John Nicholas Pacific Northwest Laboratory Richland, WA (509) 375-6559 jb_nicholas@pnl.gov --- Administrivia: This message is automatically appended by the mail exploder. CHEMISTRY@ccl.net --- everybody; CHEMISTRY-REQUEST@ccl.net --- coordinator only OSCPOST@ccl.net : send something from chemistry; FTP: www.ccl.net --- From chemistry-request@ccl.net Wed Aug 12 20:38:28 1992 Date: Wed, 12 Aug 92 09:00:16 PDT From: "Doug DeFrees" To: chemistry@ccl.net Status: R Netters: I was prompted by the recent note about an organic chemistry list to wonder what other lists or forums might exist on the internet which directly address chemistry or physics. This list for computational chemists is a good example as is the organic chemistry list (though there seems to be some question about it) and the nascent macromodel list that was mentioned here recently. Is there anywhere a list of lists? (I doubt it.) If you know of a public list or forum or newsgroup that directly addresses chemistry or physics please send a note to the list. Doug DeFrees IBM Almaden Research Center --- Administrivia: This message is automatically appended by the mail exploder. CHEMISTRY@ccl.net --- everybody; CHEMISTRY-REQUEST@ccl.net --- coordinator only OSCPOST@ccl.net : send something from chemistry; FTP: www.ccl.net --- From chemistry-request@ccl.net Wed Aug 12 21:34:14 1992 Date: Wed, 12 Aug 92 13:58:24 -0400 From: nobody@Kodak.COM To: "chemistry@ccl.net"@Kodak.COM Subject: Neural nets and DNA/RNA. (LONG!) Status: R >From: NAME: Adi M. Treasurywala FUNC: Biophys. & Compu. Chem. TEL: (518)445-7042 To: "chemistry@ccl.net"@kodakr@mrgate@wpc Folks, Thanks for all the interest and suggested reading. As promised here is a summary of the refs that I found. Thanks to Jacque Fetrow ad Max Vasques in particular for their help. The refs follow: 1) T.B.Schillen Designing a Neural Net Simulator- the MENS Modelling Environment for Network Systems.1. computer applications in Biosciences v7,4,417-430 (1990) 2) E.C.Uberbacher et al An Artificial Intelligence approach to DNA sequence feature recognition Trends in Boitechnol. v10,(1-2) 66-9 3) P.Arrigo et al Identification of a New motif on neucleic acid sequence data using Kohonen's self organizing map Comput. Appl. Biosc. v7(3), 353 4)E.C Uberbacher et al Locating protein coding regions in human DNA sequences by a multiple sensor neural network approach. PNAS v88(24) 11261-5 5)E.C Uberbacher et al A Neural network: Multiple sensor based method for recognition of gene coding segments in human DNA sequence data. Report ORNL/TM-11741 Order # DE91007462 6)S.B.Patersen et al Training neural nets to analyze biological sequences.Trends Biotechnol v8 (110) 304. 7)S Brunak et al Prediction of human mRNA donor and acceptor sites from the DNA sequence. J.Mol.Biol. v220(1) 49 8)G von Heijne Computer analysis of DNA and protein sequences Eur. J. Biochem. v199(2) 253 9)M.C.O`Neill Training back-propagation neural nets to define and detect DNA-binding sites.Neucleic Acids Res. v19(2) 313 10)M.Kanehisa Computer analysis of functional sites in proteins and nucleic acids. Med Philos v((6) 465 11)S.Brunak Neural net detects errors in the assignment of mRNA splice sites.Neucleic Acids Res v18(16) 4797 12)S.Brunak Cleaning up the gene database Nature v343(6254) 123 13)A.Lukashin et al Neural network models of promoter recognition J biomol str dyn v6(6) 1123 14)E.C.Uberbacher et al Pattern Recognition in DNA sequences: the intron-exon junction problem. 15) A.Lapedes et al Determination of Eukaryotic Protein Coding Regions Using Neural Nets and Information Theory. J.Mol.Biol.v226 471. 16)B.Demeler et al Neural nets optimization for e.coli promoter prediction. Neucleic Acids res. v19(7) 1593 17)G.D.Stormo et al Use of the 'Perceptron' algorithm to distinguish translational sites in e.coli. Nucleic acids Res. v10(9) 2997. Hopefully some of you folks find all this great bedtime reading. If there are typoes they are entirely my own fault. Please accept my apologies in advance. Adi Treasurywala. (adit@koidak.com) --- Administrivia: This message is automatically appended by the mail exploder. CHEMISTRY@ccl.net --- everybody; CHEMISTRY-REQUEST@ccl.net --- coordinator only OSCPOST@ccl.net : send something from chemistry; FTP: www.ccl.net --- From chemistry-request@ccl.net Wed Aug 12 23:59:10 1992 Date: 12 Aug 1992 16:15:55 -0400 (EDT) From: "Joseph E. Bitar" Subject: CASSCF (singlet vs. triplet) To: chemistry@ccl.net Status: R From: UOFT02::DSMITH "DR. DOUGLAS A. SMITH, UNIVERSITY OF TOLEDO" 12-AUG-1992 15:58:27.20 To: JBITAR CC: Subj: question I am trying to do CASSCF single point and geometry optimizations using G92 on a cyclic cation, C6H6N+. For the singlet, preoptimized at the RHF/3-21G level, I was able to run both the CASSCF jobs without a problem. For the triplet, preoptimized at the UHF/3/21G level, the single point job ran but the optimization died with the following error: ------------------------------------------------------------------------------ ********** CALCULATION TERMINATED ATTEMPT TO DO ORBITAL ROTATION GREATER THAN 45 DEGREES YOU HAVE CHOSEN THE WRONG STARTING ORBITALS ********** Error termination in Lnk1e. ------------------------------------------------------- My input file for the single point and optimization runs is identical with the exception of the OPT keyword. Both jobs used the UHF/3-21G orbitals and checkpoint geometry (identical copies of the checkpoint file were used for each job). ------------------------------------------------------- %CHK=UHFCHK2 #P CAS(6,7)/3-21G OPT POP=FULL GUESS=(READ,ALTER) GEOM=CHECKPOINT CAS full optimization on the UHF optimized structure of 1H-azepine 1 3 17,22 28,30 29,44 -------------------------------------------------------------- Why should the single point job finish but the optimization fail with an orbital rotation problem? I don't have much experience with CASSCF calculations and I was wondering if I have overlooked something that should have been included or excluded in these calculations. Any suggestions would be very appreciated. Thanks in advance. Joseph Bitar Graduate Student Chemistry Dept. University of Toledo e-mail: jbitar@uoft02.utoledo.edu --- Administrivia: This message is automatically appended by the mail exploder. CHEMISTRY@ccl.net --- everybody; CHEMISTRY-REQUEST@ccl.net --- coordinator only OSCPOST@ccl.net : send something from chemistry; FTP: www.ccl.net --- From chemistry-request@ccl.net Thu Aug 13 02:48:41 1992 Received: from ohstpw.mps.ohio-state.edu for jkl by oscsunb.ccl.net (5.65c+KVa/920330.1102) id AA13629; Thu, 13 Aug 1992 02:48:38 -0400 Received: from OHSTVMA.ACS.OHIO-STATE.EDU (MAILER@OHSTVMA) by MPS.OHIO-STATE.EDU (PMDF #12887) id <01GNIMTKIPSW8WVYX6@MPS.OHIO-STATE.EDU>; Thu, 13 Aug 1992 02:48 EDT Received: from OHSTVMA by OHSTVMA.ACS.OHIO-STATE.EDU (Mailer R2.08 R208004) with BSMTP id 9284; Thu, 13 Aug 92 02:47:36 EDT Received: from oscsunb.ccl.net by OHSTVMA.ACS.OHIO-STATE.EDU (IBM VM SMTP R1.2.1MX) with TCP; Thu, 13 Aug 92 02:47:35 EDT Received: by oscsunb.ccl.net (5.65c+KVa/920330.1102) id AA07208; Wed, 12 Aug 1992 21:08:37 -0400 Received: from relay2.UU.NET by oscsunb.ccl.net (5.65c+KVa/920330.1102) id AA06863; Wed, 12 Aug 1992 20:48:12 -0400 Received: from world.std.com by relay2.UU.NET with SMTP (5.61/UUNET-internet-primary) id AA25907; Wed, 12 Aug 92 20:47:44 -0400 Received: by world.std.com (5.61+++/Spike-2.0) id AA27041; Wed, 12 Aug 92 20:47:28 -0400 Date: Wed, 12 Aug 92 20:47:28 -0400 From: jle@world.std.COM (Joe M Leonard) Subject: Effect of Pi calculation Sender: chemistry-request@ccl.net To: chemistry@ccl.net, sybylreq@quant.chem.rpi.EDU Errors-To: owner-chemistry@ccl.net Message-Id: <9208130047.AA27041@world.std.com> X-Envelope-To: jkl@ccl.net Precedence: bulk Status: R Folks, In my continuing struggle to understand the published MM3 literature base, I've noticed differences between published and internal calculations on Benzene (complex, huh!) - the 2-2 force constant is listed as 1.332A, k=7.50, but the results I'm comparing against seem to indicate a change to 1.397A, k=15+. Is this a result of the Pi calculation changing the values of the 2-2 constants to reflect the "actual" environment or has the "official" parameter changed? Thanx in advance Joe Leonard jle@world.std.com --- Administrivia: This message is automatically appended by the mail exploder. CHEMISTRY@ccl.net --- everybody; CHEMISTRY-REQUEST@ccl.net --- coordinator only OSCPOST@ccl.net : send something from chemistry; FTP: www.ccl.net ---