From ross@cgl.ucsf.EDU Sun Oct 11 18:13:53 1992 Date: Mon, 12 Oct 92 01:13:53 -0700 From: ross@cgl.ucsf.edu (Bill Ross) To: chemistry@ccl.net Subject: Water packing summary Thanks to all who responded. I asked about ways of solvating a molecule other than superimposing a solvent template and subtracting overlapping solvent. Some of the replies excerpted below were sent to the reflector. The immediate reply from Thomas Pierce, thpierce@rohmhaas.com gives the authoritative reference and a good summary: > During the development of modeling work at Rohm and Haas, Dr. Mario Blanco > developed a suite of methods and published them in an article titled, > "Molecular Silverware. I. General Solutions to Excluded Volume Constrained > Problems", J. Computational Chemistry, Vol. 12, #2, 237-247 (1991). > One of the methods works well for solvation problems. Basically this method > packs a molecule around another with rotations and reorientations until > the smallest radius of gyration is achieved. Then the method adds another > molecule (solvent) and packs it, etc. The software is available from Tripos. (The Blanco work was also also mentioned by Terry Stouch (STOUCH@dino.bms.com) and Chris Van Dyke (tripos!capella!vandyke@uunet.UU.NET.) jas@medinah.atc.ucarb.com (Jack Smith) adds: > I believe that Mario Blanco now works at Molecular Simulations > (Formerly Polygen, Molecular Simulations, Cambridge Molecular > Design,...) and the Silverware software is also, or soon will be, > part of their PolyGraf program. The amorphous growth methods > used in some of the commercial polymer modeling packages > seem like a reasonable way to approach the solvent packing issues, > especially for complex solvents or binary mixtures in general. And more detail on the Tripos solutions (including Blanco's) from Malcolm A. Cline > The SYBYL software, available from Tripos Associates, contains three > different algorithms for solvation. The first is the method you > mentioned, dumping a solute into a precomputed solvent box and getting > rid of overlaps. > > The second is a piece of software called Molecular Silverware, > developed at Rohm & Haas. > > Molecular Silverware copyright (c) 1990 Rohm and Haas Company, All Rights > Reserved. > > The molecular silverware algorithm is used to rapidly solvate a molecule > with any small molecule solvent. The solute molecule may be surrounded with > solvents to form a droplet cluster, or may be used to pack the solute with > solvents in a periodic box. > > The use of SILVERWARE has several advantages over precomputed boxes: > First, any solvent molecule may be used. The resultant energies are low > enough so that Dynamics can normally be used after only a brief > minimization. > Second, solvent molecules will be packed into any cavities that may be > present in the solute. This feature results in fast equilibration of the > system when using Dynamics. > Third, when solvating in a periodic box, the periodic nature of the box is > used when packing the solvents, avoiding the problem of high-energy "edge > effects" upon use of Molecular Mechanics or Dynamics. > > > The third method is called XFIT, which is an algorithm for finding best > fit between a cluster and a solvent molecule. This program can also > be used for very high-quality solvation of a target molecule, with or > without periodic boundary conditions. XFIT will in general provide > better results than SILVERWARE, but require more time. XFIT is similar > to SILVERWARE but checks more thoroughly than SILVERWARE for the best > solvent/cluster fit as it adds each new solvent molecule to the cluster. > I believe the Blanco method is O(N^2) in the number of solvent molecules (no analysis is performed in the paper), as is any solution that looks at the whole system accumulated so far before adding another element. Compared to the times that can be spent running dynamics, a one-time N^2 cost is probably negligible - Blanco mentions 5 hours on a Vax 8650 for packing 1000 chlorofluorocarbon molecules (15,000 atoms). This might take 2 hours on a current high-end workstation. Perhaps the algorithm could be improved. A more obviously linear approach is the grid-based method described by G. Ravishanker > 1. Create a 3-dimensional grid out of the simulation box, whose > size is the diameter of water. > > 2. Start placing waters in random orientations in those grids > where you do not have a solute atom (or more correctly those grids which > have a solute atom within a threshold of .5 A or so are exculed). > [here is where linearity is lost:] > > 3. Do a long Monte Carlo simulation on water (and ions if you have > them). > > 4. Start the MD by slowly heating the system to 300K over 10 ps or > so. Equilibrate and then run production. >... > What we planned to do from now on is basically take equilibrated boxes > from the strategy above from several molecules that we have studied and > replace the solute and pad any extra waters and remove the contact waters > and run a short MC. Since the water environment is not disturbed to a > great extent in the bulk of box, the MC length for convergence will be > much less than what you need by the strategy mentioned above. We strongly > believe that doing MC to get the waters in position pays off in the long > run in producing stable MD runs. Scott Flicker also suggested Monte Carlo methods for random adsorption analogous to studies of 2D surface adsorption by R.F. Ziff et.al. This might have the effect of combining the grid placement scheme described by Ravishankar with the subsequent Monte Carlo phase. Another method mentioned by Blanco in his paper (Theodorou & Suter, Macromolecules, 18, 1206 (1985)) involves resolving an initial structure generated by superimposition using scaled-down vdw radii at first and scaling them to normal while equilibrating. The ideal method for my own everyday purposes would give similar results to Blanco's method but be linear in performance. Failing a speedup of Blanco's algorithm, an alternative (for water at least) might be some form of irregular grid generation algorithm producing a grid whose points would be water oxygens which would fit the solute at close range and scale to fit a Monte Carlo bath as distance increased. This idea is analogous to the one proposed by Ravishankar above, except that the explicit Monte Carlo baths would be replaced by randomized packing rules. I am not familiar with work on grids so cannot tell if this idea even makes sense, let alone estimate the cost. If it worked, initial orientation of waters would be decided in a second pass by working outward from the solute and considering inner neighbors. Any grid-generation experts care to comment? Bill Ross From arne@mango.mef.ki.se Mon Oct 12 11:59:35 1992 Date: Mon, 12 Oct 92 10:59:35 +0100 From: arne@mango.mef.ki.se (Arne Elofsson) To: CHEMISTRY@ccl.net Subject: Cut_off distances. In Protein MD Dear Netters In Biochemistry:31 5856-60 writes Schreiber and Steinhauser that : "Cutoff size does strongly influence MD REsults on solvated polupeptides" They basically simulates a small helix forming peptide with a cutoff at 6, 10 and 14 A nonbonded cutoff and compares it with a ewald summation technique. The strange results is that the 10 A cut off is most similar (in structure) to the ewald summation simulation. This would mean that a 14 A cutoff is too short.!! There are several questions that can be asked about the simulation used (a lot of MD information is missing) but I would like to not discuss that here. (It is just doubts it can't really prove thet the simulations are wrong). I have two suggestions/ questions: 1. What earlier studies, on polypeptides, have implemented the use of a cutoff of 8 to 12 Angstrom ? Is it really just depending on used Box sizes or ? 2. Let's assume that there is something really strange about these simulations. What can we then do about it ? I have a suggestion that involves a collaboration all over the world. If every one spend some computing time on the same system, i.e. the same peptide but different forcefileds, box sizes .. Would we not be able to together solve this problem that is important to all of ous. For practical reasons maybe it is easier to limitate this kind of study to a few labs (five to ten). If there is an interest in this kind of study I would consider spending some time organize contacts. I think this list is the best forum to discuss what should be studied and how. arne Elofsson arne@mango.mef.ki.se From arne@mango.mef.ki.se Mon Oct 12 12:08:24 1992 Date: Mon, 12 Oct 92 11:08:24 +0100 From: arne@mango.mef.ki.se (Arne Elofsson) To: CHEMISTRY@ccl.net Subject: Long time motion in proteins. Dear Netters. I search for MD/MC protocols/parameters for simplified descriptions of motion. I am aware of the work done by Skolnick and all "protein folding" work, but is there any projects to study long time motions in proteins ?? sincerely yours arne@mango.mef.ki.se From bewilson@Kodak.COM Mon Oct 12 04:09:23 1992 Date: Mon, 12 Oct 92 08:09:23 -0400 From: bewilson@Kodak.COM (Bruce E. Wilson, ECCR-PA, B95-A, X8886 (bewilson@kodak.com)) To: "chemistry@ccl.net"@Kodak.COM Subject: Potential surface calculations Harald Lanig (harry@phys-chemie.uni-wuerzburg.dbp.de) asks about using GRID in Mopac to get at what I interpret as essentialy a phi-psi plot. I'm going to let my ignorance show here, but I thought that the GRID and reaction coordinate functions did not minimize the rest of the molecule, which makes the interpretation of the resulting calculations a bit problematic. Anyone care to confirm or correct this impression? Thanks: (Harry--Ihre English ist ganz besser als meine Deutsch! Ich wuerde nicht wissen, dass English Ihre Muttersprache nicht ist). Bruce Wilson (bewilson@kodak.com) (Disclaimer: Any opinions expressed are solely those of the author.) From PPANETH1@PLEARN.bitnet Mon Oct 12 09:14:39 1992 Date: Mon, 12 Oct 92 14:02:48 CET From: Piotr Subject: E-Conference To: CC-List Hi Everybody: I was out for a couple of days and I had to browse through the CCL postings afterwards. One major reflection I have now is that our discussions, apart from garbage mail, are restricted to: 1. problems with programs, 2. searches for (free) software. Does it mean that nobody has any SUCCESS with computatio- nal chemistry? If the List survives voting, how about 'organizing' an E-conference, where we'd present something that DID work? To prevent dumping of lots of work on Jan, I thought we could decide on one particular week and at the same time post 'posters' on our work. This will give us all the idea what other people are working on, what else can be done in our own field ect. We could also do it in 'sections' like; code devepolment, applications, commercial (?). Please, respond directly to me - there is enough of noncomputational traffic already on the List. If there is resonable interest in the idea I'll pass the information to Jan. Piotr ======================================================================== Dr. hab. Piotr Paneth Institute of Applied Radiation Chemistry Technical University of Lodz Zwirki 36, 90-924 Lodz, POLAND voice: (+48-42) 31-31-93 fax: (+48-42) 36-02-46 telex: 885452 ITR PL e-mail PPANETH1 \ PLEARN.bitnet (CMS) PPANETH1 \ zsku.p.lod.edu.pl (Unix) PPANETH1 \ LODZ1.DECnet (VMS) ======================================================================== From george@archimedes.cray.com Mon Oct 12 03:02:04 1992 Date: Mon, 12 Oct 92 08:02:04 CDT From: george@archimedes.cray.com (George Fitzgerald) To: chemistry@ccl.net Subject: Analytic energy 2nd deriv for ECP Netters, The information on GRADSCF and Polyatomics Research recently posted to the net was correct except for the e-mail address. The correct address for Dr. Andrew Komornicki is c0394@rhea.cray.com -george fitzgerald From berkley@wubs.wustl.edu Mon Oct 12 03:39:16 1992 Date: Mon, 12 Oct 92 08:39:16 -0500 From: berkley@wubs.wustl.edu (Berkley Shands) To: chemistry@ccl.net Subject: ACE with a twist Receptor (TM - Tripos Associates) Beta version V2.3 timings for the ACE series at four and two degree scans. Original configuration of molecules. ******************************************************************************* Benchmarks of the complete ACE series of 71 molecules at four degrees ******************************************************************************* Top CPU Top Elps Top Sys Chg CPU Chg Elps CPU Type and notes ======= ======== ======= ======= ======== ================== 763.58 774 197.78 496.18 704 DEC VS3520 32mb VMS 5.5-2+VIP 218.58 296 19.44 144.85 194 IRIS 4d/20gt (IP6) 16mb R2000A@16Mhz 126.18 135 4.03 70.33 76 SUN 4/c SPARCstation 2 28mb 103.53 127 2.67 54.95 64 IBM RS-6000-320H 32mb (AIX 3.2.1) 81.78 124 4.13 54.33 82 ESV M120 R3000A@25Mhz 73.46 90 6.29 48.45 54 IRIS 4d/380 (IP7) 128mb R3000A@35Mhz 64.84 80 4.98 42.93 44 IRIS Indigo (IP12) 48mb R3000A@35Mhz 62.19 77 1.59 33.00 45 IBM RS-6000-550 64mb (AIX 3.2.1) 51.70 62 1.52 27.14 36 IBM RS-6000-560 96mb (AIX 3.2.1) ******************************************************************************* Benchmarks of the complete ACE series of 71 molecules at two degrees ******************************************************************************* Top CPU Top Elps Top Sys Chg CPU Chg Elps CPU Type and notes ======= ======== ======= ======= ======== ================== 5968.33 3833 219.00 5644.28 5850 DEC VS3520 32mb VMS 5.5-2+VIP 1715.39 1852 26.14 1623.20 1728 IRIS 4d/20gt (IP6) 16mb R2000A@16Mhz 740.76 756 5.03 674.48 690 SUN 4/c SPARCstation 2 28mb 627.41 679 4.96 592.22 613 ESV M120 R3000A@25Mhz 632.19 653 3.13 574.15 586 IBM RS-6000-320H 32mb (AIX 3.2.1) 522.21 543 9.88 489.59 493 IRIS 4d/380S (IP7) 128mb 3/8 CPUs 485.48 544 9.75 455.54 492 IRIS Indigo (IP12) 48mb R3000A@35Mhz 383.34 402 1.98 348.64 357 IBM RS-6000-550 64mb (AIX 3.2.1) 319.20 332 1.48 290.15 294 IBM RS-6000-560 96mb (AIX 3.2.1) ******************************************************************************* Notes: "Top CPU" is the recorded total CPU for all forked processes, including overhead and I/O. "Top Elps" is the recorded elapsed time from command line parsing to the exit of the last child processes. "Top Sys" is the recorded UNIX system overhead from the times() function "Chg CPU" is the algorithmic chargable time from initial rotations to termination (child process CPU) "Chg Elps" is the elapsed time spent rotating and computing. ***************************************************************************** ACE molecule series after minimizing with a current minimizer. (No religous flames please). Scan Total Total System Child Child Points / Used Elapsed CPU Overhead Elapsed CPU Groups ==== ======= ===== ======== ======= ===== ======== 6 37 20.23 10.6 12 10.33 1/1 4 53 35.16 11.6 32 24.32 1/1 3 165 139.62 18.0 136 121.91 3/2 2 458 427.21 20.5 428 408.08 3/2 2+ 562 523.19 26.8 529 505.55 5/2 Results are for a 0.2A uniform grid size, 5 dimensions, no grid offset. SGI IRIS Indigo @35Mhz, 48MB, local disk. Absolute orientation, no angle offset, initial optimal order. All 71 ACE molecules were minimized with MAXIMIN, Conjugate Gradient method, 100 steps maximum. The search converged after 6 molecules. Running 1 degree uniform scanning did not change the solution. The previous set of runs were performed with the original dataset, which was minimized by Sybyl 3.2. The "2+" run used a search core epsilon of 0.05A, which resulted in scan factors of 2.75 down to 0.86 degrees being applied. The two clusters discovered by this experiment have a totally different geometry from the previous studies (gee... minimizers changed since 1980!). The first molecule was three rotatable bonds, 6.292e+06 maximum conformations. The largest molecule was nine rotatable bonds, 3.172e+19 maximum conformations. Identical numeric results were obtained on IBM RS/6000 (320H, 550, 560), SUN-4 (690/MP, SPARC2), SGI IRIS (4d/380, Indigo, 4d/20gt), DEC VAX (3520 under POSIX 1.1), E&S ESV (M120). VDW scale factors 0.95, 0.87, 0.65. Random subsets of the 71 molecule series also produced the same two clusters. Monte-carlo selected conformations from the five point result set will be analyzed with QSAR studies shortly to determine if this new formulation yeilds better predictive results than the previous studies. Questions or comments to "berkley@wubs.wustl.edu" From DSMITH@uoft02.utoledo.edu Fri Oct 12 05:15:15 1992 Date: 12 Oct 1992 09:15:15 -0400 (EDT) From: "DR. DOUGLAS A. SMITH, UNIVERSITY OF TOLEDO" Subject: Re: Potential Surfaces with MOPAC To: harry@phys-chemie.uni-wuerzburg.dbp.de >From: IN%"harry@phys-chemie.uni-wuerzburg.dbp.de" "Harald Lanig" 9-OCT-1992 23:36:44.97 > >I am working on 1,1' coupled isoquionlines. As you can imagine, these >molecules can rotate about the central bond from one minima (the two >rings are almost perpendicular) over a transttion state to the 2nd. >minimun, wich seems to be a mirror image of the 1st. one. > >I have calculated all three conformations (using MOPAC 6.0) and my idea >is to show this isomerisation by a 3-dim potential surface. This is >possiple using the GRID function of MOPAC, here I can define two angles etc. >to scan. If I am using a dihedral angle describing the position of the >two rings to each other and another angle (a-b-c) between the rings, I always >will get a jump in the energy, because the rings are hindering themselve >and, if the strain is great enough, the will snap! > >Also a one-dim reaction-path using only the dihedral between the rings will >show this snap. > >Can anybody tell me how to choose the right parameters to show this >reaction without jumps. Or is this the wrong way visualizing this motion? > I am not sure why your system will "snap", but that is not an inherent problem of the molecule, it is more likely a problem in your setup (e.g. some internal coordinates held constant that should be variables) or inherent to MOPAC (IMHO - please don't take offense, but without seeing your input deck these are just suggestions). If the two minima on either side of the TS are truly mirror images, then the potential surface is most likely symmetrical and you can run your grid from TS to one minima, then fold it over to get the rest of the surface. This might avoid the problem you mention. Also, whether you start from the TS or the minima may also change the occurance of this "snap." I also need to warn you that the rotational barrier is probably going to be incorrect and the shape of the potential surface (or the 2D slice through it generated from a 1D reaction path) is likely to be significantly different from an ab initio result. We have seen this in our own work using polyenes and it has been reported for ethylene glycol and similar systems containing two electronegative groups separated by only two carbons (see Fagerburg, J. Comp. Chem. 1991, 12, 742-5). Doug Smith Assistant Professor of Chemistry The University of Toledo Toledo, OH 43606-3390 voice 419-537-2116 fax 419-537-4033 email dsmith@uoft02.utoledo.edu From d3g359@rahman.pnl.gov Mon Oct 12 03:52:08 1992 Date: Mon, 12 Oct 92 10:52:08 -0700 From: d3g359@rahman.pnl.gov Subject: Comments of electrostatics in protein MD To: CHEMISTRY@ccl.net Some comments in repsonse to Arne Elofsson >Dear Netters >In Biochemistry:31 5856-60 writes >Schreiber and Steinhauser that : >"Cutoff size does strongly influence MD REsults on solvated >polupeptides" Not at all surprising!! >They basically simulates a small helix forming peptide with >a cutoff at 6, 10 and 14 A nonbonded cutoff and compares it >with a ewald summation technique. The strange results is that >the 10 A cut off is most similar (in structure) to the ewald >summation simulation. This would mean that a 14 A cutoff is too >short.!! 1. Why does the better agreement between 10 A and the Ewald imply that the 14A cutoff is too short? The Ewald corresponds to the infinite summation if done correctly. Most cutoffs are imposed to make the calculation tractable, not to give the most accurate electrostatic representation. 2. Why to you assume that the Ewald is the 'right' answer? The Ewald imposes an assumed periodicity in the structure of the system that is not really there, especially in a protein, so the finite cutoff might be a more realisitic representation if this implied periodicity has some peculiar effects on the simulation. In other words, a simulation with periodic boundary conditions, a finite cutoff, and a large enough box to minimize boundary affects might be 'better' than the Ewald. 3. Any point charge model (as I assume mosy everybody that is doing protein dynamics is using) is so poor (does not correspond to reality!) that it is difficult to say which functional form (finite cutoff, Ewald, reaction field, etc.) gives that best representation of an already flawed model of the electrostatics of the system. There is usually no account made of polarization, three-body terms, etc. in a protein simulation. Maybe limitations in the functional representation make up for the crudeness of the model? (Cancellation of errors?) 4. I have found very large differences in electrostatic behavior depending on the magnitudes of charges and the form of the electrostatic potential energy function. 5. The real problem is, what do you compare to? How do you decide which simulation is better? If you are working with solids, the reproduction of optical properties might give you some indication if the electrostatics are right. In liquids, comparison with experimental dielectric constants and structure (RDF, neutron functions, etc.) might also be used to verify the accuracy of the simulation. What data is available on solvated proteins to verify the correct electrostatic model? John Nicholas Pacific Nortwest Laboratory Richland, WA 99352 jb_nicholas@pnl.gov From states@wucs1.wustl.edu Mon Oct 12 12:32:21 1992 Date: Mon, 12 Oct 92 17:32:21 -0500 To: d3g359@rahman.pnl.gov From: states@wucs1.wustl.edu Subject: Re: Comments of electrostatics in protein MD >2. Why to you assume that the Ewald is the 'right' answer? The Ewald >imposes an assumed periodicity in the structure of the system that >is not really there, especially in a protein, >so the finite cutoff might be a more realisitic >representation if this implied periodicity has some peculiar effects >on the simulation. In other words, a simulation with periodic >boundary conditions, a finite cutoff, and a large enough box to >minimize boundary affects might be 'better' than the Ewald. Many of the chemical properties of proteins achieve near physiologic values at quite low hydrations (30-50% water by mass) so why not simply use a protein in a water droplet? It is a system which can be simulated exactly and is not horrendously expensive to compute. >3. Any point charge model (as I assume mosy everybody that is doing >protein dynamics is using) is so poor (does not correspond to reality!) >that it is difficult to say which functional form (finite >cutoff, Ewald, reaction field, etc.) gives that best representation of an >already flawed model of the electrostatics of the system. The multipole expansion around an atom converges quite quickly so the point charge model is not the worst approximation. Errors in estimating the magnitude of the charge are much worse by comparison. >There is usually no account made of polarization, three-body terms, etc. >in a protein simulation. Correct, and modeling a hydrogen bond as a charge or dipole induced dipole interaction can actually be a pretty good approximation. >Maybe limitations in the functional representation make >up for the crudeness of the model? (Cancellation of errors?) The black art of parameterization. >4. I have found very large differences in electrostatic behavior >depending on the magnitudes of charges and the form of the >electrostatic potential energy function. > >5. The real problem is, what do you compare to? How do you decide >which simulation is better? If you are working >with solids, the reproduction of optical properties might give you >some indication if the electrostatics are right. Neglect of atomic polarizability gives a factor of two uncertainty in most condensed matter calculations. Charges derived from in vacuo quantum calculations are applied in models that do not include any bulk polarization term which would translate to a bulk dielectric of roughly 2. >In liquids, comparison >with experimental dielectric constants and structure (RDF, neutron >functions, etc.) might also be used to verify the accuracy of the >simulation. What data is available on solvated proteins to verify >the correct electrostatic model? - ionization potential for titratable groups - redox potentials in some systems (eg. disulfides, cytochrome hemes, etc.) - pK shifts in amide proton exchange rates - NMR chemical shifts (sensitive to electric field rather than potential and confounded by other factors) Collecting the reference data is a pain in the neck but entirely feasible. It would help to address the major flaw in current potentials. >John Nicholas >jb_nicholas@pnl.gov David States Institute for Biomedical Computing / Washington University in St. Louis