From ross@cgl.ucsf.EDU  Sun Oct 11 18:13:53 1992
Date: Mon, 12 Oct 92 01:13:53 -0700
From: ross@cgl.ucsf.edu (Bill Ross)
To: chemistry@ccl.net
Subject: Water packing summary


Thanks to all who responded.

I asked about ways of solvating a molecule other than superimposing
a solvent template and subtracting overlapping solvent. Some of the 
replies excerpted below were sent to the reflector.

The immediate reply from Thomas Pierce, thpierce@rohmhaas.com gives
the authoritative reference and a good summary:
> During the development of modeling work at Rohm and Haas, Dr. Mario Blanco
> developed a suite of methods and published them in an article titled,
> "Molecular Silverware. I. General Solutions to Excluded Volume Constrained 
> Problems", J. Computational Chemistry, Vol. 12, #2, 237-247 (1991).
> One of the methods works well for solvation problems. Basically this method
> packs a molecule around another with rotations and reorientations until 
> the smallest radius of gyration is achieved. Then the method adds another 
> molecule (solvent) and packs it, etc. The software is available from  Tripos.

(The Blanco work was also also mentioned by Terry Stouch (STOUCH@dino.bms.com) 
and Chris Van Dyke (tripos!capella!vandyke@uunet.UU.NET.)

jas@medinah.atc.ucarb.com (Jack Smith) adds:
>    I believe that Mario Blanco now works at Molecular Simulations
> (Formerly Polygen, Molecular Simulations, Cambridge Molecular
> Design,...) and the Silverware software is also, or soon will be, 
> part of their PolyGraf program.  The amorphous growth methods
> used in some of the commercial polymer modeling packages
> seem like a reasonable way to approach the solvent packing issues,
> especially for complex solvents or binary mixtures in general.

And more detail on the Tripos solutions (including Blanco's) from 
Malcolm A. Cline <tripos!thebe!mac@uunet.UU.NET>
> The SYBYL software, available from Tripos Associates, contains three
> different algorithms for solvation.  The first is the method you
> mentioned, dumping a solute into a precomputed solvent box and getting
> rid of overlaps.
> 
> The second is a piece of software called Molecular Silverware,
> developed at Rohm & Haas.
> 
> Molecular Silverware copyright (c) 1990 Rohm and Haas Company, All Rights
> Reserved.
> 
> The molecular silverware algorithm is used to rapidly solvate a molecule
> with any small molecule solvent. The solute molecule may be surrounded with
> solvents to form a droplet cluster, or may be used to pack the solute with
> solvents in a periodic box. 
>  
> The use of SILVERWARE has several advantages over precomputed boxes:
> First, any solvent molecule may be used. The resultant energies are low
> enough so that Dynamics can normally be used after only a brief
> minimization. 
> Second, solvent molecules will be packed into any cavities that may be
> present in the solute. This feature results in fast equilibration of the
> system when using Dynamics. 
> Third, when solvating in a periodic box, the periodic nature of the box is
> used when packing the solvents, avoiding the problem of high-energy "edge
> effects" upon use of Molecular Mechanics or Dynamics. 
> 
> 
> The third method is called XFIT, which is an algorithm for finding best
> fit between a cluster and a solvent molecule.   This program can also
> be used for very high-quality solvation of a target molecule, with or
> without periodic boundary conditions.   XFIT will in general provide
> better results than SILVERWARE, but require more time.  XFIT is similar
> to SILVERWARE but checks more thoroughly than SILVERWARE for the best
> solvent/cluster fit as it adds each new solvent molecule to the cluster.   
> 

I believe the Blanco method is O(N^2) in the number of solvent 
molecules (no analysis is performed in the paper), as is any 
solution that looks at the whole system accumulated so far before 
adding another element.  Compared to the times that can be spent 
running dynamics, a one-time N^2 cost is probably negligible - Blanco 
mentions 5 hours on a Vax 8650 for packing 1000 chlorofluorocarbon 
molecules (15,000 atoms).  This might take 2 hours on a current high-end 
workstation. Perhaps the algorithm could be improved.

A more obviously linear approach is the grid-based method described by 
G. Ravishanker <ravishan@swan.wesleyan.edu>

> 	1. Create a 3-dimensional grid out of the simulation box, whose
> size is the diameter of water. 
> 
> 	2. Start placing waters in random orientations in those grids
> where you do not have a solute atom (or more correctly those grids which
> have a solute atom within a threshold of .5 A or so are exculed).
>
[here is where linearity is lost:]
> 
> 	3. Do a long Monte Carlo simulation on water (and ions if you have
> them). 
> 
> 	4. Start the MD by slowly heating the system to 300K over 10 ps or
> so. Equilibrate and then run production.
>... 
> What we planned to do from now on is basically take equilibrated boxes
> from the strategy above from several molecules that we have studied and
> replace the solute and pad any extra waters and remove the contact waters
> and run a short MC. Since the water environment is not disturbed to a
> great extent in the bulk of box, the MC length for convergence will be
> much less than what you need by the strategy mentioned above. We strongly
> believe that doing MC to get the waters in position pays off in the long
> run in producing stable MD runs.

Scott Flicker <flicker@engin.umich.edu> also suggested Monte Carlo methods 
for random adsorption analogous to studies of 2D surface adsorption by R.F. 
Ziff et.al.  This might have the effect of combining the grid placement
scheme described by Ravishankar with the subsequent Monte Carlo phase.

Another method mentioned by Blanco in his paper (Theodorou & Suter, 
Macromolecules, 18, 1206 (1985)) involves resolving an initial structure 
generated by superimposition using scaled-down vdw radii at first and 
scaling them to normal while equilibrating.

The ideal method for my own everyday purposes would give similar results
to Blanco's method but be linear in performance. Failing a speedup of
Blanco's algorithm, an alternative (for water at least) might be 
some form of irregular grid generation algorithm producing a grid
whose points would be water oxygens which would fit the solute at 
close range and scale to fit a Monte Carlo bath as distance increased. 
This idea is analogous to the one proposed by Ravishankar above, except 
that the explicit Monte Carlo baths would be replaced by randomized packing 
rules. I am not familiar with work on grids so cannot tell if this idea
even makes sense, let alone estimate the cost. If it worked, initial
orientation of waters would be decided in a second pass by working outward
from the solute and considering inner neighbors. Any grid-generation
experts care to comment?

Bill Ross

From arne@mango.mef.ki.se  Mon Oct 12 11:59:35 1992
Date: Mon, 12 Oct 92 10:59:35 +0100
From: arne@mango.mef.ki.se (Arne Elofsson)
To: CHEMISTRY@ccl.net
Subject: Cut_off distances. In Protein MD


Dear Netters
In Biochemistry:31 5856-60 writes
Schreiber and Steinhauser that :
"Cutoff size does strongly influence MD REsults on solvated
polupeptides"
They basically simulates a small helix forming peptide with
a cutoff at 6, 10 and 14 A nonbonded cutoff and compares it
with a ewald summation technique. The strange results is that 
the 10 A cut off is most similar (in structure) to the ewald
summation simulation. This would mean that a 14 A cutoff is too
short.!!

There are several questions that can be asked about the simulation
used (a lot of MD information is missing) but I would like to
not discuss that here. (It is just doubts it can't really 
prove thet the simulations are wrong). I have two suggestions/
questions:

1. What earlier studies, on polypeptides, have implemented
the use of a cutoff of 8 to 12 Angstrom ? Is it really just
depending on used Box sizes or ?

2. Let's assume that there is something really strange about
these simulations. What can we then do about it ? I have
a suggestion that involves a collaboration all over the world.
If every one spend some computing time on the same system,
i.e. the same peptide but different forcefileds, box sizes ..
Would we not be able to together solve this problem that is 
important to all of ous. For practical reasons maybe it is
easier to limitate this kind of study to a few labs (five to ten).
If there is an interest in this kind of study I would
consider spending some time organize contacts. I think this 
list is the best forum to discuss what should be studied and how.

arne Elofsson
arne@mango.mef.ki.se

From arne@mango.mef.ki.se  Mon Oct 12 12:08:24 1992
Date: Mon, 12 Oct 92 11:08:24 +0100
From: arne@mango.mef.ki.se (Arne Elofsson)
To: CHEMISTRY@ccl.net
Subject: Long time motion in proteins.


Dear Netters.
I search for MD/MC protocols/parameters for
simplified descriptions of motion. I am aware
of the work done by Skolnick and all "protein
folding" work, but is there any projects
to study long time motions in proteins ??
sincerely yours
arne@mango.mef.ki.se

From bewilson@Kodak.COM  Mon Oct 12 04:09:23 1992
Date: Mon, 12 Oct 92 08:09:23 -0400
From: bewilson@Kodak.COM (Bruce E. Wilson, ECCR-PA, B95-A, X8886 (bewilson@kodak.com))
To: "chemistry@ccl.net"@Kodak.COM
Subject: Potential surface calculations


Harald Lanig (harry@phys-chemie.uni-wuerzburg.dbp.de) asks
about using GRID in Mopac to get at what I interpret as essentialy
a phi-psi plot.  I'm going to let my ignorance show here, but I thought
that the GRID and reaction coordinate functions did not minimize the rest
of the molecule, which makes the interpretation of the resulting calculations
a bit problematic.

Anyone care to confirm or correct this impression?  Thanks:

(Harry--Ihre English ist ganz besser als meine Deutsch!  Ich wuerde nicht
wissen, dass English Ihre Muttersprache nicht ist).


		Bruce Wilson (bewilson@kodak.com)

(Disclaimer: Any opinions expressed are solely those of the author.)


From PPANETH1@PLEARN.bitnet  Mon Oct 12 09:14:39 1992
Date:         Mon, 12 Oct 92 14:02:48 CET
From: Piotr <PPANETH1%PLEARN@OHSTVMA.ACS.OHIO-STATE.EDU>
Subject:      E-Conference
To: CC-List <chemistry@ccl.net>


Hi Everybody:
I was out for a couple of days and I had to browse through the CCL postings
afterwards. One major reflection I have now is that our discussions, apart
from garbage mail, are restricted to: 1. problems with programs, 2. searches
for (free) software. Does it mean that nobody has any SUCCESS with computatio-
nal chemistry? If the List survives voting, how about 'organizing' an
E-conference, where we'd present something that DID work?
To prevent dumping of lots of work on Jan, I thought we could decide on one
particular week and at the same time post 'posters' on our work. This will
give us all the idea what other people are working on, what else can be done in
our own field ect. We could also do it in 'sections' like; code devepolment,
applications, commercial (?).
Please, respond directly to me - there is enough of noncomputational traffic
already on the List. If there is resonable interest in the idea I'll pass the
information to Jan.
Piotr

========================================================================
Dr. hab. Piotr Paneth
Institute of Applied Radiation Chemistry
Technical University of Lodz
Zwirki 36, 90-924 Lodz, POLAND
                                      voice: (+48-42) 31-31-93
                                        fax: (+48-42) 36-02-46
                                      telex: 885452 ITR PL
e-mail
       PPANETH1 \ PLEARN.bitnet        (CMS)
       PPANETH1 \ zsku.p.lod.edu.pl    (Unix)
       PPANETH1 \ LODZ1.DECnet         (VMS)
========================================================================

From george@archimedes.cray.com  Mon Oct 12 03:02:04 1992
Date: Mon, 12 Oct 92 08:02:04 CDT
From: george@archimedes.cray.com (George Fitzgerald)
To: chemistry@ccl.net
Subject: Analytic energy 2nd deriv for ECP


Netters, 

The information on GRADSCF and Polyatomics Research recently posted to the
net was correct except for the e-mail address.  The correct address for 
Dr. Andrew Komornicki is

c0394@rhea.cray.com


-george fitzgerald


From berkley@wubs.wustl.edu  Mon Oct 12 03:39:16 1992
Date: Mon, 12 Oct 92 08:39:16 -0500
From: berkley@wubs.wustl.edu (Berkley Shands)
To: chemistry@ccl.net
Subject: ACE with a twist


Receptor (TM - Tripos Associates) Beta version V2.3 timings for the ACE
series at four and two degree scans. Original configuration of molecules.

*******************************************************************************
Benchmarks of the complete ACE series of 71 molecules at four degrees
*******************************************************************************

Top CPU	 Top Elps Top Sys Chg CPU Chg Elps CPU Type and notes
=======  ======== ======= ======= ======== ==================
 763.58    774    197.78  496.18    704     DEC VS3520 32mb VMS 5.5-2+VIP
 218.58    296     19.44  144.85    194     IRIS 4d/20gt (IP6) 16mb R2000A@16Mhz

 126.18    135      4.03   70.33     76     SUN 4/c SPARCstation 2 28mb
 103.53    127	    2.67   54.95     64     IBM RS-6000-320H 32mb (AIX 3.2.1)
  81.78    124      4.13   54.33     82     ESV M120 R3000A@25Mhz
 
  73.46     90      6.29   48.45     54     IRIS 4d/380 (IP7) 128mb R3000A@35Mhz
  64.84     80	    4.98   42.93     44     IRIS Indigo (IP12) 48mb R3000A@35Mhz
  62.19     77      1.59   33.00     45     IBM RS-6000-550 64mb (AIX 3.2.1)
  51.70     62      1.52   27.14     36     IBM RS-6000-560 96mb (AIX 3.2.1)

*******************************************************************************
Benchmarks of the complete ACE series of 71 molecules at two degrees
*******************************************************************************

Top CPU	 Top Elps Top Sys Chg CPU Chg Elps CPU Type and notes
=======  ======== ======= ======= ======== ==================
5968.33   3833    219.00 5644.28   5850     DEC VS3520 32mb VMS 5.5-2+VIP
1715.39   1852     26.14 1623.20   1728     IRIS 4d/20gt (IP6) 16mb R2000A@16Mhz

 740.76    756      5.03  674.48    690     SUN 4/c SPARCstation 2 28mb
 627.41    679      4.96  592.22    613     ESV M120 R3000A@25Mhz
 632.19    653	    3.13  574.15    586     IBM RS-6000-320H 32mb (AIX 3.2.1)

 522.21    543	    9.88  489.59    493     IRIS 4d/380S (IP7) 128mb 3/8 CPUs
 485.48    544	    9.75  455.54    492     IRIS Indigo (IP12) 48mb R3000A@35Mhz
 383.34    402      1.98  348.64    357     IBM RS-6000-550 64mb (AIX 3.2.1)
 319.20    332      1.48  290.15    294     IBM RS-6000-560 96mb (AIX 3.2.1)

*******************************************************************************

Notes:

"Top CPU" is the recorded total CPU for all forked processes, including
	  overhead and I/O.
"Top Elps" is the recorded elapsed time from command line parsing to
	  the exit of the last child processes.
"Top Sys" is the recorded UNIX system overhead from the times() function
"Chg CPU" is the algorithmic chargable time from initial rotations
	  to termination (child process CPU)
"Chg Elps" is the elapsed time spent rotating and computing.

*****************************************************************************

ACE molecule series after minimizing with a current
minimizer. (No religous flames please).


Scan  Total  Total  System   Child  Child Points /
Used Elapsed  CPU  Overhead Elapsed  CPU  Groups
==== ======= ===== ======== ======= ===== ========

 6    37     20.23   10.6     12    10.33  1/1
 4    53     35.16   11.6     32    24.32  1/1
 3   165    139.62   18.0    136   121.91  3/2
 2   458    427.21   20.5    428   408.08  3/2
 2+  562    523.19   26.8    529   505.55  5/2

Results are for a 0.2A uniform grid size, 5 dimensions, no
grid offset. SGI IRIS Indigo @35Mhz, 48MB, local disk.
Absolute orientation, no angle offset, initial optimal order.
All 71 ACE molecules were minimized with MAXIMIN, Conjugate
Gradient method, 100 steps maximum. The search converged
after 6 molecules. Running 1 degree uniform scanning did
not change the solution. The previous set of runs were performed
with the original dataset, which was minimized by Sybyl 3.2.
The "2+" run used a search core epsilon of 0.05A, which resulted
in scan factors of 2.75 down to 0.86 degrees being applied.
The two clusters discovered by this experiment have a totally
different geometry from the previous studies (gee...
minimizers changed since 1980!). The first molecule
was three rotatable bonds, 6.292e+06 maximum conformations.
The largest molecule was nine rotatable bonds, 3.172e+19 maximum
conformations. Identical numeric results were obtained on
IBM RS/6000 (320H, 550, 560), SUN-4 (690/MP, SPARC2),
SGI IRIS (4d/380, Indigo, 4d/20gt), DEC VAX (3520 under POSIX 1.1),
E&S ESV (M120). VDW scale factors 0.95, 0.87, 0.65.

Random subsets of the 71 molecule series also produced the same
two clusters. Monte-carlo selected conformations from the
five point result set will be analyzed with QSAR studies shortly
to determine if this new formulation yeilds better predictive
results than the previous studies.


Questions or comments to "berkley@wubs.wustl.edu"

From DSMITH@uoft02.utoledo.edu  Fri Oct 12 05:15:15 1992
Date: 12 Oct 1992 09:15:15 -0400 (EDT)
From: "DR. DOUGLAS A. SMITH, UNIVERSITY OF TOLEDO" <DSMITH@uoft02.utoledo.edu>
Subject: Re: Potential Surfaces with MOPAC
To: harry@phys-chemie.uni-wuerzburg.dbp.de


>From:	IN%"harry@phys-chemie.uni-wuerzburg.dbp.de"  "Harald Lanig"  9-OCT-1992 23:36:44.97
>
>I am working on 1,1' coupled isoquionlines. As you can imagine, these
>molecules can rotate about the central bond from one minima (the two
>rings are almost perpendicular) over a transttion state to the 2nd.
>minimun, wich seems to be a mirror image of the 1st. one.
>
>I have calculated all three conformations (using MOPAC 6.0) and my idea
>is to show this isomerisation by a 3-dim potential surface. This is
>possiple using the GRID function of MOPAC, here I can define two angles etc.
>to scan. If I am using a dihedral angle describing the position of the
>two rings to each other and another angle (a-b-c) between the rings, I always
>will get a jump in the energy, because the rings are hindering themselve
>and, if the strain is great enough, the will snap!
>
>Also a one-dim reaction-path using only the dihedral between the rings will
>show this snap.
>
>Can anybody tell me how to choose the right parameters to show this 
>reaction without jumps. Or is this the wrong way visualizing this motion?
>

I am not sure why your system will "snap", but that is not an inherent 
problem of the molecule, it is more likely a problem in your setup (e.g.
some internal coordinates held constant that should be variables) or inherent
to MOPAC (IMHO - please don't take offense, but without seeing your input
deck these are just suggestions).

If the two minima on either side of the TS are truly mirror images, then
the potential surface is most likely symmetrical and you can run your grid
from TS to one minima, then fold it over to get the rest of the surface.
This might avoid the problem you mention.  Also, whether you start from the
TS or the minima may also change the occurance of this "snap."

I also need to warn you that the rotational barrier is probably going to be
incorrect and the shape of the potential surface (or the 2D slice through
it generated from a 1D reaction path) is likely to be significantly different
from an ab initio result.  We have seen this in our own work using polyenes
and it has been reported for ethylene glycol and similar systems containing
two electronegative groups separated by only two carbons (see Fagerburg, 
J. Comp. Chem. 1991, 12, 742-5).

Doug Smith
Assistant Professor of Chemistry
The University of Toledo
Toledo, OH  43606-3390

voice    419-537-2116
fax      419-537-4033
email    dsmith@uoft02.utoledo.edu


From d3g359@rahman.pnl.gov  Mon Oct 12 03:52:08 1992
Date: Mon, 12 Oct 92 10:52:08 -0700
From: d3g359@rahman.pnl.gov
Subject: Comments of electrostatics in protein MD
To: CHEMISTRY@ccl.net


Some comments in repsonse to Arne Elofsson

>Dear Netters
>In Biochemistry:31 5856-60 writes
>Schreiber and Steinhauser that :
>"Cutoff size does strongly influence MD REsults on solvated
>polupeptides"

Not at all surprising!!

>They basically simulates a small helix forming peptide with
>a cutoff at 6, 10 and 14 A nonbonded cutoff and compares it
>with a ewald summation technique. The strange results is that
>the 10 A cut off is most similar (in structure) to the ewald
>summation simulation. This would mean that a 14 A cutoff is too
>short.!!


1. Why does the better agreement between 10 A and the Ewald imply
that the 14A cutoff is too short? The Ewald corresponds to the
infinite summation if done correctly.
Most cutoffs are imposed to make the calculation tractable, not
to give the most accurate electrostatic representation.

2. Why to you assume that the Ewald is the 'right' answer? The Ewald
imposes an assumed periodicity in the structure of the system that
is not really there, especially in a protein,
so the finite cutoff might be a more realisitic
representation if this implied periodicity has some peculiar effects
on the simulation. In other words, a simulation with periodic
boundary conditions, a finite cutoff, and a large enough box to
minimize boundary affects might be 'better' than the Ewald.

3. Any point charge model (as I assume mosy everybody that is doing
protein dynamics is using) is so poor (does not correspond to reality!)
that it is difficult to say which functional form (finite
cutoff, Ewald, reaction field, etc.) gives that best representation of an
already flawed model of the electrostatics of the system. There is usually
no account made of polarization, three-body terms, etc. in a protein
simulation. Maybe limitations in the functional representation make 
up for the crudeness of the model? (Cancellation of errors?)  

4. I have found very large differences in electrostatic behavior
depending on the magnitudes of charges and the form of the 
electrostatic potential energy function. 

5. The real problem is, what do you compare to? How do you decide
which simulation is better? If you are working
with solids, the reproduction of optical properties might give you
some indication if the electrostatics are right. In liquids, comparison
with experimental dielectric constants and structure (RDF, neutron
functions, etc.) might also be used to verify the accuracy of the
simulation.  What data is available on solvated proteins to verify 
the correct electrostatic model?


John Nicholas
Pacific Nortwest Laboratory
Richland, WA 99352
jb_nicholas@pnl.gov


From states@wucs1.wustl.edu  Mon Oct 12 12:32:21 1992
Date: Mon, 12 Oct 92 17:32:21 -0500
To: d3g359@rahman.pnl.gov
From: states@wucs1.wustl.edu
Subject: Re: Comments of electrostatics in protein MD


>2. Why to you assume that the Ewald is the 'right' answer? The Ewald
>imposes an assumed periodicity in the structure of the system that
>is not really there, especially in a protein,
>so the finite cutoff might be a more realisitic
>representation if this implied periodicity has some peculiar effects
>on the simulation. In other words, a simulation with periodic
>boundary conditions, a finite cutoff, and a large enough box to
>minimize boundary affects might be 'better' than the Ewald.

Many of the chemical properties of proteins achieve near
physiologic values at quite low hydrations (30-50% water by mass) so
why not simply use a protein in a water droplet?  It is a system
which can be simulated exactly and is not horrendously expensive
to compute.

>3. Any point charge model (as I assume mosy everybody that is doing
>protein dynamics is using) is so poor (does not correspond to reality!)
>that it is difficult to say which functional form (finite
>cutoff, Ewald, reaction field, etc.) gives that best representation of an
>already flawed model of the electrostatics of the system. 

The multipole expansion around an atom converges quite quickly
so the point charge model is not the worst approximation.  Errors in
estimating the magnitude of the charge are much worse by comparison.

>There is usually no account made of polarization, three-body terms, etc. 
>in a protein simulation. 

Correct, and modeling a hydrogen bond as a charge or dipole induced
dipole interaction can actually be a pretty good approximation.

>Maybe limitations in the functional representation make 
>up for the crudeness of the model? (Cancellation of errors?)  

The black art of parameterization.
 
>4. I have found very large differences in electrostatic behavior
>depending on the magnitudes of charges and the form of the 
>electrostatic potential energy function. 
>
>5. The real problem is, what do you compare to? How do you decide
>which simulation is better? If you are working
>with solids, the reproduction of optical properties might give you
>some indication if the electrostatics are right. 

Neglect of atomic polarizability gives a factor of two uncertainty in
most condensed matter calculations.  Charges derived from in vacuo
quantum calculations are applied in models that do not include any
bulk polarization term which would translate to a bulk dielectric of
roughly 2.

>In liquids, comparison
>with experimental dielectric constants and structure (RDF, neutron
>functions, etc.) might also be used to verify the accuracy of the
>simulation.  What data is available on solvated proteins to verify 
>the correct electrostatic model?

- ionization potential for titratable groups
- redox potentials in some systems (eg. disulfides, cytochrome hemes, etc.)
- pK shifts in amide proton exchange rates
- NMR chemical shifts (sensitive to electric field rather than potential
        and confounded by other factors)

Collecting the reference data is a pain in the neck but entirely
feasible.  It would help to address the major flaw in current potentials.

>John Nicholas
>jb_nicholas@pnl.gov

David States
Institute for Biomedical Computing / Washington University in St. Louis