From owner-chemistry@ccl.net Tue Feb 23 09:34:01 2010 From: "Wolf-D. Ihlenfeldt wdi#%#xemistry.com" To: CCL Subject: CCL: AW: AW: can you recommend software to compare chemical databases and exclude duplicates pls Message-Id: <-41320-100223085939-8856-a2w7CLyZGNSYvIJ4aplLeg() server.ccl.net> X-Original-From: "Wolf-D. Ihlenfeldt" Content-Language: de Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="iso-8859-1" Date: Tue, 23 Feb 2010 14:08:40 +0100 MIME-Version: 1.0 Sent to CCL by: "Wolf-D. Ihlenfeldt" [wdi||xemistry.com] >=20 > Hi Wolf, >=20 > It is clear that the approach I am proposing is just a gross outline, > and there might be some more suited variants. The first of those might > be to go for ScreeningAssistant. Other possibilities, simply adding = the > InChI key to the MDL tags (or the name field, in addition to the > supplier's name) then do tag-based processing, or use canonicalized > SMILES instead of InChI. >=20 > But for the moment, would you please share your personal experience by > documenting your criticisms precisely. > Reproductible examples would be very constructive as they might: > - prove I was, with some colleagues, incredibly lucky in our past use > of > some such "non viable" approaches, > - help the InChI and OpenBabel developers improve their code, > - warn users of the possible pitfalls with these tools. >=20 > In particular, I would be greatly interested in clear cases of issues > with InChI decoding its own strings, I have no example of complete failure, but this is the official IUPAC statement: http://www.iupac.org/web/ins/2004-039-1-800 InChI reversal: InChI to structure conversion (connection table, bond orders, charges, stereochemical parities; the resultant structures have = no coordinates; success rate on average 99.7%) Examples where you get back a tautomer if you convert back and forth via = a InChI string, which in all likelihood is not the low-energy form, and = not what you input, are not difficult to find.=20 Try for example = c1(nc2c(o1)cccc2C(=3DO)O)c4c(NC(c3c(O)cccn3)=3DO)c(ccc4)O=20 (obviously, do not use the non-standard "fixedh" flag for these = experiments) (Open)Babel failing despite input > data complying to the file format specifications, and "blank > coordinates" SDF files causing "problems" compared to "2D > representations". Well, that is a no-brainer. In SD-files, double bond stereochemistry is = tied to atom coordinates, so without coordinates you lose all E/Z stereochemistry. Wedges also make sense only in combination with coordinates, so atom stereochemistry is also gone. Babel does not write parities in the atom block, either (and even if it did, these are not to = be used any longer according to MDL guidelines), so not even this backup = option is not present.=20 I do not think that any database preparation set-up which completely obliterates stereochemistry can be considered state of the art. >=20 > Thanks in advance > VL >=20 >=20 >=20 > Le 21/02/10 12:59, Wolf-D. Ihlenfeldt wdi/./xemistry.com a =E9crit : > > > > Sent to CCL by: "Wolf-D. Ihlenfeldt" [wdi[#]xemistry.com] > > > > Ahem - IMHO the outline below is really *NOT* a viable solution. > > > > Relying on the ability to decode InChI strings back into a graph is > an > > unsuitable approach. InChI was not designed for this. The current > > implementations of inverters fail on a significant number of = strings, > and on > > others the results are often pretty weird tautomers if you have > mobile H. > > > > Also, last I checked OpenBabel did not generate 2D representations - > an > > regenerated SD file will have all-0 atomic coordinates, which can be > a > > problem later. And OpenBabel has in general a lot of problems with > > stereochemistry - if you need stereochemistry in your project, my > personal > > advice is to stay away from it. > > > > > > software to compare chemical databases > >> and exclude duplicates pls > >> > >> > >> Sent to CCL by: Vincent Leroux [vincent.leroux+*+loria.fr] > >> > >> Hi Andrew, > >> > >> There is ScreeningAssistant for such a task, and much more. > >> http://hal.archives- > >> > = ouvertes.fr/docs/00/07/97/12/PDF/monge_Molecular_Diversity_revised_2.pd > >> f > >> http://dx.doi.org/10.1007/s11030-006-9033-5 > >> http://dx.doi.org/10.2174/157340908785747410 > >> > >> SA is really boxing in the heavyweight category of chemoinformatics > >> tools so you might prefer to process data by yourself. > >> My suggestions : > >> > >> - first "clean up" db1.sdf and db2.sdf: remove salts, hydrogens = etc. > >> everything in the structure that might differ artificially from one > >> supplier to the other. (Open)Babel should do a good job here. > >> > >> - name all molecules in db1.sdf and db2.sdf using scripts. The name > >> must > >> be on the 1st line in each molecule record and should not contain > >> spaces. You want a combination of the supplier name and the = molecule > >> supplier's unique ID (already in place if you are lucky, else found > in > >> some MDL tag). Make sure the name does not contain spaces, and that > is > >> is put in the correct position (1st line). > >> > >> - use the InChI software for generating db1.inchi and db2.inchi, = and > >> make sure you get text files with 1 line per molecule formatted > like: > >> > >> > >> - If the molecules are named Supplier1/ and > >> Supplier2/ , then use the following commands (have > not > >> tested them, but should work) > >> > >> cat db[12].inchi | sort -k2 | uniq -s 1 - | grep '^Supplier1/' > > >> db1_unique.inchi > >> cat db[12].inchi | sort -k2 | uniq -s 1 - | grep '^Supplier2/' > > >> db2_unique.inchi > >> cat db[12].inchi | sort -k2 | uniq -d -s 1 - | awk '{l=3D$1; = getline; > >> print l"_aka_"$0}' > db12_common.inchi > >> > >> This will generate db1_unique.inchi, db2_unique.inchi and > >> db12_common.inchi - self-explanatory. > >> > >> - use OpenBabel (or equivalent software) for converting your InChI > >> files > >> to SDF format/2D representation, then Corina (or equivalent) if you > >> want > >> 3D coordinates, e.g. for docking. > >> > >> Regards, > >> VL > >> > >> > >> > >> Le 19/02/10 18:45, Andrew Voronkov drugdesign:yandex.ru a =E9crit : > >>> > >>> Sent to CCL by: Andrew Voronkov [drugdesign%yandex.ru] > >>> Dear CCL users, > >>> can you please recommend me academics free software or scripts (or > >> software with good evaluation period) for comparison of databases. > For > >> example I want to download two publically available databases and > unite > >> them excluding duplicates into a separate database. So in this case > I > >> ll get one united database and one database where duplicate > compounds > >> will be (but not duplicated). Can you please suggest such software? > >>> > >>> Sincerely yours, > >>> Andrew > >>> > >>> > >> > > > > > > > > -=3Dhis is automatically added to each message by the mailing script = =3D- > > > > > > >=20 >=20 >=20 > -=3D This is automatically added to each message by the mailing script = =3D- > To recover the email address of the author of the message, please > change>=20>=20>=20>=20>=20> Conferences: = http://server.ccl.net/chemistry/announcements/conferences/ >=20>=20>=20 From owner-chemistry@ccl.net Tue Feb 23 11:14:00 2010 From: "Andrew Voronkov drugdesign|,|yandex.ru" To: CCL Subject: CCL: Free analogs of GRID software (Molecular Discovery LTD ) Message-Id: <-41321-100223100010-27614-QNSjPFHTOevCmTcyLC0sdQ],[server.ccl.net> X-Original-From: Andrew Voronkov Content-Transfer-Encoding: 7bit Content-Type: text/plain Date: Tue, 23 Feb 2010 17:59:57 +0300 MIME-Version: 1.0 Sent to CCL by: Andrew Voronkov [drugdesign!A!yandex.ru] Dear CCL users, as far as this company has no free\evaluation license for academic users I wonder to check if there are any free analogs, which can be used for the binding sites profiling with positive, negative charged and hydrophobic probes? Sincerely yours, Andrew From owner-chemistry@ccl.net Tue Feb 23 14:04:01 2010 From: "Nir London nir---rosettadesigngroup.com" To: CCL Subject: CCL: RosettaCon 2010 Message-Id: <-41322-100223135848-20058-kOzC4J0BcAaMiIYkXvMrLA()server.ccl.net> X-Original-From: "Nir London" Date: Tue, 23 Feb 2010 13:58:44 -0500 Sent to CCL by: "Nir London" [nir,+,rosettadesigngroup.com] As in previous years, we would like to invite representatives of the pharma and biotech industries to participate in RosettaCon 2010. Take part in the workshops, seminars, and have an opportunity to meet the Rosetta Commons researchers. Get to know Rosetta from the inside. RosettaCon will take place during August 3-6 at the beautiful Sleeping Lady Mountain Retreat in Leavenworth, Washington, about 2.5 hours outside Seattle, in a dramatic mountain scenery the only fitting environment for a scientific discussion really. For more details: http://rosettadesigngroup.com/blog/586/rosettacon-2010/ Nir London Rosetta Design Group http://rosettadesigngroup.com From owner-chemistry@ccl.net Tue Feb 23 17:58:01 2010 From: "David Gallagher gallagher.da=-=gmail.com" To: CCL Subject: CCL: Turbomole User Group Mtg - update Message-Id: <-41323-100223173204-31209-ed08AOL+UMvUd2a7tvtIdQ/a\server.ccl.net> X-Original-From: David Gallagher Content-Type: multipart/alternative; boundary="=====================_12411578==.ALT" Date: Tue, 23 Feb 2010 13:23:33 -0800 Mime-Version: 1.0 Sent to CCL by: David Gallagher [gallagher.da#gmail.com] --=====================_12411578==.ALT Content-Type: text/plain; charset="us-ascii"; format=flowed Turbomole User Group Mtg. updated information: Venue: Room 110, Moscone Center (ACS meeting) Time: 3.30 - 4.30 PM, Tuesday, 23 March 2010 More info: http://cacheresearch.com/aiche.html#training =============================================== At 11:44 AM 1/28/2010, David Gallagher wrote: >Turbomole User Group (TUG) Meeting > > Preliminary announcement: > >Venue: San Francisco, Moscone Center (ACS meeting) > >Date: Tuesday 23rd March 2010 (early PM) > >Draft agenda: New 6.1 Turbomole Release > Q&A session > User submitted presentations > >For who?: Current users and anyone interested in > learning more about Turbomole > >Fees: None (free) > >Registration: Registration required (name, institution, country) > to dgallagher(_)CACheResearch.com > >Presenters: Please submit your abstract > to dgallagher(_)CACheResearch.com > (presentations should be 5 to 15 minutes) > >Further details will be posted as and when available at >http://cacheresearch.com/aiche.html#training > > >David Gallagher >CAChe Research --=====================_12411578==.ALT Content-Type: text/html; charset="us-ascii" Turbomole User Group Mtg. updated information:

Venue:   Room 110, Moscone Center (ACS meeting)
Time:     3.30 - 4.30 PM, Tuesday, 23 March 2010

More info:  http://cacheresearch.com/aiche.html#training

===============================================
At 11:44 AM 1/28/2010, David Gallagher wrote:
Turbomole User Group (TUG) Meeting

         Preliminary announcement:

Venue:           San Francisco, Moscone Center (ACS meeting)

Date:              Tuesday 23rd March 2010 (early PM)

Draft agenda:   New 6.1 Turbomole Release
                      Q&A session
                      User submitted presentations

For who?:        Current users and anyone interested in
                       learning more about Turbomole

Fees:              None (free)

Registration:    Registration required (name, institution, country)
                      to dgallagher(_)CACheResearch.com

Presenters:      Please submit your abstract
                       to dgallagher(_)CACheResearch.com
                       (presentations should be 5 to 15 minutes)

Further details will be posted as and when available at
http://cacheresearch.com/aiche.html#training

David Gallagher
CAChe Research
--=====================_12411578==.ALT-- From owner-chemistry@ccl.net Tue Feb 23 20:50:02 2010 From: "Cu Phung cphung^^methodist.edu" To: CCL Subject: CCL: POSCAR - Free Software Message-Id: <-41324-100223133313-18018-Shx9bntbhqPTdUCEgjUKdg#server.ccl.net> X-Original-From: "Cu Phung" Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII Date: Tue, 23 Feb 2010 12:51:29 -0500 Mime-Version: 1.0 Sent to CCL by: "Cu Phung" [cphung[A]methodist.edu] Check this site for help Useful tools for VASP users http://tfy.tkk.fi/~job/ CGP Dr. Cu G. Phung Methodist University 5400 Ramsey St. Fayetteville, NC 28311 Phone: 910-630-7137 >>> "Kaliappan Muthukumar muthukumar2k3_+_gmail.com" 2/22/2010 11:50 AM >>> Dear Folks, I am trying to create a VASP POSCAR files with more than 100's atom in my structure. I wish few of these atoms to be frozen and i am able to do that manually. Since, editing becomes cumbersome for more than 100's of atoms, (If i need to work on multiple files) i request any of you to let me know, if you are aware of any '*free*' software that could help me to make VASP Poscar files (with the option to freeze some of the atoms in it). I have information about goVasp and MedeA, but they are not free. Look forward hearing from you. Many thanks in advance. Best regards, Muthu -- Dr. Muthukumar Kaliappan, Post-doctoral Research Assistant Dept. of Chem. and Mater. Engg. University of Cincinnati, Cincinnati, Ohio, USA Phone : 513 787 2720 Email : Kaliappan.Muthukumar^_^gmail.com