From chemistry-request@ccl.net Mon Aug 19 06:50:21 1991 Date: Mon, 19 Aug 91 12:33:11 +0200 From: Anders Sundin To: chemistry@ccl.net Subject: PDB format Status: R I want to save molecular structures in PDB format. What is the exact format of the HETATM and the CONECT records? More specific questions: Can I describe more than four ligands with CONECT records? Can HETATM records describe more than element and position? Grid uses H, C0, C1, C=, C1=, FE+3 and so on to describe various elements, hybridizations, charges and connectivities. Are these "atomtypes" standard? ------------------------------------------------------------- | Anders Sundin | e-mail: sundinKC@dna.lth.se | | University of Lund | ok2aps@gemini.ldc.lu.se | | Organic Chemistry 2, | ok2aps@seldc52.bitnet | | P.O. Box 124 | phone: +46 46 108214 | | S-22100 Lund, Sweden | fax: +46 46 108209 | ------------------------------------------------------------- From jkl@ccl.net Mon Aug 19 10:38:25 1991 Date: Mon, 19 Aug 91 10:16:34 -0400 From: jkl@ccl.net To: sundinKC@dna.lth.se Subject: Re: PDB format Status: R Dear Anders, The "official" description of PDB format is available from PDB free of charge (I believe) in a paper form. Ask them, they might have it in a machine readable format too (but I doubt), and maybe, they will send it to you by e-mail. You can contact them by e-mail: PDB@BNLCHEM.BITNET The document itself is more than 40 pages long, so understandably, I am only reproducing these parts which are of interest to you. Sorry for the typos, but at least you have a flavor of what is there in original document. Jan jkl@ccl.net The HETATM record (for coordinate records of non-standard groups) has format which is in most part identical with ATOM records. The HETATM format is: Columns: 1-6 HETATM 7-11 Atom serial number (residues occur in order of their sequence numbers which increase starting from the N-terminal residue for proteins and 5'-terminal fro nucleic acids. Within each residue the atoms are ordered as indicated in Appendix B (appendix to PDB format description) If residue sequence is known, certain atom serial numbers may be ommitted to allow for future insertions of any missing atoms. If residues sequence is not reliably known, these numbers are simply oridinals 13-16 atom name (names are in Appendix B) 17 Alternate location indicator (Alternate locations for atoms may be denoted here by A, B, C, etc. 18-20 Residue name (standard names are given in Appendix C, other components are defined in HET groups 22 Chain identifier, e.g., A for hemoglobin alpha chain 23-26 Residue seq. no. 27 Code for insertions of residues, e.g. 66A, 66B, etc. 31-38 X (ortogonal coordinates in Angstroms) 39-46 Y 47-54 Z 55-60 Occupancy 61-66 Temperature factor 68-70 Footnote number FORMAT(6A1,I5,1X,A4,A1,A3,1X,A1,I4,A1,3X,3F8.3,2F6.2,1X,I3) The HET record has format: Cols: 1-3 HET 8-10 Nonstandard group(heterogen) identifier 13 Chain identifier 14-17 Sequence number 18 Insertion code 21-25 Number of atoms in non-standard group 31-70 Text FORMAT(6A1,1X,A3,2X,A1,I4,A1,2X,I5,5X,40A1) CONECT Conectivity records Cols: 1-6 CONECT 7-11 Serial number of an atom 12-16 Covalent bond connectivity (serial number of bonded atoms) 17-21 to atom specified in 7-11 22-26 27-31 32-36 Hydrogen bond (in which the atom specified in cols 7-11 acts as a donor 37-41 Hydrogen bond 42-46 Salt bridge (the atom specified in cols 7-11 has excess of negative charge) 47-51 Hydrogen bond in which the atom specified in cols 7-11 acts as acceptor 52-56 Hydrogen bond 57-61 Salt bridge, the atom specified in 7-11 has an excess of positive charge FORMAT (6A1,11I5) Note: Serial numbers are identical to those in columns 7-11 of the appropriate ATOM/HETATM records and connectivity entries correspond to these serial numbers. The second CONECT record with the same serial number in columns 7-11 may be used if necessary. Either all or none of the covalent connectivity of an atom must be specified, and if hydrogen bonding is specified the covalent connectivity is included also. The occurence of a negatibe atom serial number on a CONECT record denotes that a translationally equivalent copy (see TVECT records) of the target atom specified is linked to the origin atom of the record. Jan K. Labanowski, Ph.D., Senior Supercomputer Specialist Ohio Supercomputer Center, 1224 Kinnear Rd, Columbus, OH 43212-1163 ph:(614)-292-9279, FAX:(614)-292-7168, E-mail: jkl@ccl.net JKL@OHSTPY.BITNET