2.5 Data Structures
There are a number of data structures that CONGEN manipulates.
Many of these data structures are important for most operations; others
which are less important, are described with the commands that use them.
Much more specific information is available in the various common blocks
whose extension is .FCM in the source directories.
The important data structures are given below: Each data
structure name is followed by its abbreviation which is used as its name
in commands.
- Residue Topology File (RTF).
The residue topology file stores the definitions of all
residues. The atoms, atomic properties, bonds, bond angles,
torsion angles, improper torsion angles, hydrogen bond donors
and acceptors and antecedents, and non-bonded exclusions are
all specified on a per residue basis. The RTF also specifies
the list of chemical type which are used in the parameter file.
This file is required for any calculations.
- The Parameters (PARA or PARM).
The parameters specify the force constants, equilibrium
geometries, Van der Waals radii, and other such data needed
for calculating the energy. The list of atom type codes
comes from the RTF. The parameters are required for
any calculation, and they depend on the list of chemical
types provided in the RTF. The parameters must be consistent
with the topology file in that they must designed together.
In addition, there must be one and only one non-bonded
parameter for each atom type specified in the topology file.
- Protein Structure File (PSF).
The protein structure file is the concatenation of
information in the RTF. It specifies the information for the
entire structure. It has a hierarchical organization wherein
atoms are grouped into residues which are grouped into
segments which comprise the structure. Each atom is uniquely
identified within a residue by its IUPAC name; each residue
is uniquely identified in the segment by a residue identifier
which is the character form of the residue's position in the
segment; and each segment is identified by a segment
identifier specified by the user. This information is required
for any calculations.
- The Coordinates (COOR).
The coordinates are the Cartesian coordinates for all the
atoms in the PSF. There are two sets of coordinates provided.
The main set is the default used for all operations involving
the positions of the atoms. A comparison set (also called the
reference set) is provided for a variety of purposes, such as
a reference for rotation or operations which involve
differences between coordinates for a particular molecule.
- The Non-bonded List (NBON).
The non-bonded list contains the list of non-bonded
interactions to be used in calculating the energies as well
as optional information about the charge, dipole moment, and
quadrupole moments of the residues. This data structure
depends on the coordinates for its construction and must be
periodically updated if the coordinates are being modified.
- The Hydrogen Bond List (HBON).
The hydrogen bond list contains the list of hydrogen bonds.
Like the non-bonded list, this data structure depends on the
coordinates and must be periodically updated.
- The Constraints (CONS).
The constraints are harmonic potentials placed on selected
atomic positions or on dihedral angles. The purpose of these
constraints is to limit motion of those atoms or torsions or
to force the molecule to assume a particular conformation.
Normally, there are no constraints on the molecule. One
should note that this data structure does not hold either
constraints related to SHAKE or motion constraints where
atoms are prevented from moving entirely.
- The Internal Coordinates (IC).
The internal coordinates data structure contains information
concerning the relative positions of atoms within a structure.
This data structure is most commonly used to build or modify
cartesian coordinates from known or desired internal coordinate
values. It is also used in conjunction with the analysis of
normal modes. Since there are complete editing facilities,
it can be used as a simple but powerful method of examining
or analyzing structures.
- The Images data structure (IMAGES).
The images data structure determines and defines the relative
positions and orientations of any symmetric image of the primary
molecule(s). The purpose of this data structure is to allow
the simulation of crystal symmetry or the use of periodic
boundary conditions. Also contained in this data structure is
information concerning all nonbonded, H-bonds, and ST2
interactions between primary and image atoms.