LAMMPS WWW Site - LAMMPS Documentation - LAMMPS Commands

read_data command

Syntax:

read_data file keyword args ... 

Examples:

read_data data.lj
read_data ../run7/data.polymer.gz
read_data data.protein fix mycmap crossterm CMAP 

Description:

Read in a data file containing information LAMMPS needs to run a simulation. The file can be ASCII text or a gzipped text file (detected by a .gz suffix). This is one of 3 ways to specify initial atom coordinates; see the read_restart and create_atoms commands for alternative methods.

The structure of the data file is important, though many settings and sections are optional or can come in any order. See the examples directory for sample data files for different problems.

A data file has a header and a body. The header appears first. The first line of the header is always skipped; it typically contains a description of the file. Then lines are read one at a time. Lines can have a trailing comment starting with '#' that is ignored. If the line is blank (only whitespace after comment is deleted), it is skipped. If the line contains a header keyword, the corresponding value(s) is read from the line. If it doesn't contain a header keyword, the line begins the body of the file.

The body of the file contains zero or more sections. The first line of a section has only a keyword. The next line is skipped. The remaining lines of the section contain values. The number of lines depends on the section keyword as described below. Zero or more blank lines can be used between sections. Sections can appear in any order, with a few exceptions as noted below.

If the keyword fix is used, it specifies a fix that will be used to process portions of the data file. Any header line containing header-string and any section with a name containing section-string will be passed to the fix. See the fix cmap command for an example of a fix that operates in this manner. The doc page for the fix defines the syntax of the header line(s) and section(s) that it reads from the data file.

The formatting of individual lines in the data file (indentation, spacing between words and numbers) is not important except that header and section keywords (e.g. atoms, xlo xhi, Masses, Bond Coeffs) must be capitalized as shown and can't have extra white space between their words - e.g. two spaces or a tab between the 2 words in "xlo xhi" or the 2 words in "Bond Coeffs", is not valid.


These are the recognized header keywords. Header lines can come in any order. The value(s) are read from the beginning of the line. Thus the keyword atoms should be in a line like "1000 atoms"; the keyword ylo yhi should be in a line like "-10.0 10.0 ylo yhi"; the keyword xy xz yz should be in a line like "0.0 5.0 6.0 xy xz yz". All these settings have a default value of 0, except the lo/hi box size defaults are -0.5 and 0.5. A line need only appear if the value is different than the default.

The initial simulation box size is determined by the lo/hi settings. In any dimension, the system may be periodic or non-periodic; see the boundary command.

If the xy xz yz line does not appear, LAMMPS will set up an axis-aligned (orthogonal) simulation box. If the line does appear, LAMMPS creates a non-orthogonal simulation domain shaped as a parallelepiped with triclinic symmetry. The parallelepiped has its "origin" at (xlo,ylo,zlo) and is defined by 3 edge vectors starting from the origin given by A = (xhi-xlo,0,0); B = (xy,yhi-ylo,0); C = (xz,yz,zhi-zlo). Xy,xz,yz can be 0.0 or positive or negative values and are called "tilt factors" because they are the amount of displacement applied to faces of an originally orthogonal box to transform it into the parallelepiped.

The tilt factors (xy,xz,yz) can not skew the box more than half the distance of the corresponding parallel box length. For example, if xlo = 2 and xhi = 12, then the x box length is 10 and the xy tilt factor must be between -5 and 5. Similarly, both xz and yz must be between -(xhi-xlo)/2 and +(yhi-ylo)/2. Note that this is not a limitation, since if the maximum tilt factor is 5 (as in this example), then configurations with tilt = ..., -15, -5, 5, 15, 25, ... are all geometrically equivalent.

See Section_howto 12 of the doc pages for a geometric description of triclinic boxes, as defined by LAMMPS, and how to transform these parameters to and from other commonly used triclinic representations.

When a triclinic system is used, the simulation domain must be periodic in any dimensions with a non-zero tilt factor, as defined by the boundary command. I.e. if the xy tilt factor is non-zero, then both the x and y dimensions must be periodic. Similarly, x and z must be periodic if xz is non-zero and y and z must be periodic if yz is non-zero. Also note that if your simulation will tilt the box, e.g. via the fix deform command, the simulation box must be defined as triclinic, even if the tilt factors are initially 0.0.

For 2d simulations, the zlo zhi values should be set to bound the z coords for atoms that appear in the file; the default of -0.5 0.5 is valid if all z coords are 0.0. For 2d triclinic simulations, the xz and yz tilt factors must be 0.0.

If the system is periodic (in a dimension), then atom coordinates can be outside the bounds (in that dimension); they will be remapped (in a periodic sense) back inside the box.

IMPORTANT NOTE: If the system is non-periodic (in a dimension), then all atoms in the data file must have coordinates (in that dimension) that are "greater than or equal to" the lo value and "less than or equal to" the hi value. If the non-periodic dimension is of style "fixed" (see the boundary command), then the atom coords must be strictly "less than" the hi value, due to the way LAMMPS assign atoms to processors. Note that you should not make the lo/hi values radically smaller/larger than the extent of the atoms. For example, if your atoms extend from 0 to 50, you should not specify the box bounds as -10000 and 10000. This is because LAMMPS uses the specified box size to layout the 3d grid of processors. A huge (mostly empty) box will be sub-optimal for performance when using "fixed" boundary conditions (see the boundary command). When using "shrink-wrap" boundary conditions (see the boundary command), a huge (mostly empty) box may cause a parallel simulation to lose atoms the first time that LAMMPS shrink-wraps the box around the atoms.

The "extra bond per atom" setting should be used if new bonds will be added to the system when a simulation runs, e.g. by using the fix bond/create command. This will pre-allocate space in LAMMPS data structures for storing the new bonds.

The "ellipsoids" and "lines" and "triangles" and "bodies" settings are only used with atom_style ellipsoid or line or tri or body and specify how many of the atoms are finite-size ellipsoids or lines or triangles or bodies; the remainder are point particles. See the discussion of ellipsoidflag and the Ellipsoids section below. See the discussion of lineflag and the Lines section below. See the discussion of triangleflag and the Triangles section below. See the discussion of bodyflag and the Bodies section below.


These are the section keywords for the body of the file.

Each section is listed below in alphabetic order. The format of each section is described including the number of lines it must contain and rules (if any) for where it can appear in the data file.

Any individual line in the various sections can have a trailing comment starting with "#" for annotation purposes. E.g. in the Atoms section:

10 1 17 -1.0 10.0 5.0 6.0   # salt ion 

Angle Coeffs section:

The number and meaning of the coefficients are specific to the defined angle style. See the angle_style and angle_coeff commands for details. Coefficients can also be set via the angle_coeff command in the input script.


AngleAngle Coeffs section:


AngleAngleTorsion Coeffs section:


Angles section:

The 3 atoms are ordered linearly within the angle. Thus the central atom (around which the angle is computed) is the atom2 in the list. E.g. H,O,H for a water molecule. The Angles section must appear after the Atoms section. All values in this section must be integers (1, not 1.0).


AngleTorsion Coeffs section:


Atoms section:

An Atoms section must appear in the data file if natoms > 0 in the header section. The atoms can be listed in any order. These are the line formats for each atom style in LAMMPS. As discussed below, each line can optionally have 3 flags (nx,ny,nz) appended to it, which indicate which image of a periodic simulation box the atom is in. These may be important to include for some kinds of analysis.

angle atom-ID molecule-ID atom-type x y z
atomic atom-ID atom-type x y z
body atom-ID atom-type bodyflag mass x y z
bond atom-ID molecule-ID atom-type x y z
charge atom-ID atom-type q x y z
dipole atom-ID atom-type q x y z mux muy muz
electron atom-ID atom-type q spin eradius x y z
ellipsoid atom-ID atom-type ellipsoidflag density x y z
full atom-ID molecule-ID atom-type q x y z
line atom-ID molecule-ID atom-type lineflag density x y z
meso atom-ID atom-type rho e cv x y z
molecular atom-ID molecule-ID atom-type x y z
peri atom-ID atom-type volume density x y z
sphere atom-ID atom-type diameter density x y z
tri atom-ID molecule-ID atom-type triangleflag density x y z
wavepacket atom-ID atom-type charge spin eradius etag cs_re cs_im x y z
hybrid atom-ID atom-type x y z sub-style1 sub-style2 ...

The keywords have these meanings:

The units for these quantities depend on the unit style; see the units command for details.

For 2d simulations specify z as 0.0, or a value within the zlo zhi setting in the data file header.

The atom-ID is used to identify the atom throughout the simulation and in dump files. Normally, it is a unique value from 1 to Natoms for each atom. Unique values larger than Natoms can be used, but they will cause extra memory to be allocated on each processor, if an atom map array is used (see the atom_modify command). If an atom map array is not used (e.g. an atomic system with no bonds), and velocities are not assigned in the data file, and you don't care if unique atom IDs appear in dump files, then the atom-IDs can all be set to 0.

The molecule ID is a 2nd identifier attached to an atom. Normally, it is a number from 1 to N, identifying which molecule the atom belongs to. It can be 0 if it is an unbonded atom or if you don't care to keep track of molecule assignments.

The diameter specifies the size of a finite-size spherical particle. It can be set to 0.0, which means that atom is a point particle.

The ellipsoidflag, lineflag, triangleflag, and bodyflag determine whether the particle is a finite-size ellipsoid or line or triangle or body of finite size, or whether the particle is a point particle. Additional attributes must be defined for each ellipsoid, line, triangle, or body in the corresponding Ellipsoids, Lines, Triangles, or Bodies section.

Some pair styles and fixes and computes that operate on finite-size particles allow for a mixture of finite-size and point particles. See the doc pages of individual commands for details.

For finite-size particles, the density is used in conjunction with the particle volume to set the mass of each particle as mass = density * volume. In this context, volume can be a 3d quantity (for spheres or ellipsoids), a 2d quantity (for triangles), or a 1d quantity (for line segments). If the volume is 0.0, meaning a point particle, then the density value is used as the mass. One exception is for the body atom style, in which case the mass of each particle (body or point particle) is specified explicitly. This is because the volume of the body is unknown.

For atom_style hybrid, following the 5 initial values (ID,type,x,y,z), specific values for each sub-style must be listed. The order of the sub-styles is the same as they were listed in the atom_style command. The sub-style specific values are those that are not the 5 standard ones (ID,type,x,y,z). For example, for the "charge" sub-style, a "q" value would appear. For the "full" sub-style, a "molecule-ID" and "q" would appear. These are listed in the same order they appear as listed above. Thus if

atom_style hybrid charge sphere 

were used in the input script, each atom line would have these fields:

atom-ID atom-type x y z q diameter density 

Note that if a non-standard value is defined by multiple sub-styles, it must appear mutliple times in the atom line. E.g. the atom line for atom_style hybrid dipole full would list "q" twice:

atom-ID atom-type x y z q mux muy myz molecule-ID q 

Atom lines (all lines or none of them) can optionally list 3 trailing integer values: nx,ny,nz. For periodic dimensions, they specify which image of the simulation box the atom is considered to be in. An image of 0 means it is inside the box as defined. A value of 2 means add 2 box lengths to get the true value. A value of -1 means subtract 1 box length to get the true value. LAMMPS updates these flags as atoms cross periodic boundaries during the simulation. The flags can be output with atom snapshots via the dump command.

If nx,ny,nz values are not set in the data file, LAMMPS initializes them to 0. If image information is needed for later analysis and they are not all initially 0, it's important to set them correctly in the data file. Also, if you plan to use the replicate command to generate a larger system, these flags must be listed correctly for bonded atoms when the bond crosses a periodic boundary. I.e. the values of the image flags should be different by 1 (in the appropriate dimension) for the two atoms in such a bond.

Atom velocities and other atom quantities not defined above are set to 0.0 when the Atoms section is read. Velocities can be set later by a Velocities section in the data file or by a velocity or set command in the input script.


Bodies section:

The Bodies section must appear if atom_style body is used and any atoms listed in the Atoms section have a bodyflag = 1. The number of bodies should be specified in the header section via the "bodies" keyword.

Each body can have a variable number of integer and/or floating-point values. The number and meaning of the values is defined by the body style, as described in the body doc page. The body style is given as an argument to the atom_style body command.

The ninteger and ndouble values determine how many integer and floating-point values are specified for this particle. Ninteger and ndouble can be as large as needed and can be different for every body. Integer values are then listed on subsequent lines, 10 values per line. Floating-point values follow on subsequent lines, again 10 per line. If the number of lines is not evenly divisible by 10, the last line in that group contains the remaining values, e.g. 4 values out of 14 in the last example above, for floating-point values. If there are no values of a particular type, no lines appear for that type, e.g. there are no integer lines in the last example above.

The Bodies section must appear after the Atoms section.


Bond Coeffs section:

The number and meaning of the coefficients are specific to the defined bond style. See the bond_style and bond_coeff commands for details. Coefficients can also be set via the bond_coeff command in the input script.


BondAngle Coeffs section:


BondBond Coeffs section:


BondBond13 Coeffs section:


Bonds section:

The Bonds section must appear after the Atoms section. All values in this section must be integers (1, not 1.0).


Dihedral Coeffs section:

The number and meaning of the coefficients are specific to the defined dihedral style. See the dihedral_style and dihedral_coeff commands for details. Coefficients can also be set via the dihedral_coeff command in the input script.


Dihedrals section:

The 4 atoms are ordered linearly within the dihedral. The Dihedrals section must appear after the Atoms section. All values in this section must be integers (1, not 1.0).


Ellipsoids section:

The Ellipsoids section must appear if atom_style ellipsoid is used and any atoms are listed in the Atoms section with an ellipsoidflag = 1. The number of ellipsoids should be specified in the header section via the "ellipsoids" keyword.

The 3 shape values specify the 3 diameters or aspect ratios of a finite-size ellipsoidal particle, when it is oriented along the 3 coordinate axes. They must all be non-zero values.

The values quatw, quati, quatj, and quatk set the orientation of the atom as a quaternion (4-vector). Note that the shape attributes specify the aspect ratios of an ellipsoidal particle, which is oriented by default with its x-axis along the simulation box's x-axis, and similarly for y and z. If this body is rotated (via the right-hand rule) by an angle theta around a unit vector (a,b,c), then the quaternion that represents its new orientation is given by (cos(theta/2), a*sin(theta/2), b*sin(theta/2), c*sin(theta/2)). These 4 components are quatw, quati, quatj, and quatk as specified above. LAMMPS normalizes each atom's quaternion in case (a,b,c) is not specified as a unit vector.

The Ellipsoids section must appear after the Atoms section.


EndBondTorsion Coeffs section:


Improper Coeffs section:

The number and meaning of the coefficients are specific to the defined improper style. See the improper_style and improper_coeff commands for details. Coefficients can also be set via the improper_coeff command in the input script.


Impropers section:

The ordering of the 4 atoms determines the definition of the improper angle used in the formula for each improper style. See the doc pages for individual styles for details.

The Impropers section must appear after the Atoms section. All values in this section must be integers (1, not 1.0).


Lines section:

The Lines section must appear if atom_style line is used and any atoms are listed in the Atoms section with a lineflag = 1. The number of lines should be specified in the header section via the "lines" keyword.

The 2 end points are the end points of the line segment. The ordering of the 2 points should be such that using a right-hand rule to cross the line segment with a unit vector in the +z direction, gives an "outward" normal vector perpendicular to the line segment. I.e. normal = (c2-c1) x (0,0,1). This orientation may be important for defining some interactions.

The Lines section must appear after the Atoms section.


Masses section:

This defines the mass of each atom type. This can also be set via the mass command in the input script. This section cannot be used for atom styles that define a mass for individual atoms - e.g. atom_style sphere.


MiddleBondTorsion Coeffs section:


Pair Coeffs section:

The number and meaning of the coefficients are specific to the defined pair style. See the pair_style and pair_coeff commands for details. Coefficients can also be set via the pair_coeff command in the input script.


Triangles section:

The Triangles section must appear if atom_style tri is used and any atoms are listed in the Atoms section with a triangleflag = 1. The number of lines should be specified in the header section via the "triangles" keyword.

The 3 corner points are the corner points of the triangle. The ordering of the 3 points should be such that using a right-hand rule to go from point1 to point2 to point3 gives an "outward" normal vector to the face of the triangle. I.e. normal = (c2-c1) x (c3-c1). This orientation may be important for defining some interactions.

The Triangles section must appear after the Atoms section.


Velocities section:

all styles except those listed atom-ID vx vy vz
electron atom-ID vx vy vz ervel
ellipsoid atom-ID vx vy vz lx ly lz
sphere atom-ID vx vy vz wx wy wz
hybrid atom-ID vx vy vz sub-style1 sub-style2 ...

where the keywords have these meanings:

vx,vy,vz = translational velocity of atom lx,ly,lz = angular momentum of aspherical atom wx,wy,wz = angular velocity of spherical atom ervel = electron radial velocity (0 for fixed-core):ul

The velocity lines can appear in any order. This section can only be used after an Atoms section. This is because the Atoms section must have assigned a unique atom ID to each atom so that velocities can be assigned to them.

Vx, vy, vz, and ervel are in units of velocity. Lx, ly, lz are in units of angular momentum (distance-velocity-mass). Wx, Wy, Wz are in units of angular velocity (radians/time).

For atom_style hybrid, following the 4 initial values (ID,vx,vy,vz), specific values for each sub-style must be listed. The order of the sub-styles is the same as they were listed in the atom_style command. The sub-style specific values are those that are not the 5 standard ones (ID,vx,vy,vz). For example, for the "sphere" sub-style, "wx", "wy", "wz" values would appear. These are listed in the same order they appear as listed above. Thus if

atom_style hybrid electron sphere 

were used in the input script, each velocity line would have these fields:

atom-ID vx vy vz ervel wx wy wz 

Translational velocities can also be set by the velocity command in the input script.


Restrictions:

To read gzipped data files, you must compile LAMMPS with the -DLAMMPS_GZIP option - see the Making LAMMPS section of the documentation.

Related commands:

read_dump, read_restart, create_atoms

Default: none