Quick Start
This section covers fundamentals and command line interface (CLI) of Ta-dah!
For installation instructions see Installation.
If you are interested to use it as a C++ library have a look at the API Examples and browse through the API documentation. Still this section might prove useful.
Overview
To train a model, a config file (see Config file) and a dataset (see Dataset format) are required.
For prediction, a potential (trained model) and a dataset are required. The potential file is an output of the training process. The potential file is technically the same as the config file.
Training
To train a model run the following command in a terminal:
ta-dah train -c config.train
This will train a model using energies only. To also train on forces add -F
flag, for stresses -S
flag,
or add FORCE true
and STRESS true
keys to the config file. Note that flags take priority over config values.
Here config.train
is a Config file.
Training dataset(s), descriptors, cutoffs, model and all parameters are specified in the config file.
An output of this command is a trained model as a pot.tadah
file.
Here
is the minimal example of the config.train
file:
DBFILE db.train
INIT2B true
RCUT2B 5.3
TYPE2B D2_LJ
RCTYPE2B Cut_Dummy
MODEL M_KRR Kern_Linear
For the explanation of KEYS see Config.
An example dataset can be downloaded from here
.
Prediction
To predict energies using an existing pot.tadah
model run:
ta-dah predict -p pot.tadah -d db.predict
To also predict forces add -F
flag, for stresses -S
flag.
Alternatively, a config file
can be used
to specify a prediction datasets
(DBFILE) and whether forces and stresses are meant to be calculated (FORCE and STRESS keys).
config.pred
example:
DBFILE db.predict
FORCE false
STRESS true
To predict using pot.tadah
model and config.predict
:
ta-dah predict -p pot.tadah -c config.predict
The output of predict
subcommand are three files:
energy.pred
There are two columns. The first column lists datasets energy/atom, the second predicted energy/atom. The ordering of rows follows ordering of datasets in a config file or with-d
flag. First all energies from the first dataset are listed then from the second and so on.
forces.pred
Similar idea as above but now forces are listed. First row is a force on the first atom in the x-direcion from the first dataset, second row is the force on the first atom in the y-direction, and so on.
stress.pred
First 6 rows list components of the stress tenosor from the first configuration, followed by 6 components from the second configuration… Ordering is xx,xy,xz,yy,yz,zz
Built in help
Ta-dah! provides some basic help. Try to run it with -h
flag.
ta-dah -h
To read more about particular subcommand try
ta-dah train -h
Units and Ta-dah!
In principle Ta-dah! will work with any units. The units used are determined by the units in a training datasets. So if your dataset has energy units of electronvolt and distance in Angstrom then created model will have the same units. The unit of force must be eV/A in this case. The stress tensor has units of energy (pressure*volume). In other words, the units of stress are the same as energy and unit of force is energy/distance.
The units selected by LAMMPS must be consistent with the model units. In this case they would correspond to metal units in LAMMPS.
Note
Ta-dah! have been tested with units of eV and Angstrom. Unless you have a good reason to use different units those are the recommended ones.
Config file
For a list and explanation of supported KEY-VALUE(S) pairs see Config
Dataset format
Dataset(s) are included using DBFILE key in a Config file. More than one dataset can be specified.
There is no restriction on a number of atoms in different structures, so it’s ok to have a structure with 12 atoms and another one with 24 atoms in the same dataset.
The dataset has the following structure:
Comment line
eweight fweight sweight
ENERGY
cell vector a
cell vector b
cell vector c
stress tensor row s_1
stress tensor row s_2
stress tensor row s_3
Element px py pz fx fy fz
...
<blank line>
Comment line
eweight fweight sweight
...
First line is a comment line; it will be used as a label for a structure.
eweight fweight sweight are (optional) weighting parameters used for training. If this line is missing it defaults to 1.0 1.0 1.0. Do not leave blank line.
Each cell vector contains 3 numbers.
Each stress tensor row contains 3 numbers.
Number of lines beginning with Element is equal to the number of atoms in a structure.
Element is an atom label, usually chemical element symbol.
px, py and pz are Cartesian coordinates of the atom position.
fx, fy and fz are components of the force vector acting on the atom.
Each configuration is separated by the blank line.
If forces and/or stresses are not available they can be set to zero to satisfy parser.
Also See Units and Ta-dah!
Ta-dah! and LAMMPS
Once trained the pot.tadah
file can be used with LAMMPS like any other pair potential.
pair_style tadah/tadah
pair_coeff * * /path/to/pot.tadah ELEMENT
Here is an example lammps script file
and pot.file
.
See Installing LAMMPS interface for interface installation instructions.
Suport for muli-species systems
Ta-dah! is capable of generating machine-learned potentials for mono- and multi-component systems.