MD DataRecord

MD DataRecord a brief overview

Molecular Dynamics (MD) simulations are notoriously time consuming and computational demanding. In terms of data, a lot of information need to be stored and retrieved such as atomic coordinates, atomic velocities, forces and energies just to cite only few; extracting and managing this data then becomes crucial. The MD DataRecord API simplifies the access to the md data in the MD floe programming for Orion. In the OpenEye Datarecord model, data is exchanged between cubes in format of data records where POD data, custom objects, json data etc. are stored and retrieved by using the associated field names and types. The MD Datarecord API is built on the top of the OpenEye Datarecord, standardizing the record content data produced during MD runs in a well-structured format and providing an API point to its access.

MD DataRecord structure

The MD data produced along MD runs is structured in what is named the md record:

  • the md record contains a sub-record named md stages where md information is saved. This sub-record is a list of md stage records;

  • each md stage is a record itself with an associated name, type, log data, topology and MD State info. The latter is a custom object that stores data useful to restart MD runs such as, atomic positions, velocities and box information;

  • the md record at the top level contains a Parmed object used to carry the whole system parametrization data

  • the md record at the top level can also contain other data such as the MDComponents object used to carry info related to the different md system parts such as ligand, protein, solvent, cofactors etc. or can contains the starting ligand and protein with their names, unique identifiers such as the flask id , cofactor ids etc.

The following picture shows the md record structure and its main components

Structure of the MD Record

Structure of the MD Record

The md record is user accessible by using the MDDataRecord API. In order to use it, the MDOrion package must be installed. The installation of the package also requires to have access to the OpenEye Magpie repository for some dependencies. The API has been designed to transparently work locally and in Orion. To use the API the user needs to create a MDDatarecord object starting from an OERecord object, after that getter and setter functions can be used to access the md data (see full MDDatatecord API Documentation).

Code snippets

The following code snippets give an idea on how to use the API.

Warning

In the following examples record is an OpenEye Datarecord produced running MD floes such as Solvate and Run MD or Solvate and Run Protein-Ligand MD. Starting from the MDOrion pkg v2.0.0, the datasets produced from the Short Trajectory MD with analysis floe is “ligand centric” and the MDDatarecord API cannot be directly applied to the produced records. However, the API still works on the conformer (poses) ligand sub-records which are still md records

from MDOrion.Standards.mdrecord import MDDataRecord

# To use the MD Datarecord API an MDDataRecord instance is built starting from an OERecord
md_record = MDDataRecord(record)

# At this point getters and setters can be used to
# extract/set info from/to the md record

# MD Stage names available
stage_names = md_record.get_stages_names

# Get a MD Record Stage
md_stage_production = md_record.get_stage_by_name(stage_names[2])

# Extract the logging info from a stage
info_stage = md_record.get_stage_info(stg_name=stage_names[2])

# Extract the MD State from a stage
md_set_up_state = md_record.get_stage_state(stg_name=stage_names[0])

# Extract the Parmed Structure from the md record and synchronize the
# positions, velocities and box data to the selected stage state
pmd_structure = md_record.get_parmed(sync_stage_name=stage_names[2])

# Extract the OEMol system flask from the record and its title
flask = md_record.get_flask
flask_title = md_record.get_title

# Extract the trajectory file name associated with a stage. In this
# case the stage trajectory in unpacked and the trajectory name
# can be used in any md analysis pkg to be loaded
trj_name = md_record.get_stage_trajectory(stg_name=stage_names[2])

# Add a new Stage to the md_stages record
md_record.add_new_stage(stage_name="New_Stage",
                        stage_type="NPT",
                        topology=flask,
                        mdstate=md_set_up_state,
                        data_fn="test.tar.gz")

MDDatatecord API Documentation

Follow the API documentation

class MDOrion.Standards.mdrecord.MDDataRecord(record, inplace=True)

This Class Implements the MD Datarecord API by using getter and setter functions

The Initialization function used to create the MDDatarecord object

Parameters:
  • record (OERecord object) – The OERecord used to create the MDDatarecord
  • inplace (Bool) – if True the record will be update in place otherwise a copy of the record will be made
add_new_stage(stage_name, stage_type, topology, mdstate, data_fn, append=True, log=None, trajectory_fn=None, trajectory_engine=None, trajectory_orion_ui='OrionFile')

This method add a new MD stage to the MD stage record

Parameters:
  • stage_name (String) – The new MD stage name
  • stage_type (String) – The MD stage type e.g. SETUP, MINIMIZATION etc.
  • topology (OEMol) – The topology
  • mdstate (MDState) – The new mdstate made of state positions, velocities and box vectors
  • data_fn (String) – The data file name is used only locally and is linked to the MD data associated with the stage. In Orion the data file name is not used
  • append (Bool) – If the flag is set to true the stage will be appended to the MD stages otherwise the last stage will be overwritten by the new created MD stage
  • log (String or None) – Log info
  • trajectory_fn (String, Int or None) – The trajectory name for local run or id in Orion associated with the new MD stage
  • trajectory_engine (String or None) – The MD engine used to generate the new MD stage. Possible names: OpenMM or Gromacs
  • trajectory_orion_ui (String) – The trajectory string name to be displayed in the Orion UI
Returns:

boolean – True if the MD stage creation was successful

Return type:

Bool

create_collection(name)

This method sets a collection field on the record to be used in Orion

Parameters:name (String) – A string used to identify in the Orion UI the collection
Returns:boolean – True if the collection creation in Orion was successful otherwise False
Return type:Bool
delete_parmed

This method deletes the Parmed object from the record. True is returned if the deletion was successful.

Returns:boolean – True if Parmed object deletion was successful
Return type:Bool
delete_stages

This method deletes all the record stages

Returns:boolean – True if the deletion was successful
Return type:Bool
get_conf_id

This method returns the identification field CONF ID present on the record

Returns:conformed_id – The conformer id
Return type:Int
get_extra_data_tar

This method returns the directory file name where extra data file tar has been unpacked

Returns:directory file name – The directory file name
Return type:String
get_flask

This method returns the flask molecule present on the record

Returns:flask – The flask present on the record otherwise an error is raised
Return type:OEMol
get_flask_id

This method returns the integer value of the flask identification field present on the record

Returns:flask_id – The unique flask identifier
Return type:Int
get_lig_id

This method returns the ligand identification field present on the record

Returns:ligand_id – The ligand identification id number
Return type:Int
get_ligand

This method returns the ligand molecule present on the record

Returns:ligand – The ligand molecule if the ligand has been set on the record otherwise an error is raised
Return type:OEMol
get_md_components

This method returns the MD Components if present on the record

Returns:md_components – The MD Components object
Return type:MDComponents
get_parmed(sync_stage_name=None)

This method returns the Parmed object. An exception is raised if the Parmed object cannot be found. If sync_stage_name is not None the parmed structure positions, velocities and box vectors will be synchronized with the MD State selected by passing the MD stage name

Parameters:sync_stage_name (String or None) – The stage name that is used to synchronize the Parmed structure
Returns:parmed – The Parmed Structure object
Return type:Parmed Structure
get_primary

This method returns the primary molecule present on the record

Returns:record – The Primary Molecule
Return type:OEMol
get_protein

This method returns the protein molecule present on the record

Returns:protein – The protein molecule if the protein has been set on the record otherwise an error is raised
Return type:OEMol
get_protein_traj

This method returns the protein molecule where conformers have been set as trajectory frames

Returns:multi_conformer_protein – The multi conformer protein
Return type:OEMol
get_record

This method returns the record

Returns:record – The record to be passed with the cubes
Return type:OERecord
get_title

This method returns the title present on the record

Returns:title – The title string if present on the record otherwise an error is raised
Return type:String
has_conf_id

This method checks if the identification field CONF ID is present on the record

Returns:boolean – True if the conformer id field is present on the record otherwise False
Return type:Bool
has_extra_data_tar

This method returns True if extra data file in tar format is attached to the record

boolean: Bool
True if extra data file is present on the record otherwise False
has_flask_id

This method checks if the flask identification field is present on the record

Returns:boolean – True if the flask ID field is present on the record otherwise False
Return type:Bool
has_lig_id

This method checks if the ligand identification field is present on the record

Returns:boolean – True if the ligand identification field is present on the record otherwise False
Return type:Bool
has_ligand

This method returns True if ligand molecule is present on the record

Returns:boolean – True if the ligand molecule is present on the record otherwise False
Return type:Bool
has_md_components

This method returns True if the MD Components object is present on the record

boolean: Bool
True if the md components object is present on the record otherwise False
has_parmed

This method checks if the Parmed object is on the record.

Returns:boolean – True if the Parmed object is on the record otherwise False
Return type:Bool
has_protein

This method returns true if the protein molecule is present on the record

Returns:boolean – True if the protein molecule is present on the record otherwise False
Return type:Bool
has_stages

This method returns True if the record has a MD record list otherwise False

Returns:boolean – True if the record has a list of MD stages otherwise False
Return type:Bool
has_title

This method checks if the Title field is present on the record

Returns:boolean – True if the Title field is resent on the record otherwise False
Return type:Bool
set_conf_id(conf_id)

This method sets the identification field for the conformer on the record

Parameters:conf_id (Int) – An identification integer for the record
Returns:boolean – True if the conformed id has been set on the record otherwise an error is raised
Return type:Bool
set_extra_data_tar(tar_fn, shard_name='')

This method sets the extra data file on the record

Parameters:
  • tar_fn (String) – The compressed data file name
  • shard_name (String) – In Orion tha shard will be named by using the shard_name
Returns:

boolean – True if the setting was successful

Return type:

Bool

set_flask(flask)

This method sets the flask molecule on the record

Parameters:flask (OEMol) – The flask molecule to set on the record
Returns:record – True if the flask molecule has been set on the record otherwise an error is raised
Return type:Bool
set_flask_id(id)

This method sets the integer value of the flask identification field on the record

Parameters:id (Int) – An integer value for the flask identification field
Returns:boolean – True if the flask identification ID has been set as an integer on the record
Return type:Bool
set_lig_id(sys_id)

This method sets the ligand identification field on the record

Parameters:sys_id (Int) – An integer value for the ligand identification field
Returns:boolean – True if the value for the ligand identification field was successfully set on the record
Return type:Bool
set_ligand(ligand)

This method sets the ligand molecule on the record

Returns:boolean – returns True if the ligand has been set on the record otherwise an error is raised
Return type:Bool
set_md_components(md_components)

This method sets the MD Components on the record

Parameters:md_components (MDComponents) – The MD Components instance
Returns:boolean – True if the the md components field was successfully set on the record
Return type:Bool
set_parmed(pmd, sync_stage_name=None, shard_name='')

This method sets the Parmed object. Return True if the setting was successful. If sync_stage_name is not None the parmed structure positions, velocities and box vectors will be synchronized with the MD State selected by passing the MD stage name

Parameters:
  • pmd (Parmed Structure object) – The Parmed Structure object to be set on the record
  • sync_stage_name (String or None) – The stage name that is used to synchronize the Parmed structure
  • shard_name (String) – In Orion tha shard will be named by using the shard_name
Returns:

boolean – True if the setting was successful

Return type:

Bool

set_primary(primary_mol)

This method sets the primary molecule on the record

Parameters:primary_mol (OEMol) – The primary molecule to set on the record
Returns:boolean – True if the primary molecule has been set on the record
Return type:Bool
set_protein(protein)

This method sets the protein molecule on the record

Returns:boolean – returns True if the protein has been set on the record otherwise an error is raised
Return type:Bool
set_protein_traj(protein_conf, shard_name='')

This method sets the multi conformer protein trajectory on the record

Parameters:
  • protein_conf (OEChem) – Th multi conformer protein trajectory
  • shard_name (String) – In Orion tha shard will be named by using the shard_name
Returns:

boolean – True if the setting was successful

Return type:

Bool

set_title(title)

This method sets the system Title field on the record

Parameters:title (String) – A string used to identify the molecular system
Returns:boolean – True if the system Title has been set on the record
Return type:Bool