MD DataRecord¶
MD DataRecord a brief overview¶
Molecular Dynamics (MD) simulations are notoriously time consuming and computational demanding. In terms of data, a lot of information need to be stored and retrieved such as atomic coordinates, atomic velocities, forces and energies just to cite only few; extracting and managing this data then becomes crucial. The MD DataRecord API simplifies the access to the md data in the MD floe programming for Orion. In the OpenEye Datarecord model, data is exchanged between cubes in format of data records where POD data, custom objects, json data etc. are stored and retrieved by using the associated field names and types. The MD Datarecord API is built on the top of the OpenEye Datarecord, standardizing the record content data produced during MD runs in a well-structured format and providing an API point to its access.
MD DataRecord structure¶
The MD data produced along MD runs is structured in what is named the md record:
the md record contains a sub-record named md stages where md information is saved. This sub-record is a list of md stage records;
each md stage is a record itself with an associated name, type, log data, topology and MD State info. The latter is a custom object that stores data useful to restart MD runs such as, atomic positions, velocities and box information;
the md record at the top level contains a Parmed object used to carry the whole system parametrization data
the md record at the top level can also contain other data such as the MDComponents object used to carry info related to the different md system parts such as ligand, protein, solvent, cofactors etc. or can contains the starting ligand and protein with their names, unique identifiers such as the flask id , cofactor ids etc.
The following picture shows the md record structure and its main components
The md record is user accessible by using the MDDataRecord API. In order to use it, the MDOrion package must be installed. The installation of the package also requires to have access to the OpenEye Magpie repository for some dependencies. The API has been designed to transparently work locally and in Orion. To use the API the user needs to create a MDDatarecord object starting from an OERecord object, after that getter and setter functions can be used to access the md data (see full MDDatatecord API Documentation).
Code snippets¶
The following code snippets give an idea on how to use the API.
Warning
In the following examples record is an OpenEye Datarecord produced running MD floes such as Solvate and Run MD or Solvate and Run Protein-Ligand MD. Starting from the MDOrion pkg v2.0.0, the datasets produced from the Short Trajectory MD with analysis floe is “ligand centric” and the MDDatarecord API cannot be directly applied to the produced records. However, the API still works on the conformer (poses) ligand sub-records which are still md records
from orionmdcore.mdrecord import MDDataRecord
# To use the MD Datarecord API an MDDataRecord instance is built starting from an OERecord
md_record = MDDataRecord(record)
# At this point getters and setters can be used to
# extract/set info from/to the md record
# MD Stage names available
stage_names = md_record.get_stages_names
# Get a MD Record Stage
md_stage_production = md_record.get_stage_by_name(stage_names[2])
# Extract the logging info from a stage
info_stage = md_record.get_stage_info(stg_name=stage_names[2])
# Extract the MD State from a stage
md_set_up_state = md_record.get_stage_state(stg_name=stage_names[0])
# Extract the Parmed Structure from the md record and synchronize the
# positions, velocities and box data to the selected stage state
pmd_structure = md_record.get_parmed(sync_stage_name=stage_names[2])
# Extract the OEMol system flask from the record and its title
flask = md_record.get_flask
flask_title = md_record.get_title
# Extract the trajectory file name associated with a stage. In this
# case the stage trajectory in unpacked and the trajectory name
# can be used in any md analysis pkg to be loaded
trj_name = md_record.get_stage_trajectory(stg_name=stage_names[2])
# Add a new Stage to the md_stages record
md_record.add_new_stage(stage_name="New_Stage",
stage_type="NPT",
topology=flask,
mdstate=md_set_up_state,
data_fn="test.tar.gz")
MDDatatecord API Documentation¶
Follow the API documentation
- class orionmdcore.mdrecord.mdrecord.MDDataRecord(record, inplace=True)¶
This Class Implements the MD Datarecord API by using getter and setter functions
- add_new_stage(stage_name, stage_type, topology, mdstate, data_fn, append=True, log=None, info=None, trajectory_fn=None, trajectory_engine=None, trajectory_orion_ui='OrionFile')¶
This method add a new MD stage to the MD stage record
- Parameters
stage_name (String) – The new MD stage name
stage_type (String) – The MD stage type e.g. SETUP, MINIMIZATION etc.
topology (OEMol) – The topology
mdstate (MDState) – The new mdstate made of state positions, velocities and box vectors
data_fn (String) – The data file name is used only locally and is linked to the MD data associated with the stage. In Orion the data file name is not used
append (Bool) – If the flag is set to true the stage will be appended to the MD stages otherwise the last stage will be overwritten by the new created MD stage
log (String or None) – Log info
info (Python Dictionary or None) – Info Dictionary of Plain Data
trajectory_fn (String, Int or None) – The trajectory name for local run or id in Orion associated with the new MD stage
trajectory_engine (String or None) – The MD engine used to generate the new MD stage. Possible names: OpenMM or Gromacs
trajectory_orion_ui (String) – The trajectory string name to be displayed in the Orion UI
- Returns
boolean – True if the MD stage creation was successful
- Return type
Bool
- delete_stage_by_idx(idx)¶
This method deletes an MD stage selected by passing its index. If the stage index cannot be found an exception is raised.
- Parameters
idx (Int) – The MD stage index
- Returns
boolean – True if the deletion was successful
- Return type
Bool
- delete_stage_by_name(stg_name='last')¶
This method deletes an MD stage selected by passing the string name. If the string “last” is passed (default) the last MD stage is deleted. If no stage name has been found an exception is raised.
- Parameters
stg_name (String) – The MD stage name
- Returns
boolean – True if the deletion was successful
- Return type
Bool
- property delete_stages¶
This method deletes all the record stages
- Returns
boolean – True if the deletion was successful
- Return type
Bool
- property get_conf_id¶
This method returns the identification field CONF ID present on the record
- Returns
conformed_id – The conformer id
- Return type
Int
- property get_flask¶
This method returns the flask molecule present on the record
- Returns
flask – The flask present on the record otherwise an error is raised
- Return type
OEMol
- property get_flask_id¶
This method returns the integer value of the flask identification field present on the record
- Returns
flask_id – The unique flask identifier
- Return type
Int
- property get_last_stage¶
This method returns the last MD stage of the MD record stages
- Returns
record – The last stage of the MD record stages
- Return type
OERecord
- property get_lig_id¶
This method returns the ligand identification field present on the record
- Returns
ligand_id – The ligand identification id number
- Return type
Int
- property get_ligand¶
This method returns the ligand molecule present on the record
- Returns
ligand – The ligand molecule if the ligand has been set on the record otherwise an error is raised
- Return type
OEMol
- property get_md_components¶
This method returns the MD Components if present on the record
- Returns
md_components – The MD Components object
- Return type
MDComponents
- property get_primary¶
This method returns the primary molecule present on the record
- Returns
record – The Primary Molecule
- Return type
OEMol
- property get_protein¶
This method returns the protein molecule present on the record
- Returns
protein – The protein molecule if the protein has been set on the record otherwise an error is raised
- Return type
OEMol
- property get_record¶
This method returns the record
- Returns
record – The record to be passed with the cubes
- Return type
OERecord
- get_stage_by_idx(idx)¶
This method returns a MD stage selected by passing an index. If the stage is not found an exception is raised.
- Parameters
idx (Int) – The stage index to retrieve
- Returns
record – The MD stage selected by its index
- Return type
OERecord
- get_stage_by_name(stg_name='last')¶
This method returns a MD stage selected by passing the string stage name. If the string “last” is passed (default) the last MD stage is returned. If multiple stages have the same name the first occurrence is returned. If no stage name has been found an exception is raised.
- Parameters
stg_name (String) – The MD stage name
- Returns
record – The MD stage selected by its name
- Return type
OERecord
- get_stage_info(stg_name='last')¶
This method returns the info related to the selected stage name. If no stage name is passed the last stage is selected.
- Parameters
stg_name (String) – The MD stage name
- Returns
info_string – The info associated with the selected MD stage otherwise None
- Return type
String
- get_stage_logs(stg_name='last')¶
This method returns the logs related to the selected stage name. If no stage name is passed the last stage is selected.
- Parameters
stg_name (String) – The MD stage name
- Returns
info_string – The info associated with the selected MD stage
- Return type
String
- get_stage_state(stg_name='last')¶
This method returns the MD State of the selected stage name. If no stage name is passed the last stage is selected
- Parameters
stg_name (String) – The MD stage name
- Returns
state – The MD state of the selected MD stage
- Return type
MDState
- get_stage_topology(stg_name='last')¶
This method returns the MD topology of the selected stage name. If no stage name is passed the last stage is selected.
- Parameters
stg_name (String) – The MD stage name
- Returns
topology – The topology of the selected MD stage
- Return type
OEMol
- get_stage_trajectory(stg_name='last')¶
This method returns the trajectory file name associated with the md data. If the trajectory is not found None is return
- Parameters
stg_name (String) – The MD stage name
- Returns
trajectory_filename – Trajectory file name if the process was successful otherwise None
- Return type
String or None
- property get_stages¶
This method returns the MD stage record list with all the MD stages.
- Returns
record_list – The MD stages record list
- Return type
list
- property get_stages_names¶
This method returns the list names of the MD stages.
- Returns
list – The MD stage name list
- Return type
list
- property get_title¶
This method returns the title present on the record
- Returns
title – The title string if present on the record otherwise an error is raised
- Return type
String
- property has_conf_id¶
This method checks if the identification field CONF ID is present on the record
- Returns
boolean – True if the conformer id field is present on the record otherwise False
- Return type
Bool
- property has_extra_data_tar¶
This method returns True if extra data file in tar format is attached to the record
- Returns
boolean – True if extra data file is present on the record otherwise False
- Return type
Bool
- property has_flask_id¶
This method checks if the flask identification field is present on the record
- Returns
boolean – True if the flask ID field is present on the record otherwise False
- Return type
Bool
- property has_lig_id¶
This method checks if the ligand identification field is present on the record
- Returns
boolean – True if the ligand identification field is present on the record otherwise False
- Return type
Bool
- property has_ligand¶
This method returns True if ligand molecule is present on the record
- Returns
boolean – True if the ligand molecule is present on the record otherwise False
- Return type
Bool
- property has_ligand_traj¶
This method checks if the multi conformer ligand is on the record.
- Returns
boolean – True if the multi conformer ligand is on the record otherwise False
- Return type
Bool
- property has_md_components¶
This method returns True if the MD Components object is present on the record
- Returns
boolean – True if the md components object is present on the record otherwise False
- Return type
Bool
- property has_parmed¶
This method checks if the Parmed object is on the record.
- Returns
boolean – True if the Parmed object is on the record otherwise False
- Return type
Bool
- property has_protein¶
This method returns true if the protein molecule is present on the record
- Returns
boolean – True if the protein molecule is present on the record otherwise False
- Return type
Bool
- property has_protein_traj¶
This method checks if the multi conformer protein is on the record.
- Returns
boolean – True if the multi conformer protein is on the record otherwise False
- Return type
Bool
- has_stage_info(stg_name='last')¶
This method returns True if MD stage selected by passing the string name has infos present on the MD stage record otherwise False.
- Parameters
stg_name (String) – The MD stage name
- Returns
boolean – True if the MD stage has info otherwise False
- Return type
Bool
- has_stage_name(stg_name)¶
This method returns True if MD stage selected by passing the string name is present on the MD stage record otherwise False.
- Parameters
stg_name (String) – The MD stage name
- Returns
boolean – True if the MD stage name is present on the MD stages record otherwise False
- Return type
Bool
- property has_stages¶
This method returns True if the record has a MD record list otherwise False
- Returns
boolean – True if the record has a list of MD stages otherwise False
- Return type
Bool
- property has_title¶
This method checks if the Title field is present on the record
- Returns
boolean – True if the Title field is resent on the record otherwise False
- Return type
Bool
- property has_water_traj¶
This method checks if the multi conformer water is on the record.
- Returns
boolean – True if the multi conformer water is on the record otherwise False
- Return type
Bool
- set_conf_id(conf_id)¶
This method sets the identification field for the conformer on the record
- Parameters
conf_id (Int) – An identification integer for the record
- Returns
boolean – True if the conformed id has been set on the record otherwise an error is raised
- Return type
Bool
- set_flask(flask)¶
This method sets the flask molecule on the record
- Parameters
flask (OEMol) – The flask molecule to set on the record
- Returns
record – True if the flask molecule has been set on the record otherwise an error is raised
- Return type
Bool
- set_flask_id(id)¶
This method sets the integer value of the flask identification field on the record
- Parameters
id (Int) – An integer value for the flask identification field
- Returns
boolean – True if the flask identification ID has been set as an integer on the record
- Return type
Bool
- set_lig_id(sys_id)¶
This method sets the ligand identification field on the record
- Parameters
sys_id (Int) – An integer value for the ligand identification field
- Returns
boolean – True if the value for the ligand identification field was successfully set on the record
- Return type
Bool
- set_ligand(ligand)¶
This method sets the ligand molecule on the record
- Returns
boolean – returns True if the ligand has been set on the record otherwise an error is raised
- Return type
Bool
- set_md_components(md_components)¶
This method sets the MD Components on the record
- Parameters
md_components (MDComponents) – The MD Components instance
- Returns
boolean – True if the md components field was successfully set on the record
- Return type
Bool
- set_primary(primary_mol)¶
This method sets the primary molecule on the record
- Parameters
primary_mol (OEMol) – The primary molecule to set on the record
- Returns
boolean – True if the primary molecule has been set on the record
- Return type
Bool
- set_protein(protein)¶
This method sets the protein molecule on the record
- Returns
boolean – returns True if the protein has been set on the record otherwise an error is raised
- Return type
Bool
- set_stage_info(info_dic, stg_name='last')¶
This method sets the stage info field on the selected stage by name
- Parameters
stg_name (String) – The MD Stage name
info_dic (Python dic) – The dictionary containing the Plain Data info to save
- Returns
boolean – True if the system Title has been set on the record
- Return type
Bool
- set_title(title)¶
This method sets the system Title field on the record
- Parameters
title (String) – A string used to identify the molecular system
- Returns
boolean – True if the system Title has been set on the record
- Return type
Bool