Record Basics
An OEMolRecord
is a container class for data that is often used by a Cube.
OEMolRecord
handles the serialization and deserialization of data that is necessary to pass
records between cubes. A field identifies the name and type (.e.g., integer, OEMol,
etc..) of a piece of data on the record as well as optional meta data. A record can have any number of fields,
but the name of a field on the record must be unique. OEMolRecord
supports many built-in field
types, which are located on datarecord.Types
.
The following three methods on OEMolRecord
provide core functionality.
record.has_value(field)
Returns True if the field has a value.
record.get_value(field)
Returns a copy of the field’s value.
record.set_value(field, value)
Sets the field’s value.
Field Parameters
Cubes are often used to read and write fields on records that they receive, and so it becomes necessary
to inform a cube about the names of those fields (recall that a field is located on a record by its name).
This is accomplished by using a FieldParameter
.
A FieldParameter
is a StringParameter
which is converted into an OEField
when a cube is executed, where
the field’s name is populated by the value of the parameter. A field_type
argument is provided to the
FieldParameter
constructor, which specifies the corresponding field’s type. The parameter type of a `
FieldParameter` is always a string because it represents the name of a field, not the type
of a field.
In the following example, the name of the new int_field
added to each record is set by the value of that parameter.
from floe.api import ComputeCube
from orionplatform.mixins import RecordPortsMixin
from orionplatform.parameters import FieldParameter
from datarecord import Types
class SquareIntCube(RecordPortsMixin, ComputeCube):
int_field = FieldParameter("int_field", Types.Int, required=True)
def process(self, record, port):
if record.has_value(self.args.int_field):
x = record.get_value(self.args.int_field)
record.set_value(self.args.int_field, x * x)
self.success.emit(record)
else:
self.failure.emit(record)
Primary Molecule
An OEMolRecord
supports the concept of a primary molecule. The primary molecule
is the ‘default’ molecule for a record. As a best practice, cubes which operate on molecules should operate
on the primary molecule by default. A cube may use the PrimaryMolFieldParameter
for interacting with a primary molecule. Cube authors may also use the
InputMolFieldParameter
and OutputMolFieldParameter
specializations of PrimaryMolFieldParameter
for reading and writing primary molecules,
respectively. By default, these two parameters will return the primary molecule field, but can be overridden at runtime
to specify a different field for either.
The following example adds explicit hydrogens to a molecule. If the user does not specific either the input or output molecule fields at runtime this cube will read the molecule in from the primary molecule field, modify it and write it out to the primary molecule field (overwriting the input).
from floe.api import ComputeCube
from orionplatform.parameters import PrimaryMolFieldParameter
# Note: oechem must be imported before OpenEye toolkits
from openeye.oechem import OEAddExplicitHydrogens
class SquareIntCube(ComputeCube):
in_mol_field = PrimaryMolFieldParameter("in_mol_field", read_only=True)
out_mol_field = PrimaryMolFieldParameter("out_mol_field")
def process(self, record, port):
if record.has_value(self.args.in_mol_field):
mol = record.get_value(self.args.in_mol_field)
OEAddExplicitHydrogens(mol)
record.set_value(self.args.out_mol_field, mol)
self.success.emit(record)
else:
self.failure.emit(record)