OpenEye-drconvert API Reference

Conversion to Records

class drconvert.MolFileConverter(path, options=None, display_name=None)

Convert a Molecular file to OERecords

__iter__()

Iterator of OERecords

class drconvert.MolConversionOptions(isomeric_conf_test=False, schema_limit=0, clear_mol_data=True, unique_values_limit=25, sample_percent=100.0, generic_tags=[], mol_title_field='', smiles_field_name='Original SMILES')

This class contains options for conversion from molecules to records

class drconvert.CSVConverter(path, options=None, display_name=None)

Convert a CSV file to OERecords

__iter__()

Iterator of OERecords

class drconvert.CSVConversionOptions(schema_limit=0, unique_values_limit=25, delimiter=None, sample_percent=100.0, smiles_field_name='Original SMILES')

This class contains options for converting .csv files to records

class drconvert.OEDUConverter(path, options=None, display_name=None)

Convert an OEDesignUnit oedu file to OERecords

__iter__()

Iterator of OERecords

class drconvert.SmilesFileConverter(path, options=None, display_name=None)

Convert a SMILES file to OERecords

__iter__()

Iterator of OERecords

drconvert.get_converter(path: str, display_name: Optional[str] = None)

Returns the appropriate converter for a specified file.

Parameters
  • path (str) – The path to the file to be converted

  • display_name (str or None) – The display name for the file

Returns

An instance of a converter class

Conversion from Records

class drconvert.RecordConvertToCSV(path, delimiter=',')

Convert a record file to CSV format

__iter__()

Iterator of CSV lines.

Returns header as the first item, followed by the rows of the CSV

Returns

Iterator of CSV Rows as strings

class drconvert.RecordConvertToMols(path)

Convert a record file to OEMols

__iter__()

Iterator of OEMols

class drconvert.RecordConvertToOEDU(path)

Convert a record file to OEDU

__iter__()

Iterator of OEDUs

class drconvert.ArchiveConverter(path, chunk_size=16777216)

Convert a tarball or zip file into OERecords

__iter__()

Iterator of OERecords

drconvert.record_to_mol(record)

Convert a single record to an OEMol

drconvert.record_to_du(record)

Convert a single record with a designunit to an OEDesignUnit

The record to design unit only converts the design unit on the record, and does not preserve other field data. The conversion is not round trippable.

Conversion to Alternate Formats

DRConvert provides support for alternatives to OpenEye’s formats. The current formats that are supported are pandas Dataframes and Apache Parquet.

These formats require extra libraries that DRConvert does not depend on. To use them install pandas>=0.25.0,<0.26.0 and pyarrow>=0.14.1,<0.15.0.

Pandas

drconvert.pandas.read_record_file_to_dataframe(path, serializable=False)

Reads a record file into a pandas DataFrame

Requires pandas, if not installed importing will trigger an ImportError

Parameters
  • path (string) – Path to record file

  • serializable (bool) – If serializable, keep all non-POD types as bytes. Required to serialize dataframe in some cases

Returns

A pandas Dataframe representation of the record file

drconvert.pandas.record_to_dataframe(record, serializable=False)

Converts an OERecord to a pandas Dataframe.

Requires pandas, if not installed importing will trigger an ImportError

Parameters
  • record (OERecord) – An OERecord

  • serializable (bool) – If serializable, keep all non-POD types as bytes. Required to serialize dataframe in some cases

Returns

A pandas Dataframe representation of the record

Parquet

drconvert.parquet.parquet_to_dataframe(path)

Returns a Dataframe read from a parquet file

Requires pandas and pyarrow, if not installed importing will trigger an ImportError

Parameters

path (string) – Path to parquet file

Returns

DataFrame

drconvert.parquet.record_file_to_parquet(path, output_path, compression='snappy')

Reads a record file and writes out a parquet file

Requires pandas and pyarrow, if not installed importing will trigger an ImportError

Parameters
  • path (string) – Path to record file

  • output_path (string) – Path to write parquet file

  • compression (string) – Compression format, valid options are ‘snappy’, ‘gzip’, ‘brotli’, None. Default is ‘snappy’