• Introduction to Orion Development
    • Orion Platform
      • Data Record
      • DRConvert
      • Floe
      • Orion Client
      • Artemis
  • Orion Development Tutorials
    • Basic Development of Cubes and Floes
    • Creating a New Orion Package
      • Features
      • Requirements
      • Setup
      • Commands
      • Output Skeleton
      • Package Documentation
      • Advanced Build Options
        • Private Python Packages
        • Script Hooks
      • Creating a Package Image (Advanced)
        • Prerequisites
        • Overview
        • Creating an image from an existing floe
        • Index of Related OCLI Commands
    • Tutorial: Creating a Tensorflow Package
      • Creating the initial package
      • Configuring Conda Requirements
      • Writing a Training Workfloe
      • Testing the Training Workfloe
      • Writing a Prediction Workfloe
      • Testing the Prediction Workfloe
    • Applying Advanced Techniques
      • Introduction
      • Concurrency
        • The Bottleneck problem - Why Do Cubes Stall?
        • The Bookkeeping Problem - How to merge output of subfloes
      • Computational Resources
        • Transportation Costs
        • Cube Group Limitations
      • Parallelization
        • When Not to Parallelize
        • Parallel Cube Groups
        • The end() Method
      • Data Traffic
        • Optimizing Data Traffic
        • Buffering and Batching
        • Buffering
        • Batching
        • Buffering vs. Batching
      • Collections I/O
        • Efficient Collections I/O
        • Serial Writing
        • Advanced Writing
        • Conclusion
      • Summary: Optimized Floe Design
    • Using the Floe Editor
  • Working Programmatically with Molecule Search
    • Search Commands
      • Exact 2D Search
      • Similarity (GraphSim) 2D Search
      • Substructure 2D Search
      • Title 2D Search
      • 3D Similarity Search
    • Commands for Information Gathering
      • Getting a SMILES String from a Name
      • Getting Information about a Specific Query
      • Listing All Queries
      • Listing the Results of a Query
      • Get Information about a Searchable Database
      • Show the Sharing Status of a Custom Database
    • Query Manipulation Commands
      • Update a Molecule Search Query
      • Export Query Results to a Dataset
      • Download Results and Query to Separate Files
      • Delete a Query
    • Database Manipulation Commands
      • Create a Database from a Prepared Collection
      • Load/Unload a Database
      • Share/Unshare a Database
    • orionclient User Stories
      • Molecule Input
      • Query Round Trip
  • OpenEye Orion Platform Packages
    • Orion Platform
      • Quick Start
        • Requirements
        • Installing Orion Platform
        • Migrating from Cube Record
        • Writing Cubes with Orion Platform
        • Building Floes with Orion Platform Cubes
      • Orion Platform Introduction
        • Data handling in Orion
        • Record Basics
        • Shard Collections
        • Port Basics
        • Cube Basics
        • Common Cube Usage
        • Cube Unit Tests
        • Floe Unit Tests
      • Orion Platform Cubes
        • Orion Platform Dataset Cubes
        • Orion Platform File Cubes
        • Orion Platform Collection Cubes
      • Orion Platform Mixins
        • CollectionPortsMixin
        • RecordPortsMixin
        • ShardPortsMixin
      • Orion Platform Ports
        • Record Ports
        • Collection and Shard Ports
      • Orion Platform Parameters
        • Dataset Parameters
        • Field Parameters
        • File Parameters
        • Collection Parameters
        • Secret Parameters
      • Miscellaneous
        • InputSystemTagParameter
      • Orion Environment Variables
      • Supported Operating Systems
        • x86-64
        • arm64
      • Orion Platform Exceptions
        • ParameterException
        • PlatformException
      • Migrating from CubeRecord
        • Summary of differences
        • Migrating Cubes
        • Migrating Cube Tests
        • API Changes
    • Orion Client
      • Quick Start
        • Installing Orion Client
        • Authenticating with Orion
        • Interacting with Orion
      • Orion Client Examples
        • Orion Client Python API Examples
        • Orion Client Command Line Examples
      • Orion Client Session
        • OrionSession
        • in_orion()
        • get_profile_config()
        • get_session()
      • Orion Client Utilities
        • Backoff
        • TemporaryPath
      • Orion Client Helpers
        • Collection Helpers
        • Parameterization Helpers
        • Workfloe Helpers
      • Orion Client Types
        • Storage Types
        • Floe Types
        • Service Types
        • Management Types
        • Molecule Search Types
      • Orion Client Links
        • Orion Type Link
        • Record Type Link
        • Shard Type Link
      • Orion CLI
        • Options
        • Configuration Commands (ocli config)
        • Storage Commands
        • Floe Commands
        • Molecule Search Database Commands
        • Molecule Search Query Commands
        • Services
      • Orion Client Exceptions
        • AuthorizationRequired
        • BadResponse
        • InvalidAPIVersion
        • InvalidCredentials
        • NotFoundResponse
        • OrionConnectionError
        • OrionError
        • OrionLinkError
        • OrionServiceError
        • OrionTimeout
        • RateLimitedResponse
        • ValidationError
    • Floe
      • WorkFloes
        • Overview
        • Writing a WorkFloe in Python
        • Running a WorkFloe
        • Serializing a WorkFloe
        • Cube Execution Order
        • Cube Groups
        • Nested WorkFloes
        • Multistage WorkFloes
        • Cyclic WorkFloes
      • Cubes
        • Cube Types
        • Cube Methods
        • Emitting Items
        • Cube Parameters
        • Writing Cubes
        • Parallel Cubes
        • Packaging Cubes
        • Cube Ports
        • Testing Cubes
        • Debugging Cubes
        • Cube Metrics
        • Tuning Cube Communication
      • Parameters
        • Parameter Types
        • Adding a cube parameter
        • Specifying Parameters at run time
        • Getting a Parameter value
        • Grouping cube parameters
        • Promoting a cube parameter
        • Grouping promoted parameters
        • Parameter Attributes
        • Storing multiple values
        • Specifying parameter choices
        • Providing a static value for a parameter
        • Ordering cube parameters for user interface display
        • Ordering parameter groups for user interface display
        • Basic vs. Advanced Parameters
        • Overriding parameter fields on a cube
        • Overriding defaults in a derived class
        • Parameter Methods
      • Orion Integration
        • Development Lifecycle
        • Packaging
        • Linting and Detection
        • Hardware Requirements
        • Additional Parallel Cube Parameters
        • Job Scheduling
        • Job Logs
        • Cube File Systems
        • WorkFloe System Tags
      • FloeSpecConverter
        • Creating Spec Converter class
        • Using Spec Converters
        • Floe Package vs Floe Spec Versions
      • Dominant Resource Factor
      • Cube Types
        • ComputeCube
        • Cube
        • CubeGroup
        • CubeParametersMixin
        • CycleDetails
        • ParallelComputeCube
        • ParallelCubeGroup
        • ParallelMixin
        • ParameterConnection
        • SinkCube
        • SourceCube
      • Parameter Types
        • BaseParameter
        • BooleanParameter
        • DecimalParameter
        • FileInputParameter
        • FileOutputParameter
        • IntegerParameter
        • JSONParameter
        • ParameterGroup
        • PromotedParameterGroup
        • StringParameter
      • Test API
        • CubeTestRunner
      • Capacity States
    • Datarecord
      • Quick Tour
        • Introduction
        • Basic Functionality
        • Advanced Topics
      • OpenEye-datarecord API Reference
        • OERecord class
        • OEMolRecord class
        • OEField class
        • Types class
        • OEMatchField class
        • OEPrimaryMolField class
        • OEFieldMeta class
        • OERecordMeta class
        • Meta class
        • OEReadRecord function
        • OEWriteRecord function
        • CustomHandler class
        • create_link function
    • DRConvert
      • OpenEye-drconvert API Reference
        • Conversion to Records
        • Conversion from Records
        • Conversion to Alternate Formats
      • Datarecord Type Coercion And Field Splitting
        • Identifier Field Splitting Example
      • Using an External Schema
        • Handling Unexpected Values
        • Manually Splitting Columns
        • Renaming Columns
        • Metadata on the Schema Fields
    • Artemis
      • Quick Start
        • Installing Artemis
        • Writing a Floe Test
        • Running a Floe Test
      • Artemis Examples
        • Handling Local and Orion Floe testing
        • Testing with Collections
        • Dynamic Packaging
      • Wrapper API
        • Utilities
        • WorkFloe Wrapper
        • Input Wrappers
        • Output Wrappers
      • Pytest Markers
        • @package
        • @orion_xfail
        • @on_demand
      • Packaging API
        • OrionTestPackage
      • FloeTestCase
        • FloeTestCase
      • Pytest Options
        • Runtime
        • Resource Management
  • Release History
    • Orion Platform v6.4.0
      • Fixes
      • Orion Client
      • Orion Integration
    • Orion Platform v6.3.2
      • Floe
      • Orion Client
    • Orion 2024.3.1
    • Orion Platform v6.3.1
      • Code-Only Floe Packages
      • Working Around setuptools convert_path Problems Affecting OpenEye Utilities
      • Bug Fixes
      • Orion Client
        • Molecule Search
      • DRConvert
    • Orion 2024.2.1
      • Bug Fixes
    • Orion Platform v6.2.0
      • Orion Client
      • DRConvert
      • Bug Fixes
    • Version 2024.1.1 Highlights for Cube and Floe Developers
      • Bug Fixes
    • Version 2023.3.1 Highlights for Cube and Floe Developers
    • Orion Integration, Version 2023.3.1
    • Orion Platform v6.1.2
      • Orion Client
        • Bug Fixes
      • Floe
    • Version 2023.2.1 Highlights
    • Orion Integration, Version 2023.2.1
    • Orion Platform v6.0.0
      • Floe
      • Orion Client
      • DRConvert
    • Version 2023.1.2 Highlights
    • Orion Integration, Version 2023.1.2
    • Orion Platform v5.1.1
      • Orion Client
    • Orion Integration, Version 2023.1.1
    • Orion Platform v5.0.0
      • Orion Client
      • Floe
    • Orion Integration, Version 2022.3.1
    • Orion Platform v4.5.4
      • Orion Client
    • Orion Platform v4.5.3
      • Orion Client
    • Orion Platform v4.5.2
      • Orion Client
    • Orion Platform v4.5.1
      • Floe
    • Orion Platform v4.5.0
      • Artemis
      • Orion Client
      • Floe
    • Orion Platform v4.4.1 September 2022
      • Orion Client
    • Orion Platform v4.4.0 June 2022
      • Floe
      • Orion Client
      • DRConvert
    • Orion Platform v4.3.2 May 2022
      • DRConvert
    • Orion Platform v4.3.1 February 2022
      • Orion Client
    • Orion Platform v4.3.0 February 2022
      • drconvert
      • Orion Client
      • Floe
    • Orion Platform v4.2.7 November 2021
    • Orion Platform v4.2.6 November 2021
    • Orion Platform v4.2.5 October 2021
      • Orion Client
    • Orion Platform v4.2.4 October 2021
      • Orion Client
    • Orion Platform v4.2.3 October 2021
    • Orion Platform v4.2.2 October 2021
      • Datarecord
    • Orion Platform v4.2.1 October 2021
      • Floe
    • Orion Platform v4.2.0 October 2021
      • Orion Client
      • DRConvert
    • Orion Platform v4.1.0 September 2021
      • Orion Client
      • Artemis
      • drconvert
      • Floe
    • Orion Platform v4.0.0 May 2021
      • Orion Client
      • Artemis
      • Floe
        • Bugfixes
      • DRConvert
      • Datarecord
    • Orion Platform v3.1.3 January 2021
    • Orion Platform v3.1.2 December 2020
    • Orion Platform v3.1.1 December 2020
    • Orion Platform v3.1.0 November 2020
    • Orion Platform v3.0.1 August 2020
    • Orion Platform v3.0.0 August 2020
      • Bugfixes:
    • Orion Platform v2.4.6 May 2020
    • Orion Platform v2.4.5 April 2020
    • Orion Platform v2.4.4 March 2020
    • Orion Platform v2.4.3 March 2020
    • Orion Platform v2.4.2 February 2020
    • Orion Platform v2.4.1 January 2020
    • Orion Platform v2.4.0 December 2019
    • Orion Platform v2.3.0: November 2019
    • Orion Platform v2.2.1 November 2019
      • Bugfixes:
    • Orion Platform v2.2.0 November 2019
      • Bugfixes:
    • Orion Platform v2.1.0 October 2019
    • Orion Platform v2.0.1 September 2019
      • Bugfixes:
    • Orion Platform v2.0.0 August 2019
    • Orion Platform v1.1.4 August 2019
    • Orion Platform v1.1.2 August 2019
    • Orion Platform v1.1.0 August 2019
      • Bugfixes:
    • Orion Platform v1.0.2 July 2019
    • Orion Platform v1.0.1 July 2019
    • Orion Platform v1.0.0 June 2019
      • Bugfixes:
    • Orion Platform v0.3.2 June 2019
    • Orion Platform v0.3.1 May 2019
      • Bugfixes:
    • Orion Platform v0.3.0 May 2019
    • Orion Platform v0.2.3 May 2019
      • Bugfixes:
    • Orion Platform v0.2.2 May 2019
      • Bugfixes:
      • Bugfixes
    • Orion Platform v0.2.1 March 2019
      • Bugfixes
    • Orion Platform v0.2.0 January 2019
    • Orion Platform v0.1.14 February 2019
    • Orion Platform v0.1.13 January 2019
    • Orion Platform v0.1.12 December 2018
    • Orion Platform v0.1.11 November 2018
    • Orion Platform v0.1.10 November 2018
    • Orion Platform v0.1.9 October 2018
    • Orion Platform v0.1.8 October 2018
    • Orion Platform v0.1.6 September 2018
    • Orion Platform v0.1.5 September 2018
    • Orion Platform v0.1.4 September 2018
  • OpenEye Glossary of Terms
  • Legal Notices
    • Copyright and Trademarks
    • Sample Code
    • Citation
      • Orion®
      • Orion Floes
      • Toolkits and Applications
      • Publications for Bibliographies
        • Orion
        • AFITT and FLYNN Applications
        • OEDocking Application and Toolkit
        • OMEGA Application and Toolkit
        • ROCS Application
      • OpenEye MMDS Web Service
    • Technology Licensing
    • GCC
      • GCC RUNTIME LIBRARY EXCEPTION
      • GNU GENERAL PUBLIC LICENSE
Orion Programming Documentation
  • All OpenEye Documentation »
  • Contents »
  • Working Programmatically with Molecule Search

Working Programmatically with Molecule Search

Note

This material was developed with version 6.3 of Orion Platform.

Use the molecule search commands to create and work with molecule search databases. These databases are suitable for use in Orion, to perform molecule searches on the Molecule Search Page.

Search Commands

ocli molsearch query create is the base command to create molecule search queries remotely.

There are five types of searches one can carry out: 3D, exact (2D), similarity (GraphSim 2D), substructure (2D), and title (2D), spelled out in greater detail via ocli molsearch query create -h.

> ocli molsearch query create -h

Usage: ocli molsearch query create [OPTIONS] COMMAND [ARGS]...

  Create a molecule search query.

Options:
  -h, --help  Show this message and exit.

Commands:
  exact         2D exact search using a SMILES string.
  fastrocs      3D similarity search using a FastROCS query.
  graphsim      2D similarity search using a GraphSim query.
  help          Long format of help CMD, or the first level commands if...
  substructure  2D substructure search using an OEChem substructure query.
  title         Search a 2D database using a space separated list of...

Exact 2D Search

Exact 2D search will search a 2D Molecule Search database with a SMILES string as input. The output of the help command is shown below.

> ocli molsearch query create exact -h

Usage: ocli molsearch query create exact [OPTIONS] DATABASE SMILES

  2D exact search using a SMILES string. Example: ocli molsearch query
  create exact <database_id> "CC(=O)Oc1ccccc1C(=O)O"

Options:
  --search-type [ISM|ABS|ISOMORPH|UNCOLOR]
                                  How to search.
  --name TEXT
  --max-hits INTEGER
  --wait
  --project INTEGER
  -h, --help                      Show this message and exit.

In the following example, we search for an exact match of ibuprofen in a database with ID 3070 using OCLI.

Note

Database IDs shown in OCLI examples in this chapter are for illustration. They will not be the same on your system.

> ocli molsearch query create exact 3070 "CC(C)Cc1ccc(cc1)C(C)C(=O)O" --name "Search for ibuprofen"

In the following example, we search for an exact match of ibuprofen in a database with ID 3070 using the orionclient API.

from orionclient.session import APISession
from orionclient.types import MolsearchQuery, MolsearchDatabase

def find_first_available_db(search_type='2D'):
    if search_type.upper() not in ['2D', '3D']:
        raise ValueError("Search type must be '2D' or '3D'")

    search_filter = {'search_type': search_type.upper()}

    db = None
    for db in APISession.list_resources(MolsearchDatabase, search_filter):
        if db.state == "LOADED":
            break

    if db is None or db.state != "LOADED":
        raise ValueError(f"No loaded {search_type} database found")

    return db


PROJECT = APISession.get_current_project().id

db = find_first_available_db()

exact_query = MolsearchQuery.create_exact_query(
    database_id=db.id,
    search_type="ISM",
    smiles="CC(C)Cc1ccc(cc1)C(C)C(=O)O",
    name="Search for ibuprofen",
    project=PROJECT,
    session=APISession,
)

To download this example, click here: exact query

Similarity (GraphSim) 2D Search

Similarity search will search a 2D Molecule Search database using GraphSim with a SMILES string as input. The output of the help command is shown below.

> ocli molsearch query create graphsim -h

Usage: ocli molsearch query create graphsim [OPTIONS] DATABASE SMILES

  2D similarity search using a GraphSim query. Example: ocli molsearch query
  create graphsim <database_id> "c1ccccc1"

Options:
  --fpname TEXT                   Fingerprint to use.
  --cutoff FLOAT                  Cutoff.
  --measure [Tanimoto|Tversky|Dice|Cosine]
                                  Similarity measure to use.
  --name TEXT
  --max-hits INTEGER
  --wait
  --project INTEGER
  -h, --help                      Show this message and exit.

The following examples demonstrate various GraphSim searches with different fingerprint types using OCLI:

> ocli molsearch query create graphsim 3070 "CC(C)Cc1cc(cnc1)C(=O)O" --name "my query 1"
> ocli molsearch query create graphsim 3070 "CC(C)Cc1cc(cnc1)C(=O)O" --name "my query 2" --fpname circular
> ocli molsearch query create graphsim 3070 "CC(C)Cc1cc(cnc1)C(=O)O" --name "my query 3" --fpname circularvs
> ocli molsearch query create graphsim 3070 "CC(C)Cc1cc(cnc1)C(=O)O" --name "my query 4" --fpname treevs
> ocli molsearch query create graphsim 3070 "CC(C)Cc1cc(cnc1)C(=O)O" --name "my query 5" --fpname path

The following example will launch a GraphSim search using the circularvs fingerprint with the orionclient API.

from orionclient.session import APISession
from orionclient.types import MolsearchQuery, MolsearchDatabase

def find_first_available_db(search_type='2D'):
    if search_type.upper() not in ['2D', '3D']:
        raise ValueError("Search type must be '2D' or '3D'")

    search_filter = {'search_type': search_type.upper()}

    db = None
    for db in APISession.list_resources(MolsearchDatabase, search_filter):
        if db.state == "LOADED":
            break

    if db is None or db.state != "LOADED":
        raise ValueError(f"No loaded {search_type} database found")

    return db


PROJECT = APISession.get_current_project().id

db = find_first_available_db()

graphsim_query = MolsearchQuery.create_graphsim_query(
    database_id=db.id,
    smiles="CC(C)Cc1cc(cnc1)C(=O)O",
    fingerprint_type="circularvs",
    max_hits=100,
    similarity_measure_type="tanimoto",
    name="my query 3",
    project=PROJECT,
    cutoff=0,
    session=APISession,
)

To download this example, click here: graphsim query

Substructure 2D Search

Substructure search will search a 2D Molecule Search database using an OEChem substructure query. This search type supports both molecule files (MDLJSON) and SMARTS patterns as query molecules. The output of the help command is shown below.

> ocli molsearch query create substructure -h

Usage: ocli molsearch query create substructure [OPTIONS] DATABASE INPUT_QUERY

  2D substructure search using an OEChem substructure query.

  Constraints only applicable for MDLJSON queries.

  Examples:
      ocli molsearch query create substructure 1 path/to/file
      ocli molsearch query create substructure --subsearch_query_type SMARTS 1 c1cocc1

Options:
  --subsearch_query_type [MDLJSON|SMARTS]
  --aliphatic-constraint          only applicable for MDLJSON queries
  --topology-constraint           only applicable for MDLJSON queries
  --stereo-constraint             only applicable for MDLJSON queries
  --isotope-constraint            only applicable for MDLJSON queries
  --name TEXT
  --max-hits INTEGER
  --wait
  --cancel-after INTEGER
  --project INTEGER
  -h, --help                      Show this message and exit.

The following example demonstrates a substructure search with 4-ethyltoluene as the query using OCLI. This query is an “MDLJSON” query type (an MDL file converted to json and used as input). MDLJSON is the default query type, so it is not specified in the command.

> ocli molsearch query create substructure 694 p_ethyltoluene.sdf

The following example demonstrates a substructure search with 4-ethyltoluene as the query using OCLI. This query is a “SMARTS” query type (SMARTS pattern as input).

> ocli molsearch query create substructure --subsearch_query_type SMARTS 694 "CCc1ccc(cc1)C" --name "p-ethyltoluene from SMARTS"

The following example demonstrates a substructure search with 4-ethyltoluene as the query using the orionclient API.

from orionclient.session import APISession
from orionclient.types import MolsearchQuery, MolsearchDatabase
from openeye.oechem import OEMol, oemolistream, OEReadMolecule, OEThrow

def find_first_available_db(search_type='2D'):
    if search_type.upper() not in ['2D', '3D']:
        raise ValueError("Search type must be '2D' or '3D'")

    search_filter = {'search_type': search_type.upper()}

    db = None
    for db in APISession.list_resources(MolsearchDatabase, search_filter):
        if db.state == "LOADED":
            break

    if db is None or db.state != "LOADED":
        raise ValueError(f"No loaded {search_type} database found")

    return db


PROJECT = APISession.get_current_project().id

db = find_first_available_db()

ifs = oemolistream()
filename = "p_ethyltoluene.sdf"
if not ifs.open(filename):
    OEThrow.Fatal(f"Unable to open file {filename}")

mol = OEMol()
OEReadMolecule(ifs, mol)
ifs.close()

subsearch_query = MolsearchQuery.create_subsearch_query(
    database_id=db.id,
    num_hits=100,
    mdlquery=mol,
    subsearch_query_type="MDLJSON",
    name="p_ethyltoluene subsearch",
    project=PROJECT,
    aliphatic_constraint=False,
    topology_constraint=False,
    stereo_constraint=False,
    isotope_constraint=False,
    session=APISession,
)

To download this example, click here: subsearch query

Title 2D Search

Title search will search a 2D database using a space separated list of titles. The output of the help command is shown below.

> ocli molsearch query create title -h

Usage: ocli molsearch query create title [OPTIONS] DATABASE [TITLE]...

  Search a 2D database using a space separated list of titles Example: ocli
  molsearch query create title <database_id> <titles_list>
  --project=<project_id>

Options:
  --name TEXT
  --project INTEGER
  -h, --help         Show this message and exit.

The following example demonstrates a title search for ibuprofen and acetaminophen using OCLI.

> ocli molsearch query create title 3070 ibuprofen acetaminophen

The following example demonstrates a title search for ibuprofen and acetaminophen using the orionclient API.

from orionclient.session import APISession
from orionclient.types import MolsearchQuery, MolsearchDatabase

def find_first_available_db(search_type='2D'):
    if search_type.upper() not in ['2D', '3D']:
        raise ValueError("Search type must be '2D' or '3D'")

    search_filter = {'search_type': search_type.upper()}

    db = None
    for db in APISession.list_resources(MolsearchDatabase, search_filter):
        if db.state == "LOADED":
            break

    if db is None or db.state != "LOADED":
        raise ValueError(f"No loaded {search_type} database found")

    return db


PROJECT = APISession.get_current_project().id

db = find_first_available_db()

title_query = MolsearchQuery.create_title_query(
    name="Title query ibuprofen, acetaminophen",
    database_id=db.id,
    titles=["ibuprofen", "Acetaminophen"],
    project=PROJECT,
    session=APISession,
)

To download this example, click here: title query

3D Similarity Search

3D similarity search will search a 3D database using a FastROCS query. The output of the help command is shown below.

> ocli molsearch query create fastrocs -h

Usage: ocli molsearch query create fastrocs [OPTIONS] DATABASE
                                            QUERY_MOL_FILE_PATH

  3D similarity search using a fastrocs query

  query_mol_file_path can be any file supported by 3D search: .sdf, .mol,
  .mol2, .pdb, .ent, or .oeb Example: ocli molsearch query create fastrocs
  <database_id> <file>

Options:
  --name TEXT
  --max-hits INTEGER
  --shape-only
  --sim-type [tanimoto|tversky]
  --tversky-alpha FLOAT
  --orientation [inertial|inertialAtHeavyAtoms|inertialAtColorAtoms|subrocs|random]
  --random-starts INTEGER
  --wait
  --project INTEGER
  -h, --help                      Show this message and exit.

The following example demonstrates a 3D search for ibuprofen using OCLI.

> ocli molsearch query create fastrocs --name "Ibuprofen_3d_search" 1222 ibuprofen_3d.sdf

The following example demonstrates a 3D search for ibuprofen using the orionclient API.

from orionclient.session import APISession
from orionclient.types import MolsearchQuery, MolsearchDatabase
from openeye.oechem import OEMol, oemolistream, OEReadMolecule, OEThrow

def find_first_available_db(search_type='2D'):
    if search_type.upper() not in ['2D', '3D']:
        raise ValueError("Search type must be '2D' or '3D'")

    search_filter = {'search_type': search_type.upper()}

    db = None
    for db in APISession.list_resources(MolsearchDatabase, search_filter):
        if db.state == "LOADED":
            break

    if db is None or db.state != "LOADED":
        raise ValueError(f"No loaded {search_type} database found")

    return db


PROJECT = APISession.get_current_project().id

db3d = find_first_available_db(search_type='3D')

ifs = oemolistream()
filename = "ibuprofen.sdf"

if not ifs.open(filename):
    OEThrow.Fatal(f"Unable to open file {filename}")

mol = OEMol()
OEReadMolecule(ifs, mol)
ifs.close()

fastrocs_query = MolsearchQuery.create_fastrocs_query(
    database_id=db3d.id,
    query_mol=mol,
    max_hits=100,
    name="Ibuprofen_3d_search",
    project=PROJECT,
    shape_only=False,
    sim_type="tanimoto",
    session=APISession,
)

To download this example, click here: ibuprofen search

The following example demonstrates a 3D search using the orionclient API, against a shape query built from ibuprofen.

from orionclient.session import APISession
from orionclient.types import MolsearchQuery, MolsearchDatabase
from openeye.oechem import OEMol, oeifstream, OEReadMolecule, OEThrow
from openeye.oeshape import OEShapeQuery, OEReadShapeQuery

def find_first_available_db(search_type='2D'):
    if search_type.upper() not in ['2D', '3D']:
        raise ValueError("Search type must be '2D' or '3D'")

    search_filter = {'search_type': search_type.upper()}

    db = None
    for db in APISession.list_resources(MolsearchDatabase, search_filter):
        if db.state == "LOADED":
            break

    if db is None or db.state != "LOADED":
        raise ValueError(f"No loaded {search_type} database found")

    return db


PROJECT = APISession.get_current_project().id

db3d = find_first_available_db(search_type='3D')

ifs = oeifstream()
filename = "ibuprofen.sq"

if not ifs.open(filename):
    OEThrow.Fatal(f"Unable to open file {filename}")

sq = OEShapeQuery()
OEReadShapeQuery(ifs, sq)
ifs.close()

mol = OEMol()
sq.GetCompositeMolecule(mol)

fastrocs_query = MolsearchQuery.create_fastrocs_query(
    database_id=db3d.id,
    query_mol=mol,
    max_hits=100,
    name="Ibuprofen_3d_search",
    project=PROJECT,
    shape_only=False,
    sim_type="tanimoto",
    session=APISession,
)

To download this example, click here: search with ibuprofen query

Commands for Information Gathering

The following ocli molsearch commands can be used for gathering various information including getting a molecule’s SMILES string given the name, getting information on a specific query, listing all queries, and listing results of a query.

Getting a SMILES String from a Name

The command ocli molsearch query smiles will return the SMILES string of a molecule given the name. The output of the help command is shown below.

> ocli molsearch query smiles -h

Usage: ocli molsearch query smiles [OPTIONS] TITLE

  Get SMILES string for the molecule with the given title.

  Searches all loaded 2D databases.

  Example: ocli molsearch query smiles aspirin

Options:
  -h, --help  Show this message and exit.

The following example demonstrates this capability for ibuprofen.

> ocli molsearch query smiles ibuprofen

CC(C)Cc1ccc(cc1)C(C)C(=O)O

Getting Information about a Specific Query

The command ocli molsearch query info will print information about a query, requiring the query ID as input. The output of the help command is shown below.

> ocli molsearch query info -h

Usage: ocli molsearch query info [OPTIONS] QUERY

  Information about a specific molecule search query. Example: ocli
  molsearch query info <query_id>

Options:
  -h, --help  Show this message and exit.

The information displayed upon execution of this command depends on the query type. For example, a GraphSim query will print the ID, owner, name, state, date/time of creation, database name, query type, cutoff, similarity measure type, SMILES string, and other information.

Listing All Queries

To list all queries, use this command: ocli molsearch query list. There are various options that can be enabled for this list. These are shown in the output of the help command below.

> ocli molsearch query list

Usage: ocli molsearch query list [OPTIONS]

  List all MolSearch queries Example: ocli molsearch query list

Options:
  --offset INTEGER        Offset from the start of the list.
  --limit INTEGER         Number of objects to list.
  --query-type TEXT       Type of query to list.
  --project INTEGER
  --programmatic BOOLEAN  Filter queries created by ocli (True), UI (False),
                          or all queries (None).
  -h, --help              Show this message and exit.

Listing the Results of a Query

The command ocli molsearch query list-results will list the results of a query, requiring the query ID as input. The output of the help command is shown below.

> ocli molsearch query list-results -h

Usage: ocli molsearch query list-results [OPTIONS] QUERY

  List results for a single molecule search query. Example: ocli molsearch
  query list-results <query_id>

Options:
  --offset INTEGER  Offset from the start of the list.
  --limit INTEGER   Number of objects to list.
  -h, --help        Show this message and exit.

The output of this command depends on the query type, but the general format is that each displayed query is a row, and each piece of information about each query is a column. The queries are printed in descending order of similarity score.

Get Information about a Searchable Database

The command ocli molsearch db info will print information about the requested database, including but not limited to the user ID of the owner, the display name, the molecule count, and the search type (2D/3D). The output of the help command is shown below.

> ocli molsearch db info -h

Usage: ocli molsearch db info [OPTIONS] DATABASE

  Information about a specific molecule search database.

Options:
  -h, --help  Show this message and exit.

The following example demonstrates this.

> ocli molsearch info 4588

  id    owner  state    database_name                database_version    display_name                 error      molecules_count  search_type    section
----  -------  -------  ---------------------------  ------------------  ---------------------------  -------  -----------------  -------------  -----------
4588     1255  LOADED   FDA Approved Drugs - ChEMBL                      FDA Approved Drugs - ChEMBL                        7885  2D             OE provided

Search Parameters:
  fingerprint_types: circular, circularvs, path, pathvs, tree, treevs
  substructure_search_types: MDL, SMARTS

Show the Sharing Status of a Custom Database

Databases that are “customer managed” can be shared with various users, projects, and organizations on Orion. The users, projects, and organizations that have access to a given database, along with their level of access (read, write, share, delete), can be seen via the following OCLI command.

> ocli molsearch db list-shares 495

For information on how to share a custom database with a user, project, or organization, see the database sharing subsection.

Query Manipulation Commands

For reference information on each of the commands, see the molecule search query commands.

Update a Molecule Search Query

The command ocli molsearch query update will update specified attributes of a molecule search query. Currently, only the query name and “saved” attributes are supported. Setting “saved” to True will pin the query, preventing deletion after the 30 day grace period. Setting “saved” to False will unpin the query. The output of the help command is shown below.

> ocli molsearch query update -h

Usage: ocli molsearch query update [OPTIONS] QUERY

  Update a molecule search query.

  Example: ocli molsearch query update <query_id> --name=<query_name> --saved=<True|False>

Options:
  --name TEXT      Query name.
  --saved BOOLEAN  Save query.
  -h, --help       Show this message and exit.

Export Query Results to a Dataset

The command ocli molsearch query export will export the query results to a dataset on Orion. This command requires the ID of the query as an argument as well as the desired output project ID and output path on Orion. The IDs of accessible projects can be seen via ocli projects list. These will appear adjacent to their corresponding names on Orion. The path for the My Data folder in Orion is of the form /project/<project_id>/user/<username>/. Any subfolders and sub-subfolders can be appended to the path in the standard Unix way (separated by forward slashes). The Team Data folder in Orion is of the form /project/<project_id>/team. Optionally, selected hits can be exported via the --ids= flag. The value passed into this flag should be a comma-separated list of hit IDs. Hit IDs for a given query can be viewed via ocli molsearch query list-results <query_ID>.

The output of the help command is shown below.

> ocli molsearch query export -h

Usage: ocli molsearch query export [OPTIONS] QUERY PROJECT PATH

  Export search results to a dataset.

  Example:
      ocli molsearch query export <id> <project_id> /project/<project_id>/user/<username>/<path> --ids=1,2,3

Options:
  --name TEXT    Name of output dataset.
  --format TEXT  Output format.
  --background   Export in async mode?
  --ids TEXT     Comma-separated list of result IDs to export.
  -h, --help     Show this message and exit.

Download Results and Query to Separate Files

The command ocli molsearch query download will download a query and its search results into two separate local files. There are optional arguments to only download selected hit IDs (–id), as well as to exclude certain hit IDs (–exclude). The output filename for the results is a required positional argument, and the filename of the query molecule is derived from that as shown in the output of the help command below.

> ocli molsearch query download -h

Usage: ocli molsearch query download [OPTIONS] QUERY FILE

  Download molsearch results and query to separate local files

  File format is determined by filename extension. Supported formats are csv, smi, sdf, oeb

  Downloaded query file will be named <result_file_name>-query.<ext>

  Example:
      ocli molsearch query download 1 search-results.csv --ids 1234,1236

Options:
  --ids TEXT  Comma-separated list of result IDs to export.
  --exclude   Exclude IDs from results.
  -h, --help  Show this message and exit.

In the following example, I download the results of query 246755 to a file called my_search_results.sdf, requesting only two of the hits be downloaded to the file.

> ocli molsearch query download 246755 my_search_results.sdf --ids=25301210,25301209

This also created an output file containing the query molecule called my_search_results-query.sdf.

Delete a Query

A query can be deleted via the command ocli molsearch query delete, requiring only the query ID as an argument. The output of the help command is shown below.

> ocli molsearch query delete -h

Usage: ocli molsearch query delete [OPTIONS] QUERY

  Delete a molecule search query.

  Example: ocli molsearch query delete <query_id>

Options:
  -h, --help  Show this message and exit.

Database Manipulation Commands

For reference information on each of the commands, see the molecule search database commands.

Create a Database from a Prepared Collection

The command ocli molsearch db create can be used to create a molecule search database from a prepared collection. The database will initially be in the QUEUED state, before it ends up in the LOADED state. The memory required for a 2D database can be estimated as 1.2X the size of the collection. To estimate the memory needed for a 3D database, see the FastROCS Architecture. The search type (2D or 3D) is also a required argument. The example below is for a prepared 2D, 5GB collection.

> ocli collections info <collection_id>
> ocli molsearch db create --collection=145856 --memory-mb=6000 --search-type=2D

Load/Unload a Database

Custom databases that have been created must be loaded onto a compute instance before searches can be carried out against them. The OCLI command ocli molsearch db load <db_id> will load the database. This loaded database will incur on-demand pricing on AWS ec2 instances. Databases can be unloaded, which releases the compute instance, via ocli molsearch db unload <db_id>.

When loading a database, it will initially be in a QUEUED state until it receives a compute resource. When unloading a database, it will initially be in an UNLOADING state. Once complete, it will be in an UNLOADED state. The database cannot be loaded again until unloading completes. The example orionclient script below demonstrates the intermediate states accessed by a molecule search database during loading and unloading by showing how to unload a loaded database, waiting for the unloading to complete, loading the database again, and waiting for the loading to complete, printing the state of the database every time it changes.

from orionclient.session import APISession
from orionclient.types import MolsearchDatabase
import time

def monitor_db_until_desired_state(db, desired_state):
    if desired_state not in ["LOADED", "UNLOADED"]:
        raise ValueError("Desired state must be either LOADED or UNLOADED")

    state = ''

    # Wait for completion
    while db.state != desired_state:
        APISession.refresh_resource(db)
        if state != db.state:
            print(f"DB state: {db.state}")
        time.sleep(1)

        state = db.state

    if state == '':
        print(f"DB state: {db.state}")

if __name__ == '__main__':
    db_id = <your_db_id>
    db = APISession.get_resource(MolsearchDatabase, db_id)

    db.unload()
    monitor_db_until_desired_state(db, 'UNLOADED')

    db.load()
    monitor_db_until_desired_state(db, 'LOADED')

To download this example, click here: test for loaded

Share/Unshare a Database

Custom databases can be shared with users, projects, and organizations via the OCLI command ocli molsearch db share. The output of the help command is shown below.

> ocli molsearch db share -h

Usage: ocli molsearch db share [OPTIONS] IDENTIFIER
                               {user|project|organization} [SHARE_IDS]...

  Share a molecule search database.

Options:
  --write
  --delete
  --share
  -h, --help  Show this message and exit.

There are four permissions: read, write, delete, and share. The latter three must be specified with a flag, and all sharing grants read access.

ocli molsearch db unshare is similar to sharing, but without permissions flags. All permissions will be removed when a database is unshared.

orionclient User Stories

This section contains a collection of complex, multistep tasks in the form of orionclient API scripts.

Molecule Input

3D queries submitted via orionclient accept an OEMol instance as the query molecule. The following examples demonstrate how to extract a molecule that can be passed in as an argument to query constructors. Each example molecule has a different starting point: a record on a dataset, a molecule file, a shape query file. Each example stores the final molecule in a variable called “mol” that should be passed in directly to a MolsearchQuery constructor.

"""
Example 1: Extract molecule from the first record on an Orion dataset
"""
from orionclient.session import APISession
from orionclient.types import Dataset
from openeye.oechem import (OEMol, OEReadMolecule, OEThrow, OEPrimaryMolField,
    OEMolToSmiles)

ds = APISession.get_resource(Dataset, <your_dataset_id>)

molfield = OEPrimaryMolField()

for rec in ds.records():
    if rec.has_field(molfield):
        mol = rec.get_value(molfield)
        break

To download this example, click here: extract from first

"""
Example 2: Extract molecule from a molecule file
"""
from openeye.oechem import oemolistream, OEThrow, OEMol, OEReadMolecule

ifs = oemolistream()
filename = "<your_molecule_filename>"

if not ifs.open(filename):
    OEThrow.Fatal(f"Unable to open file {filename}")

mol = OEMol()
OEReadMolecule(ifs, mol)
ifs.close()

To download this example, click here: extract from molecule

"""
Example 3: Extract shape query from a shape query file, and convert to
searchable molecule
"""
from openeye.oechem import oeifstream, OEThrow, OEMol
from openeye.oeshape import OEShapeQuery, OEReadShapeQuery

ifs = oeifstream()
filename = "<your_shape_query_filename>.sq"

if not ifs.open(filename):
    OEThrow.Fatal(f"Unable to open file {filename}")

sq = OEShapeQuery()
OEReadShapeQuery(ifs, sq)
ifs.close()

mol = OEMol()
sq.GetCompositeMolecule(mol)

To download this example, click here: extract from shape query

Query Round Trip

In some cases, it may be desired for the search itself to be short-lived, with the the results exported to a dataset, and the search deleted immediately after the export. The following example demonstrates how to do this.

"""
'Query Round Trip' example script
Usage:
    python <script_name>.py <molecule_file_containing_query>
"""
from orionclient.session import APISession
from orionclient.types import MolsearchQuery, MolsearchDatabase
from openeye.oechem import OEMol, oemolistream, OEReadMolecule, OEThrow
import time
import sys

def monitor_query_until_complete(query):
    state = ''

    # Wait for completion
    while query.state != 'SUCCESS':
        APISession.refresh_resource(query)
        if state != query.state:
            print(f"Query state: {query.state}")
        time.sleep(1)

        state = query.state

    if state == '':
        print(f"Query state: {query.state}")

def find_first_available_db(search_type='2D'):
    if search_type.upper() not in ['2D', '3D']:
        raise ValueError("Search type must be '2D' or '3D'")

    search_filter = {'search_type': search_type.upper()}

    db = None
    for db in APISession.list_resources(MolsearchDatabase, search_filter):
        if db.state == "LOADED":
            break

    if db is None or db.state != "LOADED":
        raise ValueError(f"No loaded {search_type} database found")

    return db

if __name__ == '__main__':
    PROJECT = APISession.get_current_project().id
    USER = APISession.get_user_profile()

    db = find_first_available_db()

    ifs = oemolistream()
    filename = sys.argv[1]
    if not ifs.open(filename):
        OEThrow.Fatal(f"Unable to open file {filename}")

    mol = OEMol()
    OEReadMolecule(ifs, mol)
    ifs.close()

    # Create molecule search query
    subsearch_query = MolsearchQuery.create_subsearch_query(
        database_id=db.id,
        num_hits=100,
        mdlquery=mol,
        subsearch_query_type="MDLJSON",
        name=f"{filename} subsearch",
        project=PROJECT,
        aliphatic_constraint=False,
        topology_constraint=False,
        stereo_constraint=False,
        isotope_constraint=False,
        session=APISession,
    )

    if subsearch_query.state in ['QUEUED', 'PROCESSING', 'SUCCESS']:
        monitor_query_until_complete(subsearch_query)
    else:
        raise SystemExit("Query was found to be in an unexpected state,"
                        f" {subsearch_query.state}")

    # Export query results to an Orion dataset
    ds_id = subsearch_query.export(
        name=subsearch_query.name,
        session=APISession,
        project=PROJECT,
        path=f"/project/{PROJECT}/user/{USER.username}",
    )

    print(f"Exported molecule search query to dataset with ID {ds_id}")

    # Delete molecule search query
    APISession.delete_resource(subsearch_query)

    print("Deleted the associated query")

To download this example, click here: query round trip

Previous Next

© Copyright 2025, Cadence Design Systems, Inc.