Working Programmatically with Molecule Search
Note
This material was developed with version 6.3 of Orion Platform.
Use the molecule search commands to create and work with molecule search databases. These databases are suitable for use in Orion, to perform molecule searches on the Molecule Search Page.
Search Commands
ocli molsearch query create
is the base command to create molecule search queries remotely.
There are five types of searches one can carry out: 3D, exact (2D), similarity (GraphSim 2D),
substructure (2D), and title (2D), spelled out in greater detail via ocli molsearch query create -h
.
> ocli molsearch query create -h
Usage: ocli molsearch query create [OPTIONS] COMMAND [ARGS]...
Create a molecule search query.
Options:
-h, --help Show this message and exit.
Commands:
exact 2D exact search using a SMILES string.
fastrocs 3D similarity search using a FastROCS query.
graphsim 2D similarity search using a GraphSim query.
help Long format of help CMD, or the first level commands if...
substructure 2D substructure search using an OEChem substructure query.
title Search a 2D database using a space separated list of...
Exact 2D Search
Exact 2D search will search a 2D Molecule Search database with a SMILES string as input. The output of the help command is shown below.
> ocli molsearch query create exact -h
Usage: ocli molsearch query create exact [OPTIONS] DATABASE SMILES
2D exact search using a SMILES string. Example: ocli molsearch query
create exact <database_id> "CC(=O)Oc1ccccc1C(=O)O"
Options:
--search-type [ISM|ABS|ISOMORPH|UNCOLOR]
How to search.
--name TEXT
--max-hits INTEGER
--wait
--project INTEGER
-h, --help Show this message and exit.
In the following example, we search for an exact match of ibuprofen in a database with ID 3070 using OCLI.
Note
Database IDs shown in OCLI examples in this chapter are for illustration. They will not be the same on your system.
> ocli molsearch query create exact 3070 "CC(C)Cc1ccc(cc1)C(C)C(=O)O" --name "Search for ibuprofen"
In the following example, we search for an exact match of ibuprofen in a database with ID 3070 using the orionclient API.
from orionclient.session import APISession
from orionclient.types import MolsearchQuery, MolsearchDatabase
def find_first_available_db(search_type='2D'):
if search_type.upper() not in ['2D', '3D']:
raise ValueError("Search type must be '2D' or '3D'")
search_filter = {'search_type': search_type.upper()}
db = None
for db in APISession.list_resources(MolsearchDatabase, search_filter):
if db.state == "LOADED":
break
if db is None or db.state != "LOADED":
raise ValueError(f"No loaded {search_type} database found")
return db
PROJECT = APISession.get_current_project().id
db = find_first_available_db()
exact_query = MolsearchQuery.create_exact_query(
database_id=db.id,
search_type="ISM",
smiles="CC(C)Cc1ccc(cc1)C(C)C(=O)O",
name="Search for ibuprofen",
project=PROJECT,
session=APISession,
)
To download this example, click here: exact query
Similarity (GraphSim) 2D Search
Similarity search will search a 2D Molecule Search database using GraphSim with a SMILES string as input. The output of the help command is shown below.
> ocli molsearch query create graphsim -h
Usage: ocli molsearch query create graphsim [OPTIONS] DATABASE SMILES
2D similarity search using a GraphSim query. Example: ocli molsearch query
create graphsim <database_id> "c1ccccc1"
Options:
--fpname TEXT Fingerprint to use.
--cutoff FLOAT Cutoff.
--measure [Tanimoto|Tversky|Dice|Cosine]
Similarity measure to use.
--name TEXT
--max-hits INTEGER
--wait
--project INTEGER
-h, --help Show this message and exit.
The following examples demonstrate various GraphSim searches with different fingerprint types using OCLI:
> ocli molsearch query create graphsim 3070 "CC(C)Cc1cc(cnc1)C(=O)O" --name "my query 1"
> ocli molsearch query create graphsim 3070 "CC(C)Cc1cc(cnc1)C(=O)O" --name "my query 2" --fpname circular
> ocli molsearch query create graphsim 3070 "CC(C)Cc1cc(cnc1)C(=O)O" --name "my query 3" --fpname circularvs
> ocli molsearch query create graphsim 3070 "CC(C)Cc1cc(cnc1)C(=O)O" --name "my query 4" --fpname treevs
> ocli molsearch query create graphsim 3070 "CC(C)Cc1cc(cnc1)C(=O)O" --name "my query 5" --fpname path
The following example will launch a GraphSim search using the circularvs fingerprint with the orionclient API.
from orionclient.session import APISession
from orionclient.types import MolsearchQuery, MolsearchDatabase
def find_first_available_db(search_type='2D'):
if search_type.upper() not in ['2D', '3D']:
raise ValueError("Search type must be '2D' or '3D'")
search_filter = {'search_type': search_type.upper()}
db = None
for db in APISession.list_resources(MolsearchDatabase, search_filter):
if db.state == "LOADED":
break
if db is None or db.state != "LOADED":
raise ValueError(f"No loaded {search_type} database found")
return db
PROJECT = APISession.get_current_project().id
db = find_first_available_db()
graphsim_query = MolsearchQuery.create_graphsim_query(
database_id=db.id,
smiles="CC(C)Cc1cc(cnc1)C(=O)O",
fingerprint_type="circularvs",
max_hits=100,
similarity_measure_type="tanimoto",
name="my query 3",
project=PROJECT,
cutoff=0,
session=APISession,
)
To download this example, click here: graphsim query
Substructure 2D Search
Substructure search will search a 2D Molecule Search database using an OEChem substructure query. This search type supports both molecule files (MDLJSON) and SMARTS patterns as query molecules. The output of the help command is shown below.
> ocli molsearch query create substructure -h
Usage: ocli molsearch query create substructure [OPTIONS] DATABASE INPUT_QUERY
2D substructure search using an OEChem substructure query.
Constraints only applicable for MDLJSON queries.
Examples:
ocli molsearch query create substructure 1 path/to/file
ocli molsearch query create substructure --subsearch_query_type SMARTS 1 c1cocc1
Options:
--subsearch_query_type [MDLJSON|SMARTS]
--aliphatic-constraint only applicable for MDLJSON queries
--topology-constraint only applicable for MDLJSON queries
--stereo-constraint only applicable for MDLJSON queries
--isotope-constraint only applicable for MDLJSON queries
--name TEXT
--max-hits INTEGER
--wait
--cancel-after INTEGER
--project INTEGER
-h, --help Show this message and exit.
The following example demonstrates a substructure search with 4-ethyltoluene as the query using OCLI. This query is an “MDLJSON” query type (an MDL file converted to json and used as input). MDLJSON is the default query type, so it is not specified in the command.
> ocli molsearch query create substructure 694 p_ethyltoluene.sdf
The following example demonstrates a substructure search with 4-ethyltoluene as the query using OCLI. This query is a “SMARTS” query type (SMARTS pattern as input).
> ocli molsearch query create substructure --subsearch_query_type SMARTS 694 "CCc1ccc(cc1)C" --name "p-ethyltoluene from SMARTS"
The following example demonstrates a substructure search with 4-ethyltoluene as the query using the orionclient API.
from orionclient.session import APISession
from orionclient.types import MolsearchQuery, MolsearchDatabase
from openeye.oechem import OEMol, oemolistream, OEReadMolecule, OEThrow
def find_first_available_db(search_type='2D'):
if search_type.upper() not in ['2D', '3D']:
raise ValueError("Search type must be '2D' or '3D'")
search_filter = {'search_type': search_type.upper()}
db = None
for db in APISession.list_resources(MolsearchDatabase, search_filter):
if db.state == "LOADED":
break
if db is None or db.state != "LOADED":
raise ValueError(f"No loaded {search_type} database found")
return db
PROJECT = APISession.get_current_project().id
db = find_first_available_db()
ifs = oemolistream()
filename = "p_ethyltoluene.sdf"
if not ifs.open(filename):
OEThrow.Fatal(f"Unable to open file {filename}")
mol = OEMol()
OEReadMolecule(ifs, mol)
ifs.close()
subsearch_query = MolsearchQuery.create_subsearch_query(
database_id=db.id,
num_hits=100,
mdlquery=mol,
subsearch_query_type="MDLJSON",
name="p_ethyltoluene subsearch",
project=PROJECT,
aliphatic_constraint=False,
topology_constraint=False,
stereo_constraint=False,
isotope_constraint=False,
session=APISession,
)
To download this example, click here: subsearch query
Title 2D Search
Title search will search a 2D database using a space separated list of titles. The output of the help command is shown below.
> ocli molsearch query create title -h
Usage: ocli molsearch query create title [OPTIONS] DATABASE [TITLE]...
Search a 2D database using a space separated list of titles Example: ocli
molsearch query create title <database_id> <titles_list>
--project=<project_id>
Options:
--name TEXT
--project INTEGER
-h, --help Show this message and exit.
The following example demonstrates a title search for ibuprofen and acetaminophen using OCLI.
> ocli molsearch query create title 3070 ibuprofen acetaminophen
The following example demonstrates a title search for ibuprofen and acetaminophen using the orionclient API.
from orionclient.session import APISession
from orionclient.types import MolsearchQuery, MolsearchDatabase
def find_first_available_db(search_type='2D'):
if search_type.upper() not in ['2D', '3D']:
raise ValueError("Search type must be '2D' or '3D'")
search_filter = {'search_type': search_type.upper()}
db = None
for db in APISession.list_resources(MolsearchDatabase, search_filter):
if db.state == "LOADED":
break
if db is None or db.state != "LOADED":
raise ValueError(f"No loaded {search_type} database found")
return db
PROJECT = APISession.get_current_project().id
db = find_first_available_db()
title_query = MolsearchQuery.create_title_query(
name="Title query ibuprofen, acetaminophen",
database_id=db.id,
titles=["ibuprofen", "Acetaminophen"],
project=PROJECT,
session=APISession,
)
To download this example, click here: title query
3D Similarity Search
3D similarity search will search a 3D database using a FastROCS query. The output of the help command is shown below.
> ocli molsearch query create fastrocs -h
Usage: ocli molsearch query create fastrocs [OPTIONS] DATABASE
QUERY_MOL_FILE_PATH
3D similarity search using a fastrocs query
query_mol_file_path can be any file supported by 3D search: .sdf, .mol,
.mol2, .pdb, .ent, or .oeb Example: ocli molsearch query create fastrocs
<database_id> <file>
Options:
--name TEXT
--max-hits INTEGER
--shape-only
--sim-type [tanimoto|tversky]
--tversky-alpha FLOAT
--orientation [inertial|inertialAtHeavyAtoms|inertialAtColorAtoms|subrocs|random]
--random-starts INTEGER
--wait
--project INTEGER
-h, --help Show this message and exit.
The following example demonstrates a 3D search for ibuprofen using OCLI.
> ocli molsearch query create fastrocs --name "Ibuprofen_3d_search" 1222 ibuprofen_3d.sdf
The following example demonstrates a 3D search for ibuprofen using the orionclient API.
from orionclient.session import APISession
from orionclient.types import MolsearchQuery, MolsearchDatabase
from openeye.oechem import OEMol, oemolistream, OEReadMolecule, OEThrow
def find_first_available_db(search_type='2D'):
if search_type.upper() not in ['2D', '3D']:
raise ValueError("Search type must be '2D' or '3D'")
search_filter = {'search_type': search_type.upper()}
db = None
for db in APISession.list_resources(MolsearchDatabase, search_filter):
if db.state == "LOADED":
break
if db is None or db.state != "LOADED":
raise ValueError(f"No loaded {search_type} database found")
return db
PROJECT = APISession.get_current_project().id
db3d = find_first_available_db(search_type='3D')
ifs = oemolistream()
filename = "ibuprofen.sdf"
if not ifs.open(filename):
OEThrow.Fatal(f"Unable to open file {filename}")
mol = OEMol()
OEReadMolecule(ifs, mol)
ifs.close()
fastrocs_query = MolsearchQuery.create_fastrocs_query(
database_id=db3d.id,
query_mol=mol,
max_hits=100,
name="Ibuprofen_3d_search",
project=PROJECT,
shape_only=False,
sim_type="tanimoto",
session=APISession,
)
To download this example, click here: ibuprofen search
The following example demonstrates a 3D search using the orionclient API, against a shape query built from ibuprofen.
from orionclient.session import APISession
from orionclient.types import MolsearchQuery, MolsearchDatabase
from openeye.oechem import OEMol, oeifstream, OEReadMolecule, OEThrow
from openeye.oeshape import OEShapeQuery, OEReadShapeQuery
def find_first_available_db(search_type='2D'):
if search_type.upper() not in ['2D', '3D']:
raise ValueError("Search type must be '2D' or '3D'")
search_filter = {'search_type': search_type.upper()}
db = None
for db in APISession.list_resources(MolsearchDatabase, search_filter):
if db.state == "LOADED":
break
if db is None or db.state != "LOADED":
raise ValueError(f"No loaded {search_type} database found")
return db
PROJECT = APISession.get_current_project().id
db3d = find_first_available_db(search_type='3D')
ifs = oeifstream()
filename = "ibuprofen.sq"
if not ifs.open(filename):
OEThrow.Fatal(f"Unable to open file {filename}")
sq = OEShapeQuery()
OEReadShapeQuery(ifs, sq)
ifs.close()
mol = OEMol()
sq.GetCompositeMolecule(mol)
fastrocs_query = MolsearchQuery.create_fastrocs_query(
database_id=db3d.id,
query_mol=mol,
max_hits=100,
name="Ibuprofen_3d_search",
project=PROJECT,
shape_only=False,
sim_type="tanimoto",
session=APISession,
)
To download this example, click here: search with ibuprofen query
Commands for Information Gathering
The following ocli molsearch
commands can be used for gathering various information
including getting a molecule’s SMILES string given the name, getting information on a
specific query, listing all queries, and listing results of a query.
Getting a SMILES String from a Name
The command ocli molsearch query smiles
will return the SMILES string of a molecule
given the name. The output of the help command is shown below.
> ocli molsearch query smiles -h
Usage: ocli molsearch query smiles [OPTIONS] TITLE
Get SMILES string for the molecule with the given title.
Searches all loaded 2D databases.
Example: ocli molsearch query smiles aspirin
Options:
-h, --help Show this message and exit.
The following example demonstrates this capability for ibuprofen.
> ocli molsearch query smiles ibuprofen
CC(C)Cc1ccc(cc1)C(C)C(=O)O
Getting Information about a Specific Query
The command ocli molsearch query info
will print information about a query, requiring the
query ID as input. The output of the help command is shown below.
> ocli molsearch query info -h
Usage: ocli molsearch query info [OPTIONS] QUERY
Information about a specific molecule search query. Example: ocli
molsearch query info <query_id>
Options:
-h, --help Show this message and exit.
The information displayed upon execution of this command depends on the query type. For example, a GraphSim query will print the ID, owner, name, state, date/time of creation, database name, query type, cutoff, similarity measure type, SMILES string, and other information.
Listing All Queries
To list all queries, use this command: ocli molsearch query list
. There are various options
that can be enabled for this list. These are shown in the output of the help command below.
> ocli molsearch query list
Usage: ocli molsearch query list [OPTIONS]
List all MolSearch queries Example: ocli molsearch query list
Options:
--offset INTEGER Offset from the start of the list.
--limit INTEGER Number of objects to list.
--query-type TEXT Type of query to list.
--project INTEGER
--programmatic BOOLEAN Filter queries created by ocli (True), UI (False),
or all queries (None).
-h, --help Show this message and exit.
Listing the Results of a Query
The command ocli molsearch query list-results
will list the results of a query, requiring the
query ID as input. The output of the help command is shown below.
> ocli molsearch query list-results -h
Usage: ocli molsearch query list-results [OPTIONS] QUERY
List results for a single molecule search query. Example: ocli molsearch
query list-results <query_id>
Options:
--offset INTEGER Offset from the start of the list.
--limit INTEGER Number of objects to list.
-h, --help Show this message and exit.
The output of this command depends on the query type, but the general format is that each displayed query is a row, and each piece of information about each query is a column. The queries are printed in descending order of similarity score.
Get Information about a Searchable Database
The command ocli molsearch db info
will print information about the requested database, including
but not limited to the user ID of the owner, the display name, the molecule count, and the search type
(2D/3D). The output of the help command is shown below.
> ocli molsearch db info -h
Usage: ocli molsearch db info [OPTIONS] DATABASE
Information about a specific molecule search database.
Options:
-h, --help Show this message and exit.
The following example demonstrates this.
> ocli molsearch info 4588
id owner state database_name database_version display_name error molecules_count search_type section
---- ------- ------- --------------------------- ------------------ --------------------------- ------- ----------------- ------------- -----------
4588 1255 LOADED FDA Approved Drugs - ChEMBL FDA Approved Drugs - ChEMBL 7885 2D OE provided
Search Parameters:
fingerprint_types: circular, circularvs, path, pathvs, tree, treevs
substructure_search_types: MDL, SMARTS
Show the Sharing Status of a Custom Database
Databases that are “customer managed” can be shared with various users, projects, and organizations on Orion. The users, projects, and organizations that have access to a given database, along with their level of access (read, write, share, delete), can be seen via the following OCLI command.
> ocli molsearch db list-shares 495
For information on how to share a custom database with a user, project, or organization, see the database sharing subsection.
Query Manipulation Commands
For reference information on each of the commands, see the molecule search query commands.
Update a Molecule Search Query
The command ocli molsearch query update
will update specified attributes of a molecule search query. Currently,
only the query name and “saved” attributes are supported. Setting “saved” to True will pin the query, preventing
deletion after the 30 day grace period. Setting “saved” to False will unpin the query. The output of the help
command is shown below.
> ocli molsearch query update -h
Usage: ocli molsearch query update [OPTIONS] QUERY
Update a molecule search query.
Example: ocli molsearch query update <query_id> --name=<query_name> --saved=<True|False>
Options:
--name TEXT Query name.
--saved BOOLEAN Save query.
-h, --help Show this message and exit.
Export Query Results to a Dataset
The command ocli molsearch query export
will export the query results to a dataset on Orion. This command
requires the ID of the query as an argument as well as the desired output project ID and output path on Orion.
The IDs of accessible projects can be seen via ocli projects list
. These will appear adjacent to their
corresponding names on Orion. The path for the My Data folder in Orion is of the form
/project/<project_id>/user/<username>/. Any subfolders and sub-subfolders can be appended to the path
in the standard Unix way (separated by forward slashes). The Team Data folder in Orion is of the form
/project/<project_id>/team. Optionally, selected hits can be exported via the --ids=
flag. The value
passed into this flag should be a comma-separated list of hit IDs. Hit IDs for a given query can be viewed
via ocli molsearch query list-results <query_ID>
.
The output of the help command is shown below.
> ocli molsearch query export -h
Usage: ocli molsearch query export [OPTIONS] QUERY PROJECT PATH
Export search results to a dataset.
Example:
ocli molsearch query export <id> <project_id> /project/<project_id>/user/<username>/<path> --ids=1,2,3
Options:
--name TEXT Name of output dataset.
--format TEXT Output format.
--background Export in async mode?
--ids TEXT Comma-separated list of result IDs to export.
-h, --help Show this message and exit.
Download Results and Query to Separate Files
The command ocli molsearch query download
will download a query and its search results into two
separate local files. There are optional arguments to only download selected hit IDs (–id), as well
as to exclude certain hit IDs (–exclude). The output filename for the results is a required positional
argument, and the filename of the query molecule is derived from that as shown in the output of the
help command below.
> ocli molsearch query download -h
Usage: ocli molsearch query download [OPTIONS] QUERY FILE
Download molsearch results and query to separate local files
File format is determined by filename extension. Supported formats are csv, smi, sdf, oeb
Downloaded query file will be named <result_file_name>-query.<ext>
Example:
ocli molsearch query download 1 search-results.csv --ids 1234,1236
Options:
--ids TEXT Comma-separated list of result IDs to export.
--exclude Exclude IDs from results.
-h, --help Show this message and exit.
In the following example, I download the results of query 246755 to a file called my_search_results.sdf, requesting only two of the hits be downloaded to the file.
> ocli molsearch query download 246755 my_search_results.sdf --ids=25301210,25301209
This also created an output file containing the query molecule called my_search_results-query.sdf.
Delete a Query
A query can be deleted via the command ocli molsearch query delete
, requiring only the query ID as an
argument. The output of the help command is shown below.
> ocli molsearch query delete -h
Usage: ocli molsearch query delete [OPTIONS] QUERY
Delete a molecule search query.
Example: ocli molsearch query delete <query_id>
Options:
-h, --help Show this message and exit.
Database Manipulation Commands
For reference information on each of the commands, see the molecule search database commands.
Create a Database from a Prepared Collection
The command ocli molsearch db create
can be used to create a molecule search database from a
prepared collection. The database will initially be in the QUEUED state, before it ends up in
the LOADED state. It is recommended to request 1.2x the size of the collection for the memory
on the database. The search type (2D or 3D) is also a required argument. The example below is for
a prepared 2D, 5GB collection.
> ocli molsearch db create --collection=145856 --memory-mb=6000 --search-type=2D
Load/Unload a Database
Custom databases that have been created must be loaded onto a compute instance before searches
can be carried out against them. The OCLI command ocli molsearch db load <db_id>
will
load the database. This loaded database will incur on-demand pricing on AWS ec2 instances.
Databases can be unloaded, which releases the compute instance, via
ocli molsearch db unload <db_id>
.
When loading a database, it will initially be in a QUEUED state until it receives a compute resource. When unloading a database, it will initially be in an UNLOADING state. Once complete, it will be in an UNLOADED state. The database cannot be loaded again until unloading completes. The example orionclient script below demonstrates the intermediate states accessed by a molecule search database during loading and unloading by showing how to unload a loaded database, waiting for the unloading to complete, loading the database again, and waiting for the loading to complete, printing the state of the database every time it changes.
from orionclient.session import APISession
from orionclient.types import MolsearchDatabase
import time
def monitor_db_until_desired_state(db, desired_state):
if desired_state not in ["LOADED", "UNLOADED"]:
raise ValueError("Desired state must be either LOADED or UNLOADED")
state = ''
# Wait for completion
while db.state != desired_state:
APISession.refresh_resource(db)
if state != db.state:
print(f"DB state: {db.state}")
time.sleep(1)
state = db.state
if state == '':
print(f"DB state: {db.state}")
if __name__ == '__main__':
db_id = <your_db_id>
db = APISession.get_resource(MolsearchDatabase, db_id)
db.unload()
monitor_db_until_desired_state(db, 'UNLOADED')
db.load()
monitor_db_until_desired_state(db, 'LOADED')
To download this example, click here: test for loaded
orionclient User Stories
This section contains a collection of complex, multistep tasks in the form of orionclient API scripts.
Molecule Input
3D queries submitted via orionclient accept an OEMol instance as the query molecule. The following examples demonstrate how to extract a molecule that can be passed in as an argument to query constructors. Each example molecule has a different starting point: a record on a dataset, a molecule file, a shape query file. Each example stores the final molecule in a variable called “mol” that should be passed in directly to a MolsearchQuery constructor.
"""
Example 1: Extract molecule from the first record on an Orion dataset
"""
from orionclient.session import APISession
from orionclient.types import Dataset
from openeye.oechem import (OEMol, OEReadMolecule, OEThrow, OEPrimaryMolField,
OEMolToSmiles)
ds = APISession.get_resource(Dataset, <your_dataset_id>)
molfield = OEPrimaryMolField()
for rec in ds.records():
if rec.has_field(molfield):
mol = rec.get_value(molfield)
break
To download this example, click here: extract from first
"""
Example 2: Extract molecule from a molecule file
"""
from openeye.oechem import oemolistream, OEThrow, OEMol, OEReadMolecule
ifs = oemolistream()
filename = "<your_molecule_filename>"
if not ifs.open(filename):
OEThrow.Fatal(f"Unable to open file {filename}")
mol = OEMol()
OEReadMolecule(ifs, mol)
ifs.close()
To download this example, click here: extract from molecule
"""
Example 3: Extract shape query from a shape query file, and convert to
searchable molecule
"""
from openeye.oechem import oeifstream, OEThrow, OEMol
from openeye.oeshape import OEShapeQuery, OEReadShapeQuery
ifs = oeifstream()
filename = "<your_shape_query_filename>.sq"
if not ifs.open(filename):
OEThrow.Fatal(f"Unable to open file {filename}")
sq = OEShapeQuery()
OEReadShapeQuery(ifs, sq)
ifs.close()
mol = OEMol()
sq.GetCompositeMolecule(mol)
To download this example, click here: extract from shape query
Query Round Trip
In some cases, it may be desired for the search itself to be short-lived, with the the results exported to a dataset, and the search deleted immediately after the export. The following example demonstrates how to do this.
"""
'Query Round Trip' example script
Usage:
python <script_name>.py <molecule_file_containing_query>
"""
from orionclient.session import APISession
from orionclient.types import MolsearchQuery, MolsearchDatabase
from openeye.oechem import OEMol, oemolistream, OEReadMolecule, OEThrow
import time
import sys
def monitor_query_until_complete(query):
state = ''
# Wait for completion
while query.state != 'SUCCESS':
APISession.refresh_resource(query)
if state != query.state:
print(f"Query state: {query.state}")
time.sleep(1)
state = query.state
if state == '':
print(f"Query state: {query.state}")
def find_first_available_db(search_type='2D'):
if search_type.upper() not in ['2D', '3D']:
raise ValueError("Search type must be '2D' or '3D'")
search_filter = {'search_type': search_type.upper()}
db = None
for db in APISession.list_resources(MolsearchDatabase, search_filter):
if db.state == "LOADED":
break
if db is None or db.state != "LOADED":
raise ValueError(f"No loaded {search_type} database found")
return db
if __name__ == '__main__':
PROJECT = APISession.get_current_project().id
USER = APISession.get_user_profile()
db = find_first_available_db()
ifs = oemolistream()
filename = sys.argv[1]
if not ifs.open(filename):
OEThrow.Fatal(f"Unable to open file {filename}")
mol = OEMol()
OEReadMolecule(ifs, mol)
ifs.close()
# Create molecule search query
subsearch_query = MolsearchQuery.create_subsearch_query(
database_id=db.id,
num_hits=100,
mdlquery=mol,
subsearch_query_type="MDLJSON",
name=f"{filename} subsearch",
project=PROJECT,
aliphatic_constraint=False,
topology_constraint=False,
stereo_constraint=False,
isotope_constraint=False,
session=APISession,
)
if subsearch_query.state in ['QUEUED', 'PROCESSING', 'SUCCESS']:
monitor_query_until_complete(subsearch_query)
else:
raise SystemExit("Query was found to be in an unexpected state,"
f" {subsearch_query.state}")
# Export query results to an Orion dataset
ds_id = subsearch_query.export(
name=subsearch_query.name,
session=APISession,
project=PROJECT,
path=f"/project/{PROJECT}/user/{USER.username}",
)
print(f"Exported molecule search query to dataset with ID {ds_id}")
# Delete molecule search query
APISession.delete_resource(subsearch_query)
print("Deleted the associated query")
To download this example, click here: query round trip