Orion Client Types

Storage Types

Datasets

class orionclient.types.Dataset(id: int = None, name: str = '', description: str = '', dirty: bool = False, created='', count: int = NOTHING, owner: int = None, project: int = None, systemtags: list = [], deleted='', session=None, buffer=None, written: int = 0, buffer_count=NOTHING)

A set of OERecords stored in Orion that can be queried and viewed quickly.

Suggested to use ShardCollections for large numbers of OERecords (Greater than a million OERecords)

Creating a Dataset:

# Automatically will perform conversion to OERecords
ds = Dataset.upload(
    APISession,
    "Molecular File",
    "/path/to/file.sdf"
)
classmethod create(session, name, project=None)

Creates an empty dataset

Parameters:
  • session (OrionSession) – Authenticated OrionSession
  • name (string) – Name of the Dataset
Returns:

Instance of a Dataset

Return type:

Dataset

Raises:
  • AuthorizationRequired – If session doesn’t have valid credentials
  • ValidationError – If project is not an integer or Project
download(record_ids=None, offset=None, limit=None, format='binary')

Download the dataset, either the entirety or the a set of records specified as a blob

Parameters:
  • record_ids (list) – List of record ids to download
  • offset (integer) – Relative offset into the dataset
  • limit (integer) – Limit number of records read
  • format (string) – Download format of records (binary or json)
Returns:

Iterator of bytes

Raises:

ValidationError – If the records list isn’t a list of ids

download_to_file(filename, **kwargs)

Download the complete dataset to a file

Parameters:
  • filename (string) – Location to write the dataset to
  • record_ids (list) – List of record ids to download
finalize()

Finalizes the dataset to update the count and indicate that the dataset is safely consumable

flush()

Flushes the underlying buffer that accumulates records before they are written to the Orion API

records(record_ids=None, offset=None, limit=None)

An iterator of all of the records from the dataset or a subset as specified by record_ids. Returns OEMolRecords

Parameters:
  • record_ids (list) – List of record ids to retrieve from the dataset
  • offset (integer) – Relative offset into the dataset
  • limit (integer) – Limit number of records read
Returns:

Iterator of Records

classmethod upload(session, name, file_path, project=None)

Creates a dataset using a local file, which can be any OEChem readable format, OEDB, OEDU, or CSV.

Parameters:
  • session (OrionSession) – Authenticated OrionSession
  • name (string) – Name of the Dataset
  • filename (string) – Path of the dataset to upload to Orion
Returns:

Instance of Dataset

Return type:

Dataset

Raises:
  • AuthorizationRequired – If session doesn’t have valid credentials
  • ValidationError – If name isn’t a string or is empty
  • ValidationError – If project is not a Project or integer
write(record)

Appends a record to a dataset and mark the dataset as dirty if it isn’t already dirty.

Parameters:record (OERecord) – Record to append to the dataset

Files

class orionclient.types.File(id: int = None, session=None, name: str = '', state: str = '', reason: str = '', multipart: bool = False, created='', size: int = NOTHING, owner: int = None, project: int = None, deleted='')

A file stored by Orion that can be accessed from within floes or locally.

Creating a File:

file = File.upload(
    APISession,
    "Local file",
    "/path/to/file.ext"
)
complete()

Completes a multipart file

Raises:
  • ValidationError – If the file is not multipart
  • BadResponse – Failure to complete
classmethod create_multipart_file(session, name, project=None)

Creates a file that allows for uploading in parts.

Parameters:
  • session (OrionSession) – Authenticated OrionSession
  • name (string) – Name of the File
  • project (Project/integer) – Project to associate with the upload
Returns:

Instance of File

Return type:

File

Raises:

BadResponse – Failure to create file

download(chunk_size=1048576)

An iterator that returns chunks of bytes that makes up the file

Parameters:chunk_size (integer) – Size of each chunk
Returns:Iterator of bytes
Return type:iter
Raises:AuthorizationRequired – If session doesn’t have valid credentials
download_to_file(filename, chunk_size=1048576)

Download the file to a local file path

Parameters:
  • filename (string) – Path of the file to download the file to
  • chunk_size (integer) – Size of each chunk
Return type:

None

Raises:

AuthorizationRequired – If session doesn’t have valid credentials

classmethod upload(session, name, path, project=None)

Uploads a file to the Orion API

URLs passed as a path must have a protocol, domain and path to be considered valid.

Example: https://example.com/some/path/file.extension

Parameters:
  • session (OrionSession) – Authenticated OrionSession
  • name (string) – Name of the File
  • path (string) – Either a local file path or a URL to upload to Orion
  • project (Project/integer) – Project to associate with the upload
Returns:

Instance of File

Return type:

File

Raises:
  • AuthorizationRequired – If session doesn’t have valid credentials
  • ValidationError – If name isn’t a string or is empty
  • ValidationError – If name is longer than 255 characters
  • ValidationError – If project is not an integer or Project
  • ValidationError – If filename is not a valid path
upload_part(data, attempts=3)

Uploads a chunk of data as part of a file. Must be called sequentially.

Parameters:
  • data (bytes) – Data to upload as a chunk
  • attempts (integer) – Number of attempts to upload the chunk
Raises:
  • ValidationError – If the file is not multipart
  • BadResponse – Failure to upload

Collections and Shards

class orionclient.types.ShardCollection(id: int = None, name: str = '', state: str = '', owner: int = None, size: int = NOTHING, created='', deleted='', reason: str = '', metadata: dict = NOTHING, session=None)

Collections are a way of storing arbitrary quantities of Binary blobs, which can be defined as the user sees fit. ShardCollections are particularly useful in Workfloes as they allow for IO to be parallelized to much larger scales. They are handled by creating Shards, which is a reference to an object store that uses signed URLs to provide the user with the ability to upload and retrieve data.

Collection states:

  • “open”: Collection can have shards added, may contain temporary shards
  • “ready”: Collection is closed, contains only ready shards
  • “processing”: Collection is being closed, may contain temporary shards
  • “error”: Collection processing or deletion encountered an error, check reason
  • “deleting”: Collection is being deleted

Creating a ShardCollection:

collection = ShardCollection.create(
    APISession,
    "Corporate Database",
    metadata={
        # Arbitrary JSON as metadata
        "virtual": False,
    }
)
shard = Shard.create(
    collection,
    name="Shard 1",
    metadata={"actives": True}
)
close()

Completes the ShardCollection, making it no longer possible to create new shards associated with the collection

The collection enters the processing state and then ready if successful.

classmethod create(session, name, metadata=None, project=None)

Creates an empty ShardCollection

Parameters:
  • session (OrionSession) – Authenticated OrionSession
  • name (string) – Name of the ShardCollection
  • metadata (dict) – User-specified metadata
Returns:

Instance of ShardCollection

Return type:

ShardCollection

Raises:
  • AuthorizationRequired – If session doesn’t have valid credentials
  • ValidationError – If project is not an integer or Project
get_count()

Returns the number of shards contained within the Collection

Returns:Number of shards
Return type:integer
list_shards(shard_ids: Union[List[int], NoneType] = None, filters: Union[Dict[str, Any], NoneType] = None)

An iterator that returns all of the Shards that are associated with the collection.

All shards in a collection are listed regardless of state, but only “ready” shards should be read or downloaded. See Shard documentation for more details.

Returns:Iterator of Shards
open()

Reopens the ShardCollection, making it possible to create new shards associated with the collection

The collection enters the open state if successful.

update(name=None, metadata=None)

Updates the name or metadata of the ShardCollection

Parameters:
  • name (string) – Name of the ShardCollection
  • metadata (dict) – User-specified metadata
Raises:

ValidationError – If parameters are wrong type

class orionclient.types.Shard(id: int = None, uri_prefix: str = '', session=None, collection=None, signed_url: str = '', kms_id: str = '', name: str = '', state: str = '', size: int = NOTHING, metadata: dict = NOTHING)

A single file that makes up a ShardCollection. Used for parallelizing IO in Parallel Cubes. Max file size of 5Gb per shard

Shard states:

  • “open”: Shard is empty and can be modified
  • “temporary”: Shard has data associated with it but is not ready yet
  • “error”: An error has occurred with the shard
  • “ready”: Shard has data associated with it, is safe to use, and
    cannot be modified. See Shard’s close method to mark a shard as “ready”

When you first create a shard, its state is “open”. Any shard not in “ready” is deleted when the associated collection is closed.

Creating a Shard:

# Requires a collection to be created
shard = Shard.create(
    collection,
    name="Shard 1",
    metadata={"actives": True}
)
shard.upload_file("path/to/file.ext")
close()

Closes the shard, indicating that the shard has content that the user wants to persist

Changes state of shard from temporary to ready.

Warning

Shards can be marked ready immediately after upload in a serial cube, However, shards must only be marked ready downstream of the parallel cube or parallel cube group that created them. Otherwise, duplicate shards could be created when work is retried.

classmethod create(collection, name='', metadata=None)

Creates a Shard to be a part of a ShardCollection

Parameters:
  • collection (ShardCollection) – ShardCollection to associate the shard with
  • name (str) – Name of the Shard
  • metadata (dict) – User-specified metadata
Returns:

Instance of Shard

Return type:

Shard

Raises:

AuthorizationRequired – If session doesn’t have valid credentials

download(chunk_size=1048576)

An iterator that returns chunks of bytes that makes up the shard

Parameters:chunk_size (integer) – Size of each chunk
Returns:Iterator of bytes
Return type:iter
Raises:AuthorizationRequired – If collection session doesn’t have valid credentials
download_to_file(filename, chunk_size=1048576, attempts=10)

Downloads the shard to a local file

Parameters:
  • filename (string) – Local file path to download file to
  • chunk_size (integer) – Size of each chunk
  • attempts (int) – Number of create/upload attempts (not request-level retries)
Raises:

AuthorizationRequired – If collection session doesn’t have valid credentials

update(name=None, metadata=None)

Updates the name or metadata of the Shard

Parameters:
  • name (str) – Name of the Shard
  • metadata (dict) – User-specified metadata
Raises:

ValidationError – If parameters are wrong type

upload(shard_data)

Uploads the content of a file handler to the shard. Must be opened with ‘rb’, not ‘r’

Parameters:shard_data (file) – File handler to file to upload
Raises:AuthorizationRequired – If collection session doesn’t have valid credentials
upload_file(filename, attempts=10)

Uploads a file to the shard

Warning

This may result in the shard’s id being updated. Do not emit a shard or save its ID until after an upload succeeds.

Parameters:
  • filename (str) – Local file path to upload
  • attempts (int) – Number of create/upload attempts (beyond request-level retries)
Returns:

Instance of Shard

Raises:
  • AuthorizationRequired – If collection session doesn’t have valid credentials
  • ValidationError – If file is empty or too large

Tags

class orionclient.types.OrionTag(id: int = None, name: str = '', parent: int = None, owner: int = None, project: int = None)

Represents a private or system tag. Can be set on most resources for organization or discovery purposes.

Tagging a resource:

APISession.tag_resource(file_obj, "tag 1", "tag 2", "tag 3")

Retrieving tags:

tags = APISession.list_tags(file_obj)

Delete a Tag:

APISession.delete_resource(OrionTag(id=618))

Projects

class orionclient.types.Project(id: int = None, name: str = '', description: str = '', owner: int = None, statistics=NOTHING)

An Orion Project resource type. A project is a location in which Datasets and Files can be stored which pertain to a specific goal.

Creating a Project:

project = APISession.create_resource(
    Project,
    params={"name": "My personal project"}
)

Secrets

class orionclient.types.Secret(id: int = None, name: str = '', owner: int = None, created='', last_updated='', description: str = '', value: str = '', session=None)

Orion Secrets are a way of storing values in an encrypted state that can be retrieved for later use.

Creating a Secret:

secret = APISession.create_resource(
    Secret,
    params={
        "name": "Database Password",
        "description": "The password to the corporate collection database"
    }
)
secret.set_value("corporate-password")
set_value(value)

Sets the value of the secret.

Parameters:value (string) – Secret value to store in Orion
Raises:BadResponse – Fails to set the value

Floe Types

WorkFloePackage

class orionclient.types.WorkFloePackage(id: int = None, session=None, uuid: str = None, created='', source_code=None, state: str = '', reason: str = '', version='', owner: int = None, size: int = NOTHING, specification=NOTHING, processing_detail=NOTHING)

Represents a Floe Package that contains Workfloes (WorkFloeSpec) and Cubes (Cube).

Upload a Workfloe Package:

package = WorkFloePackage.upload(
    APISession,
    "path/to/package.tar.gz"
)
get_environment_log()

Returns an iterator of bytes that make up the environment log

Returns:Iterator of Log bytes
Raises:BadResponse – Unable to make request to log stream
get_inspection_result()

Returns dictionary that describes the results of package inspection

Returns:Dictionary of inspection
Raises:BadResponse – Unable to retrieve inspection results
classmethod upload(session, path)

Uploads a WorkFloe Package to Orion

Parameters:
  • session (OrionSession) – Orion Session
  • path (string) – Filepath or URL to the workfloe package
Raises:

ValidationError – Invalid file path or URL

Cube

class orionclient.types.Cube(id: int = None, name: str = '', uuid: str = None, title: str = '', state: str = '', version: str = '', cube_class: str = '', specification: dict = NOTHING, owner: int = None, package: int = None)

The description of a Cube that is on Orion.

List Cubes:

cubes = APISession.list_resources(Cube)
for cube in cubes:
    print(cube)

WorkFloeSpec

class orionclient.types.WorkFloeSpec(id: int = None, uuid: str = None, name: str = '', title: str = '', state: str = '', version: str = '', specification: dict = NOTHING, owner: int = None, package: int = NOTHING, cubes: list = NOTHING, systemtags: list = [])

A WorkFloeSpec is the the description of a WorkFloe that defines what a job (WorkFloeJob) will perform.

List WorkFloe Specs:

specs = APISession.list_resources(WorkFloeSpec)
for spec in specs:
    print(spec)

WorkFloeJob

class orionclient.types.WorkFloeJob(id: int = None, name: str = '', state: str = '', reason: str = '', created='', started='', finished='', job_type: str = '', notify: bool = False, success: bool = False, project: int = None, status=NOTHING, parameters=NOTHING, system_parameters=NOTHING, owner: int = None, workfloe=None, session=None)

A WorkFloeJob is a specific run a workfloe (WorkFloeSpec).

List WorkFloe Jobs:

jobs = APISession.list_resources(WorkFloeJob)
for job in jobs:
    print(job)

Starting a Job:

# Retrieve a workfloe spec that you wish to run
workfloe_spec = APISession.get_resource(WorkFloeSpec, 814)
job = WorkFloeJob.start(
    APISession,  # Session to start job with
    workfloe_spec,
    "Name of My Workfloe",  # Name of the floe
    {
        "promoted": {"param": 618},
        "cube": {
            "cube1": {"param": "value"},
            "cube2": {"param": "value"}
        },
        "floe": {},
    }
)
cancel()

Cancels a job if it is running at the time

get_costs()
Returns a dictionary containing cost information for the workfloe and
constituent cubes. Cost is cached server-side for 30 seconds, so call this function at most every 30 seconds.

Example response:

{
  "total": 0.03,
  "costs": [
    {
      "total": 0,
      "name": "Log processor",
      "usage": [],
      "rank": null
    },
    {
      "total": 0.01,
      "name": "Sink",
      "usage": [
        {
          "hours": 0.17,
          "hourly_rate": 0.06,
          "group_name": ...
        }
      ],
      "rank": 1
    },
    {
      "total": 0.01,
      "name": "Source",
      "usage": [
        {
          "hours": 0.17,
          "hourly_rate": 0.06,
          "group_name": ...
        }
      ],
      "rank": 0
    },
    {
      "total": 0.01,
      "name": "Job Controller",
      "usage": [
        {
          "hours": 0.17,
          "hourly_rate": 0.06,
          "group_name": ...
        }
      ],
      "rank": null
    }
  ]
}
logs()

Returns an iterator of bytes that make up the job logs

Returns:Iterator of bytes
classmethod start(session, workfloe_spec, name, parameters, project=None, args=None)

Start will trigger a Job using the spec and parameters that are provided using the credentials that the session is configured for.

Parameters:
  • session (OrionSession) – Authenticated OrionSession
  • workfloe_spec (WorkFloeSpec) – WorkFloeSpec to run job from
  • name (string) – Name of the job
  • parameters (dict) – Parameters for the job to be run with
  • list<string> – Command line arguments to be parsed and added to parameters
Returns:

Instance of WorkFloeJob

Return type:

WorkFloeJob

Raises:

AuthorizationRequired – If session doesn’t have valid credentials

trigger_debug_export()

Requests that Orion export debug data for this job

An email will be sent to the email registered to the user profile that triggers the export

Service Types

Services

class orionclient.types.Service(id: int = None, owner: int = None, name: str = '', description: str = '', url: str = '', metadata: dict = NOTHING, accepts_orion_tokens: bool = False, session=None)

Orion Services is a registry of internal or external services integrated with the larger Orion system.

Creating a Service:

service = APISession.create_resource(
    Service,
    params={
        "name": "Internal Cluster",
        "description": "Internal service that provides API access",
        "url": "special-sauce.company.com/api/",
        "metadata": {
            # Arbitrary JSON as metadata
            "contact": "help@company.com",
        }
    }
)

Management Types

Tokens

class orionclient.types.Token(id: int = None, description: str = '', value: str = '', owner: int = None, created='')

A token that is used for authentication to communicate with the APIs of Orion

List tokens:

for token in APISession.list_resources(Token):
    print(token)

Create token:

token = APISession.create_resource(Token, params={"description": "new token"})

Users

class orionclient.types.OrionUser(id: int = None, email: str = '', username: str = '', first_name: str = '', last_name: str = '', administrator: bool = False)

An Orion User.

List Users:

users = APISession.list_resources(OrionUser)
for user in users:
    print(user)
class orionclient.types.UserProfile(id: int = None, email='', username: str = '', first_name: str = '', last_name: str = '', organization: int = None, staff=False, remote=False, administrator=False, organization_feedback_email: str = '')

The profile that Orion Client is authenticated against

Get current profile:

profile = APISession.get_user_profile()

Instance Groups

class orionclient.types.InstanceGroup(id: int = None, instance_detail=NOTHING, usage=0.0, pool='', name='', size=0, min_size=0, max_size=0, desired_size=0, affinity=0.0, cost=0.0, spot=False, num_healthy=0, min_reserve=0.0, state='')

A group of Instances that is maintained by Orion that Cubes run on.

Warning

Does not support get_resource() as groups do not have identifiers. Use list_resources() to access the groups.

List Instance Groups:

groups = APISession.list_resources(InstanceGroup)
for group in groups:
    print(group)

Costs

class orionclient.types.LedgerEntry(id: int = None, created='', comment: str = '', creator: int = None, owner: int = None, entry_type: str = '', amount=Decimal('0.00'), project: int = None, job: int = None)

Entries in Orion that relate to the costs that users have.

Can only see other user’s entries if authenticated as an Orion Admin.

List Ledger Entries:

entries = APISession.list_resources(LedgerEntry)
for entry in entries:
    print(entry)
class orionclient.types.Balance(id: int = None, balance='', timestamp='')

Balances are snapshots of spending on Orion.

List Balances:

for balance in APISession.list_resources(Balance):
    print(balance)
class orionclient.types.CurrentBalance(id: int = None, balance=Decimal('0.0'), ytd=Decimal('0.0'), mtd=Decimal('0.0'), last_updated='')

The current balance for the organization.

Get current balance:

balance = APISession.get_current_balance()