Orion Platform File Cubes¶
When you want to read files into floes, often they will have to be converted to records, so that they can be interpreted by other cubes. The following cubes provide utilities for converting files to records and records to various file formats.
Cubes¶
Binary File Reader¶
Import Statement
from orionplatform.cubes import BinaryFileReaderCube
Description
- A cube that reads one or more files and emits their contents in a single stream.
Generally used with a BinaryInputPort initializer.
Output Ports
success: BinaryOutputPort
Ungrouped Parameters¶
File to use as input: FileInputParameter
The file to read from in binary mode
Parameter Groups¶
Floe Internals
buffer_size: IntegerParameter
The amount of data buffered before sending downstream
Hardware
CPUs: IntegerParameter
The number of CPUs to run this cube with
Temporary Disk Space (MiB): DecimalParameter
The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
GPUs: IntegerParameter
The number of GPUs to run this cube with
Instance Tags: StringParameter
Only run on machines with matching tags (comma separated)
Instance Type: StringParameter
The type of instance that this cube needs to be run on
Memory (MiB): DecimalParameter
The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
Shared Memory (MiB): DecimalParameter
The amount of shared memory to allow a container to address
Spot policy: StringParameter
Control cube placement on spot market instances
Metrics
Cube Metrics: StringParameter
Set of metrics to be collected
Metric Period: DecimalParameter
How often to sample metrics, in seconds
File to Record Converter¶
Import Statement
from orionplatform.cubes import FileToRecordConverter
Description
Reads a molecule or csv file and converts to records
Output Ports
success: RecordOutputPort
Ungrouped Parameters¶
File to use as input: FileInputParameter
Molecular or CSV file to convert to records for use in a floe
File extension to append to input the file name: StringParameter
Override the file format derived from input file name
Parameter Groups¶
Floe Internals
buffer_size: IntegerParameter
The amount of data buffered before sending downstream
Hardware
CPUs: IntegerParameter
The number of CPUs to run this cube with
Temporary Disk Space (MiB): DecimalParameter
The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
GPUs: IntegerParameter
The number of GPUs to run this cube with
Instance Tags: StringParameter
Only run on machines with matching tags (comma separated)
Instance Type: StringParameter
The type of instance that this cube needs to be run on
Memory (MiB): DecimalParameter
The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
Shared Memory (MiB): DecimalParameter
The amount of shared memory to allow a container to address
Spot policy: StringParameter
Control cube placement on spot market instances
Metrics
Cube Metrics: StringParameter
Set of metrics to be collected
Metric Period: DecimalParameter
How often to sample metrics, in seconds
Archive Reader¶
Import Statement
from orionplatform.cubes import ArchiveConverterCube
Description
- Converts a tar or zip file into records (if the output port is connected)
or directly into datasets (if the output port isn’t connected).
Output Ports
success: RecordOutputPort
Ungrouped Parameters¶
Tar or zip file to use as input: FileInputParameter
Archive file to convert to records
Parameter Groups¶
Floe Internals
buffer_size: IntegerParameter
The amount of data buffered before sending downstream
Hardware
CPUs: IntegerParameter
The number of CPUs to run this cube with
Temporary Disk Space (MiB): DecimalParameter
The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
GPUs: IntegerParameter
The number of GPUs to run this cube with
Instance Tags: StringParameter
Only run on machines with matching tags (comma separated)
Instance Type: StringParameter
The type of instance that this cube needs to be run on
Memory (MiB): DecimalParameter
The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
Shared Memory (MiB): DecimalParameter
The amount of shared memory to allow a container to address
Spot policy: StringParameter
Control cube placement on spot market instances
Metrics
Cube Metrics: StringParameter
Set of metrics to be collected
Metric Period: DecimalParameter
How often to sample metrics, in seconds
Record to File Converter¶
Import Statement
from orionplatform.cubes import RecordsToFileConverter
Description
- A writer that converts a stream of records to an OE-recognized file.
The format is determined from the file extension given to the “file_name” parameter. The cube will raise an exception if the content of the records cannot be converted into the requested file format.
Input Ports
intake: RecordInputPort
Ungrouped Parameters¶
file_name: FileOutputParameter
Name of the file to create from records. The file extension will determine the format.
Parameter Groups¶
Floe Internals
buffer_size: IntegerParameter
The amount of data buffered before sending downstream
Hardware
CPUs: IntegerParameter
The number of CPUs to run this cube with
Temporary Disk Space (MiB): DecimalParameter
The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
GPUs: IntegerParameter
The number of GPUs to run this cube with
Instance Tags: StringParameter
Only run on machines with matching tags (comma separated)
Instance Type: StringParameter
The type of instance that this cube needs to be run on
Memory (MiB): DecimalParameter
The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
Shared Memory (MiB): DecimalParameter
The amount of shared memory to allow a container to address
Spot policy: StringParameter
Control cube placement on spot market instances
Metrics
Cube Metrics: StringParameter
Set of metrics to be collected
Metric Period: DecimalParameter
How often to sample metrics, in seconds
Record File to Record Converter¶
Import Statement
from orionplatform.cubes import RecordFileToRecordConverter
Description
“Reads a record file and converts to records
Output Ports
success: RecordOutputPort
Ungrouped Parameters¶
File to use as input: FileInputParameter
Record file to use as input to a floe
Parameter Groups¶
Floe Internals
buffer_size: IntegerParameter
The amount of data buffered before sending downstream
Hardware
CPUs: IntegerParameter
The number of CPUs to run this cube with
Temporary Disk Space (MiB): DecimalParameter
The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
GPUs: IntegerParameter
The number of GPUs to run this cube with
Instance Tags: StringParameter
Only run on machines with matching tags (comma separated)
Instance Type: StringParameter
The type of instance that this cube needs to be run on
Memory (MiB): DecimalParameter
The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
Shared Memory (MiB): DecimalParameter
The amount of shared memory to allow a container to address
Spot policy: StringParameter
Control cube placement on spot market instances
Metrics
Cube Metrics: StringParameter
Set of metrics to be collected
Metric Period: DecimalParameter
How often to sample metrics, in seconds
Record to Record File¶
Import Statement
from orionplatform.cubes import RecordsToRecordFileConverter
Description
A writer that writes a stream of records to a record file (i.e. oedb).
Input Ports
intake: RecordBytesInputPort
Ungrouped Parameters¶
file_name: FileOutputParameter
Name of the file to create from records
Parameter Groups¶
Floe Internals
buffer_size: IntegerParameter
The amount of data buffered before sending downstream
Hardware
CPUs: IntegerParameter
The number of CPUs to run this cube with
Temporary Disk Space (MiB): DecimalParameter
The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
GPUs: IntegerParameter
The number of GPUs to run this cube with
Instance Tags: StringParameter
Only run on machines with matching tags (comma separated)
Instance Type: StringParameter
The type of instance that this cube needs to be run on
Memory (MiB): DecimalParameter
The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
Shared Memory (MiB): DecimalParameter
The amount of shared memory to allow a container to address
Spot policy: StringParameter
Control cube placement on spot market instances
Metrics
Cube Metrics: StringParameter
Set of metrics to be collected
Metric Period: DecimalParameter
How often to sample metrics, in seconds