Collection to Hitlist Dataset


Creates a dataset of the top scoring compounds from a collection ranked by value of a float field in the collection.


Title : Collection to Hitlist Dataset
Tags : Large Scale Floes Collection Hitlist Hit List Utility
Python Name : #06_collection_to_hit_list_dataset



  • Input Collection Collection to extract top scoring records from.
    Type : collection_source
    Required : True
    Python Name : input_collection
  • Sort Field Field in the collection to sort on. If you are processing the Raw Results collection from a giga docking run the sort field is ‘Chemgauss4’
    Type : field_parameter::float
    Required : True
    Python Name : sort_field


  • Output Dataset Output dataset to write to
    Type : dataset_out
    Required : True
    Default : Collection Hit List
    Python Name : output_dataset


  • Descending If ‘On’ scores will be sorted in descending order (i.e, high scores will appear at the top of the hit list). If ‘Off’ scores will be sorted in ascending order (i.e., low scores will appear at the top of the hit list).Hint: Set this to ‘On’ when processing ROCS/FastROCS results and ‘Off’ when processing docking results.
    Type : boolean
    Required : True
    Default : False
    Choices :True, False
    Python Name : descending
  • Sort Hit List If turned off the output dataset will still contain the top N molecules from the input collection, however within the dataset the molecules will not be sorted. This will reduce the memory needed for the hit list cube.
    Type : boolean
    Required : True
    Default : True
    Choices :True, False
    Python Name : sort_switch
  • Hit List Size Size of the output hit list. If this value is set to greater than 100K and ‘Sort Hit List’ is true the amount of memory for the serial cubes may need to be increased. (see ‘Serial Cube Memory’ parameter)
    Type : integer
    Required : True
    Default : 10000
    Min Value : 1
    Python Name : hit_list_size
  • Serial Cube Memory Memory (in MB) allocated to both the ‘Hit List’ and ‘Find Score Cutoff’ cubes.
    Type : decimal
    Required : False
    Default : 30720
    Range : 7168 to 8589934592
    Python Name : serial_memory_mb