Dataset Subsetting Based on String Keys¶
- This flow takes takes a dataset and two input parameters: a string field from that dataset,
and a string paraemter as input. It splits the string field by line to create keys, and then emits records from the
input dataset that have values of the specified string field which match any of these keys.
Extra Required Parameters
CPUs (integer) : Number of CPUs to use in each instance of subset cubeDefault: 4 Min: 1 Max: 128 Input String Field To Use For Subsetting (Field Type: String) : The name for the string data field.Default: Subset Field Number of messages to distribute at a time (integer) : Number of records to process in each instance of subset cubeDefault: 5000 Min: 1 Max: 65535 Output matched dataset (dataset_out) : Output dataset to write toDefault: matched Write unmatched dataset (boolean) : If off, then the ‘unmatched’ dataset is not generated.Default: False Input Dataset (data_source) : Dataset to subset