Command Line Parsing

This chapter describes how to parse command line options using the OEInterface class and associated free functions.

There are three basic steps to using the OEInterface class:

Configure parameters

Tell the OEInterface the names and types of the parameters for it to expect when it parses the command line.

Parse the command line

User specified parameter values are read from the command line into the OEInterface.

Get parameter values

The values of individual parameters are extracted from the OEInterface object via a get method.

Getting Started

The easiest way to get started with OEInterface for parsing command line options is to use the convenience constructor that will take an interface definition and a command line to parse. The constructor will configure the interface with the specified definition, then parse the options from command line passed in through argv.

Listing 1: Getting started with OEInterface command line parsing

using System;
using OpenEye.OEChem;

public class GettingStartedWithOEInterface
{
    static string InterfaceData = @"
!PARAMETER -b
  !TYPE bool
  !BRIEF A boolean parameter
!END
!PARAMETER -i
  !TYPE int
  !BRIEF An integer parameter
!END
!PARAMETER -f
  !TYPE float
  !BRIEF A float parameter
!END
!PARAMETER -str
  !TYPE string
  !BRIEF A string parameter
!END
";

    public static int Main(string[] args)
    {
        OEInterface itf = new OEInterface(InterfaceData, "GettingStartedWithOEInterface", args);

        Console.WriteLine("-b = " + itf.GetBool("-b"));
        Console.WriteLine("-i = " + itf.GetInt("-i"));
        Console.WriteLine("-f = " + itf.GetFloat("-f"));
        Console.WriteLine("-str = " + itf.GetString("-str"));
        return 0;
    }
}

The values specified on the command line are then accessed on the OEInterface object using a Get method that corresponds to the type of the parameter. The following is an example of executing the program in Listing 1:

Warning

The convenience constructor will call the system exit function forcing the program to exit in the following situations:

  • The interface definition is malformed
  • The user has requested help on any of the parameters, e.g., --help -str
  • The command line is malformed, i.e., missing or extra parameters

For these reasons the convenience constructor may not be suitable for use in a library setting, it is intended for standalone applications.

Parameter Value Access

The Get methods of the OEInterface returns the value of the specified parameter the user entered on the command line, or in the case that the parameter was not specified by the user on the command line the default value specified in the interface definition is returned. If there is no default or user specified value the default constructed object is returned (i.e., 0.0f for float parameters and an empty string for string parameters) and a warning is issued. For example, the following demonstrates how running the program in Listing 1 without any arguments will display a lot of warnings about uninitialized values being used.

The warnings can be avoided by using the OEInterface.Has method to check if there is no default value or user specified value. Listing 2 demonstrates how to check for the existence of certain parameters.

Listing 2: Testing for parameter existence with the Has method

/**********************************************************************
  Copyright (C) 2010 OpenEye Scientific Software, Inc.
***********************************************************************/
using System;
using OpenEye.OEChem;

public class UsingOEInterfaceHas
{
    static string InterfaceData = @"
!PARAMETER -b
  !TYPE bool
  !BRIEF A boolean parameter
!END
!PARAMETER -i
  !TYPE int
  !BRIEF An integer parameter
!END
!PARAMETER -f
  !TYPE float
  !BRIEF A float parameter
!END
!PARAMETER -str
  !TYPE string
  !BRIEF A string parameter
!END
";

    public static void Main(string [] argv)
    {
        OEInterface itf = new OEInterface(InterfaceData, "UsingOEInterfaceHas", argv);

        if(itf.HasBool("-b"))
            Console.WriteLine("-b = " + itf.GetBool("-b"));
        if(itf.HasInt("-i"))
            Console.WriteLine("-i = " + itf.GetInt("-i"));
        if(itf.HasFloat("-f"))
            Console.WriteLine("-f = " + itf.GetFloat("-f"));
        if(itf.HasString("-str"))
            Console.WriteLine("-str = " + itf.GetString("-str"));
    }
}

The following demonstrates the output of the program in Listing 2. Notice that only variables present on the command line will trigger OEInterface.Has to return true. This is an important distinction when dealing with boolean parameters, OEInterface.Has will not return the value of the boolean parameter, just the existence of it.

Default Values

Using OEInterface.Has all over a program can become quite clumsy. Therefore it is recommended to use !DEFAULT in the parameter interface definition to specify the parameter’s default value whenever a default value can be specified. When a parameter has a default value OEInterface.Has will always return true. Listing 3 demonstrates how to set a default value for a parameter.

Listing 3: Setting a default value for a parameter

/**********************************************************************
  Copyright (C) 2010 OpenEye Scientific Software, Inc.
***********************************************************************/
using System;
using OpenEye.OEChem;


public class UsingOEInterfaceDefaults
{
    static String InterfaceData = @"
!PARAMETER -b
  !TYPE bool
  !DEFAULT false
  !BRIEF An boolean parameter
!END
!PARAMETER -i
  !TYPE int
  !DEFAULT 2
  !BRIEF An integer parameter
!END
!PARAMETER -f
  !TYPE float
  !DEFAULT 0.5
  !BRIEF A float parameter
!END
!PARAMETER -str
  !TYPE string
  !DEFAULT foo
  !BRIEF A string parameter
!END
";

    public static void Main(string [] argv)
    {
        OEInterface itf = new OEInterface(InterfaceData, "UsingOEInterfaceDefaults", argv);

        Console.WriteLine("-b = " + itf.GetBool("-b"));
        Console.WriteLine("-i = " + itf.GetInt("-i"));
        Console.WriteLine("-f = " + itf.GetFloat("-f"));
        Console.WriteLine("-str = " + itf.GetString("-str"));
    }
}

This allows the code to take a very simple structure that doesn’t have to care whether the parameter was actually specified on the command line. The following is what is printed by the program in Listing 3 when no arguments are given.

Requiring Parameters

Programs usually have a bare minimum set of parameters that must be specified. A parameter can be marked as required in the interface definition by using !REQUIRED true in the parameter definition. Listing 4 demonstrates making all the parameters used in the previous example required.

Listing 4: Requiring all parameters to be specified.

/**********************************************************************
  Copyright (C) 2010 OpenEye Scientific Software, Inc.
***********************************************************************/
using System;
using OpenEye.OEChem;

public class UsingOEInterfaceRequired
{
    static string InterfaceData = @"
!PARAMETER -b
  !TYPE bool
  !REQUIRED true
  !BRIEF An boolean parameter
!END
!PARAMETER -i
  !TYPE int
  !REQUIRED true
  !BRIEF An integer parameter
!END
!PARAMETER -f
  !TYPE float
  !REQUIRED true
  !BRIEF A float parameter
!END
!PARAMETER -str
  !TYPE string
  !REQUIRED true
  !BRIEF A string parameter
!END
";

    public static void Main(string [] argv)
    {
        OEInterface itf = new OEInterface(InterfaceData, "UsingOEInterfaceRequired", argv);

        Console.WriteLine("-b = {0}", itf.GetBool("-b"));
        Console.WriteLine("-i = {0}", itf.GetInt("-i"));
        Console.WriteLine("-f = {0}", itf.GetFloat("-f"));
        Console.WriteLine("-str = {0}", itf.GetString("-str"));
    }
}

If any required parameter is missing from the command line during the command line parsing stage an appropriate error message will be thrown. The following demonstrates the error messages thrown by the program in Listing 4:

Parameter Lists

Sometimes it is useful to allow multiple values for a given parameter. The !LIST field is used to specify that a parameter can have multiple values separated by spaces on the command line. The OEInterface.GetList method should then be used to get an iterator over the parameter’s values. Listing 5 demonstrates how to specify that a parameter is a list of values and then how to access it with the OEInterface.GetList method.

Listing 5: Specifying that a parameter is a list of values

/**********************************************************************
  Copyright (C) 2010 OpenEye Scientific Software, Inc.
***********************************************************************/
using System;
using OpenEye.OEChem;


public class UsingOEInterfaceList
{
    static String InterfaceData = @"
!PARAMETER -strs
  !TYPE string
  !LIST true
  !BRIEF Some string parameters
!END
";

    public static void Main(string [] argv)
    {
        OEInterface itf = new OEInterface(InterfaceData, "UsingOEInterfaceList", argv);

        Console.Write("-strs = ");
        foreach (string param in itf.GetStringList("-strs"))
        {
            Console.WriteLine(param);
            Console.Write("        ");
        }
        Console.WriteLine();
    }
}

The command line parser will then assign elements as values based upon their position in the argv that was passed to the program. This allows the user to pass values that have spaces in them by simply quoting the relevant section. The following demonstrates the behavior of the program in Listing 5:

Positional Parameters

For commonly used parameters it is often useful to allow them to be specified by their position on the command line. The !KEYLESS record is used to specify that a parameter can be assigned from a positional value on the command line. Listing 6 demonstrates how to specify input and output file name parameters that can be specified from their position on the command line.

Listing 6: Specifying keyless input and output file parameters

/**********************************************************************
  Copyright (C) 2010 OpenEye Scientific Software, Inc.
***********************************************************************/
using System;
using OpenEye.OEChem;

public class UsingOEInterfaceKeyless
{
    static String InterfaceData = @"
!PARAMETER -i
  !TYPE string
  !KEYLESS 1
  !BRIEF Input file name
!END
!PARAMETER -o
  !TYPE string
  !KEYLESS 2
  !BRIEF Output file name
!END
";

    public static void Main(string [] argv)
    {
        OEInterface itf = new OEInterface(InterfaceData, "UsingOEInterfaceKeyless", argv);

        Console.WriteLine("Input file name = " + itf.GetString("-i"));
        Console.WriteLine("Output file name = " + itf.GetString("-o"));
    }
}

!KEYLESS parameters can still be specified using their command line flag. However, positional arguments need to be the last values on the command line. The following demonstrates how !KEYLESS parameters can and cannot be specified on the command line:

Warning

An ambiguity occurs when mixing !KEYLESS with !LIST. There is no way for the parser to differentiate between multiple values for a !LIST and different !KEYLESS parameters. OEInterface resolves this by requiring that any keyless list parameter must be the last keyless parameter.

Molecule Parameters

As the previous example illustrated how to to handle simple parameters such as float, int or string. The OEInterface also provides a convenient way to open a molecule file and return its first molecule.

Similarly the first multi-conformer molecule can be imported from a molecule file by setting the type of a parameter as !TYPE OEMol.

Command Line Help

The OEInterface object can have a significant amount of information, default values, text descriptions, etc, about the parameters it holds. All this information allows the OEInterface to provide help to the user interactively automatically based upon the interface definition. Listing 8 is a fairly simple program, it just concatenates all the molecules given into a single molecule. Though it has a fairly complex set of command line parameters which OEInterface handles automatically.

Listing 8: A program to concatenate any number of molecules together

/**********************************************************************
  Copyright (C) 2010 OpenEye Scientific Software, Inc.
***********************************************************************/
using System;
using OpenEye.OEChem;

public class UsingOEInterfaceHelp
{
    static String InterfaceData = @"
!BRIEF UsingOEInterfaceHelp [-d <delimiter>] [-o] <output> [-i] <input1> <input2> ...
!PARAMETER -delim
  !ALIAS -d
  !TYPE string
  !DEFAULT _
  !BRIEF Title delimiter
  !DETAIL
         This is the value given to the OEChem function OEAddMols to
         separate the titles of the input molecules in the output.
!END
!PARAMETER -output
  !ALIAS -o
  !TYPE string
  !REQUIRED true
  !KEYLESS 1
  !BRIEF The output file
  !DETAIL
         The molecule file to output to, can be any file format
         OEChem supports.
!END
!PARAMETER -inputs
  !ALIAS -i
  !TYPE string
  !REQUIRED true
  !LIST true
  !KEYLESS 2
  !BRIEF The input files
  !DETAIL
         A list of molecule files to add together. Note, only the
         first molecule from every file will be added.
!END
";

    public static void Main(string [] argv)
    {
        OEInterface itf = new OEInterface(InterfaceData, "UsingOEInterfaceHelp", argv);

        OEGraphMol outmol = new OEGraphMol();
        foreach (string param in itf.GetStringList("-inputs"))
        {
            oemolistream ims = new oemolistream(param);

            OEGraphMol inmol = new OEGraphMol();
            OEChem.OEReadMolecule(ims, inmol);

            OEChem.OEAddMols(outmol, inmol, itf.GetString("-delim"));
        }

        oemolostream oms = new oemolostream(itf.GetString("-output"));
        OEChem.OEWriteMolecule(oms, outmol);
    }
}

Note

  • !BRIEF can be used outside of a !PARAMETER to give a concise one line command line usage.
  • !ALIAS is used to specify an alternative name the parameter can be specified with.
  • !DETAIL is used for a more detailed description of the parameter.

Whenever there are !REQUIRED parameters the OEInterface will complain about an empty command line. This is usually the first way users interact with a command line program so the goal is to give them only enough information to get the program running. This is where specifying a top-level !BRIEF is exceptionally useful to the end user. The following is the output of the program in Listing 8 when no parameters are given:

More information is accessible from the command line by specifying --help as the first argument on the command line. The following is an example of requesting help from the program in Listing 7:

As seen above --help can take arguments that allow the user introspect even further into the semantics of a particular parameter.

Note

OECheckHelp is the low-level function used to check a command line for --help. By default the OEInterface constructor and the OEParseCommandLine function call OECheckHelp automatically so most users will never need to use this function directly.

Interface Definition Specification

Configuring an OEInterface is the process of telling the name, type, and other details of all the command line parameters the program is going to use. An interface definition consists of a collection of PARAMETER blocks optionally organized into CATEGORY blocks. A PARAMETER block must always have the following information:

NAME:The command line flag, e.g., -i, -str, -cutoff
TYPE:The kind of information being passed, e.g., float, int, string, bool

A PARAMETER can also optionally contain the following additional information:

ALIAS:One or more aliases for the parameter name that are generally used to create shorthand for lengthy parameter names. For example a parameter named -energy_cutoff_window could be given an alias -ecw, that would make typing -energy_cutoff_window 10.0 on the command line equivalent to -ecw 10.0.
DEFAULT:A default value for the parameter that will be used if the parameter is not specified by the user on the command line.
REQUIRED:A flag specifying that the parameter is required. If a parameter is required the program will throw an error if the user does not specify that parameter. By default parameters are not required.
LIST:A flag specifying whether the parameter is a list of multiple values. For example, ./program -inputs file1.smi file2.smi file3.smi.
KEYLESS:A number, N, that specifies that this parameter may be fulfilled by the Nth value on the command line that does not have a parameter flag specifically associated with it. For example, if -i had a KEYLESS value of 1 specified the following are equivalent command lines, ./program -i foo.smi and ./program foo.smi.
VISIBILITY:A visibility level that controls when the parameter appears when the users requests --help. Valid values are: simple, normal, hidden. The default is simple if a VISIBILITY isn’t specified.
LEGAL_VALUE:A valid value that the parameter can be. The parameter specified must be one of the values specified in a LEGAL_VALUE field.
ILLEGAL_VALUE:A value that cannot be specified as the parameter value.
LEGAL_RANGE:A range that the parameter value must fall into. More useful than LEGAL_VALUE when the parameter is a float.
ILLEGAL_RANGE:A range that the parameter value must not fall into. More useful than ILLEGAL_VALUE when the parameter is a float.
BRIEF:A one line description of the parameter that will be displayed to users when listing multiple command line options using the high-level --help functions.
DETAIL:A multi line description of the parameter that will be displayed to users when the user requests help about a specific parameter, e.g., --help -foo.

Therefore, a PARAMETER block takes the following form:

!PARAMETER <name> [order priority]
  !ALIAS <alias>
  !TYPE <type>
  !DEFAULT <default value>
  !REQUIRED <true or false>
  !LIST <true or false>
  !KEYLESS <N>
  !VISIBILITY <visibility>
  !LEGAL_VALUE <value>
  !ILLEGAL_VALUE <value>
  !LEGAL_RANGE <hi value> <low value>
  !ILLEGAL_RANGE <hi_value> <low_value>
  !BRIEF <brief description>
  !DETAIL
    <detailed description line 1>
    <detailed description line 2>
    <detailed description line 3>
    .
    .
    .
!END

PARAMETER blocks can then be organized into CATEGORY blocks. CATEGORY blocks can also contain nested CATEGORY blocks, there is no limit to the amount of nesting. The value specified after the !CATEGORY specifier is the name of the CATEGORY. The CATEGORY can also contain a !BRIEF to give a human readable description of the grouping. The following interface definition defines an interface definition with an “Input and Output” category with sub-categories for input and output respectively. As well as a category for various other parameters.

!CATEGORY Input and Output
  !CATEGORY Input
    !PARAMETER -i
      ...
    !END
  !CATEGORY Output
    !PARAMETER -o
      ...
    !END
  !END
!END
!CATEGORY Other Stuff
  !PARAMETER -o
    ...
  !END
!END

By default parameters and categories appear in alphabetical order. This order can be adjusted however by adding an integer after the category or parameter name in the interface definition. Parameters and categories are then sorted first by this integer and secondarily by their names. If this integer is unspecified, it is assumed to be zero. The following example illustrates re-ordering of both categories and variables.

!CATEGORY Input and Output 1
  !CATEGORY Input 1
    !PARAMETER -input 1
      ...
    !END
    !PARAMETER -dbase 2
      ...
    !END
 !CATEGORY Output 2
    !PARAMETER -output 1
      ...
    !END
    !PARAMETER -hits 2
      ...
    !END
  !END
!END
!CATEGORY Other Stuff 2
  !PARAMETER -o
    ...
  !END
!END