Command Line Parsing

This chapter describes how to parse command line options using the OEInterface class and associated free functions.

There are three basic steps to using the OEInterface class:

Configure parameters

Tell the OEInterface the names and types of the parameters for it to expect when it parses the command line.

Parse the command line

User specified parameter values are read from the command line into the OEInterface.

Get parameter values

The values of individual parameters are extracted from the OEInterface object via a get method.

Getting Started

The easiest way to get started with OEInterface for parsing command line options is to use the convenience constructor that will take an interface definition and a command line to parse. The constructor will configure the interface with the specified definition, then parse the options from command line passed in through argv.

Listing 1: Getting started with OEInterface command line parsing

#include "openeye.h"
#include "oesystem.h"

using namespace OESystem;

const char *InterfaceData = 
"!PARAMETER -b\n"
"  !TYPE bool\n"
"  !BRIEF A boolean parameter\n"
"!END\n"
"!PARAMETER -i\n"
"  !TYPE int\n"
"  !BRIEF An integer parameter\n"
"!END\n"
"!PARAMETER -f\n"
"  !TYPE float\n"
"  !BRIEF A float parameter\n"
"!END\n"
"!PARAMETER -str\n"
"  !TYPE string\n"
"  !BRIEF A string parameter\n"
"!END\n";

int main(int argc, char** argv)
{
  OEInterface itf(InterfaceData, argc, argv);

  std::cout << "-b = " << itf.Get<bool>("-b") << std::endl;
  std::cout << "-i = " << itf.Get<int>("-i") << std::endl;
  std::cout << "-f = " << itf.Get<float>("-f") << std::endl;
  std::cout << "-str = " << itf.Get<std::string>("-str") << std::endl;

  return 0;
}

The values specified on the command line are then accessed on the OEInterface object using a Get method that corresponds to the type of the parameter. The following is an example of executing the program in Listing 1:

prompt> ./GettingStartedWithOEInterface -b -i 12345 -f 0.5 -str foo
-b = 1
-i = 12345
-f = 0.5
-str = foo

Warning

The convenience constructor will call the system exit function forcing the program to exit in the following situations:

  • The interface definition is malformed
  • The user has requested help on any of the parameters, e.g., --help -str
  • The command line is malformed, i.e., missing or extra parameters

For these reasons the convenience constructor may not be suitable for use in a library setting, it is intended for standalone applications.

Parameter Value Access

The Get methods of the OEInterface returns the value of the specified parameter the user entered on the command line, or in the case that the parameter was not specified by the user on the command line the default value specified in the interface definition is returned. If there is no default or user specified value the default constructed object is returned (i.e., 0.0f for float parameters and an empty string for string parameters) and a warning is issued. For example, the following demonstrates how running the program in Listing 1 without any arguments will display a lot of warnings about uninitialized values being used.

prompt> ./GettingStartedWithOEInterface
Warning: OEInterface::Get, requesting value of unset parameter -b
-b = 0
Warning: OEInterface::Get, requesting value of unset parameter -i
-i = 0
Warning: OEInterface::Get, requesting value of unset parameter -f
-f = 0
Warning: OEInterface::Get, requesting value of unset parameter -str
-str =

The warnings can be avoided by using the OEInterface::Has method to check if there is no default value or user specified value. Listing 2 demonstrates how to check for the existence of certain parameters.

Listing 2: Testing for parameter existence with the Has method

#include "openeye.h"
#include "oesystem.h"

using namespace OESystem;

const char *InterfaceData = 
"!PARAMETER -b\n"
"  !TYPE bool\n"
"  !BRIEF An boolean parameter\n"
"!END\n"
"!PARAMETER -i\n"
"  !TYPE int\n"
"  !BRIEF An integer parameter\n"
"!END\n"
"!PARAMETER -f\n"
"  !TYPE float\n"
"  !BRIEF A float parameter\n"
"!END\n"
"!PARAMETER -str\n"
"  !TYPE string\n"
"  !BRIEF A string parameter\n"
"!END\n";

int main(int argc, char** argv)
{
  OEInterface itf(InterfaceData, argc, argv);

  if (itf.Has<bool>("-b"))
    std::cout << "-b = " << itf.Get<bool>("-b") << std::endl;
  if (itf.Has<int>("-i"))
    std::cout << "-i = " << itf.Get<int>("-i") << std::endl;
  if (itf.Has<float>("-f"))
    std::cout << "-f = " << itf.Get<float>("-f") << std::endl;
  if (itf.Has<std::string>("-str"))
    std::cout << "-str = " << itf.Get<std::string>("-str") << std::endl;

  return 0;
}

The following demonstrates the output of the program in Listing 2. Notice that only variables present on the command line will trigger OEInterface::Has to return true. This is an important distinction when dealing with boolean parameters, OEInterface::Has will not return the value of the boolean parameter, just the existence of it.

prompt> ./UsingOEInterfaceHas
prompt> ./UsingOEInterfaceHas -b
-b = 1
prompt> ./UsingOEInterfaceHas -i 2
-i = 2
prompt> ./UsingOEInterfaceHas -f 0.5
-f = 0.5
prompt> ./UsingOEInterfaceHas -str foo
-str = foo

Default Values

Using OEInterface::Has all over a program can become quite clumsy. Therefore it is recommended to use !DEFAULT in the parameter interface definition to specify the parameter’s default value whenever a default value can be specified. When a parameter has a default value OEInterface::Has will always return true. Listing 3 demonstrates how to set a default value for a parameter.

Listing 3: Setting a default value for a parameter

#include "openeye.h"
#include "oesystem.h"

using namespace OESystem;

const char *InterfaceData = 
"!PARAMETER -b\n"
"  !TYPE bool\n"
"  !DEFAULT false\n"
"  !BRIEF An boolean parameter\n"
"!END\n"
"!PARAMETER -i\n"
"  !TYPE int\n"
"  !DEFAULT 2\n"
"  !BRIEF An integer parameter\n"
"!END\n"
"!PARAMETER -f\n"
"  !TYPE float\n"
"  !DEFAULT 0.5\n"
"  !BRIEF A float parameter\n"
"!END\n"
"!PARAMETER -str\n"
"  !TYPE string\n"
"  !DEFAULT foo\n"
"  !BRIEF A string parameter\n"
"!END\n";

int main(int argc, char** argv)
{
  OEInterface itf(InterfaceData, argc, argv);

  std::cout << "-b = " << itf.Get<bool>("-b") << std::endl;
  std::cout << "-i = " << itf.Get<int>("-i") << std::endl;
  std::cout << "-f = " << itf.Get<float>("-f") << std::endl;
  std::cout << "-str = " << itf.Get<std::string>("-str") << std::endl;

  return 0;
}

This allows the code to take a very simple structure that doesn’t have to care whether the parameter was actually specified on the command line. The following is what is printed by the program in Listing 3 when no arguments are given.

prompt> ./UsingOEInterfaceDefaults
-b = 0
-i = 2
-f = 0.5
-str = foo

Requiring Parameters

Programs usually have a bare minimum set of parameters that must be specified. A parameter can be marked as required in the interface definition by using !REQUIRED true in the parameter definition. Listing 4 demonstrates making all the parameters used in the previous example required.

Listing 4: Requiring all parameters to be specified.

#include "openeye.h"
#include "oesystem.h"

using namespace OESystem;

const char *InterfaceData = 
"!PARAMETER -b\n"
"  !TYPE bool\n"
"  !REQUIRED true\n"
"  !BRIEF An boolean parameter\n"
"!END\n"
"!PARAMETER -i\n"
"  !TYPE int\n"
"  !REQUIRED true\n"
"  !BRIEF An integer parameter\n"
"!END\n"
"!PARAMETER -f\n"
"  !TYPE float\n"
"  !REQUIRED true\n"
"  !BRIEF A float parameter\n"
"!END\n"
"!PARAMETER -str\n"
"  !TYPE string\n"
"  !REQUIRED true\n"
"  !BRIEF A string parameter\n"
"!END\n";

int main(int argc, char** argv)
{
  OEInterface itf(InterfaceData, argc, argv);

  std::cout << "-b = " << itf.Get<bool>("-b") << std::endl;
  std::cout << "-i = " << itf.Get<int>("-i") << std::endl;
  std::cout << "-f = " << itf.Get<float>("-f") << std::endl;
  std::cout << "-str = " << itf.Get<std::string>("-str") << std::endl;

  return 0;
}

If any required parameter is missing from the command line during the command line parsing stage an appropriate error message will be thrown. The following demonstrates the error messages thrown by the program in Listing 4:

prompt> ./UsingOEInterfaceRequired
No arguments specified on the command line
Required parameters:
    -b : An boolean parameter
    -i : An integer parameter
    -f : A float parameter
    -str : A string parameter
For more help type:
  ./UsingOEInterfaceRequired --help

prompt> ./UsingOEInterfaceRequired -b
Fatal: Missing required parameter -i

Parameter Lists

Sometimes it is useful to allow multiple values for a given parameter. The !LIST field is used to specify that a parameter can have multiple values separated by spaces on the command line. The OEInterface::GetList method should then be used to get an iterator over the parameter’s values. Listing 5 demonstrates how to specify that a parameter is a list of values and then how to access it with the OEInterface::GetList method.

Listing 5: Specifying that a parameter is a list of values

#include "openeye.h"
#include "oesystem.h"

using namespace OESystem;

const char *InterfaceData = 
"!PARAMETER -strs\n"
"  !TYPE string\n"
"  !LIST true\n"
"  !BRIEF Some string parameters\n"
"!END\n";

int main(int argc, char** argv)
{
  OEInterface itf(InterfaceData, argc, argv);

  std::cout << "-strs = ";
  for (OEIter<const std::string> p = itf.GetList<std::string>("-strs"); p; ++p)
    std::cout << *p << std::endl
              << "        ";
  std::cout << std::endl;

  return 0;
}

The command line parser will then assign elements as values based upon their position in the argv that was passed to the program. This allows the user to pass values that have spaces in them by simply quoting the relevant section. The following demonstrates the behavior of the program in Listing 5:

prompt> ./UsingOEInterfaceList -strs foo bar blah
-strs = foo
        bar
        blah

prompt> ./UsingOEInterfaceList -strs "foo bar" blah
-strs = foo bar
        blah

Positional Parameters

For commonly used parameters it is often useful to allow them to be specified by their position on the command line. The !KEYLESS record is used to specify that a parameter can be assigned from a positional value on the command line. Listing 6 demonstrates how to specify input and output file name parameters that can be specified from their position on the command line.

Listing 6: Specifying keyless input and output file parameters

#include "openeye.h"
#include "oesystem.h"

using namespace OESystem;

const char *InterfaceData = 
"!PARAMETER -i\n"
"  !TYPE string\n"
"  !KEYLESS 1\n"
"  !BRIEF Input file name\n"
"!END\n"
"!PARAMETER -o\n"
"  !TYPE string\n"
"  !KEYLESS 2\n"
"  !BRIEF Output file name\n"
"!END\n";

int main(int argc, char** argv)
{
  OEInterface itf(InterfaceData, argc, argv);

  std::cout << "Input file name = "  << itf.Get<std::string>("-i") << std::endl;
  std::cout << "Output file name = " << itf.Get<std::string>("-o") << std::endl;

  return 0;
}

!KEYLESS parameters can still be specified using their command line flag. However, positional arguments need to be the last values on the command line. The following demonstrates how !KEYLESS parameters can and cannot be specified on the command line:

prompt> ./UsingOEInterfaceKeyless -i foo -o bar
Input file = foo
Output file = bar

prompt> ./UsingOEInterfaceKeyless foo bar
Input file = foo
Output file = bar

prompt> ./UsingOEInterfaceKeyless -i foo bar
Input file = foo
Output file = bar

prompt> ./UsingOEInterfaceKeyless foo -o bar
Fatal: Unknown parameter, foo, on command line

Warning

An ambiguity occurs when mixing !KEYLESS with !LIST. There is no way for the parser to differentiate between multiple values for a !LIST and different !KEYLESS parameters. OEInterface resolves this by requiring that any keyless list parameter must be the last keyless parameter.

Molecule Parameters

As the previous example illustrated how to to handle simple parameters such as float, int or string. The OEInterface also provides a convenient way to open a molecule file and return its first molecule.

Listing 7: Example of using molecule parameter

#include <openeye.h>
#include <oeplatform.h>
#include <oechem.h>

using namespace OESystem;
using namespace OEChem;

const char *InterfaceData =
"!PARAMETER -mol\n"
"!TYPE OEGraphMol\n"
"  !BRIEF Molecule file\n"
"  !REQUIRED true\n"
"  !KEYLESS 1\n"
"!END\n";

int main(int argc, char** argv)
{
  OERegisterMolParameters();
  OEInterface itf(InterfaceData, argc, argv);

  OEGraphMol mol = itf.Get<OEGraphMol>("-mol");
  std::cout << "Number of heavy atoms in molecule = " <<  OECount(mol, OEIsHeavy()) << std::endl;

  return 0;
}
prompt> ./OEInterfaceMolParam -mol benzene.mol
Number of heavy atoms in molecule = 6

Similarly the first multi-conformer molecule can be imported from a molecule file by setting the type of a parameter as !TYPE OEMol.

Command Line Help

The OEInterface object can have a significant amount of information, default values, text descriptions, etc, about the parameters it holds. All this information allows the OEInterface to provide help to the user interactively automatically based upon the interface definition. Listing 8 is a fairly simple program, it just concatenates all the molecules given into a single molecule. Though it has a fairly complex set of command line parameters which OEInterface handles automatically.

Listing 8: A program to concatenate any number of molecules together

#include "openeye.h"
#include "oesystem.h"
#include "oechem.h"

using namespace OESystem;
using namespace OEChem;

const char *InterfaceData = 
"!BRIEF UsingOEInterfaceHelp [-d <delimiter>] [-o] <output> [-i] <input1> <input2> ...\n"
"!PARAMETER -delim\n"
"  !ALIAS -d\n"
"  !TYPE string\n"
"  !DEFAULT _\n"
"  !BRIEF Title delimiter\n"
"  !DETAIL\n"
"         This is the value given to the OEChem function OEAddMols to\n"
"         separate the titles of the input molecules in the output.\n"
"!END\n"
"!PARAMETER -output\n"
"  !ALIAS -o\n"
"  !TYPE string\n"
"  !REQUIRED true\n"
"  !KEYLESS 1\n"
"  !BRIEF The output file\n"
"  !DETAIL\n"
"         The molecule file to output to, can be any file format\n"
"         OEChem supports.\n"
"!END\n"
"!PARAMETER -inputs\n"
"  !ALIAS -i\n"
"  !TYPE string\n"
"  !REQUIRED true\n"
"  !LIST true\n"
"  !KEYLESS 2\n"
"  !BRIEF The input files\n"
"  !DETAIL\n"
"         A list of molecule files to add together. Note, only the\n"
"         first molecule from every file will be added.\n"
"!END\n";

int main(int argc, char** argv)
{
  OEInterface itf(InterfaceData, argc, argv);

  OEGraphMol outmol; 
  for (OEIter<const std::string> i = itf.GetList<std::string>("-inputs"); i; ++i)
  {
    oemolistream ims(*i);
    
    OEGraphMol inmol;
    OEReadMolecule(ims, inmol);

    OEAddMols(outmol, inmol, itf.Get<std::string>("-delim").c_str());
  }

  oemolostream oms(itf.Get<std::string>("-output"));
  OEWriteMolecule(oms, outmol);

  return 0;
}

Note

  • !BRIEF can be used outside of a !PARAMETER to give a concise one line command line usage.
  • !ALIAS is used to specify an alternative name the parameter can be specified with.
  • !DETAIL is used for a more detailed description of the parameter.

Whenever there are !REQUIRED parameters the OEInterface will complain about an empty command line. This is usually the first way users interact with a command line program so the goal is to give them only enough information to get the program running. This is where specifying a top-level !BRIEF is exceptionally useful to the end user. The following is the output of the program in Listing 8 when no parameters are given:

prompt> ./UsingOEInterfaceHelp
No arguments specified on the command line
./UsingOEInterfaceHelp : UsingOEInterfaceHelp [-d <delimiter>] [-o] <output> [-i] <input1> <input2> ...
Required parameters:
    -output : The output file
    -inputs : The input files
For more help type:
  ./UsingOEInterfaceHelp --help

More information is accessible from the command line by specifying --help as the first argument on the command line. The following is an example of requesting help from the program in Listing 7:

prompt> ./UsingOEInterfaceHelp --help
./UsingOEInterfaceHelp : UsingOEInterfaceHelp [-d <delimiter>] [-o] <output> [-i] <input1> <input2> ...
Simple parameter list
    -delim : Title delimiter
    -inputs : The input files
    -output : The output file

Additional help functions:
  ./UsingOEInterfaceHelp --help simple      : Get a list of simple parameters (as seen above)
  ./UsingOEInterfaceHelp --help all         : Get a complete list of parameters
  ./UsingOEInterfaceHelp --help defaults    : List the defaults for all parameters
  ./UsingOEInterfaceHelp --help <parameter> : Get detailed help on a parameter
  ./UsingOEInterfaceHelp --help html        : Create an html help file for this program

As seen above --help can take arguments that allow the user introspect even further into the semantics of a particular parameter.

prompt> ./UsingOEInterfaceHelp --help -inputs

Contents of parameter -inputs
    Aliases : -i
    Type : string
    Allow list : true
    Default : (parameter does not have a default)
    Keyless : 2
    Simple : true
    Required : true
    Brief : The input files
    Detail
        A list of molecule files to add together. Note, only the
        first molecule from every file will be added.

Note

OECheckHelp is the low-level function used to check a command line for --help. By default the OEInterface constructor and the OEParseCommandLine function call OECheckHelp automatically so most users will never need to use this function directly.

Interface Definition Specification

Configuring an OEInterface is the process of telling the name, type, and other details of all the command line parameters the program is going to use. An interface definition consists of a collection of PARAMETER blocks optionally organized into CATEGORY blocks. A PARAMETER block must always have the following information:

NAME:The command line flag, e.g., -i, -str, -cutoff
TYPE:The kind of information being passed, e.g., float, int, string, bool

A PARAMETER can also optionally contain the following additional information:

ALIAS:One or more aliases for the parameter name that are generally used to create shorthand for lengthy parameter names. For example a parameter named -energy_cutoff_window could be given an alias -ecw, that would make typing -energy_cutoff_window 10.0 on the command line equivalent to -ecw 10.0.
DEFAULT:A default value for the parameter that will be used if the parameter is not specified by the user on the command line.
REQUIRED:A flag specifying that the parameter is required. If a parameter is required the program will throw an error if the user does not specify that parameter. By default parameters are not required.
LIST:A flag specifying whether the parameter is a list of multiple values. For example, ./program -inputs file1.smi file2.smi file3.smi.
KEYLESS:A number, N, that specifies that this parameter may be fulfilled by the Nth value on the command line that does not have a parameter flag specifically associated with it. For example, if -i had a KEYLESS value of 1 specified the following are equivalent command lines, ./program -i foo.smi and ./program foo.smi.
VISIBILITY:A visibility level that controls when the parameter appears when the users requests --help. Valid values are: simple, normal, hidden. The default is simple if a VISIBILITY isn’t specified.
LEGAL_VALUE:A valid value that the parameter can be. The parameter specified must be one of the values specified in a LEGAL_VALUE field.
ILLEGAL_VALUE:A value that cannot be specified as the parameter value.
LEGAL_RANGE:A range that the parameter value must fall into. More useful than LEGAL_VALUE when the parameter is a float.
ILLEGAL_RANGE:A range that the parameter value must not fall into. More useful than ILLEGAL_VALUE when the parameter is a float.
BRIEF:A one line description of the parameter that will be displayed to users when listing multiple command line options using the high-level --help functions.
DETAIL:A multi line description of the parameter that will be displayed to users when the user requests help about a specific parameter, e.g., --help -foo.

Therefore, a PARAMETER block takes the following form:

!PARAMETER <name> [order priority]
  !ALIAS <alias>
  !TYPE <type>
  !DEFAULT <default value>
  !REQUIRED <true or false>
  !LIST <true or false>
  !KEYLESS <N>
  !VISIBILITY <visibility>
  !LEGAL_VALUE <value>
  !ILLEGAL_VALUE <value>
  !LEGAL_RANGE <hi value> <low value>
  !ILLEGAL_RANGE <hi_value> <low_value>
  !BRIEF <brief description>
  !DETAIL
    <detailed description line 1>
    <detailed description line 2>
    <detailed description line 3>
    .
    .
    .
!END

PARAMETER blocks can then be organized into CATEGORY blocks. CATEGORY blocks can also contain nested CATEGORY blocks, there is no limit to the amount of nesting. The value specified after the !CATEGORY specifier is the name of the CATEGORY. The CATEGORY can also contain a !BRIEF to give a human readable description of the grouping. The following interface definition defines an interface definition with an “Input and Output” category with sub-categories for input and output respectively. As well as a category for various other parameters.

!CATEGORY Input and Output
  !CATEGORY Input
    !PARAMETER -i
      ...
    !END
  !CATEGORY Output
    !PARAMETER -o
      ...
    !END
  !END
!END
!CATEGORY Other Stuff
  !PARAMETER -o
    ...
  !END
!END

By default parameters and categories appear in alphabetical order. This order can be adjusted however by adding an integer after the category or parameter name in the interface definition. Parameters and categories are then sorted first by this integer and secondarily by their names. If this integer is unspecified, it is assumed to be zero. The following example illustrates re-ordering of both categories and variables.

!CATEGORY Input and Output 1
  !CATEGORY Input 1
    !PARAMETER -input 1
      ...
    !END
    !PARAMETER -dbase 2
      ...
    !END
 !CATEGORY Output 2
    !PARAMETER -output 1
      ...
    !END
    !PARAMETER -hits 2
      ...
    !END
  !END
!END
!CATEGORY Other Stuff 2
  !PARAMETER -o
    ...
  !END
!END