Optional Parameters
Execute Options
- -param
Command line options will be read from the specified file. This file may have been generated from a previous run or may be constructed de novo. The default name of the file is
szybki.param. Any parameter in the parameter setup file is superseded by the parameter on the command line. For example, runningszybki -param szybki.param -i my.pdbwill perform calculations for the moleculemy.pdbwhile using all other parameters taken fromszybki.param.
- -mpi_np <n>
Specifies the number of processors
nwhen SZYBKI is run in MPI mode.
- -mpi_hostfile <filename>
Specifies the name of the file containing processors configuration. For every host, this file should contain a line
host_name nwherenis the number of processors on the host.
Molecule Preparation Options
- -strip_water
Causes removal of water molecules from the input protein when options
-proteinor-complexare used.
- -largest_part
Calculations are performed only for the largest fragment of the noncovalent input complex. For example, when the input file contains coordinates for the salt, \([Large\_cation^+]\cdot[Cl^-]\), the use of the
-largest_partwill cause the \(Cl^-\) anion to be ignored. By default, calculations are done for the entire complex.
Output Options
- -silent
By default, the names of the processed molecules are displayed. The use of the option
-silentwill suppress this output.
- -verbose
An extensive general output will be generated containing initial and final energy and gradient data, as well as an optimization report.
- -prefix <pn>
Replaces the
szybkiprefix in .log, .rpt (report), .status , .param, and _out.oeb files, with the input stringpn.
- -report
An output file in tabular form of final energy components for every molecule/conformer will be generated. The name of the file is fixed as
prefix.rpt, whereprefixisszybkiby default or determined by the-prefixoption.
- -sdtag <string>
In the case when the output file is in the SD format, an energy tag can be added. For single molecules, by default all energy terms (MMFF, solvation, and constraint) plus
Total energyare added, so the option has no effect. “Total energy” always contains the constraint terms. In the case of protein-ligand systems, only the relevant terms (Ligand-Protein Energyand all of its components) are attached as SD tags by default. The option can be used therefore to add one of the remaining terms from the list below. Ifstringis set toall, all energy terms are tagged as SD tags.VdWCoulombBondBendStretchBendTorsionImproper TorsionL VdWL CoulombL BondL BendL StretchBendL TorsionL Improper TorsionP VdWP CoulombP BondP BendP StretchBendP TorsionP Improper TorsionPL VdWPL CoulombLigand-Protein EnergySheffield SolvationConstraint PotentialPB SolventArea Solvent__VdW__Coulomb__Protein desolv__Ligand desolv__Solvent screening__Grid Coulomb__Exact CoulombP/L energy__AMBER VdW__AMBER CoulombTorsion ConstraintIEFF InteractionInterLigand IEFF
If the parameter
stringdoes not correspond to one of the available energy terms, no tag will be written. Tags starting with L, P, PL and__are:L: ligand intramolecular terms only
P: protein intramolecular terms only
PL: protein-ligand intermolecular terms only
__: protein-ligand interaction terms
- -keepFailures
By default, molecules which failed during processing are store in the file
prefix.FAIL.input_formatwhereinput_formatis that of the input molecule used in option-ligands. If the flag is set totrue, those failed molecules are written to the molecule output file.
- -heavy_rms
By default, the RMSD of all atoms after optimization is reported in the log file. This option replaces the default with the heavy atoms RMSD.
- -log <prefix>
Prefix of the log file name,
prefix.log. If omitted, the prefixszybkiis used by default. This option is aliased to-l.
- -out <filename>
Output file name, in any format supported by OEChem, for an optimized ligand. If not specified,
szybki_out.oeborprefix_out.oeb(when-prefixis used) will be generated. Alias-ocan be used instead of-out. This can be used as a keyless parameter without the-outkey when it is last on the command line just after the-in(or-ior-ligands) option.
- -out_protein <filename>
Partially optimized protein will be saved in a file named
filename. By default, this option is not used; however, modified coordinates of a protein are still saved in the form of a generic data attached to the optimized ligand when it is output in the oeb format, under the tagsaveprot. One can easily extract coordinates of the optimized protein using OEChem TK or VIDA.
- -out_complex <filename>
Protein-ligand complex for partially optimized protein will be saved in a file named
filename.
Force Field Options
- -ff
Force field to be used. Valid options are:
mmff94,mmff94s,amber_mmff94,amber_mmff94s,ieff_mmff94,ieff_mmff94s,smirnoff99frosst,parsley_openff,sage_openff,ff14sb_parsley, andff14sb_sage.Upper case options are also accepted. In addition to the above, this command also accepts any SMIRNOFF force field parameter file in XML format.
This option overwrites the deprecated option
-MMFF94swhen both are used simultaneously. The combined potential AMBER-MMFF94 (or AMBER-MMFF94s) can be applied only for protein-ligand interactions using a rigid protein model. All intramolecular ligand interactions are described by the MMFF94 (or MMFF94s) force field while intermolecular protein-ligand interactions are handled by the Amber force field.ieff_mmff94andieff_mmff94spotentials can be used only for intermolecular interactions provided that electrostatic multipoles are assigned to the molecules. Combined potentialff14sb_parsleyorff14sb_sagecan be applied for optimization of a protein system which might contain cofactors and ligands. In such a case, the protein part is handled with the ff14SB version of AMBER while ligands and cofactors by the latestparsleyorsageforce field. Currently selectingff14sb_parsleyorff14sb_sagecannot be combined with-proteinoption. In order to optimize a part of a protein-ligand complex (e.g., the ligand), with theff14sb_parsleyorff14sb_sagepotential, one can specify a list of flexible atoms using-flex_file.
- -exact_vdw
By default, the van der Waals protein-ligand interaction is calculated with the use of lookup tables in order to speed up the calculations. This option allows using the exact analytical van der Waals potential for the optimization of a ligand in the protein binding site.
- -mod_vdw
Regular MMFF Van der Waals Interactions equation will be replaced with:
\(V_{vdw} = \left\{ \begin{array}{cc} \epsilon_{ij} \left( \frac{1.07R_{ij}}{r_{ij}+0.07R_{ij}} \right) ^7 \left( \frac{1.12R_{ij}^7}{r_{ij}^7 +0.12R_{ij}} -2 \right) & \mbox{for $r_{ij} < R_{ij}$} \\ -\epsilon_{ij} & \mbox{for $r_{ij} \geq R_{ij}$} \end{array} \right.\)
in which no attractive van der Waals forces are present. This type of van der Waals potential prevents a so-called “hydrophobic collapse.”
- -neglect_frozen
After an optimization with frozen terms, the default behavior calls for a full single-point calculation of the whole system. The
-neglect_frozenoption will skip this final calculation of the whole system, thus yielding results for only the non-frozen pieces. When there is a large frozen section, this can drastically reduce memory and CPU usage.
- -noCoulomb
Electrostatic terms defined in the electrostatic interactions section will be excluded from the force field potential. This option might be useful to prevent generation of folded structures.
- -protein_vdw <r>
Calculation of van der Waals protein-ligand interaction energy will be limited to a sphere of radius
r. The default value is 18.0 Å. In many applications, a value as small as 10 Å can be used with essentially no effect on the final optimized ligand geometry. The legal range is 5–500 Å.
- -strict
Enforces strict atom typing. This is a default behavior. When the value of the flag is selected to be
false, this enforcement is removed.
- -vdw_cutoff
Sets distance for intramolecular van der Waals interactions. By default, all atom pairs in the molecules contribute to the molecule’s van der Waals energy. The legal range is 5–500 Å.
Charging Options
- -am1bcc
AM1BCC charges [Jakalian-2002] are calculated for every conformation and used for the PB or Sheffield solvation energy only. Those charges are therefore conformation-dependent, and reflect changes of molecular density with molecular geometry. Note that AM1BCC charges are conventionally calculated for just one conformer or a few conformers of a multiconformer molecule, those conformers with electrostatically least-interacting functional (ELF) groups, so calculating AM1BCC charges for every conformer with this option is unconventional. For conventional behavior, apply AM1BCC ELF charges to the molecule separately and then use that as the input structure with the
-current_chargesoption.
- -current_charges
During optimization of molecules in solution with the use of the Poisson-Boltzmann solvation model, the free energy of solvation and solvent forces can be calculated with the use of atomic partial charges other than MMFF. The option
-current_chargesallows the use of partial charges read from the input molecular file. In the case where the ligand is optimized inside a protein, protein-ligand electrostatic interactions (coulomb and/or Poisson-Boltzmann) could be calculated based on ligand and protein partial charges read from the molecular input files. When input partial charges are not found, the MMFF94 partial charges will be used.
Solvent Options
- -inner_dielectric <d>
The default value for the protein or ligand dielectric constant of 1.0 can be changed to a user-selected value
d. This option is aliased as-protein_dielectric. The upper allowed limit is 20.0.
- -shefA <a>
Parameter
ain the Sheffield solvation potential given in option-sheffield. If the option is not used, a value of 1.553149 is assigned [Grant-2007].
- -shefB <b>
Parameter
bin the Sheffield solvation potential given in option-sheffield. If the option is not used, a value of 0.735694 is assigned [Grant-2007].
- -sheffield
Use of this option will result in adding an additional electrostatic term in order to mimic the solution environment according to [Grant-2007]. This term is of the form: \(f_\epsilon/(8\pi\epsilon_0)\sum_{i,j}q_iq_j/\sqrt{(ar^2 + bR_iR_j)}\), where \(f_\epsilon=(1/\epsilon_{in}-1/\epsilon_{out})\), \(q_i\), and \(q_j\) are partial charges, and \(R_i\) and \(R_j\) are van der Waals radii of atoms
iandj. While less accurate than Poisson-Boltzmann, this method is very fast and offers analytic first and second derivatives. Because of the latter, it is the only solvation option for entropy calculations.
- -solv_dielectric <d>
Allows changing the default value for solvent dielectric constant used for Poisson-Boltzmann and Sheffield solvation energy calculations. The allowed range is 1 to 100. Default value is 80.
- -solventCA <s>
This option can only be used in combination with
freeform -solventPBor-sheffield. It causes inclusion of a molecular surface solvation term (sometimes called cavity solvation term) in the total energy. The value of parameters(microscopic surface tension coefficient) is in the range 0.005–0.030 kcal/(mol Å\(^2\)). This option is aliased as -solventMA for compatibility with previous releases.
- -solventPB
For optimization of small molecules in solution, the electrostatic part of molecule-solvent interactions will be calculated using Poisson-Boltzmann model.
- -protein_elec <m>
This option provides four choices for calculating protein-ligand electrostatic interaction energies:
m = None, ExactCoulomb, GridCoulomb, andPB. The optionm=Noneeliminates electrostatic interactions. Values ofmset atExactCoulombandGridCoulombresult in the usage of coulomb exact potential and digitized on the grid, respectively. Optionm=PBprovides a more realistic potential which accounts for solvent forces, calculated according to the Poisson-Boltzmann model at every iteration step. This option requires substantially higher CPU time, particularly for large proteins. By default,m=ExactCoulomb.
- -radii <type>
Determines types of atomic radii used for PB calculations (options:
freeform -solventPBand-protein_elec). By default, the parametertypeis set toBondi. Two other choices areZAP9andZAP7[Nicholls-2010].
- -salt <c>
Allows to all PB calculations to be performed at a specified salt concentration in M, up to 0.08M. By default, the salt concentration is zero.
Saving and Loading Coulombic Grids
- -loadPG <filename>
In the case when the electrostatic component of the protein-ligand interaction energy has been precalculated on a grid, this option forces SZYBKI to read the grid potential from the file
filename, and use it for ligand optimization inside the protein. This option is available only when-protein_elecis set withGridCoulomb.
- -savePG <filename>
Saves a potential grid in the file named
filename. Allows a significant saving in CPU time for runs when coulomb grids are used to optimize a number of ligands inside the same protein. This option is available only when-protein_elecis set withGridCoulomb.
Optimization Options
- -optDOF
An alias to
-optGeometry.
- -optdof
An alias to
-optGeometry.
- -optGeometry <dof>
Optimization in specified degrees of freedom will be done. The possible choice for the parameter
dofarecartfor Cartesian coordinates,torortorsionsfor torsion optimization,solidfor rigid ligand optimization inside a protein receptor,noneorspfor single point calculation,Honlyfor hydrogen atoms only optimization, andcalculationDependent. The valuecalculationDependent, which is selected by default when an option is not used, sets the type of degrees of freedom: free ligands by default are optimized in Cartesian coordinates, while protein-bound ligands are in translational-rotational coordinates. The option is aliased with-optDOFand-optdof.
- -grad_conv <c>
Optimization is terminated when the RMS gradient reaches the input value
c, unless it is finished earlier because of other reasons. When omitted, the default convergence criteria is 0.1 on gradient vector norm: \(\sqrt{\sum_i {g_ig_i} }\), where \(g_i\) is thei-thgradient component.
- -max_iter <m>
Optimization will be terminated when the number of iteration cycles reaches input number
m. The default value is 1000.
- -optMethod <type>
Selects optimization type. The value of
typereplaces the default BFGS optimizer for small molecules and the conjugate gradient for systems with 500 or more degrees of freedom. Allowed values oftypeareBFGSorbfgsfor BFGS optimization;CG,cg, orconjfor conjugate gradient;SD,sd, orsteepestfor steepest descent optimization;sd_bfgsfor pre-optimization with five steps of steepest descent followed by BFGS optimization;sd_cgfor pre-optimization with five steps of steepest descent followed by conjugate gradient optimization; and “newton” or “NEWTON” for Newton-Raphson optimization if analytic second derivatives are available. Entropy estimation requires a BFGS or Newton-Raphson type of optimization, so all values of the parametertypewhich represent different optimization methods are ignored for entropy runs. In addition, when a quasi-Newton method of entropy estimation is requested (see-entropy), the usage of BFGS is enforced.
Fixing Ligand Atoms
- -fix_file <filename>
The text file
filenameshould contain a list of molecule names followed by atom numbers to be fixed. An example of the command line for a SZYBKI run with the use of this option is given below in the Example Commands subsection.
- -fix_smarts <file_name>
All atoms which belong to a SMARTS pattern specified in a single line of the text file
file_namewill be fixed. For example, if the input string is [!#1], all heavy atoms will be fixed and a SZYBKI run will optimize all hydrogen atom positions.
- -flex_file <filename>
The text file
filenameshould contain a list of molecule names followed by atom numbers to be optimized. All non-listed atoms will be fixed. The atom numbering convention used is from 0 to n-1, where n is the molecule’s total number of atoms.
Constraining Ligand Atoms
- -harm_constr1 <k>
Constrained potential of the form \(kr^2\) will be imposed on all heavy atoms, where \(k\) is the force constant. By default, no constraint is applied (\(k = 0\)), and the upper allowed limit is 1000 kcal/(mol Å\(^2\)).
- -harm_constr2 <d>
Constrained potential of the form \(V = k(r-d)^2,r>d\) and \(V=0, r \leq d\) will be imposed on all heavy atoms, where d is the constraining distance in angstroms. This can be used only together with
-harm_constr1. The default value is 0, while the upper allowed value is 5 Å.
- -harm_smarts <file.txt>
All atoms which belong to a SMARTS pattern read from the file
file.txtwill be constrained. For example, when the filefile.txtcontains a single linecO, the input option-harm_smarts file.txtwill result in constraining all aromatic carbon atoms and oxygen atoms bonded to them. This must be used in conjunction with-harm_constr1.
Constraining Torsion Angles
- -tor_constr <fn>
File name containing a single line which determines the torsion to be constrained, reference torsion angle, and a force constant. The constraining potential is in the form \(V=k_c(cos(phi) - cos(phi0))^2\), where \(k_c\) is the user-specified force constant, and \(phi0\) is the reference torsion dihedral angle. The input data in the file
fnshould be in the following order: SMARTS \(phi0\) \(k_c\). SMARTS might be replaced with atom indices: \(i_1\) \(i_2\) \(i_3\) \(i_4\) \(phi0\) \(k_c\).Examples of valid inputs are:
[C:1][N:2][c:3][s:4] 0.0 2.05 0 10 25 0.0 2.0Notice that atom indices are numbered from 0 to \(N-1\), where \(N\) is the number of atoms in the molecule.
Protein Flexibility
- -flexdist <d>
Specifies the distance
dfrom the ligand that determines the flexible residues of a protein receptor in the optimization of a ligand inside the protein binding side. This has to be used together with-flextype.
- -flexlist <fn>
Similar to
-flexdist, but instead of using distance from the ligand as partial protein flexibility criteria, provides a file namefncontaining flexible residues. Every line of this file should contain a PDB residue name, residue number, and chain ID. For example, a two-line file:ILE 78 APhe 114 Binforms SZYBKI that Ile78 of chain A and Phe114 from chain B will be flexible during ligand optimization. This has to be used together with
-flextype.
- -flextype <type>
Allows the user to specify the type of partial protein flexibility. The possible values of parameter
typeare a stringpolarH,sideC, orresidue, with respective meanings of polar hydrogens, side chains, and complete residues. This must be used together with-flexdistor-flexlist. The molecular potential used for optimization consists of three components:Ligand intramolecular MMFF potential
Protein-ligand potential
MMFF terms for interactions between polar protein hydrogens and the ligand
Interaction between fixed protein atoms and the ligand
Protein intramolecular potential
MMFF terms involving polar protein hydrogens and atoms up to three bonds apart
3.2. Interaction between the rest of the protein and flexible polar hydrogens
The sum of the components 2b and 3b might be called a “protein-pseudoligand” interaction and is evaluated according to the model selected with the
-protein_elecoption above.
Entropy Calculation Options
- -entropy <type>
Estimation of the entropy of a ligand will be done. The parameter
typecan take three values: “None” (default), “QN” (Quasi-Newton Hessian), and “AN” (analytical). The input ligand is assumed to be in the form of a multiconformational molecule. The environment is determined by the-sheffieldoption (solution),-proteinoption (in the binding site), or neither one, indicating a gas phase.
- -t
Sets the temperature (in °C) of the system for entropy estimation. The default temperature is 25°C (298K).