Optional Parameters¶
Execute Options¶
-
-param
¶
Command line options will be read from the specified file. This file may have been generated from a previous run or may be constructed de novo. The default name of the file is
szybki.param
. Any parameter in the parameter setup file is superseded by the parameter on the command line. For example runningszybki -param szybki.param -i my.pdb
will perform calculations for the moleculemy.pdb
while using all other parameters taken fromszybki.param
.
-
-mpi_np
<n>
¶ Specifies the number of processors
n
when SZYBKI is run in MPI mode.
-
-mpi_hostfile
<filename>
¶ Specifies the name of the file containing processors configuration. For every host this file should contain a line
host_name n
wheren
is the number of processors on the host.
Molecule preparation Options¶
-
-strip_water
¶
Causes removal of water molecules from the input protein when options
-protein
or-complex
are used.
-
-largest_part
¶
Calculations are performed only for the largest fragment of the noncovalent input complex. For example when the input file contains coordinates for the salt, \([Large\_cation^+]\cdot[Cl^-]\), the use of the
-largest_part
will cause the \(Cl^-\) anion to be ignored. By default calculations are done for the entire complex.
Output Options¶
-
-silent
¶
By default the names of the processed molecules are displayed. The use of the option
-silent
will suppress this output.
-
-verbose
¶
An extensive general output will be generated containing initial and final energy and gradient data, as well as an optimization report.
-
-prefix
<pn>
¶ Replaces
szybki
prefix in .log, .rpt (report), .status , .param and _out.oeb files, with the input stringpn
.
-
-report
¶
An output file in tabular form of final energy components for every molecule/conformer will be generated. The name of the file is fixed as:
prefix.rpt
whereprefix
isszybki
by default or determined by the-prefix
option.
-
-sdtag
<string>
¶ In the case when the output file is in the SD format, an energy tag can be added. For single molecules by default all energy terms (MMFF, solvation and constraint) plus
Total energy
are added, so the option has no effect. “Total energy” always contains the constraint terms. In the case of protein-ligand systems only the relevant terms (Ligand-Protein Energy
and all of its components) are attached as SD tags by default. The option can be used therefore to add one of the remaining terms from the list below. Ifstring
is set toall
, all energy terms are tagged as SD tags.VdW
Coulomb
Bond
Bend
StretchBend
Torsion
Improper Torsion
L VdW
L Coulomb
L Bond
L Bend
L StretchBend
L Torsion
L Improper Torsion
P VdW
P Coulomb
P Bond
P Bend
P StretchBend
P Torsion
P Improper Torsion
PL VdW
PL Coulomb
Ligand-Protein Energy
Sheffield Solvation
Constraint Potential
PB Solvent
Area Solvent
__VdW
__Coulomb
__Protein desolv
__Ligand desolv
__Solvent screening
__Grid Coulomb
__Exact Coulomb
P/L energy
__AMBER VdW
__AMBER Coulomb
Torsion Constraint
IEFF Interaction
InterLigand IEFF
If parameter
string
does not correspond to one of the available energy terms, no tag will be written. Tags starting with L, P, PL and__
are:L: ligand intra-molecular terms only
P: protein intra-molecular terms only
PL: protein-ligand inter-molecular terms only
__
: protein-ligand interaction terms
-
-keepFailures
¶
By default molecules which failed during processing are store in the file
prefix.FAIL.input_format
whereinput_format
is that of the input molecule used in option-ligands
. If the flag is set totrue
, those failed molecules are written to molecule output file.
-
-heavy_rms
¶
By default all atoms RMSD after optimization is reported in the log file. This option replaces the default with the heavy atoms RMSD.
-
-log
<prefix>
¶ Prefix of the log file name,
prefix.log
. If omitted, the prefixszybki
is used by default. This option is aliased to-l
.
-
-out
<filename>
¶ Output file name, in any format supported by OEChem, for an optimized ligand. If not specified,
szybki_out.oeb
orprefix_out.oeb
(when-prefix
is used) will be generated. Alias-o
can be used instead of-out
. Can be used as a keyless parameter without the-out
key when it is last on the command line just after the-in
(or-i
, or-ligands
) option.
-
-out_protein
<filename>
¶ Partially optimized protein will be saved in a file named
filename
. By default this option is not used, however modified coordinates of a protein are still saved in the form of a generic data attached to the optimized ligand when it is output in the oeb format, under tagsaveprot
. One can easily extract coordinates of the optimized protein using OEChem TK or OE application VIDA.
-
-out_complex
<filename>
¶ Protein-ligand complex for partially optimized protein will be saved in a file named
filename
Forcefield Options¶
-
-ff
¶
Force field to be used. Valid options are:
mmff94
,mmff94s
,amber_mmff94
,amber_mmff94s
,ieff_mmff94
,ieff_mmff94s
,smirnoff99frosst
,parsley_openff
,sage_openff
,ff14sb_parsley
andff14sb_sage
.Upper case options are also accepted. In addition to the above, this command also accepts any SMIRNOFF forcefield parameter file in XML format.
This option overwrites the deprecated option deprecated -MMFF94s when both are used simultaneously. The combined potential amber-mmff94 (or ambefr-mmff94s) can be applied only for protein-ligand interaction using a rigid protein model. All intramolecular ligand interactions are described by the mmff94 (or mmff94s) force field while intermolecular protein-ligand interactions are handled by the Amber force field.``ieff_mmff94`` and
ieff_mmff94s
potentials can be used only for intermolecular interactions provided that electrostatic multipoles are assigned to the molecules. Combined potentialff14sb_parsley
orff14sb_sage
can be applied for optimization of a protein system which might contain cofactors and ligand(s). In such a case protein part is handled with ff14sb version of Amber while ligand(s) and cofactor(s) by parsley_openff1.3.1 or sage_openff2.0.0 force field. Currently selectingff14sb_parsley
orff14sb_sage
cannot be combined with-protein
option. In order to opimize a part of protein-ligand complex (for example ligand), withff14sb_parsley
orff14sb_sage
potential one can specify a list of flexible atoms using-flex_file
.
-
-exact_vdw
¶
By default the VdW protein-ligand interaction is calculated with the use of lookup table in order to speedup the calculations. This option allows using the exact analytical VdW potential for the optimization of a ligand in the protein binding site.
-
-mod_vdw
¶
Regular MMFF Van der Waals interactions equation will be replaced with:
\(V_{vdw} = \left\{ \begin{array}{cc} \epsilon_{ij} \left( \frac{1.07R_{ij}}{r_{ij}+0.07R_{ij}} \right) ^7 \left( \frac{1.12R_{ij}^7}{r_{ij}^7 +0.12R_{ij}} -2 \right) & \mbox{for $r_{ij} < R_{ij}$} \\ -\epsilon_{ij} & \mbox{for $r_{ij} \geq R_{ij}$} \end{array} \right.\)
in which no attractive VdW forces are present. This type of VdW potential prevents so called “hydrophobic collapse”.
-
-neglect_frozen
¶
After an optimization with frozen terms, the default behavior calls for a full single-point calculation of the whole system. The
-neglect_frozen
option will skip this final calculation of the whole system, thus yielding results for only the non-frozen pieces. When there is a large frozen section, this can drastically reduce memory and cpu usage.
-
-noCoulomb
¶
Electrostatic terms defined in the Electrostatic interactions section will be excluded from the force field potential. This option might be useful to prevent generation of folded structures.
-
-protein_vdw
<r>
¶ Calculation of VdW protein-ligand interaction energy will be limited to a sphere of radius
r
. The default value is 18.0 Å. In many applications a value as small as 10 Å can be used with essentially no effect on the final optimized ligand geometry. Legal range is 5-500 Å.
-
-strict
¶
Enforces strict atom typing. This is a default behavior. When the value of the flag is selected to be
false
, this enforcement is removed.
-
-vdw_cutoff
¶
Sets distance for intramolecular VdW interactions. By default all atom pairs in the molecules contribute to the molecule’s VdW energy. The legal range is 5 - 500 Å.
Charging Options¶
-
-am1bcc
¶
AM1BCC charges ([Jakalian-2002] ) are calculated for every conformation and used for the PB or Sheffield solvation energy only. Those charges are therefore conformation dependent, and reflect changes of molecular density with molecular geometry. Note that AM1BCC charges are conventionally calculated for just one conformer or a few conformers of a multiconformer molecule, those conformer(s) with Electrostatically Least-interacting Functional (ELF) groups, so calculating AM1BCC charges for every conformer with this option is unconventional. For conventional behavior, apply AM1BCC ELF charges to the molecule separately and then use that as the input structure with the
-current_charges
option.
-
-current_charges
¶
During optimization of molecules in solution with the use of the Poisson-Boltzmann solvation model, free energy of solvation and solvent forces can be calculated with the use of atomic partial charges other than MMFF. Option
-current_charges
allows the use of partial charges read from the input molecular file. In the case where the ligand is optimized inside a protein, protein-ligand electrostatic interactions (Coulomb and/or Poisson-Boltzmann) could be calculated based on ligand and protein partial charges read from the molecular input file(s). When input partial charges are not found, the MMFF94 partial charges will be used.
Solvent Options¶
-
-inner_dielectric
<d>
¶ The default value for the protein or ligand dielectric constant of 1.0 can be changed to a user selected value
d
. This option is aliased as-protein_dielectric
. The upper allowed limit is 20.0.
-
-shefA
<a>
¶ Parameter
a
in the Sheffield solvation potential given in option-sheffield
. If the option is not used a value of 1.553149 is assigned ([Grant-2007]).
-
-shefB
<b>
¶ parameter
b
in the Sheffield solvation potential given in option-sheffield
. If the option is not used a value of 0.735694 is assigned ([Grant-2007]).
-
-sheffield
¶
Usage of this option will result in adding additional electrostatic term in order to mimic the solution environment according to [Grant-2007]. This term is of the form: \(f_\epsilon/(8\pi\epsilon_0)\sum_{i,j}q_iq_j/\sqrt{(ar^2 + bR_iR_j)}\) where \(f_\epsilon=(1/\epsilon_{in}-1/\epsilon_{out})\), \(q_i\), \(q_j\) are partial charges and \(R_i\), \(R_j\) are Van der Waals radii of atoms
i
andj
. While less accurate than Poisson-Boltzmann, this method is very fast and offers analytic first and second derivatives. Because of the latter it is the only solvation option for entropy calculations.
-
-solv_dielectric
<d>
¶ Allows to change the default value for solvent dielectric constant used for Poisson-Boltzmann and Sheffield solvation energy calculations. Allowed range is 1 to 100. Default value is 80.
-
-solventCA
<s>
¶ This option can only be used in combination with
freeform -solvent
PB
or-sheffield
. It causes inclusion of a molecular surface solvation term (sometimes called cavity solvation term) in the total energy. The value of parameters
(microscopic surface tension coefficient) is in the range 0.005 - 0.030 kcal/(mol Å\(^2\)). This option is aliased as -solventMA, for compatibility with previous releases.
-
-solventPB
¶
For optimization of small molecules in solution, the electrostatic part of molecule-solvent interactions will be calculated using Poisson-Boltzmann model.
-
-protein_elec
<m>
¶ This option provides 4 choices for calculating protein-ligand electrostatic interaction energies:
m = None, ExactCoulomb, GridCoulomb
andPB
. Optionm
=None
eliminates electrostatic interactions. Values ofm
set atExactCoulomb
andGridCoulomb
result in the usage of Coulomb exact potential and digitized on the grid respectively. Optionm
=PB
provides a more realistic potential which accounts for solvent forces, calculated according to the Poisson-Boltzmann (PB) model at every iteration step. This option requires substantially higher CPU time, particularly for large proteins. By defaultm
=ExactCoulomb
.
-
-radii
<type>
¶ Determines types of atomic radii used for PB calculations (options:
freeform -solvent
PB
and-protein_elec
). By default parametertype
is set toBondi
. Two other choices areZAP9
andZAP7
([Nicholls-2010]).
-
-salt
<c>
¶ Allows to all PB calculations to be performed at specified salt concentration in M, up to 0.08M. By default salt concentration is zero.
Saving and Loading Coulombic Grids¶
-
-loadPG
<filename>
¶ In the case when the electrostatic component of the protein-ligand interaction energy has been pre-calculated on a grid, this option forces SZYBKI to read the grid potential from the file
filename
, and use it for ligand optimization inside the protein. This option is available only when-protein_elec
is set withGridCoulomb
.
-
-savePG
<filename>
¶ Saves potential grid in the file named
filename
. Allows a significant saving in CPU time for runs when Coulomb grids are used to optimize a number of ligands inside the same protein. This option is available only when-protein_elec
is set withGridCoulomb
.
Optimization Options¶
-
-optDOF
¶
An alias to
-optGeometry
.
-
-optdof
¶
An alias to
-optGeometry
.
-
-optGeometry
<dof>
¶ Optimization in specified degrees of freedom will be done. The possible choice for parameter
dof
are:cart
for Cartesian coordinates,tor
ortorsions
for torsions optimization,solid
for rigid ligand optimization inside a protein receptor,none
orsp
for single point calculation,Honly
for hydrogen atoms only optimization, andcalculationDependent
. The valuecalculationDependent
which is selected by default when option is not used, sets the defaults type of degrees of freedom: Free ligands by default are optimized in Cartesian coordinates, while protein bound ligands in translational-rotational coordinates. The option is aliased with-optDOF
and-optdof
.
-
-grad_conv
<c>
¶ Optimization is terminated when the rms gradient reaches the input value
c
unless is finished earlier because of other reasons. When omitted the default convergence criteria is 0.1 on gradient vector norm: \(\sqrt{\sum_i {g_ig_i} }\), where \(g_i\) isi-th
gradient component.
-
-max_iter
<m>
¶ Optimization will be terminated when the number of iteration cycles reaches input number
m
. The default value is 1000.
-
-optMethod
<type>
¶ Selects optimization type. The value of
type
replaces the default BFGS optimizer for small molecules and conjugate gradient for systems with 500 or more degrees of freedom. Allowed values oftype
are:BFGS
,bfgs
for BFGS optimization,CG
,cg
,conj
for conjugate gradient,SD
,sd
,steepest
for steepest descent optimization,sd_bfgs
for pre-optimization with 5 steps of steepest descent followed by BFGS optimization,sd_cg
for pre-optimization with 5 steps of steepest descent followed by conjugate gradient optimization, and “newton”, “NEWTON” for Newton-Raphson optimization if analytic second derivatives are available. Entropy estimation requires BFGS or Newton-Raphson type of optimization, so all values of parametertype
which represent different optimization methods are ignored for entropy runs. In addition, when quasi-Newton method of entropy estimation is requested (see-entropy
), the usage of BFGS is enforced.
Fixing Ligand Atoms¶
-
-fix_file
<filename>
¶ Text file
filename
should contain a list of molecule names followed by atom numbers to be fixed. An example of the command line for SZYBKI run with the use of this option is given below in the subsection Example Commands.
-
-fix_smarts
<file_name>
¶ All atoms which belong to SMARTS pattern specified in a single line of the text file
file_name
will be fixed. For example, if the input string is [!#1] all heavy atoms will be fixed and SZYBKI run will optimize all hydrogen atom positions.
-
-flex_file
<filename>
¶ Text file
filename
should contain a list of molecule names followed by atom numbers to be optimized. All non-listed atoms will be fixed. Numbering atoms convention used is from 0 to n-1, where n is the molecule’s total number of atoms.
Constraining Ligand Atoms¶
-
-harm_constr1
<k>
¶ Constrained potential of the form: \(kr^2\) will be imposed on all heavy atoms, where \(k\) is the force constant. By default no constraint is applied (\(k = 0\)), and the upper allowed limit is 1000 kcal/(mol Å\(^2\)).
-
-harm_constr2
<d>
¶ Constrained potential of the form: \(V = k(r-d)^2,r>d\) and \(V=0, r \leq d\) will be imposed on all heavy atoms, where d is the constraining distance in angstroms. Can be used only together with
-harm_constr1
. Default value is 0, while the upper allowed value is 5 Å.
-
-harm_smarts
<file.txt>
¶ All atoms which belong to SMARTS pattern read from file
file.txt
will be constrained. For example when the filefile.txt
contains a single linecO
, input option-harm_smarts file.txt
will result in constraining all aromatic carbon atoms and oxygen atoms bonded to them. Must be used in conjunction with-harm_constr1
.
Constraining Torsion Angles¶
-
-tor_constr
<fn>
¶ File name containing a single line which determines the torsion to be constrained, reference torsion angle and a force constant. The constraining potential is in the form: \(V=k_c(cos(phi) - cos(phi0))^2\), where \(k_c\) is the user specified force constant and \(phi0\) is the reference torsion dihedral angle. The input data in the file
fn
should be in the following order: SMARTS \(phi0\) \(k_c\). SMARTS might be replaced with atom indices: \(i_1\) \(i_2\) \(i_3\) \(i_4\) \(phi0\) \(k_c\).Examples of valid inputs are:
[C:1][N:2][c:3][s:4] 0.0 2.0
5 0 10 25 0.0 2.0
Notice that atom indices are numbered from 0 to \(N-1\), where \(N\) is the number of atoms in the molecule.
Protein Flexibility¶
-
-flexdist
<d>
¶ Specifies distance
d
from the ligand which determines flexible residues a protein receptor in the optimization of a ligand inside the protein binding side. Has to be used together with-flextype
-
-flexlist
<fn>
¶ Similar to
-flexdist
but instead of using distance from the ligand as partial protein flexibility criteria, provides a file namefn
, containing flexible residues. Every line of this file should contain a pdb residue name, residue number and chain id. For example a 2 line file:ILE 78 APhe 114 Binforms SZYBKI that Ile78 of chain A and Phe114 from chain B will be flexible during ligand optimization. Has to be used together with
-flextype
.
-
-flextype
<type>
¶ Allows to specify the type of partial protein flexibility. Possible values of parameter
type
is a stringpolarH
,sideC
orresidue
, and their respective meanings are: polar hydrogens, side chains and complete residues. Has to be used together with-flexdist
or-flexlist
. The molecular potential used for optimization consists of three components:Ligand intra-molecular MMFF potential
Protein-ligand potential
2.1. MMFF terms for interaction between polar protein hydrogens and the ligand
2.2. Interaction between fixed protein atoms and the ligand
Protein intra-molecular potential
3.1. MMFF terms involving polar protein hydrogens and atoms up to three bonds apart.
3.2. Interaction between the rest of the protein and flexible polar hydrogens
The sum of components 2.2 and 3.2 might be called “protein-pseudoligand” interaction and is evaluated according to the model selected with the
-protein_elec
option above.
Entropy calculation Options¶
-
-entropy
<type>
¶ Estimation of the entropy of a ligand will be done. Parameter
type
can take three values “None” (default), “QN” (Quasi-Newton Hessian) and “AN” (analytical). Input ligand is assumed to be in the form of a multiconformation molecule. The environment is determined by the option-sheffield
(solution),-protein
(in the binding site) or none of those two indicating a gas-phase.
-
-t
¶
Sets the temperature (in C) of the system for entropy estimation. Default temperature is 25C (298K).