Gaussian QM Parallel Run Input Files

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

  • Solution-based/Small Molecule Lead-opt/QM Analysis

  • Task-based/Quantum Mechanics

  • Role-based/Computational Chemist

  • Product-based/Quantum Mechanics/Gaussian

Description

Important Note that a status of ‘Success’ in the Orion UI only indicates the job is complete. Check the output of the Floe to see how many Gaussian calculations completed successfully. Pay careful attention to which output files are generated. If there is a failure output file, check the Failure Report in the Floe Report tab to see how many calculations failed. To see examples of Gaussian calculations which succeed and fail, check out the Gaussian Orion Module Tutorials.

Run any Gaussian input file(s) in parallel. This Floe takes advantage of the Orion scheduler with a parallel cube for the Gaussian calculations, meaning the number of calculations running simultaneously will scale automatically. However, there is also a strict time limit for these types of cubes. Any calculations which take more than 10 hours will fail. Output for failed calculations is still saved, so it is recommended to store checkpoint files for all calculations in order to restarted any calculations which time out. Output for all calculations is saved into archive (tar) file(s).

For calculations anticipated to take more than 10 hours Run the Floe Gaussian QM Run Input Files which has no time limit.

The goal of this Floe is to allow Orion to be used as the computational engine for any Gaussian calculation. For help creating Gaussian input files see Gaussian documentation here: https://gaussian.com/input/

Spot Policy for this floe is set to prohibited. This is because spot instances on AWS can be taken away causing a calculation to restart from scratch. For longer running calculations, this can cause costs to increase dramatically. If you are running a large number of short calculations (under 30 minutes) it may be more cost efficient to switch this parameter to spot prefered.

Input Files Input files to this Floe should be Gaussian input files (with extension .com or .gjf) or TAR or ZIP archived directories. For archived directories, if a sub-directory has multiple input files they will each be moved into a separate subdirectory. For example, if an input directory had two input files (calc_1.com and calc_2.com), the output file would have two output directories (calc_1 and calc_2) with all outputs from each calculation stored in the respective sub directory.

To maintain the same directory structure for input and output, make sure there is only one input file in each subdirectory. In all cases, the Gaussian input file should have extensions .com or .gjf. Calculation files with any other extension will fail this Floe. If a checkpoint file (or other formats) is in the same input directory they can be used for the calculation as well.

Gaussian Calculation The Hardware Requirements set for this Floe attempt to be reasonable for calculations with drug like molecules. However, if using expensive methods or large basis sets, it is recommended that a benchmark calculation is performed to check. Metrics are enabled for the Run Gaussian Inputs Cubes so they can be monitored while a calculation is running.

Saving Output The output directory for all calculations is saved as a tar file. If the file extension provided is not recognized it will be replaced with ‘.tar’. Currently, tar files with no compression or gzip, bzip2, or lzma compression are supported. If the size of this archive file exceeds the Max Gaussian Output File Size (default 1,000 MB), then a second output file will be created with the same name followed by a number.

Whether or not the Gaussian calculation succeeds, all output from the job will be saved for post processing. The file from the ‘Successful Calculations Tar Archive File’ parameter will contain directories for all successful calculations. The file from the ‘Failed Calculation Tar Archive File’ parameter will contain a directory for any failed calculations, to understand these failures look at the Orion Failure report and the log file in each directory.

Promoted Parameters

Title in user interface (promoted name)

Inputs

Gaussian Input File(s) (in): Gaussian input files (.com or .gjf) or archived directory with multiple input files. Zip files or Tar files with gzip, bzip2, lzma, or no compression are currently supported. Any additional input files (that is, checkpoint files) should be in the same subdirectory as their respective Gaussian file.

  • Required

  • Type: file_in

Outputs

Successful Calculations Tar Archive File (out): Title for the output archive file for successful calculations.

  • Required

  • Type: file_out

  • Default: Gaussian_parallel_output.tar

Failed Calculation Tar Archive File (failure): Title for the output archive file for failed calculations.

  • Required

  • Type: file_out

  • Default: Gaussian_parallel_failures.tar

Max Gaussian Output File Size (MB) (gau_max_directory_size): Specify a maximum output file size. When the file size is exceeded a new tar file will be created. Until all of the output is saved. When multiple files are being created an increment will automatically be added.

  • Type: decimal

  • Default: 1000

Save Log files (store_log_file): Save output log files for all calculations, they are always saved for failed calculations.

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Report Name for Log Files (gau_log_report_name): Gaussian log files for the calculation are saved to a Floe Report. This allows Gaussian log files to be viewed on Orion before downloading results. Log files are saved for all failures and optionally saved for successful calculations if the ‘Store Log Files’ above is turned On.

  • Type: string

  • Default: Gaussian Log File Report

Hardware Requirements for Gaussian Calculation

Gaussian # Threads (gaussian_nthreads): Number of CPUs for Gaussian calculation.

  • Type: integer

  • Default: 8

Gaussian Memory (gaussian_memory): Memory for Gaussian calculations in MBs.

  • Type: decimal

  • Default: 14400

Gaussian Disk Space (gaussian_disk_space): Temporary disk space (in MB) required for your calculation.

  • Type: decimal

  • Default: 25600

Timeout for Gaussian Calculation (Hours) (gaussian_timeout): Parallel Cubes have a strict time limit, this parameter stops the gaussian calculation before it reaches that limit. Note, this is a maximum only, the cube will finish when the calculation completes.

  • Type: decimal

  • Default: 10

Spot policy (gaussian_spot_policy): Spot instances on AWS are cheaper, but can be taken away at any time. When a spot instance is removed, Orion will automatically restart the calculationfrom the beginning. This can be cost prohibitive for longer calculations. For shorter calculations (<10 minutes) the calculations being restarted may be cost efficient and the option could be switched to ‘Preferred.’

  • Type: string

  • Default: Prohibited

  • Choices: [‘Allowed’, ‘Preferred’, ‘NotPreferred’, ‘Prohibited’, ‘Required’]