This page contains the complete documentation for pbqff as a single HTML
	  page. For a quick reference see
	  the man
	  directory in the main repository, where you can find a manual page in both
	  Unix man and PDF formats.
	
pbqff is a package (really a collection of packages) for computing anharmonic vibrational and rotational spectroscopic constants for small molecules. It does this through the use of quartic force fields (QFFs) to approximate the molecular potential energy function and second-order vibrational and rotational perturbation theory, which utilize the derivative information from the function approximation to compute the actual spectroscopic constants.
Because this is a complex task, pbqff is broken into several reusable components, only the final piece of which is included in this repository. Other major components include:
Molecule data structure and symmetry operations
	  
	  pbqff itself is a relatively thin wrapper around these packages that
	  primarily handles tasks like generating the grid of displaced geometries
	  to map out the QFF in various coordinate systems. This fact isn't really
	  important directly, but it may help to explain some of the potential error
	  messages you encounter, if they reference source code
	  in psqs, for example.
	
The primary input for pbqff is a TOML file. TOML as a format is widely used in the Rust community and is pretty easy for humans to write, although you will end up having to type a lot of quotes compared to something like YAML. To start, here's an example input file for a MOPAC QFF on a PBS queue:
geometry = """
O
H 1 OH
H 1 OH 2 HOH
OH = 1.0
HOH = 109.5
"""
optimize = true
charge = 0
step_size = 0.005
sleep_int = 2
job_limit = 2048
chunk_size = 1
coord_type = "norm"
template = "scfcrt=1.D-21 aux(precision=14 comp xp xs xw) PM6 THREADS=1"
queue_template = """
#!/bin/sh
#PBS -N {{.basename}}
#PBS -S /bin/bash
#PBS -j oe
#PBS -o pts/{{.basename}}.out
#PBS -W umask=022
#PBS -l walltime=100:00:00
#PBS -l cput=100:00:00
#PBS -l ncpus=1
#PBS -l mem=16gb
#PBS -q workq
module load openpbs
export WORKDIR=$PBS_O_WORKDIR
export TMPDIR=/tmp/$USER/$PBS_JOBID
export MOPAC_CMD=/ddnlus/r2518/Packages/mopac/build/mopac
export LD_LIBRARY_PATH=/ddnlus/r2518/Packages/mopac/build
cd $WORKDIR
mkdir -p $TMPDIR
trap 'rm -rf $TMPDIR' EXIT
"""
program = "mopac"
queue = "slurm"
findiff = true
check_int = 0
	  In addition to the options specified here, there are a few other optional
	  values like hybrid_template for specifying a separate quantum
	  chemistry program template for the cubic and quartic force
	  constants, weights for specifying atomic
	  weights, dummy_atoms for specifying a number of trailing
	  dummy atoms, and the very specialized norm_resume_hff for
	  resuming a normal coordinate QFF from the finished Cartesian harmonic
	  force field (HFF). A summary of all the possible input options is given in
	  the table below, followed by a more detailed description of each option.
	
| Option | Type | Description | 
|---|---|---|
| geometry | String | The initial molecular geometry as XYZ or Z-matrix | 
| optimize | bool | Whether or not to optimize the geometry before running the QFF | 
| charge | isize | The molecular charge | 
| step_size | f64 | The displacement size for the QFF in Å | 
| sleep_int | usize | Interval in seconds to sleep between job polling iterations | 
| job_limit | usize | Maximum number of jobs to submit to the queue at once | 
| chunk_size | usize | Number of jobs to combine into a single queue submission | 
| coord_type | CoordType | Displacement type for the QFF. Supported values are sic, cart, and norm | 
| template | TemplateSrc | Quantum chemistry program template. Can either be a literal String
		or a map with the key file | 
| hybrid_template | Option<TemplateSrc> | Separate quantum chemistry program template for cubic and quartic force constants | 
| queue_template | Option<TemplateSrc> | Queuing system template | 
| program | Program | A quantum chemistry program name. Supported values are dftb+, mopac, molpro, and cfour | 
| queue | Queue | A queuing system. Supported values are pbs, slurm, and local | 
| findiff | Option<bool> | Whether to use finite differences for derivative evaluations. Not all coordinate types support this | 
| check_int | usize | Interval in polling iterations to write checkpoints. A value of 0 disables checkpoints | 
| weights | Option<Vec<f64>> | An optional sequence of atomic masses for use in normal coordinate QFFs and spectroscopic calculations | 
| dummy_atoms | Option<usize> | An optional number of trailing dummy atoms | 
| norm_resume_hff | Option<bool> | Resume a normal coordinate QFF from a previously-finished HFF checkpoint | 
	  The geometry can be specified as either an XYZ geometry with
	  an optional number of atoms and comment line or as a Z-matrix. Currently,
	  pbqff cannot automatically convert between Z-matrix and XYZ geometries,
	  so optimize = true is required if the geometry is provided as
	  a Z-matrix. This allows pbqff to extract the Cartesian geometry from the
	  quantum chemistry program's output file and use that for the geometrical
	  displacements. For completeness, here is an example XYZ geometry for
	  cyclopropenylidene:
	
5
c3h2 geometry
C          0.0000000000        0.0000000000       -0.8887390447
C          0.0000000000       -0.6627588746        0.3682596138
C          0.0000000000        0.6627588746        0.3682596138
H          0.0000000000       -1.5952720351        0.9069548903
H          0.0000000000        1.5952720351        0.9069548903
Again, the first two lines are optional, but if either one is present then both must be present. And an example Z-matrix geometry for water:
O
H 1 OH
H 1 OH 2 HOH
OH = 1.0
HOH = 109.5
	  Aside from the caveats above about Z-matrix input,
	  the optimize keyword tells pbqff whether or not to optimize
	  the geometry before building the rest of the QFF. Because the quality of
	  the Taylor series approximation underlying the QFF is tied to the
	  assumption that the input geometry is a minimum on the potential energy
	  surface (PES), any geometries provided with optimize = false
	  should definitely have been optimized prior to the pbqff run and at the
	  same level of theory as the QFF.
	
	  charge is possibly the most straightforward input option as
	  it is simply the molecular charge as a whole number. As we'll see in later
	  sections, this won't necessarily be used if you omit
	  a {{.charge}} entry in your program template, but the
	  option must be provided in the pbqff input file even in those cases.
	
	  step_size sets the displacement size for the QFF in units of
	  Ångstroms. 0.005 is a reasonable default, with larger values like 0.010 or
	  even 0.020 helping occasionally with floppy systems.
	
	  pbqff uses a polling model for running jobs in parallel. On each iteration
	  of a big loop that runs until all jobs have finished, pbqff checks each
	  output file for the currently-running jobs to see if any have finished.
	  For very short single-point energy calculations (like PM6 or HF/STO-3G)
	  this makes a lot of sense because many jobs are likely to finish
	  constantly and each iteration will make some progress. On the other hand,
	  for long-running individual jobs (like CCSD(T)-F12/cc-pCVTZ-F12 on a large
	  molecule), pbqff will waste a lot of CPU time constantly checking the same
	  output files that may not have changed meaningfully since the last polling
	  interval. Thus, sleep_int is an interval in seconds for pbqff
	  to sleep between such polling attempts. Higher values should cause pbqff
	  to spend less CPU time reading the same files over and over, but extremely
	  high values could potentially lead to periods of no jobs running because
	  submitting more jobs as old ones finish is also part of the polling loop.
	  In practice, I usually use a minimum of 60 seconds for this and go up to
	  ~600 seconds for what I consider really long jobs. The options in the next
	  section also have some effect on what to choose here.
	
	  These two options, job_limit and chunk_size are
	  in the same section because they are very closely related. Early versions
	  of pbqff submitted each single-point energy calculation as a separate
	  queue submission limited by the job_limit setting. However,
	  this was inefficient enough to attract the attention of the HPC
	  administrators who recommended combining multiple program invocations into
	  a single submission script. This led to the addition of
	  the chunk_size option, which sets the number of invocations
	  per submission script (chunk). Hopefully this is already clear, but below
	  is an illustration of a chunk of MOPAC jobs for a local queue, where "job
	  submission" is handled by a very simple bash script. This would correspond
	  to a chunk_size of 5.
	
/opt/mopac/mopac job.0000000.mop
/opt/mopac/mopac job.0000001.mop
/opt/mopac/mopac job.0000002.mop
/opt/mopac/mopac job.0000003.mop
/opt/mopac/mopac job.0000004.mop
	  Because of this legacy, job_limit still corresponds to
	  individual calculations, despite the fact that they are now grouped into
	  chunks. As a result, if you want to restrict the number of jobs as
	  reported by your queue manager you should consider the
	  ratio job_limit / chunk_size rather
	  than job_limit alone. For example, to target a maximum number
	  of queued jobs of 500 with a chunk size of 25, you could set
	  a job_limit of 500 * 25 = 12_500. Note the use
	  of the underscore as a digit separator. This is not required in TOML but
	  can be handy for large numbers.
	
	  The main consideration when choosing a chunk_size is the
	  amount of time spent submitting jobs versus actually running jobs.
	  According to our system administrators, the cutoff (likely pretty
	  conservative) is around 5 minutes of job time. This can lead to pretty
	  crazy chunk sizes, especially for PM6 calculations on small molecules,
	  where single-point energy computations can take on the order of 0.01
	  seconds (5 minutes = 300 seconds | 1 job / 0.01 seconds => 30_000
	  jobs per chunk!). However, more expensive calculations can easily
	  take five minutes or more each, allowing you to use a chunk size of 1.
	  Individual jobs give pbqff the greatest freedom to queue new jobs as old
	  ones finish, but, again, it's possible to spend more time writing and
	  submitting separate jobs (and taxing the queue system) than actually
	  running calculations.
	
	  Of course, the preceding paragraphs all assume that you're running pbqff
	  on a large molecule with many single-point calculations and that
	  you're on some kind of distributed compute cluster. If, on the other hand,
	  you're running pbqff on a personal computer or a single node on a cluster
	  using the local queue option, you will only be able to run a very small
	  number of chunks at once anyway (roughly one per CPU or thread you allowed
	  pbqff to use). In this case, there's no real incentive to shrink
	  your chunk_size to allow more flexible submission. As long as
	  your chunk size is small enough to send one chunk to each core you
	  requested, all of your compute will be saturated already. For example, you
	  wouldn't want to request a chunk size of 250 for a 1000-job QFF and
	  request 8 cores because only 4 cores would get chunks to work on, but
	  there's no real reason to set a chunk size of 25 either. It will just lead
	  to a lot more shell scripts being written. The logic is similar for a
	  small molecule like a diatomic where the total number of points may be
	  smaller than a reasonable chunk size for other calculations.
	
	  This table contains my general heuristics for job_limit
	  and chunk_size requests on our compute cluster, where the
	  hard limit on queued jobs is around 1000, but most of the time I can only
	  get ~50 jobs to run at once anyway.
	
| Level of Theory | Time Per Job (s) | Job Limit | Chunk Size | 
|---|---|---|---|
| PM6 (5 atoms) | 0.01 | 1_500_000 | 30_000 | 
| F12-DZ (5 atoms) | 1 | 15_000 | 300 | 
| TZ-cCR (8 atoms) | 300 | 50 | 1 | 
	  pbqff originally grew out of two separate projects: go-cart,
	  which was my first implementation of finite-difference
	  Cartesian-coordinate QFFs, and pbqff proper, which automated our group's
	  traditional least-squares fitted symmetry-internal coordinate (SIC) QFFs.
	  These two (SICs and Cartesian coordinates) are still supported, along with
	  the newest but already most-used normal-coordinate QFFs. Fully documenting
	  the construction of a symmetry-internal coordinate system is beyond the
	  scope of this manual (at least for now but likely forever) because it
	  involves a lot of practice and manual construction. Thus, the two
	  recommended coordinate systems for pbqff are Cartesian coordinates,
	  requested by coord_type = "cart" and normal coordinates,
	  requested by coord_type = "norm".
	
	  Cartesian coordinates are the most straightforward coordinate system. The
	  initial geometry is either already in Cartesian (XYZ) form or can be
	  converted from a Z-matrix by the quantum chemistry program. Then, each of
	  these coordinates is displaced in all of the two-, three-, and
	  four-coordinate combinations needed to map out the QFF. It is
	  theoretically possible to approximate the PES with a least-squares fit in
	  Cartesian coordinates, but this is not currently implemented. I tried this
	  in a previous version of pbqff, and it led to significant bottlenecks in
	  the least-squares fit due to the much larger number of Cartesian points
	  compared to SICs without any improvement (and occasional worsening) of the
	  results. In other words, Cartesian coordinates imply findiff =
	  true.
	
	  Normal coordinates are generally a clear improvement over Cartesian
	  coordinates. First, the VPT2 implementation in spectro only requires the
	  semi-diagonal normal coordinate force constants (Φiijj and
	  Φiiii), but these require the fully-off-diagonal Cartesian
	  coordinate force constants like Fijkl to compute. Similarly,
	  and even more straightforwardly, normal coordinates discard the rotational
	  and translational degrees of freedom present in Cartesian coordinates,
	  lowering the number of coordinates from 3N to 3N
	  − 6. Combined, these factors allow for far fewer single-point
	  energies to be evaluated in normal coordinates.
	
Further, normal coordinates share the convenience of Cartesian coordinates: they can be derived automatically from the input geometry. This may not be clear from this manual because of the limited treatment of SICs, but this is the primary contrast between SICs and normal coordinates because SICs also require far fewer points than Cartesians, putting them on more equal footing with normal coordinates in terms of computational cost (but still worse for even medium-sized molecules) but certainly not in convenience.
	  Of course, the similarity to Cartesian coordinates is no accident. Normal
	  coordinates can only be derived automatically by computing a Cartesian
	  harmonic force field (HFF) to obtain the Hessian matrix, from which the
	  mass-weighted eigenvalues can be extracted to yield the normal
	  coordinates. This separation of the HFF and QFF leads to two additional
	  options for normal coordinate QFFs: hybrid_template
	  and norm_resume_hff. These are both discussed more fully in
	  later sections, but in brief, hybrid_template allows you to
	  use the plain template as the template for the HFF points and
	  the hybrid_template for the cubic and quartic points;
	  and norm_resume_hff allows you to resume a normal coordinate
	  QFF from a checkpoint after the initial HFF.
	
	  Normal coordinates are also the only coordinate type for which
	  the findiff option actually has any effect. SICs are assumed
	  to be fitted (findiff = false) and Cartesians are assumed to
	  be finite-difference. However, because of the way normal coordinates are
	  constructed (via the initial HFF), fitted normal coordinates are not refit
	  to zero the approximated function gradient as was typical with SICs.
	  Again, this is likely not to make much sense unless you're familiar with
	  the SIC procedure itself, but in short, this means normal coordinates
	  don't really capture the benefits SICs got from performing this fitting,
	  and finite differences are more economical, so stick with finite
	  differences (findiff = true) unless you really know what
	  you're doing.
	
Finally, normal coordinates are not without their limitations. Their main shortcoming relates to the mass-weighting involved in their definition, which causes the atomic masses to be baked into the resulting force constants in a way that is not easy to reverse. Practically, this means that computing isotopic spectroscopic data with normal coordinate QFFs requires recomputing the QFF for each combination of atomic masses. Second, because they are based on the Cartesian HFF, any numerical issues in the initial HFF can be dramatically compounded in the QFF. This mostly arises with highly symmetrical molecules (symmetric and spherical tops) with many resonances, but it can also pop up for other tricky molecules. It should be pretty obvious when this happens (anharmonic frequencies of 10,000 or more), but it's something to keep in the back of your mind.
	  As mentioned above, fully documenting the SIC code could double the size
	  of this document. All I'll say here is that running an SIC QFF in pbqff
	  requires the top of an INTDER input file (the simple- and
	  symmetry-internal coordinate definitions) in a file
	  named intder.in in the directory where you invoke pbqff. If
	  there's a resurgence of interest in SICs, I'd like to handle this kind of
	  input in a more compact way and without requiring a separate file. Feel
	  free to send me an email or open a GitHub issue if you are interested in
	  this or try using it and encounter any problems.
	
As mentioned in the introduction, pbqff offloads the interface to both chemistry programs and queuing systems to the psqs library. As listed in the psqs README, it currently supports writing input files and reading output files for the following programs:
	  Where these are listed roughly in order of support quality. MOPAC and
	  Molpro have been tested the most thoroughly, while CFOUR and DFTB+ have
	  been tested but to a much more limited extent. Each of these can be
	  requested in the pbqff input file using the program keyword
	  with valid options being the lowercase versions of each program
	  name as a TOML
	  string: "mopac", "molpro", "cfour", "dftb+".
	
	  Like the program option above, the queue option
	  depends on the values supported by psqs, which are the following:
	
PBS and Slurm are popular queuing systems on HPC clusters, while the local queue is a way to run pbqff on a single computer. Again, these are listed roughly in order of support. Hundreds of QFFs have been run with pbqff on a PBS cluster, while a much smaller number have been run on a smaller Slurm cluster. The local queue is used in all of the pbqff unit tests, so by that metric it is arguably the best-tested, but it essentially has no error handling, making it suited really only for these tests and other short test calculations rather than "production" QFFs.
	  These can be selected by their lowercase TOML string forms with
	  the queue keyword: "pbs", "slurm",
	  and "local".
	
	  The program template is the main way for you to customize how
	  pbqff runs the single-point energies that compose the QFF. The template
	  syntax is inspired by
	  Go's template library,
	  where patterns like {{.geom}} are replaced by the
	  appropriate information (somewhat obviously the geometry in this example).
	  Currently each supported program has its own set of supported options, but
	  most of them are the same at least.
	
	  Most users of pbqff specify both these template values and
	  the queue_template discussed in the next section as a literal
	  TOML string, but you can also specify an external file using the syntax
	  below.
	
template = { file = "/path/to/template" }
	  All of the examples in the following sections will work for either
	  plain template input or for the
	  special hybrid_template option. As mentioned in the section
	  on normal
	  coordinates, hybrid_template allows you to run the HFF
	  portion of a normal coordinate QFF with one level of theory and the cubic
	  and quartic portions of the QFF at a second level of theory. As far as I
	  know, this has only been tested to work with Molpro,
	  so see that section below for an example input file.
	
	  MOPAC has the simplest template format because MOPAC uses a
	  single line at the start of the file to specify all options, followed by
	  two comment lines and the geometry. As such, pbqff does not handle any
	  template parameter replacements for MOPAC (the curly brace stuff from the
	  paragraphs above). Options like the charge and whether or not to run an
	  optimization are simply appended to the first line of the template, and
	  the geometry for each single-point energy calculation is appended to the
	  file after two arbitrary comment lines. As such, the template
	  value in a MOPAC QFF should be a single-line TOML string like those in the
	  examples below.
	
template = "scfcrt=1.D-21 aux(precision=14 comp xp xs xw) PM6 THREADS=1"template = """gnorm=0.0 scfcrt=1.D-21 aux(precision=14 comp xp xs xw) \
PM6 SINGLET THREADS=1 external=params.dat"""As shown in the second example, you can also use the multi-line TOML string syntax (triple quotes) to allow very long templates to be broken across multiple lines with a backslash. Unfortunately this is not supported by single-line TOML strings with single quotes.
	  MOPAC defaults to running a geometry optimization, so the input
	  option 1SCF is added to the input line to disable this for
	  the single-point energy computations. As far as I know it's safe to
	  include any geometry-optimization-specific options
	  (like gnorm above) in these single-point calculations, so
	  pbqff doesn't have to do any further processing on the template.
	
	  For Molpro, pbqff supports two template
	  parameters: {{.geom}} and {{.charge}},
	  although it's often easier to supply the charge literally in
	  your template. The charge option is quite
	  straightforward as the input charge option is simply
	  substituted directly for {{.charge}} in the template
	  string. The geometry handling, on the other hand, is slightly more
	  complicated. Notice that there is no closing curly brace in the templates
	  below. The reason for this is simply programming convenience, at a slight
	  risk of confusion for the user. For Z-matrix geometries, Molpro expects
	  the main Z-matrix to be inside the curly braces and the variable
	  definitions to follow the closing curly brace. In contrast, XYZ geometries
	  should be fully enclosed in curly braces. Omitting the closing curly brace
	  makes it easier for pbqff to handle this distinction.
	
Here's a somewhat complicated template example that captures the flexibility of the template input for a TZ-cCR calculation:
template = """memory,8,g
gthresh,energy=1.d-12,zero=1.d-22,oneint=1.d-22,twoint=1.d-22;
gthresh,optgrad=1.d-8,optstep=1.d-8;
nocompress;
geometry={
{{.geom}}
basis={
default,cc-pCVTZ-f12
}
set,charge={{.charge}}
set,spin=0
hf,accuracy=16,energy=1.0d-10
{CCSD(T)-F12,thrden=1.0d-12,thrvar=1.0d-10,nocheck;core}
{optg,grms=1.d-8,srms=1.d-8}
etz=energy
basis=cc-pvtz-dk
hf,accuracy=16,energy=1.0d-10
{CCSD(T),thrden=1.0d-12,thrvar=1.0d-10,nocheck;}
edk=energy
basis=cc-pvtz-dk
dkroll=1
hf,accuracy=16,energy=1.0d-10
{CCSD(T),thrden=1.0d-12,thrvar=1.0d-10,nocheck;}
edkr=energy
pbqff=etz(2)+edkr-edk
show[1,f20.12],pbqff"""
	  In addition to the note about the missing curly brace, also note that
	  pbqff will only read the output energy if it is named pbqff
	  and displayed in the output using the show directive. For
	  simple calculations, this can be achieved using an expression
	  like pbqff=energy, followed by thte show line in
	  the example above. However, for F12 calculations in particular, it's
	  important to remember that the F12 energy in Molpro is a vector, with the
	  F12a and F12b energies in it. The example above uses the F12b energy
	  (etz(2)). This might look unusual if you're a programmer used
	  to indices starting at zero, but Fortran indices actually start at 1,
	  meaning etz(1) is the F12a energy and etz(2) is
	  the F12b energy.
	
Another important feature of Molpro template handling is how it adds or subtracts geometry optimization keywords depending on the type of calculation requested. There are four possibilities here:
optg included in templateoptg includedoptg includedoptg included
	  In the first and last cases, pbqff doesn't have to make any changes to
	  your template. However, in the second and third cases, pbqff needs to
	  bring the request for an initial geometry optimization into alignment with
	  the provided template. It does this by either adding a
	  default optg directive
	  ({optg,grms=1.d-8,srms=1.d-8}) in the second case, or by
	  deleting a line matching the regular
	  expression (?i)^.*optg(,|\s*$). If you're not familiar with
	  the intricacies of regular expressions, this looks for a line starting
	  with some optional junk (such as a curly brace), followed by the literal
	  word optg, followed by either a comma or some number of
	  spaces and a newline. While writing this, it doesn't seem quite right, but
	  it has worked for a couple of years in practice to identify
	  typical optg lines
	  like {optg,grms=1.d-8,srms=1.d-8} and skip lines
	  like gthresh,optgrad=1.d-8,optstep=1.d-8;, which also
	  includes optg but not in the right way.
	
	  In summary, if you're picky about how the optimization should be run, be
	  sure to include a single-line optg directive of your own
	  choosing, and pbqff will remove it as needed for single-point energy
	  calculations. Otherwise you can omit an optg line entirely
	  and pbqff will include a default one as needed too.
	
	  As one final example, here's a combination template
	  and hybrid_template input file section for a TZ-cCR+DZ
	  calculation:
	
template = """memory,1,g
gthresh,energy=1.d-12,zero=1.d-22,oneint=1.d-22,twoint=1.d-22;
gthresh,optgrad=1.d-8,optstep=1.d-8;
nocompress;
geometry={
{{.geom}}
basis={
default,cc-pCVTZ-f12
}
set,charge={{.charge}}
set,spin=0
hf,accuracy=16,energy=1.0d-10
{CCSD(T)-F12,thrden=1.0d-12,thrvar=1.0d-10,nocheck;core}
{optg,grms=1.d-8,srms=1.d-8}
etz=energy
basis=cc-pvtz-dk
hf,accuracy=16,energy=1.0d-10
{CCSD(T),thrden=1.0d-12,thrvar=1.0d-10,nocheck;}
edk=energy
basis=cc-pvtz-dk
dkroll=1
hf,accuracy=16,energy=1.0d-10
{CCSD(T),thrden=1.0d-12,thrvar=1.0d-10,nocheck;}
edkr=energy
pbqff=etz(2)+edkr-edk
show[1,f20.12],pbqff"""
hybrid_template = """memory,1,g
gthresh,energy=1.d-12,zero=1.d-22,oneint=1.d-22,twoint=1.d-22;
gthresh,optgrad=1.d-8,optstep=1.d-8;
nocompress;
geometry={
{{.geom}}
basis={
default,cc-pVDZ-f12
}
set,charge={{.charge}}
set,spin=0
hf,accuracy=16,energy=1.0d-10
{CCSD(T)-F12,thrden=1.0d-12,thrvar=1.0d-10}
{optg,grms=1.d-8,srms=1.d-8}
pbqff=energy(2)
show[1,f20.12],pbqff
"""
	  If renamed just to template, the hybrid_template
	  would serve as an F12-DZ template itself.
	
	  Like with Molpro, pbqff also supports the {{.geom}}
	  and {{.charge}} keywords for CFOUR. Additionally, pbqff
	  recognizes a {{.keywords}} template parameter that it
	  needs to know where to insert things like COORD=CARTESIAN for
	  the single-point energy calculations. One, admittedly strange, difference
	  with the CFOUR {{.charge}} parameter is that it also adds
	  the prefix CHARGE= as expected by CFOUR. With these
	  considerations in mind, an example CFOUR template for a CCSD/cc-pVTZ QFF
	  is shown below.
	
comment line
{{.geom}}
*CFOUR(CALC=CCSD,BASIS=PVTZ,MEMORY_SIZE=8,MEM_UNIT=GB,REF=RHF,MULT=1
{{.charge}}
{{.keywords}})
	  Unlike Molpro, there is no special handling of geometry optimization. If
	  you request a geometry optimization, pbqff assumes that you have marked
	  all of the variables in the Z-matrix to optimize with * as
	  expected by CFOUR and will extract the optimized Cartesian geometry from
	  the MOLDEN output file in the same directory as the main CFOUR output.
	
	  Again, like Molpro and CFOUR, DFTB+ supports
	  the {{.geom}} and {{.charge}} template
	  parameters. Also like Molpro, DFTB+ requires a rather elaborate geometry
	  optimization input section, so if you don't provide a section matching the
	  regular expression (?i)Driver = GeometryOptimization
	  (basically just that literal text but ignoring case), pbqff will add the
	  default value shown below.
	
Driver = GeometryOptimization {
	Optimizer = Rational {}
	MovedAtoms = 1:-1
	MaxSteps = 100
	OutputPrefix = "geom.out"
	Convergence {
		Energy   = 1e-8
		GradElem = 1e-7
		GradNorm = 1e-7
		DispElem = 1e-7
		DispNorm = 1e-7
	}
}
And here is a full DFTB+ example template with a custom geometry optimization block that is probably a more sensible default.
template = """Geometry = xyzFormat {
{{.geom}}
}
Hamiltonian = DFTB {
  Scc = Yes
  SlaterKosterFiles = Type2FileNames {
    Prefix = "/home/brent/chem/dftb/matsci/"
    Separator = "-"
    Suffix = ".skf"
  }
  MaxAngularMomentum {
    C = "p"
    H = "s"
  }
  Charge = {{.charge}}
  PolynomialRepulsive = SetForAll { Yes }
}
Options {
}
Analysis {
  CalculateForces = Yes
}
ParserOptions {
  ParserVersion = 12
}
Driver = GeometryOptimization {
  Optimizer = LBFGS { }
  MovedAtoms = 1:-1
  MaxSteps = 500
  OutputPrefix = "geom.out"
  Convergence {
    Energy = 1e-8
    GradElem = 1e-8
    GradNorm = 1e-8
    DispElem = 1e-7
    DispNorm = 1e-7
  }
}
"""
	  Just like the plain template keyword allows you to customize
	  the chemistry program input, the queue_template keyword
	  allows you to tell pbqff how to submit your jobs to the three kinds of
	  queues it knows about: PBS, Slurm, and what it calls a "local" queue.
	
	  One thing to keep in mind with the queue templates is how the chemistry
	  program is invoked for each input job. The primary means of customizing
	  this is through the PROGRAM_CMD variables. The table below
	  shows the command that pbqff writes to a submission script for each job in
	  a chunk. Don't worry about {filename} here, that's a Rust
	  format string parameter, not a template parameter like we saw in the
	  previous sections. pbqff will automatically replace this with the proper
	  filename.
	
| Program | Template | 
|---|---|
| MOPAC | $MOPAC_CMD {filename}.mop | 
| Molpro | $MOLPRO_CMD {filename}.inp | 
| CFOUR | (cd {filename} && $CFOUR_CMD) | 
| DFTB+ | (cd {filename} && $DFTB_CMD > out) | 
Note that CFOUR and DFTB+ expect a single filename for each input file (ZMAT and dftb_in.hsd, respectively), so their "filenames" actually correspond to directories. Again, that is effectively an implementation detail of pbqff, not something you should have to worry about.
	  In addition to using these command variables, all of the queue
	  implementations have access to the template
	  parameters {{.basename}}
	  and {{.filename}}. These both refer to the filename of
	  the submission script itself (usually something
	  like main0.pbs
	  or main0.slurm). basename just strips off any
	  directory information, which makes it a bit neater to use as the job name,
	  as shown in the examples below.
	
	  The structure of each section below is the same. First, tables containing
	  the default CMD values for each of the supported programs on
	  that queue are shown, followed by example queue_template
	  values for each program.
	
| Program | Default Command | 
|---|---|
| MOPAC | /ddnlus/r2518/Packages/mopac/build/mopac | 
| Molpro | "molpro -t $NCPUS --no-xml-output" | 
| CFOUR | "/ddnlus/r2518/bin/c4ext_new.sh $NCPUS" | 
| DFTB+ | /ddnlus/r2518/.conda/envs/dftb/bin/dftb+ | 
MOPAC and DFTB+ have simple commands (just paths to executables here that don't require quotes, but Molpro and CFOUR require additional arguments separated by spaces.
	  This example is repeated from the initial input file example. One feature
	  to point out here that will be shared with later templates is the use of
	  the trap shell command to run any commands that need to be
	  run after the single-point energy calculations. pbqff just appends
	  each CMD line to the template, so there's not any opportunity
	  to append commands after that. trap allows you to associate
	  an action with a signal received by the shell process, in this case the
	  EXIT signal sent when the process exits. If you need to run multiple
	  commands you can either separate them with colons or &&, or
	  define a simple shell cleanup function and invoke it like
	  this: trap cleanup EXIT.
	
queue_template = """
#!/bin/sh
#PBS -N {{.basename}}
#PBS -S /bin/bash
#PBS -j oe
#PBS -o pts/{{.basename}}.out
#PBS -W umask=022
#PBS -l walltime=100:00:00
#PBS -l cput=100:00:00
#PBS -l ncpus=1
#PBS -l mem=16gb
#PBS -q workq
module load openpbs
export WORKDIR=$PBS_O_WORKDIR
export TMPDIR=/tmp/$USER/$PBS_JOBID
cd $WORKDIR
mkdir -p $TMPDIR
trap 'rm -rf $TMPDIR' EXIT
export LD_LIBRARY_PATH=/ddnlus/r2518/Packages/mopac/build
export MOPAC_CMD=/ddnlus/r2518/Packages/mopac/build/mopac
"""
	  Note the addition of the LD_LIBRARY_PATH variable along
	  with MOPAC_CMD. This is not a pbqff requirement but something
	  to keep in mind if your command points to a dynamically-linked executable
	  like this version of MOPAC.
	
queue_template = """
#!/bin/sh
#PBS -N {{.basename}}
#PBS -S /bin/bash
#PBS -j oe
#PBS -o {{.basename}}.out
#PBS -W umask=022
#PBS -l walltime=1000:00:00
#PBS -l ncpus=1
#PBS -l mem=8gb
#PBS -q workq
module load openpbs molpro
export WORKDIR=$PBS_O_WORKDIR
export TMPDIR=/tmp/$USER/$PBS_JOBID
cd $WORKDIR
mkdir -p $TMPDIR
trap 'rm -rf $TMPDIR' EXIT
export MOLPRO_CMD=\"molpro -t $NCPUS --no-xml-output\"
"""
	  One potentially surprising part of this template is the use
	  of basename instead of filename in
	  the #PBS -o line. While Molpro PBS jobs get written into a
	  single directory, unlike CFOUR and DFTB+ jobs, the Molpro module on Maple
	  doesn't allow Molpro jobs to be submitted from a separate directory
	  anymore. As a result, pbqff will change to the appropriate directory
	  before submitting a job, leading to this template discrepancy.
	
This template and the following are pretty much the same, as were the previous two, so there's not much more to say. In general, the top portion of each template just consists of normal PBS directives with optional template parameters, followed by a single command or couple of commands (for MOPAC) to set the variable used by pbqff.
queue_template = """
#!/bin/sh
#PBS -N {{.basename}}
#PBS -S /bin/bash
#PBS -j oe
#PBS -o {{.filename}}.out
#PBS -W umask=022
#PBS -l walltime=1000:00:00
#PBS -l ncpus=1
#PBS -l mem=8gb
#PBS -q workq
module load openpbs
export WORKDIR=$PBS_O_WORKDIR
cd $WORKDIR
CFOUR_CMD=\"/ddnlus/r2518/bin/c4ext_new.sh $NCPUS\"
"""queue_template = """
#!/bin/sh
#PBS -N {{.basename}}
#PBS -S /bin/bash
#PBS -j oe
#PBS -o {{.filename}}.out
#PBS -W umask=022
#PBS -l walltime=1000:00:00
#PBS -l ncpus=1
#PBS -l mem=8gb
#PBS -q workq
module load openpbs
export WORKDIR=$PBS_O_WORKDIR
cd $WORKDIR
export DFTB_CMD=/ddnlus/r2518/.conda/envs/dftb/bin/dftb+
"""The general approach for Slurm and the local queue is identical to PBS, so you may want to read the previous section even if you're intending to use one of these other queues. Here I will just present some example templates for quick reference.
	  This template and the Molpro one below are embedded directly from psqs,
	  where the templates for pbqff are defined. psqs doesn't currently define
	  default templates for CFOUR or DFTB+, so you'll have to provide
	  a queue_template of your own for those programs.
	
#!/bin/bash
#SBATCH --job-name=semp
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH -o {{.filename}}.out
#SBATCH --no-requeue
#SBATCH --mem=1gb
export LD_LIBRARY_PATH=/home/qc/mopac2016/
export MOPAC_CMD=/home/qc/mopac2016/MOPAC2016.exe
echo $SLURM_JOB_ID
date
hostname
#!/bin/bash
#SBATCH --job-name={{.filename}}
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH -o {{.filename}}.out
#SBATCH --no-requeue
#SBATCH --mem=8gb
MOLPRO_CMD="/home/qc/bin/molpro2020.sh 1 1"
#!/bin/bash
#SBATCH --job-name={{.filename}}
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH -o {{.filename}}.out
#SBATCH --no-requeue
#SBATCH --mem=8gb
CFOUR_CMD="cfour"
#!/bin/bash
#SBATCH --job-name={{.filename}}
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH -o {{.filename}}.out
#SBATCH --no-requeue
#SBATCH --mem=8gb
DFTB_CMD="dftb+"
Like for Slurm, pbqff includes only two default templates for the local queue option, which is primarily used for testing. Because the local queue simply "submits" jobs using bash scripts rather than PBS or Slurm submission scripts, a local queue template can be much simpler than a PBS or Slurm template, essentially just defining the path to the executable to be run. Of course, you can add any additional commands you want, but this is all that pbqff requires.
export MOPAC_CMD=/opt/mopac/mopac
export LD_LIBRARY_PATH=/opt/mopac/ DFTB_CMD=/opt/dftb+/dftb+ MOLPRO_CMD=molpro CFOUR_CMD=cfour As alluded to in the coordinate type discussion, pbqff supports two types of derivative calculations for the force constants used in the VPT2 calculations. The first, and more traditional, of these involves a least-squares fit of the QFF energies to the Taylor series expansion of the potential energy surface using the ANPASS program. This works well for small molecules and can help to smooth over minor numerical issues because the initial guess at the PES can be refit to the function minimum to yield zero-gradient force constants. On the other hand, for large molecules, assembling the matrix of energies and performing the least-squares solution actually becomes a fairly significant bottleneck in terms of both time and memory requirements. This is compounded by larger coordinate systems like Cartesians.
As such, most of our recent calculations have relied instead on finite difference derivatives, specifically central finite differences. These are easier to compute because the individual energies comprising each force constant can be accumulated directly into their corresponding force constant rather than being kept around for a big calculation at the end.
	  Again, as described in the normal
	  coordinate section, normal coordinates are the only type for which
	  the findiff option is used. Cartesian coordinates
	  assume findiff = true (they always use finite differences),
	  and SICs assume findiff = false (they always use an ANPASS
	  fitting). Normal coordinates also default to findiff = false,
	  but because of the benefits mentioned above, you will usually want to use
	  them with findiff = true, so this default should probably be
	  updated.
	
	  Checkpoints are configured through the check_int option. As
	  the name somewhat implies, this sets the checkpoint interval in units of
	  polling iterations (see Sleep Interval for
	  more details). A value of 0 disables checkpoints, while a
	  positive value will write a psqs checkpoint file (chk.json)
	  at that interval.
	
	  In addition to the chk.json file written by psqs, which
	  contains all of the information needed to recreate the queue of running
	  chemistry jobs, pbqff writes its own checkpoint to
	  the res.chk file. This file contains the rest of the state
	  needed to pick up after psqs resumes from its own checkpoint. Despite the
	  strange file extension, this is also a JSON file that you can inspect if
	  needed, although modifying either of these files could cause some very
	  strange issues unless you are very careful.
	
	  To resume a QFF from these two checkpoint files, provide the pbqff binary
	  with the -c or --checkpoint
	  As described in the normal coordinate
	  section, there is also a configuration option
	  called norm_resume_hff, which allows resuming a normal
	  coordinate QFF from within the initial HFF (as in before it moves on to
	  the QFF portion). This relies on the same res.chk file as the
	  normal checkpoints, but its contents are very different during the HFF. To
	  trigger this kind of checkpoint, you actually need to omit
	  the -c command line flag and only include this option in the
	  input file itself.
	
	  The optional weights keyword is used to specify the atomic
	  masses to use for normal coordinate generation and mass-dependent
	  spectroscopic calculations. If this is not provided, the mass of the most
	  abundant isotope from
	  the NIST
	  atomic weight database is used. See below for an example of how to
	  include this in your input file for water. Currently the number of weights
	  must equal the number of atoms. There is no way to specify weights for a
	  subset of atoms.
	
geometry = """
O
H 1 OH
H 1 OH 2 HOH
OH = 1.0
HOH = 109.5
"""
# one-line version
weight = [15.99491461957, 1.00782503223, 1.00782503223]
# multiline version, useful for larger molecules
weight = [
    15.99491461957,
    1.00782503223,
    1.00782503223,
]
# use one deuterium
weight = [
    15.99491461957,
    2.01410177812, # <- D
    1.00782503223,
]
This is possibly the worst-named pbqff option because it has nothing to do with the kind of dummy atoms typically included in a Z-matrix, for example, which do not require any special treatment in the pbqff input file (they should be erased by the chemistry program before pbqff sees the Cartesian geometry). Instead this is used to specify a number of atoms that should not be displaced in the course of the QFF. This is a somewhat experimental option that I have currently only used in Molpro to compute QFFs for small molecules in the presence of a non-bonded atom. This takes advantage of Molpro's Z-matrix input, which allows including some Cartesian coordinates too. These Cartesian coordinates are frozen in geometry optimizations, allowing for the type of calculation described here. I consider it unlikely that this option will be useful or even usable in other packages, but it exists if you need it.
	  Finally, the norm_resume_hff option allows you to resume a
	  (presumably failed) normal coordinate QFF from the initial HFF stage. This
	  was a much-requested feature by users of pbqff, but the separation of
	  normal coordinate QFFs into the HFF and QFF stages made it difficult to
	  integrate with the existing checkpoint mechanism, leading to this separate
	  option. Indeed, it is fully separate from the command
	  line --checkpoint option, which must be omitted when
	  using norm_resume_hff.
	
	  In addition to the settings in the input file described in the previous
	  section, pbqff takes several command line options. To some extent, the
	  most important of these is the name of the input file, but even this has a
	  default value of pbqff.toml, allowing it to be omitted if you
	  use that as your input file name. The other options are summarized in the
	  table below.
	
| Option | Type | Description | 
|---|---|---|
| -c/--checkpoint | bool | Resume from the checkpoint files in the current directory (chk.json and res.chk). Defaults to false. | 
| -o/--overwrite | bool | Overwrite existing output from a previous run. Defaults to false. | 
| -v/--version | bool | Print the git commit hash and exit. Defaults to false. | 
| -t/--threads | usize | Set the maximum number of threads to use. Defaults to 0, which means to use as many threads as there are CPUS. | 
| -j/--json | bool | Serialize the input file to JSON and exit. For use by qffbuddy. | 
| -n/--no_del | bool | Don't delete any files when running the single-point energies. Defaults to false. | 
As indicated by the slash, each of these options can either be provided as a short option with a single dash and its first letter or as a long option with two dashes and the full name.
	  Of these options, -t/--threads is probably the most important
	  on a regular basis since it controls the number of threads spawned by
	  pbqff. If you're running pbqff on a local computer, you may want to limit
	  this to ~half of the CPUs on the machine to allow other programs to run
	  smoothly. Otherwise, if you're running pbqff on some kind of HPC cluster,
	  you'll likely want to set this to the number of CPUs you request in your
	  job submission script.
	
	  The -o/--overwrite option is also commonly used, especially
	  if you make a typo in your first submission of a pbqff job. By default,
	  pbqff will refuse to overwrite any output files in the current directory,
	  even if they just contain an error message. Passing the -o
	  flag will bypass this check and allow the run to continue.
	
	  The -c/--checkpoint flag can be used to resume from a
	  previously-written checkpoint file. See
	  the Checkpoints section above for more details.
	
	  Finally, the -n/--no_del option prevents pbqff from cleaning
	  up the (sometimes numerous) single-point energy input and output files as
	  it runs. This is primarily useful if you're debugging some kind of issue
	  in pbqff and need to see all of the output files at the end of a run. This
	  is not the default because large QFFs produce a very large quantity of
	  very similar files that HPC administrators don't always like to keep
	  around.
	
	  As noted in the table, the -j/--json flag is intended
	  primarily for use by qffbuddy and is hidden from the
	  built-in -h/--help flag contained in pbqff. However, it is
	  documented here in case it proves useful for other interfaces to pbqff.
	
You may know qffbuddy as a Python GUI for writing and even running pbqff jobs. This section contains a reimplementation of qffbuddy as an HTML form. While you can't use it to run pbqff directly, submitting the form will copy the generated TOML file to your clipboard for pasting into an actual TOML file.