Tool Wrappers (modelarchive.tools)

The modelarchive.tools submodule provides wrappers around external tools used in ModelArchive workflows, along with convenience functions for common use cases. Detailed documentation is available for each individual tool wrapper.

MAXIT (modelarchive.tools.maxit)

MAXIT from RCSB converts coordinate files in PDB legacy format to CIF and CIF files to mmCIF. This module also adds functionality to turn a PDB file into a (minimalist) ModelCIF file. But don’t get too excited - none of the functionality will turn a PDB file into a fully annotated ModelCIF file. It just makes sure the starting point is of valid CIF syntax. Extra data still need to be added…

MAXIT is not bundled with this module. The source code can be downloaded here. Installation instructions are available, and here is a TL;DR how to compile on macOS and most Linux distributions:

# cd into the unpacked source directory first
export RCSBROOT=$(pwd)
make
make binary
# binaries are found in bin/
# RCSBROOT needs to point at data/ when running maxit

This module adds a new tool to the command line, ma-maxit. It simply wraps around RCSB maxit but extends modes to “CIF/ PDB to ModelCIF”.

modelarchive.tools.maxit.MAXIT_BINARY = 'maxit'

Path to the maxit binary, defaults to maxit from $PATH.

Can be overridden by setting the MAXIT_BINARY environment variable before import.

modelarchive.tools.maxit.cif2mmcif(infile, outfile)[source]

Convert a CIF file to mmCIF using MAXIT.

Only returns log messages upon failure.

Parameters:
  • infile (Path | str) – Input CIF file (also as gzip).

  • outfile (Path | str) – Output mmCIF file. If the filename ends with .gz or .gzip, the output will be compressed.

Returns:

MAXIT log messages on failure, empty list on success.

Return type:

list[str]

Raises:

RuntimeError – If RCSBROOT environment variable is not set.

modelarchive.tools.maxit.cif2modelcif(infile, outfile)[source]

Convert a CIF file into a minimalist ModelCIF file.

Sanitizes the input and adds mandatory ModelCIF categories. The input file is expected to be in mmCIF format as produced by MAXIT.

Parameters:
  • infile (Path | str) – Input mmCIF/ ModelCIF file (also as gzip).

  • outfile (Path | str) – Output ModelCIF file. If the filename ends with .gz or .gzip, the output will be compressed.

Returns:

Path to the output file.

Return type:

Path | str

Raises:

RuntimeError – If reading or writing the CIF file fails.

modelarchive.tools.maxit.coordfile2modelcif(infile, outfile)[source]

Convert a macromolecular structure file to a minimalist ModelCIF file.

Dispatches to pdb2modelcif() or cif2modelcif() based on the file extension. Supports .gz compressed files.

Parameters:
  • infile (Path | str) – Input file in PDB or CIF format. Supported extensions: .pdb, .cif, .mmcif, and their .gz variants.

  • outfile (Path | str) – Output ModelCIF file. If the filename ends with .gz or .gzip, the output will be compressed.

Returns:

Error log on failure, empty list on success.

Return type:

list[str]

Raises:
  • RuntimeError – If RCSBROOT environment variable is not set.

  • ValueError – If the file extension is not supported.

modelarchive.tools.maxit.fixing_pdb2mmcif(pdb_as_string, outfile)[source]

Convert a PDB legacy format string to mmCIF, fixing known issues.

Adds missing chain names if necessary before conversion. Only returns log messages upon failure.

Parameters:
  • pdb_as_string (str) – Input file content in PDB legacy format.

  • outfile (Path | str) – Output mmCIF file. If the filename ends with .gz or .gzip, the output will be compressed.

Returns:

MAXIT log messages on failure, empty list on success.

Return type:

list[str]

Raises:

RuntimeError – If RCSBROOT environment variable is not set.

modelarchive.tools.maxit.main()[source]

Entry point for the ma-maxit command line tool.

modelarchive.tools.maxit.pdb2cif(infile, outfile)[source]

Convert a PDB legacy format file to CIF using MAXIT.

Only returns log messages upon failure.

Parameters:
  • infile (Path | str) – Input file in PDB legacy format (also as gzip).

  • outfile (Path | str) – Output CIF file. If the filename ends with .gz or .gzip, the output will be compressed.

Returns:

MAXIT log messages on failure, empty list on success.

Return type:

list[str]

Raises:

RuntimeError – If RCSBROOT environment variable is not set.

modelarchive.tools.maxit.pdb2mmcif(infile, outfile)[source]

Convert a PDB legacy format file to mmCIF using MAXIT.

Runs MAXIT first in PDB to CIF mode, then converts the result to mmCIF. Only returns log messages upon failure.

Parameters:
  • infile (Path | str) – Input file in PDB legacy format (also as gzip).

  • outfile (Path | str) – Output mmCIF file. If the filename ends with .gz or .gzip, the output will be compressed.

Returns:

MAXIT log messages on failure, empty list on success.

On failure, the first element indicates which conversion step failed.

Return type:

list[str]

Raises:

RuntimeError – If RCSBROOT environment variable is not set.

modelarchive.tools.maxit.pdb2modelcif(infile, outfile)[source]

Convert a PDB legacy format file into a minimalist ModelCIF file.

Fixes known issues before conversion. Only returns log messages upon failure.

Parameters:
  • infile (Path | str) – Input file in PDB legacy format (also as gzip).

  • outfile (Path | str) – Output ModelCIF file. If the filename ends with .gz or .gzip, the output will be compressed.

Returns:

Error log on failure, empty list on success.

Return type:

list[str]

Raises:

RuntimeError – If RCSBROOT environment variable is not set.

modelarchive.tools.maxit.run_maxit(infile, outfile, mode, logfile=None)[source]

Run MAXIT without checks, mode-preselection, or cleanup.

Parameters:
  • infile (Path | str) – Input file. Either PDB legacy format or CIF.

  • outfile (Path | str) – Output file.

  • mode (str) – MAXIT operation mode. Use "1" for PDB to CIF, "2" for CIF to PDB, "8" for CIF to mmCIF.

  • logfile (Path | str, optional) – File for MAXIT log messages.

Returns:

Result of the MAXIT run.

Return type:

subprocess.CompletedProcess

modelarchive.tools.maxit.run_maxit_log2list(infile, outfile, mode)[source]

Run MAXIT and return the log file content as a list.

Parameters:
  • infile (Path | str) – Input file. Either PDB legacy format or CIF (also as gzip).

  • outfile (Path | str) – Output file. If the filename ends with .gz or .gzip, the output will be compressed.

  • mode (str) – MAXIT operation mode. Use "1" for PDB to CIF, "2" for CIF to PDB, "8" for CIF to mmCIF.

Returns:

A tuple of the log file content as a list of strings and the MAXIT exit status.

Return type:

tuple[list[str], int]

MolScript (modelarchive.tools.molscript)

Convenience wrapper for MolScript and Raster3D image generation.

MolScript is a tool to create images for molecular structures with the help of Raster3D. While the system creates nice “out-of-the-box” images, the workflow is a bit more involved. So this module exists as a convenience wrapper, producing (opinionated) images from single function or command line calls.

MolScript and Raster3d are not bundled with this module. The source code can be downloaded here and here. Installation instructions are available on the project web pages and/ or in the source code distributions.

This module can also be used on the command line as ma-make-image.

modelarchive.tools.molscript.MOLAUTO_BINARY = 'molauto'

Path to the molauto binary, defaults to molauto from $PATH.

Can be overridden by setting the MOLAUTO_BINARY environment variable before import.

modelarchive.tools.molscript.MOLSCRIPT_BINARY = 'molscript'

Path to the molscript binary, defaults to molscript from $PATH.

Can be overridden by setting the MOLSCRIPT_BINARY environment variable before import.

modelarchive.tools.molscript.RENDER_BINARY = 'render'

Path to the render (Raster3D) binary, defaults to render from $PATH.

Can be overridden by setting the RENDER_BINARY environment variable before import.

modelarchive.tools.molscript.coordfile2image(input_file, png_path, colour_scheme=None, img_size=400)[source]

Create a 2D image for a CIF/ PDB file (gzip allowed).

Parameters:
  • input_file (Path | str) – Path to the input PDB or mmCIF file. Gzip-compressed files are supported (extensions .pdb.gz, .pdb.gzip, .cif.gz, .cif.gzip, .mmcif.gz, .mmcif.gzip).

  • png_path (Path | str) – Path to the output PNG file.

  • colour_scheme (str, optional) – Colour scheme to apply. Currently supported: "chain". Defaults to None.

  • img_size (int, optional) – Size of the quadratic image in pixels. Defaults to 400.

Returns:

None

Raises:
  • RuntimeError – If image creation fails (PNG missing or too small after rendering) and if the MolScript script is corrupted.

  • ValueError – If the file extension of input_file is not supported.

modelarchive.tools.molscript.main()[source]

Entry point for the ma-make-image command line tool.

modelarchive.tools.molscript.run_molauto(input_pdb, options=None)[source]

Execute molauto without checks or cleanup.

Parameters:
  • input_pdb (Path | str) – Path to the input PDB file.

  • options (list[str], optional) – Additional command line options passed to molauto. Defaults to None.

Returns:

Result of the MAXIT run.

Return type:

subprocess.CompletedProcess

Raises:

subprocess.CalledProcessError – If molauto exits with a non-zero return code.

modelarchive.tools.molscript.run_molscript(script, options=None)[source]

Execute molscript without checks or cleanup.

Parameters:
  • script (list[str]) – Lines of the MolScript input script.

  • options (list[str], optional) – Additional command line options passed to molscript. Defaults to None.

Returns:

Result of the molscript run.

Return type:

subprocess.CompletedProcess

Raises:

subprocess.CalledProcessError – If molscript exits with a non-zero return code.

modelarchive.tools.molscript.run_render(script_stdout, png_path, options=None)[source]

Execute render without checks or cleanup.

Parameters:
  • script_stdout (bytes) – stdout of a molscript run.

  • png_path (Path | str) – Path to the output PNG file.

  • options (list[str], optional) – Additional command line options passed to render. Defaults to None.

Returns:

Result of the render run.

Return type:

subprocess.CompletedProcess

Raises:

subprocess.CalledProcessError – If render exits with a non-zero return code.