Tool Wrappers (modelarchive.tools)

The modelarchive.tools submodule provides wrappers around external tools used in ModelArchive workflows, along with convenience functions for common use cases. Detailed documentation is available for each individual tool wrapper.

MAXIT (modelarchive.tools.maxit)

MAXIT from RCSB converts coordinate files in PDB legacy format to CIF and CIF files to mmCIF. This module also adds functionality to turn a PDB file into a (minimalist) ModelCIF file. But don’t get too excited - none of the functionality will turn a PDB file into a fully annotated ModelCIF file. It just makes sure the starting point is of valid CIF syntax. Extra data still need to be added…

MAXIT is not bundled with this module. The source code can be downloaded here. Installation instructions are available, and here is a TL;DR how to compile on macOS and most Linux distributions:

# cd into the unpacked source directory first
export RCSBROOT=$(pwd)
make
make binary
# binaries are found in bin/
# RCSBROOT needs to point at data/ when running maxit
modelarchive.tools.maxit.MAXIT_BINARY = 'maxit'

Path to the maxit binary, defaults to maxit from $PATH.

Can be overridden by setting the MAXIT_BINARY environment variable before import.

modelarchive.tools.maxit.cif2mmcif(infile, outfile)[source]

Convert a CIF file to mmCIF using MAXIT.

Only returns log messages upon failure.

Parameters:
  • infile (Path | str) – Input CIF file (also as gzip).

  • outfile (Path | str) – Output mmCIF file. If the filename ends with .gz or .gzip, the output will be compressed.

Returns:

MAXIT log messages on failure, empty list on success.

Return type:

list[str]

Raises:

RuntimeError – If RCSBROOT environment variable is not set.

modelarchive.tools.maxit.cif2modelcif(infile, outfile)[source]

Convert a CIF file into a minimalist ModelCIF file.

Sanitizes the input and adds mandatory ModelCIF categories. The input file is expected to be in mmCIF format as produced by MAXIT.

Parameters:
  • infile (Path | str) – Input mmCIF/ ModelCIF file (also as gzip).

  • outfile (Path | str) – Output ModelCIF file. If the filename ends with .gz or .gzip, the output will be compressed.

Returns:

Path to the output file.

Return type:

Path | str

Raises:

RuntimeError – If reading or writing the CIF file fails.

modelarchive.tools.maxit.coordfile2modelcif(infile, outfile)[source]

Convert a macromolecular structure file to a minimalist ModelCIF file.

Dispatches to pdb2modelcif() or cif2modelcif() based on the file extension. Supports .gz compressed files.

Parameters:
  • infile (Path | str) – Input file in PDB or CIF format. Supported extensions: .pdb, .cif, .mmcif, and their .gz variants.

  • outfile (Path | str) – Output ModelCIF file. If the filename ends with .gz or .gzip, the output will be compressed.

Returns:

Error log on failure, empty list on success.

Return type:

list[str]

Raises:
  • RuntimeError – If RCSBROOT environment variable is not set.

  • ValueError – If the file extension is not supported.

modelarchive.tools.maxit.fixing_pdb2mmcif(pdb_as_string, outfile)[source]

Convert a PDB legacy format string to mmCIF, fixing known issues.

Adds missing chain names if necessary before conversion. Only returns log messages upon failure.

Parameters:
  • pdb_as_string (str) – Input file content in PDB legacy format.

  • outfile (Path | str) – Output mmCIF file. If the filename ends with .gz or .gzip, the output will be compressed.

Returns:

MAXIT log messages on failure, empty list on success.

Return type:

list[str]

Raises:

RuntimeError – If RCSBROOT environment variable is not set.

modelarchive.tools.maxit.main()[source]

Entry point for the ma-maxit command line tool.

modelarchive.tools.maxit.pdb2cif(infile, outfile)[source]

Convert a PDB legacy format file to CIF using MAXIT.

Only returns log messages upon failure.

Parameters:
  • infile (Path | str) – Input file in PDB legacy format (also as gzip).

  • outfile (Path | str) – Output CIF file. If the filename ends with .gz or .gzip, the output will be compressed.

Returns:

MAXIT log messages on failure, empty list on success.

Return type:

list[str]

Raises:

RuntimeError – If RCSBROOT environment variable is not set.

modelarchive.tools.maxit.pdb2mmcif(infile, outfile)[source]

Convert a PDB legacy format file to mmCIF using MAXIT.

Runs MAXIT first in PDB to CIF mode, then converts the result to mmCIF. Only returns log messages upon failure.

Parameters:
  • infile (Path | str) – Input file in PDB legacy format (also as gzip).

  • outfile (Path | str) – Output mmCIF file. If the filename ends with .gz or .gzip, the output will be compressed.

Returns:

MAXIT log messages on failure, empty list on success.

On failure, the first element indicates which conversion step failed.

Return type:

list[str]

Raises:

RuntimeError – If RCSBROOT environment variable is not set.

modelarchive.tools.maxit.pdb2modelcif(infile, outfile)[source]

Convert a PDB legacy format file into a minimalist ModelCIF file.

Fixes known issues before conversion. Only returns log messages upon failure.

Parameters:
  • infile (Path | str) – Input file in PDB legacy format (also as gzip).

  • outfile (Path | str) – Output ModelCIF file. If the filename ends with .gz or .gzip, the output will be compressed.

Returns:

Error log on failure, empty list on success.

Return type:

list[str]

Raises:

RuntimeError – If RCSBROOT environment variable is not set.

modelarchive.tools.maxit.run_maxit(infile, outfile, mode, logfile=None)[source]

Run MAXIT without checks, mode-preselection, or cleanup.

Parameters:
  • infile (Path | str) – Input file. Either PDB legacy format or CIF.

  • outfile (Path | str) – Output file.

  • mode (str) – MAXIT operation mode. Use "1" for PDB to CIF, "2" for CIF to PDB, "8" for CIF to mmCIF.

  • logfile (Path | str, optional) – File for MAXIT log messages.

Returns:

Result of the MAXIT run.

Return type:

subprocess.CompletedProcess

modelarchive.tools.maxit.run_maxit_log2list(infile, outfile, mode)[source]

Run MAXIT and return the log file content as a list.

Parameters:
  • infile (Path | str) – Input file. Either PDB legacy format or CIF (also as gzip).

  • outfile (Path | str) – Output file. If the filename ends with .gz or .gzip, the output will be compressed.

  • mode (str) – MAXIT operation mode. Use "1" for PDB to CIF, "2" for CIF to PDB, "8" for CIF to mmCIF.

Returns:

A tuple of the log file content as a list of strings and the MAXIT exit status.

Return type:

tuple[list[str], int]