Skip to content

A high-performance, pure Rust toolkit for standardizing and preparing biomolecular systems (proteins & nucleic acids). It heals missing atoms, resolves protonation states, adds solvation, and unifies topologies to forge simulation-ready structures.

License

Notifications You must be signed in to change notification settings

TKanX/bio-forge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

11626b4 · · Jan 26, 2026

History

529 Commits
Jan 26, 2026
Dec 22, 2025
Jan 26, 2026
Nov 26, 2025
Jan 26, 2026
Dec 28, 2025
Jan 26, 2026
Jan 26, 2026
Jan 23, 2026
Nov 20, 2025
Jan 26, 2026
Jan 26, 2026

Repository files navigation

BioForge Logo BioForge

BioForge is a pure-Rust toolkit for automated preparation of biological macromolecules. It reads experimental structures (PDB/mmCIF), reconciles them with high-quality residue templates, repairs missing atoms, assigns hydrogens and termini, builds topologies, and optionally solvates the system with water and ions—all without leaving the Rust type system.

Highlights

  • Template-driven accuracy – Curated TOML templates for standard amino acids, nucleotides, and water guarantee reproducible coordinates, charges, and bonding.
  • High performance – Multithreaded processing (via rayon) handles million-atom systems in milliseconds; single-pass parsing, in-place mutation, and zero-copy serialization minimize overhead.
  • Rich structure model – Lightweight Atom, Residue, Chain, and Structure types backed by nalgebra make geometric operations trivial.
  • Format interoperability – Buffered readers/writers for PDB, mmCIF, and MOL2 plus error types that surface precise parsing diagnostics.
  • Preparation pipeline – Cleaning, repairing, protonating, solvation, coordinate transforms, and topology reconstruction share a common ops::Error so workflows compose cleanly.
  • WebAssembly support – Full-featured WASM bindings for modern JavaScript bundlers (Vite, webpack, Rollup); ideal for browser-based molecular viewers and web applications.
  • Rust-first ergonomics – No FFI, no global mutable state beyond the lazily-loaded template store, and edition 2024 guarantees modern language features.

Processing Pipeline

Load → Clean → Repair → Hydrogenate → Solvate → Topology → Write

  1. Loadio::read_pdb_structure or io::read_mmcif_structure parses coordinates with IoContext alias resolution.
  2. Cleanops::clean_structure removes waters, ions, hetero residues, or arbitrary residue names via CleanConfig.
  3. Repairops::repair_structure realigns residues to templates and rebuilds missing heavy atoms (OXT on C-termini, OP3 on 5'-phosphorylated nucleic acids).
  4. Hydrogenateops::add_hydrogens infers protonation states (configurable pH, histidine strategy, and salt bridge detection) and reconstructs hydrogens from template anchors.
  5. Solvateops::solvate_structure creates a periodic box, packs water on a configurable lattice, and swaps molecules for ions to satisfy a target charge.
  6. Topologyops::TopologyBuilder emits bond connectivity with peptide-link detection, nucleic backbone connectivity, and disulfide heuristics.
  7. Writeio::write_pdb_structure / io::write_mmcif_structure serialize the processed structure; write_*_topology helpers emit CONECT or struct_conn records.

Quick Start

For CLI Users

Install the latest BioForge CLI binary from the releases page or via cargo:

cargo install bio-forge

Once the bioforge binary is installed, you can repair a structure in a single step:

bioforge repair -i input.pdb -o repaired.pdb

Explore the complete preparation pipeline in the user manual and browse the examples directory for runnable walkthroughs.

For Library Developers (Rust)

BioForge is also available as a library crate. Add it to your Cargo.toml dependencies:

[dependencies]

bio-forge = "0.4.0"

Example: Preparing a PDB Structure

use std::{fs::File, io::{BufReader, BufWriter}};


use bio_forge::{
    ioo::{
        read_pdb_structuree,
        write_pdb_structuree,
        write_pdb_topologyy,
         IoContext,
     },
    opss::{
        add_hydrogenss, clean_structure, repair_structure, solvate_structure,
         CleanConfig, HydroConfig, SolvateConfig, TopologyBuilder,
     },

};


fn main() -> t; Result(), Boxdyn std::error::Error>gt;>gt; {
     let ctx = IoContext::new_default();
     let input = BufReader::new(File::open("input.pdb")?);
     let mut structure = read_pdb_structure(input, &ctx)?;

      clean_structure(&mut structure, &CleanConfig::water_only())?;
     repair_structure(&mut structure)?;
     add_hydrogens(&mut structure, &HydroConfig::default())?;
     solvate_structure(&mut structure, &SolvateConfig::default())?;

      let topology = TopologyBuilder::new().build(structure.clone())?;

      write_pdb_structure(BufWriter::new(File::create("prepared.pdb")?), &structure)?;
     write_pdb_topology(BufWriter::new(File::create("prepared-topology.pdb")?), &topology)?;
     Ok(())

}

Prefer mmCIF? Swap in read_mmcif_structure / write_mmcif_structure. Need to process ligands? Parse them via io::read_mol2_template and feed the resulting Template into TopologyBuilder::add_hetero_template.

For Library Developers (JavaScript/TypeScript)

Install via npm:

npm install bio-forge-wasm

Prepare a structure with the following code:

import { Structure } from "bio-forge-wasm";


const pdb = await fetch("https://files.rcsb.org/view/1UBQ.pdb").then((r) =>gt;
   r.text()

);

const structure = Structure.fromPdb(pdb);


structure.clean({ removeWater: true });

structure.repair();

structure.addHydrogens({ hisStrategy: "network" });


const topology = structure.toTopology();

console.log(`Bonds: ${topology.bondCount}`);
r.text() ); const structure = Structure.fromPdb(pdb); structure.clean({ removeWater: true }); structure.repair(); structure.addHydrogens({ hisStrategy: "network" }); const topology = structure.toTopology(); console.log(`Bonds: ${topology.bondCount}`);onds: ${topology.bondCount}`);ondCount}`);" tabindex="0" role="button">

Documentation

Resource Description
CLI Manual Command-line usage and options
JS/TS API WebAssembly bindings reference
Rust API Library documentation
Architecture Internal design and algorithms
Examples Runnable walkthroughs

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A high-performance, pure Rust toolkit for standardizing and preparing biomolecular systems (proteins & nucleic acids). It heals missing atoms, resolves protonation states, adds solvation, and unifies topologies to forge simulation-ready structures.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages