Backtracking Algorithms for Structural Isomer Generation
This project explores how computer science and chemistry intersect by attempting to enumerate all structural isomers for a given molecular formula. What began as a curiosity—why a formula like C₅H₅NO₂ could have hundreds of isomers—became a full summer project in algorithm design, graph theory, and chemical reasoning.
Traditional AP Chemistry approaches (drawing Lewis structures by hand) quickly break down: even simple molecules like acetic acid require pages of diagrams. This repository contains my attempt to automate that process with recursive backtracking, graph construction, and chemical validity checks.
In AP Chemistry, I learned that calculating the number of isomers is difficult and often impossible by hand. When software I was using unexpectedly returned 672 isomers for a seemingly simple formula, I realized how complex this problem truly is.
I initially believed CS and chemistry had little overlap. But as I implemented bond rules, valence constraints, and stability heuristics, I found myself merging discrete math, graph theory, and chemical intuition into one workflow.
Generates candidate molecular graphs that satisfy valence rules.
Encodes relative bond strengths and stability checks to restrict graph generation to chemically reasonable structures.
Inspired by set theory:
- Converts each molecular graph into a unique character sequence
- Eliminates duplicate structures
- Dramatically improves runtime
Atoms are nodes; bonds are edges. Validity is enforced during graph construction.
Despite successful generation for smaller formulas, the project faces several inherent difficulties:
- Algorithmic completeness: Ensuring all valid isomers are generated is still unsolved.
- Chemical edge cases: Some molecules (e.g., malic acid, C₄H₅O₆ variants) did not appear—likely due to missing constraints or pruning that is too aggressive.
- Computational explosion: The search space grows extremely quickly with molecular size.
Improving correctness, not just performance, is the next major step.
Here is a clean, polished README.md containing all the sections you asked for — project name, project description, usage, local launch instructions, and issues encountered.
Backtracking-Based Structural Isomer Generator
This project attempts to algorithmically enumerate all structural isomers for a given molecular formula by combining concepts from chemistry, graph theory, and computer science.
The idea originated after discovering that the formula C₅H₅NO₂ had 672 isomers, far more than expected. Manual attempts using Lewis diagrams quickly became impractical, so this project explores how recursion, backtracking, and chemical constraints can be used to systematically construct all valid molecular structures.
The program generates molecular graphs, checks chemical validity (valence rules, bond strengths, and stability heuristics), and uses an injective encoding function to remove duplicates efficiently.
This work sits at the intersection of:
- Chemistry (valence, bond types, structural rules)
- Computer Science (recursion, backtracking, graph theory)
- Discrete Math (injective functions, combinatorial logic)
The core function takes a formula in the form:
"CO2"
You can modify this dictionary to represent any molecular formula you want to analyze.
The generator builds and explores candidate molecular graphs using:
- recursive backtracking
- valence-based pruning
- duplicate elimination via unique encoding
The output is returned as a Python list. Each element is a dictionary describing one unique isomer, including:
- atom positions
- bond positions
- lone pair positions
- atom-level information (hybridization, VSEPR, bond angles, formal charge)
- molecule-level metadata (polarity, σ/π bonds, molar mass)
These objects are interpreted directly by the frontend’s canvas renderer.
{
"0": {
"atoms": {
"500.0,300": "C",
"435.0,300": "O",
"565.0,300": "O"
},
"bonds": {
"467.0,300": "=-",
"532.0,300": "=-"
},
"lonePairs": {
"416.25,300.0": ":-",
"435.0,281.25": ":|",
"583.75,300.0": ":-",
"565.0,281.25": ":|"
},
"atomsInfo": {
"500.0,300": {
"Hybridization": "sp1",
"VSEPR": "linear",
"bond angles": "180°",
"formal charge": "0"
},
"435.0,300": {
"Hybridization": "sp2",
"VSEPR": "N/A",
"bond angles": "N/A",
"formal charge": "0"
},
"565.0,300": {
"Hybridization": "sp2",
"VSEPR": "N/A",
"bond angles": "N/A",
"formal charge": "0"
}
},
"molInfo": {
"molar mass": "44.01g",
"polarity": "non-polar",
"σ bonds": "2",
"π bonds": "2"
}
}
}
}git clone https://github.com/jefferyz2008/Isomer_Enumerator_Backend
cd Isomer_Enumerator_BackendIf you use virtual environments:
python3 -m venv venv
source venv/bin/activateInstall requirements:
pip install -r requirements.txtFrom the repo root:
python3 src/main.pyModify main.py to input any molecular formula you want.
- Python
- Custom graph structures
- Backtracking algorithms
- Symmetry and duplicate-elimination utilities
- Formal proof-based correctness checks
- Improved chemical rule set (e.g., resonance, tautomers)
- Caching & memoization to reduce search depthepth
- Visualization layer for generated structures
- Integration with cheminformatics libraries (RDKit)
The program can generate hundreds of valid isomers for formulas like C₅H₅NO₂, far beyond what is possible by hand.
Here’s the updated section with a link to both the backend and frontend repositories:
You can explore the full codebases here:
- Backend (Isomer Enumerator Backend): https://github.com/jefferyz2008/Isomer_Enumerator_Backend
- Frontend (Isomer Enumerator Frontend): https://github.com/jefferyz2008/Isomer-Enumerator-Frontend