CZ Biohub's ESMFold2 and ESM3 produce high-quality structure predictions — but pLDDT and pTM only measure model confidence. They don't check rotamer geometry, clashscores, backbone validity, or docking readiness. That gap is what StructSure fills.
from esm.sdk.forge import SequenceStructureForgeInferenceClient
client = SequenceStructureForgeInferenceClient(
model="esmfold2-fast-2026-05",
url="https://biohub.ai",
token="<your Biohub API token>",
)
result = client.fold(sequence="MKTAYIAKQRQISFVKSHFSRQ...")
result.to_pdb("my_protein.pdb")
Upload to Validate in the app (free). You'll get a safe_for_docking flag,
clash count, gap list, and missing atom count before spending credits on cleanup.
# Fills missing atoms, normalises chains, reports gaps
pdbprep submit \
--recipe structure_cleanup \
--file my_protein.pdb \
--output cleaned.pdb
pdbprep submit \
--recipe docking_ready_export \
--file my_protein.pdb \
--options '{"target_service": "colabdock"}' \
--output docking_ready.pdb
The colabdock preset strips hydrogens, removes OXT terminal oxygens,
renumbers residues from 1 per chain, reorders atom serials, and
auto-detects VHH / nanobody chains to put the antigen first.
| Issue | pLDDT / pTM | StructSure |
|---|---|---|
| Backbone gaps (missing residues) | ✗ | ✓ detected + reported |
| Missing heavy atoms | ✗ | ✓ filled where safe |
| Steric clashes | ✗ | ✓ counted |
| Non-standard chain IDs | ✗ | ✓ normalised |
| Alternate location records | ✗ | ✓ resolved |
| Insertion codes | ✗ | ✓ removed |
| Non-sequential residue numbering | ✗ | ✓ renumbered |
| Hydrogen atoms (docking incompatibility) | ✗ | ✓ stripped |
| Safe-for-docking verdict | ✗ | ✓ explicit flag |
| Goal | Recipe | Cost |
|---|---|---|
| Quick sanity check | validate_only | Free |
| Feed into ABB3 or RFDiffusion | structure_cleanup | ~$0.15 / structure |
| ColabDock / ClusPro / ZDOCK | docking_ready_export + colabdock preset | ~$0.15 / structure |
| GROMACS / AMBER MD | md_preparation | ~$0.20 / structure |
| Screen 100+ ESM3 designs | Batch validate → cleanup passing only | Free to triage |
Folds a single sequence to all-atom coordinates. Output is generally clean but may have missing side-chain atoms in low-confidence regions (pLDDT < 70) and occasionally non-standard chain labelling. structure_cleanup handles both.
A generative model that can hallucinate structurally plausible but geometrically strained regions, especially when conditioning on partial functional motifs. Run validate_only first to triage candidates before paying for cleanup — the free validation report tells you which structures are worth cleaning.
When screening ESM3 designs, validate all candidates in one batch, filter on
safe_for_docking == true and clash_count == 0,
then clean only the structures that pass.
import requests, zipfile, pathlib
# Zip your ESM3 outputs
with zipfile.ZipFile("candidates.zip", "w") as zf:
for pdb in pathlib.Path("esm3_outputs").glob("*.pdb"):
zf.write(pdb, pdb.name)
# Submit batch validate (free)
r = requests.post(
"https://api.structsure.bio/v1/batches",
json={
"recipe_id": "validate_only",
"recipe_version": "1.0.0",
"input_format": "zip",
"batch_mode": {"file_glob": "**/*.pdb", "stop_on_error": False},
},
)
batch = r.json()
requests.put(batch["upload"]["url"], data=open("candidates.zip", "rb"))
requests.post(f"https://api.structsure.bio/v1/batches/{batch['batch_id']}/submit")
# Poll until done, download batch_report.csv
# Filter: safe_for_docking == true, clash_count == 0
# Submit passing structures for structure_cleanup
StructSure processes files ephemerally. Input files are deleted immediately after processing begins; output files are available for 30–60 minutes, then deleted. No user accounts or file contents are retained — only SHA-256 hashes and metrics for reproducibility.
Your sequences stay within Biohub's infrastructure. Fold with your own Biohub token, then upload only the resulting PDB to StructSure. StructSure never sees your sequence.