Methods & reproducibility

Documentation for researchers citing VectaBind. Predictions are computational estimates for ranking and triage — not a substitute for experimental binding assays, FEP, or structure-based docking campaigns.

Validation status: Reported 0.20 pKd MAE is on the PDBBind 2020 validation split used for model selection. Formal held-out test evaluation (e.g. CASF) is in progress. Use VectaBind to compare compounds relative to each other on the same target.

Model

ComponentDetail
ArchitectureStage 6 SE(3)-equivariant EGNN + cross-attention ligand–pocket fusion
Protein representationESM2-3B embeddings (2560-d) on binding-pocket residues + Cα coordinates
Ligand representationRDKit graph → GNN encoder (Stage 5/6 path)
Parameters~65M trainable
Training structures~94k complexes (PDBBind-derived pipeline)
API versionv1.0.0 · endpoint https://api.vectabind.com

Outputs

Calibration

Raw model outputs are mapped through a potency calibration layer before display. Advanced / raw scores are available in Compound analysis → Advanced · model internals in the app. Docking (GNINA) is optional on Pro tier and is separate from the ML affinity head.

Targets & structures

Scoreable targets use pre-computed pocket embeddings from crystallographic or modeled binding sites. Alias names (e.g. EGFR, HER2) map to PDB pocket IDs via an internal registry (GET /targets). Custom pockets can be uploaded on Pro tier via POST /proteins/upload.

Recommended citation language

“Binding affinity was estimated using VectaBind (Stage 6 EGNN + ESM2-3B, API v1.0.0) as a computational rank-ordering tool. Predictions were not treated as experimental Ki/Kd values.”

Limitations

Links