AlphaFold 3 Protein-Ligand Co-Folding: Architecture & Use Cases

AlphaFold 3 Protein-Ligand Co-Folding: Architecture & Use Cases

AlphaFold 3 Protein-Ligand Co-Folding: Architecture & Use Cases

AlphaFold 3 protein-ligand co-folding is the first deep-learning system that predicts a protein and its bound small molecule in a single forward pass, instead of docking a ligand into a pre-computed structure. Published by DeepMind and Isomorphic Labs in Nature in May 2024, the model replaces AlphaFold 2’s Evoformer with a slimmer Pairformer, swaps the SE(3)-equivariant structure module for a diffusion decoder, and extends the prediction target from protein backbones to every heavy atom in a complex — ligands, ions, nucleic acids included. This post unpacks the architecture, the training recipe, the benchmarks that actually matter for drug discovery, and the very real limits a med-chemist will hit on a Tuesday afternoon. By the end you will know when AF3 replaces a docking pipeline, when it does not, and which open-source descendants (Boltz-1, Chai-1) are safer for commercial work.

Architecture at a glance

AlphaFold 3 Protein-Ligand Co-Folding: Architecture & Use Cases — architecture diagram
Architecture diagram — AlphaFold 3 Protein-Ligand Co-Folding: Architecture & Use Cases
AlphaFold 3 Protein-Ligand Co-Folding: Architecture & Use Cases — architecture diagram
Architecture diagram — AlphaFold 3 Protein-Ligand Co-Folding: Architecture & Use Cases
AlphaFold 3 Protein-Ligand Co-Folding: Architecture & Use Cases — architecture diagram
Architecture diagram — AlphaFold 3 Protein-Ligand Co-Folding: Architecture & Use Cases
AlphaFold 3 Protein-Ligand Co-Folding: Architecture & Use Cases — architecture diagram
Architecture diagram — AlphaFold 3 Protein-Ligand Co-Folding: Architecture & Use Cases
AlphaFold 3 Protein-Ligand Co-Folding: Architecture & Use Cases — architecture diagram
Architecture diagram — AlphaFold 3 Protein-Ligand Co-Folding: Architecture & Use Cases

What changed between AlphaFold 2 and AlphaFold 3

AlphaFold 3 keeps the spirit of AlphaFold 2 — sequence and MSA in, 3D structure out — but rewires three of its four major components. The Evoformer is replaced by a 48-block Pairformer, the SE(3)-equivariant structure module is replaced by a diffusion decoder, and the output is extended from protein backbones to all-atom coordinates of any biomolecular complex. MSAs are still required but used more sparingly.

The 2021 AlphaFold 2 architecture (Jumper et al., Nature 596) had four pieces: a feature embedder, the Evoformer trunk that exchanges information between an MSA representation and a residue-pair representation, an SE(3)-equivariant structure module that produced backbone frames plus side chains, and a recycling loop. It was protein-only. Ligands, nucleic acids, ions, and post-translational modifications had to be bolted on afterwards with separate tools like AutoDock Vina, Glide, or Rosetta.

AlphaFold 3 collapses that pipeline. A single network takes protein chains plus DNA, RNA, ligand SMILES strings, CCD codes, and ions, and predicts the joint 3D arrangement in one shot. The architectural simplifications matter because they made it possible to scale the chemistry vocabulary without exploding model size — AF3 has roughly the same parameter count as AF2 but covers a far larger biomolecular universe.

Three engineering choices explain the leap. First, switching to a token-based representation where each ligand heavy atom is a token unifies the vocabulary across polymers and small molecules — the same attention mechanism handles a tryptophan side chain and a methotrexate analog. Second, the diffusion decoder is not equivariant by construction, freeing the model from the rotation-equivariant constraints that limited AF2’s structure module to backbone frames. Third, the loss is rewritten to operate on raw coordinates rather than on frames-and-torsions, which allows the network to learn ligand conformations end-to-end without a stereochemistry-aware decoder.

The downside of these choices is loss of guaranteed symmetry: AF3 predictions of a homodimer can return slightly asymmetric coordinates because nothing in the architecture enforces the symmetry. In practice the asymmetry is small (sub-angstrom) and users symmetrize the output if needed.

Core architecture: from sequence to atom cloud

AlphaFold 3 is a token-based encoder-decoder. Tokens represent residues for polymers and individual heavy atoms for ligands and modified residues. The encoder is a Pairformer trunk that builds a learned single representation and a pair representation; the decoder is a conditional diffusion module that denoises a random atom cloud over 20 steps using the trunk’s pair representation as guidance.

AlphaFold 3 protein-ligand co-folding architecture diagram showing input embedder, Pairformer trunk, diffusion module and output heads

The input pipeline starts at the embedder. For protein and nucleic acid chains, each residue becomes one token; for ligands, modified residues, and ions, each heavy atom becomes a token, with bond-graph features supplied via the Chemical Component Dictionary (CCD) or parsed from SMILES. The embedder also computes a relative-position encoding that respects chain breaks and ligand atom adjacency. MSAs are constructed with Jackhmmer against UniRef90 and HHBlits against Uniclust30, then pooled and added to the single representation — a much lighter MSA usage than AF2’s column-wise attention.

The Pairformer trunk runs 48 identical blocks. Each block updates the pair representation with triangle attention and triangle multiplicative updates (inherited from AF2), then updates the single representation with attention biased by the pair representation. Crucially, the Pairformer drops AF2’s MSA-row attention: the MSA is summarized into the single representation once, not re-attended every block. That is the headline simplification — the trunk is roughly 30 percent fewer FLOPs per block than the Evoformer, and the saved compute funds the diffusion decoder.

The diffusion module replaces the rotation-equivariant structure module. It receives the final pair representation as conditioning, samples Gaussian noise around the centre of mass, and denoises atom coordinates over 20 steps using a transformer-based denoiser. The training objective is a weighted sum of a per-step diffusion loss (predicting the noise) plus an LDDT-based loss on the final coordinates, plus auxiliary confidence losses on pLDDT, PAE, and pTM heads.

The denoiser itself is a stack of attention blocks operating on atoms, with conditioning injected via cross-attention to the trunk’s pair representation. Each denoising step takes a noised atom cloud and predicts a clean version; sampling repeats the prediction-and-renoise loop 20 times along a fixed noise schedule. The schedule is heavy-tailed at the beginning (large noise gives the model freedom to reorganize topology) and tight at the end (small noise refines local geometry). This is the same EDM-style sampler popularized by image diffusion, adapted to 3D coordinates with a sigma-conditioned MLP.

A subtle but important detail is “atom permutation symmetry”. Two carbons in a symmetric ligand (the para-positions of a benzene ring, say) are interchangeable in chemistry but distinct as token indices in the network. AF3 handles this with a permutation-aware loss that matches predicted atoms to ground-truth atoms by lowest RMSD across the symmetry orbit before computing the loss. Without this, the network would be punished for swapping equivalent atoms and would refuse to learn ligand geometry.

Inputs and outputs in practice

The AlphaFold Server input schema is JSON, with one entry per chain or ligand. Protein chains are FASTA sequences, RNA and DNA chains are nucleotide strings, small molecules are SMILES or CCD codes, ions are CCD codes, and modifications can be attached to specific residues. Outputs are atom coordinates plus five confidence metrics: pLDDT, PAE, ipae, pTM, and ipTM.

Here is a minimal AlphaFold Server request for predicting a kinase bound to ATP:

{
  "name": "EGFR_kinase_ATP",
  "modelSeeds": [1, 2, 3, 4, 5],
  "sequences": [
    {
      "proteinChain": {
        "sequence": "GTEFKKVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYS",
        "count": 1
      }
    },
    {
      "ligand": {
        "ligand": "ATP",
        "count": 1
      }
    },
    {
      "ion": {"ion": "MG", "count": 2}
    }
  ]
}

That request returns a CIF file with all heavy atoms, plus a JSON summary containing per-residue pLDDT (0-100), an N-by-N PAE matrix in angstroms, the interface PAE block for cross-chain pairs, a scalar pTM, and a per-chain-pair ipTM matrix. Confidence interpretation is the single biggest source of user error — we will return to it below.

AlphaFold 3 protein-ligand co-folding sequence diagram from input embedder through diffusion denoising loop

Sample seeds matter. The server runs the diffusion module five times per submission with different seeds and returns the top-ranked structure by mean confidence, plus four alternates. Different seeds can produce visibly different ligand poses, especially in flexible pockets. Treat the five-seed ensemble as a poor-man’s uncertainty estimate rather than chasing a single “best” answer.

Training data and the self-distillation flywheel

AlphaFold 3 is trained on the Protein Data Bank (PDB) plus a self-distillation set drawn from the AlphaFold Database. The PDB cutoff is September 30, 2021, with quality filters on resolution and validation. The self-distillation set adds roughly 1.4 million predicted structures with high mean pLDDT, augmenting the diversity of sequences seen during training. Ligand training pairs come from PDB co-crystal structures, with chemistry parsed from the CCD.

The training pipeline runs in two phases. The first phase trains the full network on cropped tokens with a maximum of 384 tokens per crop, for roughly 35 percent of total steps. The second fine-tuning phase increases crop size to 768 tokens and re-weights the loss towards ligand atoms and inter-chain contacts. Confidence heads are trained in a final stage on frozen trunk and diffusion weights, predicting the actual LDDT of the network’s own predictions.

Two biases are worth naming. First, the PDB is enriched in soluble proteins, well-behaved kinases, and viral surface proteins; transmembrane GPCR complexes and large nucleoprotein assemblies are under-represented and AF3 quality drops accordingly. Second, the ligand training set is dominated by drug-like small molecules in the 200-600 Da range; covalent inhibitors, macrocycles above 1 kDa, and metal-coordinating warheads are weak spots. The original Nature paper shows the success-rate cliff explicitly in its Figure 3 ligand benchmark.

A third subtle bias comes from the self-distillation step. Self-distilled targets are AF2 predictions filtered by pLDDT, which means the training set inherits AF2’s blind spots. Domains AF2 struggled with (intrinsically disordered regions, novel folds without close MSA homologs) remain under-represented in AF3 training and unsurprisingly remain weak points at inference time. This is a structural reason to be skeptical of AF3 predictions for orphan proteins with shallow MSAs — the model has seen relatively few such examples even after self-distillation.

Training cost is non-trivial but tractable: published estimates put AF3 training at roughly 10-15 days on a 256-TPUv4 pod, with another 2-3 days for the confidence-head fine-tune. Inference is much cheaper. A single 250-residue protein-ligand complex predicted with five seeds runs in roughly 4-6 minutes on a single H100 GPU, and the AlphaFold Server batches multiple users onto shared accelerators so most jobs return within 15 minutes wall-clock.

Accuracy: what the benchmarks actually say

On the PoseBusters v2 benchmark of 308 recent PDB co-crystal structures, AlphaFold 3 produces a pose within 2 angstroms RMSD of the experimental ligand in 76 percent of cases — versus 52 percent for DiffDock and roughly 41 percent for AutoDock Vina. Protein-protein interface accuracy (DockQ > 0.23) hits 62 percent, up from 23 percent for AF2-Multimer. Antibody-antigen complexes remain hard: success rate sits at roughly 33 percent versus 19 percent for AF-Multimer.

Those numbers come straight from the Nature paper’s Tables 1-2 and the PoseBusters analysis that pre-dated AF3 by months. PoseBusters is important because it filters for physical validity: it rejects poses that clash with protein atoms, have wrong bond lengths, or have implausible stereochemistry. Earlier docking benchmarks rewarded RMSD-low but physically broken poses; PoseBusters does not.

Three caveats keep this from being a slam dunk. Performance on the 88 PoseBusters structures published after the AF3 training cutoff drops from 76 percent to 71 percent — small but measurable test-set leakage in the headline number. Multi-ligand pockets with overlapping cofactors (NADH plus a substrate, for example) confuse the network. And the network’s published “success rate” counts the top-ranked seed; if you measure best-of-5 seeds, the number rises by another 4-6 percentage points but you have to know how to pick the right seed without the answer key.

For an adjacent comparison with the geometric deep-learning revolution in materials chemistry, the same diffusion-decoder pattern is now being applied to crystal structure prediction with GNoME and MatterGen — AF3 is the biomolecular cousin of a broader shift away from physics-based simulation toward learned generative priors.

Open-weights alternatives: Boltz-1, Chai-1, and friends

Three open or semi-open AF3 reimplementations matter in 2026. Boltz-1 from MIT-CSAIL is MIT-licensed for commercial use, matches AF3 within 1-2 percentage points on PoseBusters, and runs on a single 40 GB A100. Chai-1 from Chai Discovery is open-weights with a restricted commercial license, supports MSA-free inference, and adds a multimer-restraint mode. ProteinMPNN is not a co-folder but is the dominant sequence-design tool that pairs with AF3 for de-novo binder generation.

Comparison matrix for AlphaFold 2, AlphaFold 3, Boltz-1, and Chai-1 across backbone, ligand support, license, and weights

The licensing matrix decides which model a startup can actually use. AlphaFold 3 weights are released under a non-commercial Creative Commons-style license through a gated form on the AF3 GitHub repository; a pharma company that uses them in a paid pipeline is in breach. Boltz-1’s MIT license has no such trap. Chai-1’s license permits research use and restricts hosted-API resale. Pick once, document the choice in your model card, and stop relitigating it every quarter.

On head-to-head benchmarks Boltz-1 trails AF3 by roughly 2-3 points on PoseBusters v2 and by about 5 points on antibody-antigen DockQ, but the gap closes when you allow Boltz-1 to consume multiple seeds and pick best-of-N. Chai-1’s distinctive advantage is restraint-guided multimer prediction: you can supply cryo-EM density blobs or chemical cross-link distance restraints as additional conditioning, which is invaluable for big assemblies. AF3 has no equivalent restraint interface in its current public release.

Two further open projects bear watching. RoseTTAFold All-Atom from the Baker lab matches roughly 90 percent of AF3’s ligand pose accuracy with a fully permissive license. The OpenFold consortium has been working on an AF3-replication codebase that, once trained on equivalent data, could land as a fully open AF3-equivalent. The open ecosystem is moving fast enough that commercial users should re-evaluate their model choice every six months, not every project.

Drug-discovery use cases — beyond the demo

The four use cases where AlphaFold 3 has already changed a real workflow in 2025-2026 are early-stage hit triage, antibody developability filtering, CRISPR guide-design QC, and de-novo protein binder design. Each pairs AF3 with at least one orthogonal method — a physics-based scoring function, an MD simulation, or a wet-lab assay — because AF3 alone is a fast filter, not a verdict.

AlphaFold 3 drug discovery use case flow from disease target through co-folding screen to wet-lab validation

In hit triage, a medicinal chemist replaces an overnight Glide docking run on a 100,000-compound library with an AF3 co-folding screen that takes two hours on 8 H100s using Boltz-1 weights. Compounds with ipTM > 0.6 and pocket pLDDT > 70 move forward; the rest are dropped. The chemist then re-scores the top 200 with MM-GBSA or Vina and orders 20 for SPR and ITC. Industry teams at Isomorphic Labs report this funnel cuts time-to-hit by roughly 60 percent versus a docking-only pipeline, with comparable or better hit rates.

For antibody engineering, AF3 predicts the antibody-antigen complex from the antibody sequence plus the antigen structure. ipTM > 0.7 correlates loosely with experimentally validated binders, but not strongly enough to trust without an Octet assay. Practitioners use AF3 to rank a Rosetta-designed library of 50-200 variants and pick the top 20 for yeast display. The complementary trend in mRNA-encoded therapeutics means that designed antibody sequences can move from in-silico to in-vivo expression in weeks.

For CRISPR guide design, AF3 predicts a Cas9-sgRNA-DNA ternary complex and flags guides where the DNA strand kinks unfavourably or the protospacer adjacent motif is misaligned. It is not a substitute for a deep-learning specificity predictor like CRISPRoff, but it adds a structural sanity check that catches a small but real class of guides that score well on sequence-based models and fail in cells.

For de-novo binder design, the dominant 2026 pattern is the RFdiffusion-then-ProteinMPNN-then-AF3 loop. RFdiffusion generates a backbone scaffold targeted at a chosen epitope, ProteinMPNN designs a sequence that folds into that scaffold, and AF3 predicts the designed protein in complex with the target. Designs that survive AF3 with ipTM > 0.85 and target-pocket pLDDT > 80 advance to E. coli expression and SPR. Hit rates from in-silico design to validated nanomolar binder have climbed from below 1 percent in 2022 to roughly 8-12 percent in 2025 published reports, with AF3 doing the structural verification heavy lifting that previously required Rosetta plus weeks of MD.

Confidence metrics, demystified

AlphaFold 3 returns five confidence signals — pLDDT, PAE, ipae, pTM, ipTM — and each answers a different question. pLDDT is per-atom or per-residue local confidence on a 0-100 scale; trust values above 70 and treat above 90 as “as good as a 2.5 angstrom crystal structure”. PAE is a residue-by-residue matrix of expected position error in angstroms; values below 10 in a block indicate the network is confident about the relative arrangement. ipTM measures interface accuracy for multi-chain predictions; above 0.8 is strong evidence of correct topology, 0.6-0.8 is ambiguous, below 0.6 is essentially “do not trust”.

AlphaFold 3 confidence metrics map for pLDDT, PAE, PTM, and ipTM and when each is trustworthy

The single most common mistake is interpreting global pTM as proof of local correctness. A 500-residue protein with one mis-folded loop can still have pTM > 0.85 because the rest of the topology is right. Always cross-check the pLDDT trace and the PAE block over the region you care about. For a ligand pose in a binding pocket, the trustworthy signal is the joint ipTM-of-pocket plus ligand-atom pLDDT plus interface PAE between pocket residues and ligand atoms — not the global pTM.

A second common mistake is treating confidence as a binary trust signal. ipTM = 0.62 does not mean “right” or “wrong” — it means roughly 60-70 percent probability that the interface topology is correct given the network’s calibration. Build that probability into your downstream pipeline. If your screen has 1000 hits with ipTM > 0.6, expect 300-400 to be topologically wrong; design wet-lab capacity accordingly rather than ordering all 1000 compounds and being surprised by the hit rate.

A practical inspection workflow is to load the predicted CIF in PyMOL or ChimeraX, color residues by pLDDT (blue high, red low), overlay the PAE matrix from the JSON sidecar, and check three things by eye: does the binding pocket have uniformly high pLDDT, are the ligand-pocket distances chemically reasonable (no atom clashes, no carbon-hydrogen bonds at 1 angstrom), and is the PAE block between ligand atoms and pocket residues below 5 angstroms. Five minutes of structural inspection catches most of the failures the global metrics miss.

Trade-offs and failure modes

AlphaFold 3 fails predictably in five regimes: conformational dynamics, binding-affinity prediction, highly flexible regions, novel chemistry, and antibody-antigen complexes outside training distribution. Knowing the failure modes is more valuable than knowing the success cases, because the network is confident-looking even when it is wrong.

First, AF3 predicts one conformation per seed. It does not enumerate allosteric states, alternative side-chain rotamers, or domain motions. A kinase predicted in its DFG-in active state may exist 80 percent of the time in DFG-out in cells. If conformational ensemble matters — and for most drug targets it does — pair AF3 with a short MD run or with ensemble-prediction methods like Distributional Graphormer.

Second, AF3 does not directly predict binding affinity. ipTM correlates with whether a complex forms, not with how tightly. A compound with ipTM 0.85 may have IC50 of 100 nM or 100 microM. Affinity ranking still requires MM-GBSA, FEP+, or wet-lab measurement.

Third, intrinsically disordered regions and long flexible linkers get low pLDDT scores and should be treated as “this region has no single answer” rather than “the model failed”. Many users delete low-pLDDT residues and re-render — that is appropriate for visualizations, dangerous if you are designing mutations there.

Fourth, novel chemistry outside the training distribution — covalent warheads, organometallic complexes, peptide macrocycles above 1500 Da, PROTACs — produces visually plausible but mechanistically wrong poses. Validate any covalent geometry with a quantum-chemistry tool like Psi4 or with an MD restraint.

Fifth, antibody-antigen prediction remains the published weakness. The 33 percent success rate cited above means roughly two out of three antibody complexes are placed wrongly. For early triage that is still useful — better than random — but it does not replace structural biology for any antibody you are about to spend $200k optimizing.

A sixth failure mode worth flagging: AF3’s confidence metrics can be miscalibrated near the training-cutoff boundary. Targets with close homologs deposited just before September 2021 receive inflated pLDDT and ipTM, because the network effectively memorized them. Targets with no homologs in the PDB receive deflated pLDDT even when the prediction is correct, because the network is not used to seeing such input. A common defense is to compare AF3’s confidence on your target against AF3’s confidence on a held-out target you know is correct from cryo-EM or crystallography — calibration is relative, not absolute.

Practical workflow for a med-chemist in 2026 vs 2022

A 2022 med-chemist working on a new GPCR ligand series would start by homology-modelling the target with SwissModel or building from a cryo-EM map, then dock candidate ligands with Glide or AutoDock, then rank with MM-GBSA, then send 20-50 compounds for synthesis and assay. The structural prep alone took weeks.

A 2026 med-chemist takes the receptor sequence, drops it into AlphaFold Server with the top 10 candidate ligands as SMILES, gets co-folded complexes back in 30 minutes per ligand, filters by ipTM > 0.65 and per-pocket pLDDT > 70, re-scores survivors with FEP+, and ships 10-15 to synthesis. Cycle time is days, not weeks. The catch — and the reason senior chemists are not yet replaced by interns with AlphaFold Server logins — is that knowing which seeds to trust, which pockets are predicted correctly, and which compounds to reject for chemistry-feasibility reasons is still hard-won expertise.

A subtler shift is the change in what gets logged. In 2022 the artifact handed off to the next chemist was a PDB file and a docking score. In 2026 it is a JSON bundle: AF3 model version, weights checksum, five seeds, per-residue pLDDT, full PAE matrix, ipTM, the chosen ligand SMILES, plus a Boltz-1 cross-check and a Vina re-score. Provenance has become first-class because the rate of model updates is high — Boltz-2 lands within months, AF3.1 arrives via the server with no version banner — and any reported hit needs to be reproducible against the exact weights that generated it.

For structural validation downstream, the parallel improvements in sub-2-angstrom cryo-EM resolution mean that AF3-predicted poses can now be confirmed experimentally faster and at higher confidence than during the AF2 era.

# Pseudocode: AF3 hit triage loop using Boltz-1 weights
from boltz import Boltz
import pandas as pd

model = Boltz.load("boltz-1-0")
hits = []
for smi in compound_library.smiles:
    pred = model.predict(
        protein=target_sequence,
        ligand=smi,
        n_seeds=5,
    )
    if pred.iptm > 0.6 and pred.pocket_plddt.mean() > 70:
        hits.append({
            "smiles": smi,
            "iptm": pred.iptm,
            "pocket_plddt": pred.pocket_plddt.mean(),
            "pose_cif": pred.top_cif,
        })

pd.DataFrame(hits).to_csv("af3_triaged_hits.csv")

Practical recommendations

  • Use the AlphaFold Server for non-commercial research; switch to Boltz-1 for any commercial pipeline.
  • Always run five seeds per prediction and inspect the spread of ligand RMSD across seeds.
  • Trust ipTM > 0.8 for interface topology; treat 0.6-0.8 as “needs orthogonal validation”.
  • Combine AF3 poses with MM-GBSA or FEP+ for affinity ranking — AF3 alone cannot rank potency.
  • Never delete low-pLDDT regions before checking whether they overlap your binding site or design target.
  • Document the model version, weights checksum, and seed count in your electronic lab notebook for reproducibility.

FAQ

How is AlphaFold 3 different from AlphaFold 2?

AlphaFold 3 replaces AF2’s Evoformer with a slimmer 48-block Pairformer and replaces the SE(3)-equivariant structure module with a diffusion-based all-atom decoder. The bigger change is scope: AF3 predicts protein-protein, protein-DNA, protein-RNA, protein-ligand, and ion-bound complexes in one model, where AF2 was protein-only and required separate docking tools for ligands.

Can AlphaFold 3 predict binding affinity?

No, not directly. AlphaFold 3 outputs the geometry of a bound complex plus confidence metrics like ipTM, but ipTM correlates with “is this complex plausible” rather than with potency. To rank binding affinity you still need MM-GBSA, FEP+, alchemical free-energy calculations, or wet-lab measurements like SPR, ITC, or fluorescence polarization.

Is AlphaFold 3 free to use commercially?

The AlphaFold Server at alphafoldserver.com is free for non-commercial use, and the released weights on GitHub are under a non-commercial license. For commercial drug-discovery pipelines you cannot use AF3 weights directly. Open-weights alternatives like Boltz-1 (MIT license) and Chai-1 (restricted commercial) provide nearly-equivalent quality with friendlier terms.

What is the accuracy of AlphaFold 3 protein-ligand co-folding?

On the PoseBusters v2 benchmark, AlphaFold 3 produces ligand poses within 2 angstroms RMSD of the crystal structure in 76 percent of cases, versus 52 percent for DiffDock and 41 percent for AutoDock Vina. Antibody-antigen prediction remains harder, at roughly 33 percent success rate, and performance drops by a few percentage points on structures published after the September 2021 training cutoff.

Does AlphaFold 3 work for membrane proteins and GPCRs?

Partially. AF3 inherits AF2’s lipid-naive view of membrane environments and does not explicitly model bilayers, so transmembrane helix packing is sometimes off. GPCR ligand co-folding works for orthosteric pockets in receptors well-represented in the PDB, less reliably for allosteric pockets and for inactive-state conformations of receptors whose training data are dominated by active states.

Should I use AlphaFold 3, Boltz-1, or Chai-1?

For academic research with non-commercial intent, AF3 via the AlphaFold Server is fastest and best-documented. For commercial pipelines, Boltz-1 is the safest pick because of its MIT license and near-AF3 quality. Chai-1 is worth evaluating when you need MSA-free inference or restraint-guided multimer prediction. Run all three on a small benchmark of your own targets before committing.

Further reading

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *