Predicting Protein Conformational States with AI (2026)

Protein conformational state prediction is the field’s next big problem, and 2026 is the year it stopped being a niche concern. A single folded structure, the kind AlphaFold made famous, is a useful snapshot. But most proteins are not statues. They breathe, flex, and switch between distinct shapes to do their jobs. A drug that binds one shape may ignore another. An enzyme that looks inert in one state turns active in the next.

For years, AI structure prediction gave us one answer per sequence. That answer was often stunningly accurate. It was also, by design, incomplete. Real biology lives in the motion between states. The frontier now is predicting not one structure but the landscape a protein explores.

This shift matters because function, allostery, and druggability all hide in that landscape. Static models cannot see them. New methods can begin to.

What this covers: why one structure is not enough, the main AI methods for predicting conformational states, how a prediction-to-validation pipeline works end to end, where these methods break, and why drug discovery teams care.

Why a single structure isn’t enough

Proteins are dynamic objects. The same chain of amino acids can fold into several stable shapes, called conformational states. Each state has a different energy and a different job. A protein might sit mostly in an “open” state, dip occasionally into a “closed” one, and visit rare intermediates in between.

This motion is not noise. It is the mechanism. Allostery, where binding at one site changes behavior at a distant site, depends on the protein shifting between states. Signaling, transport, and catalysis all rely on these shifts. A transporter that never closed would never move its cargo.

Think about how much information a single shape throws away. A kinase that flips between active and inactive forms is two different drug targets, not one. A receptor that samples several conformations presents several distinct surfaces. Reduce that to one structure and you have answered the wrong question. You have a confident average where biology wanted a distribution.

There is also a deeper point about energy. Each state sits in an energy well, and the protein hops between wells driven by thermal motion. The shape of that energy landscape, how deep each well is and how high the barriers between them are, controls behavior. Two proteins can share a dominant structure yet behave differently because their landscapes differ. Structure prediction that ignores the landscape misses why proteins act as they do.

AlphaFold2 and AlphaFold3 mostly return a single high-confidence structure. AlphaFold3, released by Google DeepMind and Isomorphic Labs, is diffusion-based and extends prediction to complexes, including proteins, nucleic acids, ligands, and ions. That is a major advance. But the default output is still one dominant conformation, not the full set of functional states. DeepMind has been explicit that conformational landscapes remain an open frontier (AlphaFold, DeepMind).

Why does the gap matter so much for drugs? Many binding pockets only appear in a specific state. A “cryptic” pocket can be invisible in the dominant structure and obvious in a rarer one. Target a protein in the wrong shape and your molecule has nowhere to grip.

This is not a fringe concern. A large fraction of important drug targets, including many receptors, kinases, and transporters, are defined by their motion. Their whole job is to change shape in response to a signal. Describing such a protein with one frozen frame is like reviewing a film from a single still. You might recognize the actors. You will miss the plot.

The research community has framed two named frontiers for the field: routine de novo binder design, and prediction of conformational ensembles. Reviews in venues such as Nature and Frontiers, alongside a steady stream of bioRxiv method papers, have pushed the second frontier hard since 2023. The reported claim that around 40% of new PDB depositions in 2024 to 2025 used AI methods, if it holds, shows how fast static prediction spread. The ensemble problem is the harder sequel.

It is worth being precise about what “established” means here. Single-structure prediction is mature and widely trusted. Conformational ensemble prediction is not yet at that level. It is an active research area with promising results, genuine successes on specific systems, and no general, validated method that works reliably across all proteins. Treat the techniques in this post as emerging tools, strong on certain problems, unproven on others. The hype around AlphaFold should not be transferred wholesale to the harder ensemble question.

See Figure 1 for the difference between a static prediction and a conformational ensemble.

Figure 1: A single sequence does not map to a single shape. Static prediction returns one dominant state. Ensemble prediction aims to recover the multiple functional conformations a protein actually samples.

The methods

There is no single recipe for conformational state prediction yet. Instead there is a toolbox of approaches, each with different assumptions, costs, and failure modes. Most production attempts combine two or more. Figure 2 lays out the broad taxonomy before we go deeper.

Figure 2: Three broad families of methods. They are not mutually exclusive, and hybrid pipelines are common.

The three families below cover most of what teams use in 2026. Read them as complementary tactics, not rivals. One way to keep them straight is by what each one trusts most. MSA methods trust evolution. Generative models trust learned patterns across many structures. Hybrid methods trust physics. The strongest pipelines borrow from all three, using evolution to seed candidates, learned models to expand them, and physics to keep them honest.

MSA subsampling and clustering

AlphaFold reads a multiple sequence alignment, or MSA, a stack of evolutionarily related sequences. The MSA carries the coevolution signal that drives accurate folding. It turns out that signal also encodes hints about alternative states, if you know how to read it.

The trick is to subsample the MSA. Instead of feeding the model the full alignment, you feed it smaller, carefully chosen subsets. Different subsets can nudge AlphaFold toward different conformations. AF-cluster, introduced in work led by groups including the Ovchinnikov lab and published in Nature, clusters MSA sequences and runs predictions per cluster. Some clusters surface alternative states the full MSA would average away.

This approach is attractive because it reuses existing models. No retraining required. It has produced alternative conformations for fold-switching proteins that single-MSA prediction missed. But it is also fragile. Results depend heavily on how you cluster and how deep you subsample. It does not give reliable populations, only candidate shapes. See Figure 3 for the pipeline.

A few practical notes make this concrete. Shallow subsampling, where you keep only a handful of sequences, tends to increase output diversity. The model becomes less certain and explores more. Deep subsampling pins it to the dominant fold. There is no single correct depth. Teams sweep across several depths and several clustering schemes, then look at which states recur. A conformation that shows up across many independent subsets is more trustworthy than one that appears once and vanishes.

The method shines on fold-switching proteins, which are known to adopt radically different structures from one sequence. These are exactly the cases where a single prediction is most misleading. For proteins with subtler motions, the signal is weaker and the subsampling trick is less reliable. It is a clever exploit of the alignment, not a general solution.

Figure 3: MSA subsampling splits one alignment into many subsets. Each subset can bias the predictor toward a different conformational state.

Generative and diffusion ensemble prediction

The second family treats conformations as samples from a distribution. Generative models, including diffusion models, learn to produce many plausible structures for one sequence rather than a single best guess. AlphaFold3 itself is diffusion-based, and the wider field has adapted generative architectures specifically to output diverse states.

Diffusion works by starting from noise and denoising toward valid structures. Run the process many times with different noise seeds and you get a spread of candidate conformations. Tune the sampling and you can widen or narrow that spread. ESM-family protein language models also feed into several generative pipelines, supplying learned sequence representations.

The appeal is direct sampling of diversity. The catch, covered later, is that this diversity is not guaranteed to match real thermodynamic populations. A model can generate ten distinct shapes that look reasonable while getting their relative likelihoods wrong.

It helps to separate two questions the field keeps blurring. First, which conformations can this protein adopt? Second, how likely is each one? Generative diffusion models are reasonably good at the first question and weak at the second. They cast a wide net of plausible shapes. What they do not reliably tell you is whether a given shape is the dominant state or a one-in-a-million excursion. That gap is the central honesty problem of the whole field.

Some recent generative methods try to condition sampling on extra information, such as a known reference state or a desired functional context. Conditioning narrows the output toward states you care about. It is a useful steering knob, though it also risks baking in your assumptions. If you condition too hard, you find what you went looking for and miss the surprise.

Hybrid methods with MD and Boltzmann generators

The third family grounds AI predictions in physics. Molecular dynamics, or MD, simulates how a protein moves atom by atom over time. MD is physically principled but expensive, and it struggles to reach rare states in feasible compute budgets.

The pairing is natural. AI proposes candidate states quickly; MD refines them and checks whether they are physically stable. Some pipelines use AI-generated structures as starting points to seed MD, cutting the time wasted exploring dead ends.

Boltzmann generators push this further. They are generative models trained to sample conformations weighted by their Boltzmann probability, the physics-correct likelihood of each state at equilibrium. In principle, this is exactly what we want: not just which shapes exist, but how often each one appears. In practice, training Boltzmann generators for large, realistic proteins remains hard. They work best on smaller systems today, and scaling them is an active research problem.

Why are they so hard to scale? The Boltzmann distribution is brutally sensitive to small energy errors, because probabilities depend on energy exponentially. A tiny mistake in the energy of a state can throw its predicted population off by orders of magnitude. Large proteins also have vast, rugged landscapes with many wells, and learning to sample all of them correctly is a hard statistical problem. Progress is real but incremental.

The broader lesson of the hybrid family is humility about pure machine learning. Physics is a strong prior. Folding the laws of motion and energy into the pipeline, whether through MD refinement or Boltzmann-weighted training, is how the field guards against confident nonsense. The most credible conformational results in 2026 tend to have a physics check somewhere in the loop.

How it works end-to-end

A real conformational prediction run is a pipeline, not a single model call. It moves from sequence to candidate shapes to validated states. Figure 4 traces the full flow.

Figure 4: The end-to-end path. Each candidate conformation must survive scoring and comparison against experiment before anyone trusts it.

It starts with the sequence and its MSA. The MSA is built by searching large sequence databases for evolutionary relatives. This step alone shapes everything downstream, because the alignment carries the coevolution signal. For conformational work, the MSA may then be subsampled or clustered, as described above.

Next comes generation. The predictor, whether an AlphaFold variant, a dedicated diffusion model, or a hybrid, produces multiple candidate conformations. The aim here is coverage. You want the real states somewhere in your candidate pool, even if you also catch some junk. This is the step where the choice of method really shows. An MSA-subsampling run will sweep alignment depths. A diffusion run will vary noise seeds. A hybrid run may seed MD from the most promising candidates and let physics extend the search.

How many candidates is enough? There is no fixed number. The honest answer is that you keep generating until the set of distinct states stops growing, then a bit more for safety. For a protein with two well-separated states, a modest pool may suffice. For a flexible protein with many shallow wells, you may need a large pool and still worry you missed something. Knowing when you have explored enough is itself an unsolved problem.

Then comes the hard part: scoring and filtering. Not every generated shape is real. Pipelines apply several filters in sequence. Confidence metrics from the predictor flag low-quality structures. Physics-based energy checks, sometimes via short MD runs, remove unstable shapes. Clustering groups similar candidates so you can identify distinct states rather than near-duplicates. The ordering matters. Cheap filters run first to cut the pool fast, and expensive physics checks run last on the survivors.

The final and most important step is validation against experiment. This is what separates a credible result from a pretty picture. Candidate states are compared to experimental data where it exists:

Cryo-EM maps, which can capture multiple states in a single sample.
X-ray crystallography for known stable conformations.
NMR data, which is sensitive to dynamics and minor states.
Small-angle scattering and hydrogen-deuterium exchange for shape and flexibility cues.

When a predicted state matches an experimental observation, confidence rises sharply. When no experimental data exists, the prediction stays a hypothesis. That distinction is easy to blur and dangerous to ignore. A conformation no experiment has ever seen is a lead to test, not a fact to publish.

Cryo-EM deserves a special mention because it has changed what validation is possible. Modern cryo-EM can resolve several coexisting states from a single sample, sorting particles into classes that correspond to different conformations. That makes it a natural partner for ensemble prediction. Where a predicted state lines up with a cryo-EM class, you have strong corroboration. NMR complements this by probing motion directly, including states too rare or fleeting for other methods to catch. Together these techniques give the field a way to check its computational guesses against reality.

There is also a feedback loop worth noting. As more experimental structures of alternative states accumulate in the PDB, the data available to train and benchmark these models grows. Better data tends to produce better models, which suggest better experiments. The pace of that loop, not any single algorithm, may decide how fast the field matures.

One detail deserves emphasis: coverage and precision pull against each other. Generate too few candidates and you may miss the real state entirely. Generate too many and you drown in false positives that your filters must clear. Good pipelines deliberately over-generate at the start, then filter hard. It is cheaper to discard a bad shape than to never have proposed the right one.

The output of a good run is not one structure. It is a small set of distinct, physically plausible, ideally experiment-backed conformational states, each with a rough sense of how reliable it is. A useful result also flags which states have experimental support and which remain purely computational. That honesty about provenance is what lets a downstream team decide where to spend lab time.

Trade-offs, gotchas, what goes wrong

The biggest trap is mistaking diversity for accuracy. A method that outputs many conformations is not automatically giving you the right ones. Figure 5 maps the common failure modes.

Figure 5: Where conformational prediction goes wrong. Most failures are about weighting, validation, and false states rather than raw geometry.

The deepest issue is thermodynamic weighting. Most current ensembles do not give true Boltzmann populations. They tell you which shapes might exist, not how often each one appears in a real cell. Treating a generated ensemble as if its frequencies were physically correct is a serious error. Only physics-grounded methods, like well-trained Boltzmann generators, even attempt the right weighting, and those do not yet scale to most proteins.

Validation is genuinely hard. Experimental data on alternative states is sparse. Many predicted conformations have no experiment to check them against. Compute cost adds friction: generating large ensembles and refining them with MD is expensive. And false states are a real risk. Models can produce physically plausible shapes that the protein never actually adopts, leading you to chase pockets that do not exist.

A subtler failure is sampling bias inherited from training data. These models learn from the structures we have, and the PDB over-represents stable, crystallizable states. Rare and disordered conformations are under-represented. A model trained on that data may quietly favor the kinds of states it has seen most, missing exactly the unusual conformations that often matter for function. The blind spots of the data become the blind spots of the model.

Intrinsically disordered regions are a particular sore point. Some proteins, or parts of them, do not have a single stable fold at all. They are better described as fluctuating clouds than as switching states. Methods built around discrete conformations can struggle here, and forcing a tidy set of states onto a genuinely disordered region can mislead more than it informs.

The honest summary: these methods are powerful hypothesis generators and unreliable oracles. Use them to find candidate states worth testing, not to declare which states a protein occupies. The discipline is to keep the experimental check in the loop and to report uncertainty plainly.

Why it matters: drug discovery & design

Conformational state prediction is not an academic luxury. It changes what targets look druggable. Many proteins called “undruggable” simply lack an obvious pocket in their dominant shape. A rarer conformation might open one.

That reframes the whole hunt. If you can predict the states a target samples, you can:

Find cryptic pockets that only appear in specific conformations.
Design molecules that selectively stabilize one functional state over another.
Anticipate resistance, where a mutation shifts the conformational balance.
Improve binder and de novo design by targeting the right shape from the start.
Prioritize targets by whether a useful druggable state is reachable at all.

This connects directly to the binder-design frontier. De novo binders need a target surface, and that surface depends on the state. Get the state wrong and the binder grips nothing. The two frontiers, ensembles and binder design, are tightly coupled in practice. A binder designed against a conformation the protein rarely visits may bind beautifully in silico and do nothing in a cell.

There is a strategic angle too. Drugging a specific state, rather than just a protein, opens the door to finer control. A molecule that locks an enzyme in its inactive shape acts differently from one that simply blocks a pocket. State-selective drugs can, in principle, hit one function of a multi-function protein while sparing the others. That selectivity is where much of the long-term promise sits, and it is impossible to pursue without a view of the states in the first place.

The payoff is a richer picture of a target before any chemistry begins. That is worth real compute, even with today’s limits. Used carefully, with experiment kept in the loop, conformational prediction is becoming a standard part of how teams reason about hard targets rather than an exotic add-on.

FAQ

What is a protein conformational state?
A conformational state is one of the distinct stable shapes a protein can fold into. The same amino acid sequence can adopt several states, each with a different energy and function. Proteins switch between states to bind partners, signal, or transport molecules. Predicting these states, not just one, is the goal of conformational ensemble methods in 2026.

Does AlphaFold predict multiple conformations?
AlphaFold2 and AlphaFold3 mostly return a single dominant structure per sequence. They were not designed to output full conformational ensembles. AlphaFold3 is diffusion-based and handles complexes, but multiple functional states still require extra techniques. Researchers use MSA subsampling, such as AF-cluster, or dedicated generative models to coax alternative conformations out of these systems.

What is MSA subsampling?
MSA subsampling feeds a structure predictor smaller, selected subsets of the multiple sequence alignment instead of the full alignment. Different subsets carry different coevolution signals, which can bias the model toward different conformational states. AF-cluster, published in Nature, clusters MSA sequences and runs predictions per cluster, surfacing alternative states the full alignment would average away.

Are predicted conformational ensembles thermodynamically accurate?
Usually not. Most AI-generated ensembles show which conformations might exist but do not give true Boltzmann populations, the physics-correct frequency of each state. Only methods like Boltzmann generators attempt correct weighting, and those do not yet scale to most large proteins. Treat predicted ensembles as candidate states to validate, not as accurate population measurements.

How are predicted conformational states validated?
Predicted states are compared against experimental data such as cryo-EM, X-ray crystallography, NMR, and scattering experiments. Cryo-EM and NMR are especially useful because they can capture multiple or minor states. When a prediction matches experiment, confidence rises. When no data exists, the conformation stays a hypothesis to test, not a confirmed fact.

Why do conformational states matter for drug discovery?
Many drug binding pockets only appear in specific conformations. A cryptic pocket can be invisible in a protein’s dominant shape and accessible in a rarer one. Predicting conformational states helps teams find these pockets, design molecules that stabilize a chosen functional state, and target proteins once considered undruggable.

Predicting Protein Conformational States with AI (2026)

Predicting Protein Conformational States with AI (2026)

Why a single structure isn’t enough

The methods

MSA subsampling and clustering

Generative and diffusion ensemble prediction

Hybrid methods with MD and Boltzmann generators

How it works end-to-end

Trade-offs, gotchas, what goes wrong

Why it matters: drug discovery & design

FAQ

Further reading

Related

Comments

Leave a Reply Cancel reply

Tag Cloud

Categories