← Back

From Schrodinger to Silicon: Why AI Can't Solve Materials Discovery (Yet)

2024-06-15

I still remember the frustration. Late nights in the UCL computational chemistry lab, watching my DFT calculations crawl through yet another porphyrin system optimization. Each calculation took hours, sometimes days, and I was only studying a molecule with around 100 atoms. The Schrodinger equation, that elegant cornerstone of quantum mechanics, had become my computational nemesis.

That experience crystallized something important: the gap between what we can calculate in theory and what we can compute in practice is the central problem in materials discovery. And despite the recent explosion of AI accelerated materials discovery startups raising hundreds of millions, this fundamental bottleneck hasn't gone away. It's just been repackaged.

The Promise: Ab Initio Methods and the Schrodinger Equation

Let's start with why computational materials discovery is theoretically possible. At its heart is the Schrodinger equation, the fundamental equation of quantum mechanics that governs how quantum systems evolve. If you can solve it for a given atomic configuration, you can predict that material's properties: its crystal structure, electronic band gap, magnetic behavior, optical properties, thermal conductivity. All without ever stepping into a lab.

This is what chemists call ab initio or "from first principles" methods. You input only fundamental physical constants and the system's electron structure, and in theory, you get out real world material properties. It's beautiful. It's exact. And for anything larger than a few atoms, it's completely intractable.

The culprit is the many body problem. The Pauli exclusion principle means every electron must interact with every other electron. Their movements are fundamentally correlated. Adding just a few electrons creates exponentially increasing complexity. Current supercomputers can run precise calculations for quantum systems of only about 11 electrons. When practical chemistry involves systems with hundreds or thousands of electrons, you start to see the problem.

The Compromise: Density Functional Theory

Enter Density Functional Theory (DFT), developed in the 1960s by Walter Kohn and Lu Jeu Sham, who won the Nobel Prize for it in 1998. Instead of tracking every electron's wavefunction, DFT focuses on electron density: the probability of finding an electron at a particular location in space. It's an approximation, but a theoretically sound one that makes quantum calculations computationally feasible.

DFT became the workhorse of computational chemistry and materials science. It's what I used for those porphyrin calculations, molecules relevant to catalysis, photosynthesis, and potentially solar energy applications. And it worked, sort of.

But here's what they don't tell you in the textbooks: DFT has serious limitations that become apparent when you actually try to use it.

Where DFT Struggles

First, there's the sheer computational cost. Even with its approximations, DFT scales poorly as systems grow larger. My porphyrin systems, organic molecules with conjugated ring structures, took days to optimize even on decent hardware. For the complex metal alloys or multi element compounds that make up most practical engineering materials (steels, glasses, ceramics with 5 to 10 elements), DFT becomes prohibitively expensive.

Second, DFT just doesn't work well for certain physics. High temperature systems present a major challenge. The theory breaks down when thermal fluctuations become significant. Complex crystal structures with grain boundaries, defects, and other structural complexities that define real materials are poorly handled. Strongly correlated systems, materials where electron-electron interactions dominate like high temperature superconductors, remain problematic. And perhaps most critically, itinerant magnetism, the magnetic behavior emerging from the collective electronic structure of metal alloys, is exactly the kind of rare earth free magnet everyone is desperate to find, yet DFT struggles here.

That last point is crucial. China's grip on rare earth magnets has sparked a race to find alternatives, and DFT is particularly bad at predicting the very properties that matter for these materials. The magnetic behavior of iron nitride magnets, discovered in the 1950s but still not commercialized, depends on achieving specific crystal structures that DFT struggles to model accurately.

Third, there's the training data problem. DFT is theoretically sound for small, simple molecules under controlled conditions. But does data from small molecules scale to large, complex systems? Does it capture the emergent properties that arise from specific crystal structures, operating temperatures, or compositional variations? We don't actually know yet.

I experienced this firsthand with porphyrins. The calculations would converge to a solution, but validating whether that solution actually represented the physical system, whether my approximations held, whether the functional I chose was appropriate, required careful benchmarking against experimental data. DFT gives you an answer, but not always the right answer.

The Dream: Quantum Computing

The obvious solution is quantum computing. A quantum computer with millions of qubits could, in principle, solve the Schrodinger equation exactly for large systems. No approximations, no scaling issues, just pure quantum simulation.

This is why companies like IBM, Google, and IonQ are investing heavily in quantum hardware, and why materials simulation is always cited as a killer application for quantum computing. If we had fault tolerant quantum computers with sufficient qubits, the materials discovery problem would be transformed overnight.

But here's the reality: we don't have those computers, and we won't for at least another decade, probably longer. Current quantum computers have dozens to hundreds of noisy qubits. Useful chemistry simulations will require millions of error corrected qubits. The engineering challenges of maintaining quantum coherence, scaling qubit counts, and achieving low error rates are immense.

So quantum computing remains a future promise, not a present solution.

The Interim Solution: AI as a DFT Surrogate

This brings us to the current wave of AI materials discovery startups: Orbital Materials, Radical AI, Periodic Labs (which raised $200M at a $1B valuation), and others. The pitch is compelling: use AI to shortcut the computational bottleneck.

Here's the workflow. Generate a large training dataset using DFT calculations on simpler systems. Augment with existing materials databases (AFLOW, Materials Project, ICSD). Train a machine learning model, typically a graph neural network rather than an LLM, to predict material properties. Once trained, the model can "one shot" property predictions without running expensive DFT calculations. Or better yet, work backwards: specify desired properties and have the model suggest candidate materials.

Google DeepMind's GNoME (Graph Networks for Materials Exploration) is the flagship example. Published in Nature in late 2023, it "discovered" 2.2 million stable crystal structures, with 736 independently verified. The model predicts stability based on chemical formula, then confirms with DFT calculations in an iterative feedback loop.

It's a clever approach. Instead of the 1997 Deep Blue chess computer approach (pure brute force DFT), you get something more like AlphaGo: AI centered with less computational overhead.

But Here's the Problem

The AI approach inherits all of DFT's limitations because DFT generates most of the training data. If DFT struggles with high temperatures, complex crystal structures, and itinerant magnetism, then so will the AI models trained on DFT data. You've made the computation faster, but you haven't made it more accurate for the cases where accuracy matters most.

This is what worries me about the current startup landscape. The pitch decks emphasize "compute" and "AI acceleration," but if your training data is fundamentally limited, adding more compute doesn't solve the problem. It's like training an image classifier on low resolution photos and expecting it to identify fine details. The information simply isn't there.

There are additional issues. The small to large molecule problem: DFT slows down with larger systems, so training datasets are biased toward smaller, simpler molecules. Will models trained on these generalize to the complex, multi element alloys used in engineering applications? Unknown.

The crystal structure problem: High throughput DFT struggles to predict which crystal polymorph will actually form under synthesis conditions. Remember, aragonite and calcite are both pure calcium carbonate, same composition, different arrangement, entirely different properties. If your training data can't capture this, your model won't either.

The validation problem: How do you validate AI predictions for truly novel materials when experimental synthesis is expensive and time consuming? You're often stuck in a catch-22: the materials AI suggests might be interesting, but you won't know until you make them, and making them is exactly the bottleneck you were trying to avoid.

The Real Path Forward: Hybrid Approaches

I think the startups that succeed won't be those that just add more compute or scale up AI models. They'll be the ones that recognize this requires both computational and experimental innovation.

The pharma industry provides a blueprint with high throughput screening: massive parallelization and automation to synthesize and test thousands of compounds per week. Materials discovery needs the same, but it's harder.

You need high throughput synthesis with robotic systems for thin film deposition, microfluidics, dip pen lithography, methods that can produce material libraries at scale, though current techniques struggle with crystal structure control. You need high throughput characterization through spectroscopy for optical and electronic properties, mechanical probes for hardness and elasticity, automated XRD for crystal structure. Most importantly, you need real experimental data in the loop: not just DFT generated synthetic data, but actual lab results feeding back into the AI training process.

This is expensive. It's messy. It requires coordinating software and hardware teams. But it's the only way to get around DFT's limitations. You need real world data to capture the physics that DFT misses.

From my own experience wrestling with porphyrin calculations, I learned that computation and experiment need each other. DFT gave me starting points and hypotheses, but validation always came from the lab. The AI materials startups that treat their models as hypothesis generators rather than oracle machines, and invest heavily in experimental validation loops, are the ones with staying power.

Where We Stand

So where does this leave us?

DFT is powerful but limited. Great for small molecules and specific applications, it struggles with the complex systems and properties we care most about. Quantum computing is the ultimate solution, but 10+ years away from practical materials discovery. AI is a useful accelerant that makes DFT level predictions faster and can identify promising candidates, but can't transcend the limitations of its training data.

The current AI materials discovery wave isn't a revolution. It's an optimization. It makes the existing workflow faster and potentially more systematic, but it doesn't solve the fundamental problem: we still can't reliably predict the properties of complex materials from first principles.

That doesn't mean it's not valuable. Even an optimized workflow could accelerate discovery timelines from decades to years. Finding one commercially viable new material, a better magnet, a higher capacity battery cathode, a more efficient thermoelectric, could justify the entire investment.

But we should be clear eyed about what AI can and can't do. It's a bridge technology, making the best of our current computational limitations while we wait for quantum computers to mature. The startups that recognize this, that invest in hybrid experimental computational approaches, that focus on specific high value applications where DFT's limitations are less severe, that build tools for materials scientists rather than trying to replace them, those are the ones I'd bet on.

As someone who's spent too many nights watching progress bars inch forward on quantum chemistry calculations, I'm excited about any approach that makes this process less painful. But having felt DFT's limitations firsthand, I also know that no amount of AI can solve a problem when the underlying physics isn't captured in your data.

The materials discovery problem is fundamentally hard. AI makes it faster. Only quantum computing will make it truly solvable.