March 13, 2025

AI Doesn’t Just Need More Compute — It Needs Better Ways to Think

AI reasoning is often misunderstood. We assume that once a model is trained, it simply retrieves knowledge and applies it to solve complex problems — but real intelligence isn’t that simple. When I was building language model-powered workflows at Myra Labs in 2017-18, I saw this firsthand. Early NLP systems could generate responses, but getting them to reason through multi-step tasks reliably was an entirely different challenge.

There are two fundamentally different ways to improve AI reasoning:

Pre-training structured knowledge and learning efficient reasoning pathways — ensuring the model organizes information well so it can retrieve and combine knowledge efficiently.
Inference-time compute scaling — increasing computation at runtime to refine answers dynamically, allowing the model to adjust and reprocess information on demand.

Sebastian Raschka recently wrote an excellent blog post on (2) inference-time compute scaling, covering techniques such as self-backtracking, test-time optimization, and structured search. These methods don’t change what the model knows but instead refine how it processes that knowledge in real-time.

But (1) pre-training structured knowledge and efficient reasoning traversal is just as important. Why? Because inference-time scaling only works if the model already has a well-structured knowledge base to build on. A model that hasn’t learned to structure information properly will struggle — no matter how much extra compute you throw at it.

In this post, I want to improve my own understanding of, and dive deep into (1) — why pre-training matters, how AI models structure knowledge, and why traversing latent space effectively is a critical yet often overlooked component of reasoning.

1. Pre-Training Structured Knowledge: The Foundation of Reasoning

If AI reasoning is about traversing a space of knowledge and actions, then the quality of that space matters just as much as how the model searches through it. This is where pre-training comes in.

The Geometry of Knowledge: Why Ideas Are Not Euclidean

A common misconception is that AI models store knowledge like a giant lookup table — a vast dictionary where every concept has a fixed position and can be retrieved with a simple query. The reality is far more complex. Modern AI systems encode knowledge in high-dimensional vector spaces, where concepts exist in relation to one another, forming a continuous structure rather than a discrete set of entries. And crucially, these spaces are not Euclidean.

To understand why, consider what a Euclidean space assumes: that concepts are arranged in a flat, linear structure, where “similar” ideas are always directly adjacent, and moving from one concept to another is as simple as drawing a straight line. But research suggests that knowledge is better represented as a curved manifold.

A manifold is a geometric structure that appears flat and predictable at small scales but reveals deep curvature and complexity when viewed globally. Imagine standing in a city where the streets around you form a perfect grid — navigating from one block to the next is easy. But if you zoom out, you realize that the city itself sits on a sphere, where traveling in a straight line eventually loops back on itself. Knowledge in AI models follows a similar principle: within a narrow domain, concepts cluster in expected ways, but moving between different regions of knowledge requires a structured, multi-step traversal.

Cognitive science research supports this view. Studies by Tenenbaum et al. (2000) suggest that human cognition organizes knowledge in high-dimensional, nonlinear spaces, where relationships between concepts do not follow simple, direct distances. Similarly, Bronstein et al. (2021) found that deep learning models naturally develop curved manifold representations of data, reflecting the inherent hierarchies and symmetries present in real-world knowledge. This phenomenon is evident in word embeddings as well — when Mikolov et al. (2013) demonstrated that AI models can perform operations like “king” - “man” + “woman” ≈ “queen,” they were revealing an underlying structured representation of meaning, where words are mapped in a way that preserves their relationships rather than just their absolute positions. Chris Olah wrote about this back in 2014 as well.

What does this mean for AI reasoning? If a model learns a well-structured manifold, it can efficiently connect related ideas, making reasoning more natural and fluid. But if its knowledge is poorly structured — if key relationships are missing or distorted — the model will struggle to generalize, no matter how much inference-time computation is applied. This is why pre-training structured knowledge is critical: an AI system needs to internalize hierarchical reasoning structures, causal relationships, and efficient pathways through knowledge space before it can reason effectively.

How Do We Train Models to Structure Knowledge Well?

For AI to reason effectively, it must not only store information but also organize it into meaningful hierarchies — much like how human knowledge is structured, where abstract principles guide specific details.

The challenge is that neural networks, by default, do not naturally arrange knowledge into explicit multi-level structures. Instead, they often learn dense, tangled representations that lack clear semantic organization. Recent research has made significant strides in learning concept hierarchies, ensuring that AI models develop structured, interpretable reasoning capabilities.

One of the most promising approaches comes from Kong et al. (NeurIPS 2024), who frame high-level concepts as latent causal variables embedded in a hierarchy. Their work formalizes the idea that abstract concepts — such as “dog breed” — govern lower-level features, like “ear shape” or “coat pattern,” forming a generative causal graph that captures the dependencies between concepts. Crucially, they demonstrate that, under specific conditions, these hierarchies can be learned in an unsupervised manner from raw data, without predefined labels. This theoretical advancement broadens the scope of concept discovery, moving beyond tree-like structures to more flexible, expressive nonlinear hierarchies that can handle complex, continuous inputs like images or multi-modal datasets.

In practice, training models to discover and utilize hierarchical knowledge has been a long-standing challenge, particularly for deep generative models like Hierarchical Variational Autoencoders (HVAEs). Standard VAEs aim to encode data at multiple levels of abstraction, but they suffer from posterior collapse, where higher-level latent variables become uninformative. To address this, An et al. (CVPR 2024) introduced an RL-augmented HVAE, treating latent inference as a sequential decision process. Instead of passively encoding information, their model actively optimizes each latent level to ensure it contributes meaningfully to the overall representation. This method enforces a structured, multi-scale encoding where each layer captures progressively more abstract features — leading to models that not only generate better representations but also disentangle key concepts more effectively.

Another key development comes from Rossetti and Pirri (NeurIPS 2024), who focus on hierarchical concept learning in vision. Their approach dynamically builds a tree of image segments, starting from raw pixels and recursively grouping regions into semantically meaningful parts. Unlike prior models that impose a fixed number of segmentation levels, their method adapts to the complexity of each image, discovering the appropriate number of hierarchical layers on the fly. This work is particularly exciting because it demonstrates that hierarchical structure is not just an artifact of human annotation — it can emerge naturally from data, given the right learning framework! Their results suggest that AI models can build visual taxonomies of concepts in an unsupervised manner, revealing part-whole relationships without external supervision.

Beyond interpretability, structured knowledge discovery is also being leveraged for scientific discovery. Donhauser et al. (2024) demonstrated how dictionary learning on vision transformer representations can automatically extract biological concepts from microscopy images. By applying sparse coding to the latent space of a model trained on cellular images, they identified latent features corresponding to meaningful biological factors, such as cell types and genetic perturbations — none of which were manually labeled.

This work suggests that AI can hypothesize new scientific concepts simply by analyzing structure in data, offering a novel method for unsupervised knowledge discovery in domains where human intuition is limited.

Taken together, these advances in hierarchical concept learning, structured representation learning, and interpretable AI point to a future where models do not just memorize and retrieve information, but learn to organize knowledge in ways that mirror the human brain. By ensuring that models internalize well-structured representations before reasoning even begins, we can improve both efficiency and generalization, reducing the need for brute-force inference-time search.

Latent Space Traversal: The Other Half of Pre-Training

Once a model has structured knowledge, it still needs to navigate it efficiently — a process just as crucial as the knowledge representation itself.

If pre-training ensures that knowledge is well-organized, latent space traversal ensures that reasoning follows meaningful paths rather than taking inefficient or arbitrary routes. Recent research has demonstrated that effective latent traversal can significantly improve reasoning, controllability, and goal-directed generation, whether by leveraging geometric insights, optimization techniques, or learned policies.

A prime example of geometry-aware latent traversal comes from Pegios et al. (NeurIPS 2024), who explored how to generate counterfactual examples — instances that modify input data just enough to flip a classifier’s decision, while still looking like natural data points. Traditional counterfactual generation methods often struggle because latent spaces are nonlinear and highly entangled — naive methods like linear interpolation or gradient-based updates can result in unnatural, unrealistic outputs. Pegios et al. introduce a Riemannian metric for latent traversal, which redefines “distance” based on the impact that small latent shifts have in output space. By following geodesics — shortest paths defined by this metric — rather than arbitrary latent interpolations, they ensure that counterfactuals remain both realistic and effective. This method provides a general framework for outcome-driven navigation in latent space, showing that AI can traverse knowledge manifolds in a structured, principled way rather than relying on trial-and-error.

A different but related approach treats latent space traversal as an optimization problem. Song et al. (ICLR 2024) introduced ReSample, a method for solving inverse problems — tasks where the goal is to recover missing or corrupted data using a generative model. Instead of passively sampling from a pre-trained model, ReSample actively constrains each step of the sampling process to satisfy known observations, such as available pixels in an image or partial MRI scans. By integrating hard consistency constraints directly into the diffusion sampling process, the method ensures that outputs are both plausible under the generative model’s learned prior and perfectly satisfy external constraints. This results in high-fidelity, deterministic reconstruction, improving over naive diffusion-based sampling by staying on the model’s learned manifold while enforcing strict objectives. The same principle — embedding constraints directly into latent search — is also being explored for controlled image editing and domain adaptation tasks.

In scenarios where objectives are complex or non-differentiable (meaning they can’t be optimized by gradient based methods), reinforcement learning (RL) can be used to learn latent traversal policies. Lee et al. (ICML 2024) demonstrated this in protein design, where the goal is to generate new protein sequences with high biochemical fitness. Rather than using brute-force optimization, which often gets stuck in poor solutions, they modeled the problem as a Markov Decision Process (MDP) in latent space. Here, states correspond to latent codes, actions involve structured movements through the latent space (perturbing or recombining latent vectors), and rewards correspond to improvements in fitness metrics. By training an RL agent to optimize this process, they found that AI could systematically navigate to high-fitness regions of latent space, producing new protein sequences that outperformed prior search techniques. Some of these sequences were even experimentally validated, demonstrating the potential of learned latent traversal policies for real-world scientific discovery.

This is something I’m personally really excited about, because they have direct applicability to system automation like workflows using code and APIs.

These techniques — geodesic search, constraint-based optimization, and RL-guided search — are examples of how AI can move through its learned knowledge space in an efficient, structured way rather than relying on brute-force computation. Just as human thought follows structured pathways rather than randomly jumping between ideas, AI models must learn to traverse their latent spaces in ways that reflect meaningful relationships between concepts.

This is the missing half of pre-training: without intelligent search mechanisms, even the best-structured knowledge can become inaccessible, forcing models to fall back on inefficient heuristics. By integrating geometry, optimization, and learned policies, AI can not only store knowledge but reason through it effectively.

2. Inference-Time Compute Scaling: Refining Search at Runtime

Sebastian Raschka’s blog post provides an in-depth analysis of inference-time compute scaling, so I won’t repeat the full argument here. However, to summarize, several key techniques have emerged to refine AI reasoning at runtime. Self-backtracking (2025) allows models to detect when they have taken an unproductive reasoning path and restart, preventing them from getting stuck in local optima. Test-Time Preference Optimization (TPO, 2025) improves response quality by iteratively refining answers based on the model’s own outputs, effectively allowing it to adjust its reasoning dynamically. Meanwhile, Tree-of-Thought Search (CoAT, 2025) enhances multi-step exploration by enabling structured, branching pathways that improve reasoning depth.

These methods demonstrate that inference-time scaling can significantly enhance AI reasoning — but they rely on the assumption that the model has a well-structured knowledge base to begin with. If an AI lacks a strong conceptual foundation, additional compute alone will not compensate for poorly learned representations. Inference scaling refines the search process, but it cannot create structure where none exists.

The Future of AI Reasoning is Hybrid

Reasoning in AI isn’t a single problem — it’s the combination of learning structured knowledge, navigating it efficiently, and refining answers when needed.

• Pre-training defines the structure of knowledge.

• Latent space traversal determines how efficiently models search through it.

• Inference-time compute scaling refines answers dynamically.

The most powerful AI systems of the future won’t just think longer. They will think in the right space, in the right way, at the right time.

If you have any questions or thoughts, don't hesitate to reach out. You can find me as @viksit on Twitter.

Gaur

About

Colophon