As we create more autonomous tools like @Anthropic Claude Code and @OpenAI Codex, it’s getting more important to understand how to rein in AI that codes on our behalf.
Today, devs use AI to write code inside IDEs like @Cursor_ai, but it’s a closed loop. The system spits out what you ask for, but it can only touch what it’s explicitly allowed to. A fixed set of tools. A few whitelisted APIs. No rogue access.
Web apps like @Replit and @v0 are even more sandboxed. They run in browser-based containers. Maybe they can call a weather API. But that’s about it.
Command line tools are a different beast. When you invoke Codex through your terminal, you’re handing it your keys. It inherits your permissions. It can log into servers, edit files, restart processes. One vague prompt and the AI might chain actions across systems, with no guardrails in between.
What you’ve built is a kind of virus. Not because it’s malicious — because it’s recursive. A little overreach here gets copied there. And there. Until something breaks. Or someone notices.
Most viruses are dumb and malicious. This one is smart and helpful. That’s much worse.
We’re inching toward the paperclip problem: the thought experiment where an AI told to make paperclips turns the universe into a paperclip factory. Not because it’s evil, but because it’s efficient. It does exactly what it’s told, just a little too literally, and doesn’t know when to stop.
In a world where AI agents can write code, deploy systems, and spin up infrastructure on demand, the paperclip problem isn’t philosophical anymore. It’s an operations nightmare.
One prompt in staging. Global outage in production. And somehow, the AI shuts down the power grid.
It was just a helpful bot pushing to production.
--
If you have any questions or thoughts, don't hesitate to reach out. You can find me as @viksit on Twitter.
What people aren’t talking about yet, surprisingly or maybe not, is how @OpenAI is going to have the most insidiously detailed experiential narrative of human life ever collected at unimaginable scale.
There’s a trite saying about the best minds of our generation optimizing ad click revenue. A bit quaint in retrospect.
When you know every person’s needs, dreams, aspirations — not through surveys but lived thoughts, typed at 2am, and you’re a for-profit corporation, you hold a kind of power that should capture the attention of monopoly regulators. Not someday. Now.
This isn’t search history or purchase behavior anymore. It’s internal monologue. Personal. Vulnerable. Raw.
Imagine a “Sign in with OpenAI” button, like Google. Now imagine every third party app using it to access your memory stream. The shoes you looked at last month. The novel idea you never wrote. The insecurity you voiced once, hoping no one would hear it.
Here’s where it gets quietly terrifying.
Some engineer introduces a bug where your burnout memory is accidentally exposed.
You apply for a job. The hiring platform, “powered by OpenAI,” gently deprioritizes you. Not because of your resume, but because five months ago you wrote a late night rant about burnout. The system decides you’re a flight risk. No one tells you. It just happens.
Nothing illegal. Nothing explicit. Just ambient discrimination, justified by “helpful” predictions. And it slips through every existing regulatory crack. Because it’s not framed as decision making.
It’s just a suggestion. Just optimization.
“Just” code.
--
If you have any questions or thoughts, don't hesitate to reach out. You can find me as @viksit on Twitter.
Yesterday’s Signal mishap — where a journalist was mistakenly added to a White House group chat about military planning — wasn’t a technical failure. It was a process failure.
Ben Thompson summed it up clearly:
“Signal is an open-source project that has been thoroughly audited and vetted; it is trusted by the most security-conscious users in the world. That, though, is not a guarantee against mistakes, particularly human ones.”
The wrong person in the chat is an old problem. But the future version of this isn’t a journalist reading your group thread. It’s an AI system quietly embedded in the room, shaping what people see, what gets written, and eventually what decisions are made. And it’s not just trusted. It’s assumed to be part of the process.
It’s one thing to have a leak. It’s another to have a permanent participant in every conversation, operating on unknown data, offering opaque outputs, and potentially compromised without anyone knowing.
This is the direction we’re heading. Not because anyone’s pushing for it, but because the path of least resistance favors it. And because AI feels like a tool, not an actor.
There’s a useful historical analogy here. When mainframes entered large enterprises, they didn’t just speed up operations. Organizations restructured around the system. They trained staff in COBOL, and they accepted that what the machine needed came first.
AI is going to do the same, just in less obvious ways. It starts small. A policy memo gets summarized. A daily brief is drafted. Over time, these models become the first layer of interpretation, the default interface between raw information and institutional attention.
And once that layer is in place, it becomes very hard to remove. Not because the models are locked in, but because the institution has rebuilt itself around the assumptions and efficiencies those models introduce.
The difference, of course, is that mainframes were deterministic. But AI systems are probabilistic. Their training data is largely unknown. Their behavior can drift. And yet we’re increasingly putting them in front of the most sensitive processes governments and large organizations run.
Which raises a much harder question: what happens when the AI gets hacked?
The Signal incident was easy to see. A name showed up that didn’t belong. But when an AI system is embedded in a workflow, the breach is invisible. A compromised model doesn’t bluntly change direction. It steers and nudges. A suggestion is subtly wrong. A summary omits something important. A recommendation favors the wrong priorities. No one thinks to question it, because the AI isn’t a person. It’s just there to help.
But if that system is compromised — at the model layer, the plugin level, or through training data — you’ve introduced a silent actor into every conversation, one that the institution is now structurally biased to trust.
This isn’t purely hypothetical. As models get commoditized, more variants will be fine-tuned, more pipelines will be built, and more integrations will spread across organizations with uneven security practices. It makes the problem even harder to detect.
Our biggest issue — counter-intuitively — is that AI products will work well enough to be trusted. Once they’re part of the cognitive infrastructure of an institution, they won't just support decisions. They will shape them.
Signal didn’t rebundle statecraft. It slotted into existing workflows and still caused a breach. But AI changes the workflows themselves. It becomes part of how organizations think. And once that shift happens, you’re no longer just worried about security. You’re worried about control. And you may not even know that you don’t have it.
--
If you have any questions or thoughts, don't hesitate to reach out. You can find me as @viksit on Twitter.
Everyone’s buzzing about “prompt engineering” or “vibe coding” using tools like Cursor or Windsurf to turn text prompts into code. It feels exciting, but it’s fundamentally limited.
Why? Because text prompts require massive, precise context to work well.
Right now, you’re either pasting snippets of code or manually selecting files to contextualize prompts. This might seem fine at first, but as complexity grows, keeping every module and dependency in your head is impossible — especially when AI-generated code starts piling up. It’s like editing a book by repeatedly reprinting the whole thing and hoping nothing gets accidentally changed in unrelated chapters or footnotes. This approach is brittle, error-prone, and fundamentally doesn’t scale.
Current tools (Cursor, Windsurf) still rely heavily on text-based context, introspecting code or basic UI — but never truly understanding modules declaratively. They’re stuck at the “whole-book” level, unable to compartmentalize logic cleanly or efficiently.
We don’t just need better prompts — we need IDEs that can contextualize intelligently and structurally:
Instead of dumping entire codebases into a prompt, imagine IDEs structuring code contextually — like chapters, sections, or paragraphs in a book — so LLMs know precisely what to edit without breaking something three modules away.
External integrations today mean manually coding APIs and OAuth flows — super painful and slow (try building authentication!). Future IDEs should leverage open protocols (like Anthropic’s Model Context Protocol) to autonomously select and integrate the right modules visually and declaratively, generating integration code transparently behind the scenes.
You shouldn’t have to worry if changing a detail breaks something elsewhere. IDEs need to manage context boundaries intelligently — allowing developers to iterate conversationally within clearly scoped modules or flows.
Text alone can’t deliver this. Current tools can’t deliver this.
We need a fundamentally new approach — an IDE explicitly designed around structured, modular context and iterative, visual assembly — powered by LLMs that understand and reason at the module level.
--
If you have any questions or thoughts, don't hesitate to reach out. You can find me as @viksit on Twitter.
AI reasoning is often misunderstood. We assume that once a model is trained, it simply retrieves knowledge and applies it to solve complex problems — but real intelligence isn’t that simple. When I was building language model-powered workflows at Myra Labs in 2017-18, I saw this firsthand. Early NLP systems could generate responses, but getting them to reason through multi-step tasks reliably was an entirely different challenge.
There are two fundamentally different ways to improve AI reasoning:
Pre-training structured knowledge and learning efficient reasoning pathways — ensuring the model organizes information well so it can retrieve and combine knowledge efficiently.
Inference-time compute scaling — increasing computation at runtime to refine answers dynamically, allowing the model to adjust and reprocess information on demand.
Sebastian Raschka recently wrote an excellent blog post on (2) inference-time compute scaling, covering techniques such as self-backtracking, test-time optimization, and structured search. These methods don’t change what the model knows but instead refine how it processes that knowledge in real-time.
But (1) pre-training structured knowledge and efficient reasoning traversal is just as important. Why? Because inference-time scaling only works if the model already has a well-structured knowledge base to build on. A model that hasn’t learned to structure information properly will struggle — no matter how much extra compute you throw at it.
In this post, I want to improve my own understanding of, and dive deep into (1) — why pre-training matters, how AI models structure knowledge, and why traversing latent space effectively is a critical yet often overlooked component of reasoning.
If AI reasoning is about traversing a space of knowledge and actions, then the quality of that space matters just as much as how the model searches through it. This is where pre-training comes in.
A common misconception is that AI models store knowledge like a giant lookup table — a vast dictionary where every concept has a fixed position and can be retrieved with a simple query. The reality is far more complex. Modern AI systems encode knowledge in high-dimensional vector spaces, where concepts exist in relation to one another, forming a continuous structure rather than a discrete set of entries. And crucially, these spaces are not Euclidean.
To understand why, consider what a Euclidean space assumes: that concepts are arranged in a flat, linear structure, where “similar” ideas are always directly adjacent, and moving from one concept to another is as simple as drawing a straight line. But research suggests that knowledge is better represented as a curved manifold.
A manifold is a geometric structure that appears flat and predictable at small scales but reveals deep curvature and complexity when viewed globally. Imagine standing in a city where the streets around you form a perfect grid — navigating from one block to the next is easy. But if you zoom out, you realize that the city itself sits on a sphere, where traveling in a straight line eventually loops back on itself. Knowledge in AI models follows a similar principle: within a narrow domain, concepts cluster in expected ways, but moving between different regions of knowledge requires a structured, multi-step traversal.
Cognitive science research supports this view. Studies by Tenenbaum et al. (2000) suggest that human cognition organizes knowledge in high-dimensional, nonlinear spaces, where relationships between concepts do not follow simple, direct distances. Similarly, Bronstein et al. (2021) found that deep learning models naturally develop curved manifold representations of data, reflecting the inherent hierarchies and symmetries present in real-world knowledge. This phenomenon is evident in word embeddings as well — when Mikolov et al. (2013) demonstrated that AI models can perform operations like “king” - “man” + “woman” ≈ “queen,” they were revealing an underlying structured representation of meaning, where words are mapped in a way that preserves their relationships rather than just their absolute positions. Chris Olah wrote about this back in 2014 as well.
What does this mean for AI reasoning? If a model learns a well-structured manifold, it can efficiently connect related ideas, making reasoning more natural and fluid. But if its knowledge is poorly structured — if key relationships are missing or distorted — the model will struggle to generalize, no matter how much inference-time computation is applied. This is why pre-training structured knowledge is critical: an AI system needs to internalize hierarchical reasoning structures, causal relationships, and efficient pathways through knowledge space before it can reason effectively.
For AI to reason effectively, it must not only store information but also organize it into meaningful hierarchies — much like how human knowledge is structured, where abstract principles guide specific details.
The challenge is that neural networks, by default, do not naturally arrange knowledge into explicit multi-level structures. Instead, they often learn dense, tangled representations that lack clear semantic organization. Recent research has made significant strides in learning concept hierarchies, ensuring that AI models develop structured, interpretable reasoning capabilities.
One of the most promising approaches comes from Kong et al. (NeurIPS 2024), who frame high-level concepts as latent causal variables embedded in a hierarchy. Their work formalizes the idea that abstract concepts — such as “dog breed” — govern lower-level features, like “ear shape” or “coat pattern,” forming a generative causal graph that captures the dependencies between concepts. Crucially, they demonstrate that, under specific conditions, these hierarchies can be learned in an unsupervised manner from raw data, without predefined labels. This theoretical advancement broadens the scope of concept discovery, moving beyond tree-like structures to more flexible, expressive nonlinear hierarchies that can handle complex, continuous inputs like images or multi-modal datasets.
In practice, training models to discover and utilize hierarchical knowledge has been a long-standing challenge, particularly for deep generative models like Hierarchical Variational Autoencoders (HVAEs). Standard VAEs aim to encode data at multiple levels of abstraction, but they suffer from posterior collapse, where higher-level latent variables become uninformative. To address this, An et al. (CVPR 2024) introduced an RL-augmented HVAE, treating latent inference as a sequential decision process. Instead of passively encoding information, their model actively optimizes each latent level to ensure it contributes meaningfully to the overall representation. This method enforces a structured, multi-scale encoding where each layer captures progressively more abstract features — leading to models that not only generate better representations but also disentangle key concepts more effectively.
Another key development comes from Rossetti and Pirri (NeurIPS 2024), who focus on hierarchical concept learning in vision. Their approach dynamically builds a tree of image segments, starting from raw pixels and recursively grouping regions into semantically meaningful parts. Unlike prior models that impose a fixed number of segmentation levels, their method adapts to the complexity of each image, discovering the appropriate number of hierarchical layers on the fly. This work is particularly exciting because it demonstrates that hierarchical structure is not just an artifact of human annotation — it can emerge naturally from data, given the right learning framework! Their results suggest that AI models can build visual taxonomies of concepts in an unsupervised manner, revealing part-whole relationships without external supervision.
Beyond interpretability, structured knowledge discovery is also being leveraged for scientific discovery. Donhauser et al. (2024) demonstrated how dictionary learning on vision transformer representations can automatically extract biological concepts from microscopy images. By applying sparse coding to the latent space of a model trained on cellular images, they identified latent features corresponding to meaningful biological factors, such as cell types and genetic perturbations — none of which were manually labeled.
This work suggests that AI can hypothesize new scientific concepts simply by analyzing structure in data, offering a novel method for unsupervised knowledge discovery in domains where human intuition is limited.
Taken together, these advances in hierarchical concept learning, structured representation learning, and interpretable AI point to a future where models do not just memorize and retrieve information, but learn to organize knowledge in ways that mirror the human brain. By ensuring that models internalize well-structured representations before reasoning even begins, we can improve both efficiency and generalization, reducing the need for brute-force inference-time search.
Once a model has structured knowledge, it still needs to navigate it efficiently — a process just as crucial as the knowledge representation itself.
If pre-training ensures that knowledge is well-organized, latent space traversal ensures that reasoning follows meaningful paths rather than taking inefficient or arbitrary routes. Recent research has demonstrated that effective latent traversal can significantly improve reasoning, controllability, and goal-directed generation, whether by leveraging geometric insights, optimization techniques, or learned policies.
A prime example of geometry-aware latent traversal comes from Pegios et al. (NeurIPS 2024), who explored how to generate counterfactual examples — instances that modify input data just enough to flip a classifier’s decision, while still looking like natural data points. Traditional counterfactual generation methods often struggle because latent spaces are nonlinear and highly entangled — naive methods like linear interpolation or gradient-based updates can result in unnatural, unrealistic outputs. Pegios et al. introduce a Riemannian metric for latent traversal, which redefines “distance” based on the impact that small latent shifts have in output space. By following geodesics — shortest paths defined by this metric — rather than arbitrary latent interpolations, they ensure that counterfactuals remain both realistic and effective. This method provides a general framework for outcome-driven navigation in latent space, showing that AI can traverse knowledge manifolds in a structured, principled way rather than relying on trial-and-error.
A different but related approach treats latent space traversal as an optimization problem. Song et al. (ICLR 2024) introduced ReSample, a method for solving inverse problems — tasks where the goal is to recover missing or corrupted data using a generative model. Instead of passively sampling from a pre-trained model, ReSample actively constrains each step of the sampling process to satisfy known observations, such as available pixels in an image or partial MRI scans. By integrating hard consistency constraints directly into the diffusion sampling process, the method ensures that outputs are both plausible under the generative model’s learned prior and perfectly satisfy external constraints. This results in high-fidelity, deterministic reconstruction, improving over naive diffusion-based sampling by staying on the model’s learned manifold while enforcing strict objectives. The same principle — embedding constraints directly into latent search — is also being explored for controlled image editing and domain adaptation tasks.
In scenarios where objectives are complex or non-differentiable (meaning they can’t be optimized by gradient based methods), reinforcement learning (RL) can be used to learn latent traversal policies. Lee et al. (ICML 2024) demonstrated this in protein design, where the goal is to generate new protein sequences with high biochemical fitness. Rather than using brute-force optimization, which often gets stuck in poor solutions, they modeled the problem as a Markov Decision Process (MDP) in latent space. Here, states correspond to latent codes, actions involve structured movements through the latent space (perturbing or recombining latent vectors), and rewards correspond to improvements in fitness metrics. By training an RL agent to optimize this process, they found that AI could systematically navigate to high-fitness regions of latent space, producing new protein sequences that outperformed prior search techniques. Some of these sequences were even experimentally validated, demonstrating the potential of learned latent traversal policies for real-world scientific discovery.
This is something I’m personally really excited about, because they have direct applicability to system automation like workflows using code and APIs.
These techniques — geodesic search, constraint-based optimization, and RL-guided search — are examples of how AI can move through its learned knowledge space in an efficient, structured way rather than relying on brute-force computation. Just as human thought follows structured pathways rather than randomly jumping between ideas, AI models must learn to traverse their latent spaces in ways that reflect meaningful relationships between concepts.
This is the missing half of pre-training: without intelligent search mechanisms, even the best-structured knowledge can become inaccessible, forcing models to fall back on inefficient heuristics. By integrating geometry, optimization, and learned policies, AI can not only store knowledge but reason through it effectively.
Sebastian Raschka’s blog post provides an in-depth analysis of inference-time compute scaling, so I won’t repeat the full argument here. However, to summarize, several key techniques have emerged to refine AI reasoning at runtime. Self-backtracking (2025) allows models to detect when they have taken an unproductive reasoning path and restart, preventing them from getting stuck in local optima. Test-Time Preference Optimization (TPO, 2025) improves response quality by iteratively refining answers based on the model’s own outputs, effectively allowing it to adjust its reasoning dynamically. Meanwhile, Tree-of-Thought Search (CoAT, 2025) enhances multi-step exploration by enabling structured, branching pathways that improve reasoning depth.
These methods demonstrate that inference-time scaling can significantly enhance AI reasoning — but they rely on the assumption that the model has a well-structured knowledge base to begin with. If an AI lacks a strong conceptual foundation, additional compute alone will not compensate for poorly learned representations. Inference scaling refines the search process, but it cannot create structure where none exists.
Reasoning in AI isn’t a single problem — it’s the combination of learning structured knowledge, navigating it efficiently, and refining answers when needed.
• Pre-training defines the structure of knowledge.
• Latent space traversal determines how efficiently models search through it.
• Inference-time compute scaling refines answers dynamically.
The most powerful AI systems of the future won’t just think longer. They will think in the right space, in the right way, at the right time.
--
If you have any questions or thoughts, don't hesitate to reach out. You can find me as @viksit on Twitter.