Latent Space: The AI Engineer Podcast Podcast Summary — Free Daily Recap

Latest Episodes

The most recent episodes — sign up to get AI-powered summaries of each one.

Yesterday1h 9m
Codex from 0 to 10M Users: Building ChatGPT Work — Akshay Nathan, OpenAI
There are roughly 100x more people who use code than who can write code. As code that “just works” becomes easier to generate, this group may be the biggest prize of all — if you can get the agentic interface right.A key trend we have been tracking over at AINews is the absolute explosion in Codex usage this year, with MAU now up >10x from Jan 2026. Less than two weeks after their July 9th launch, OpenAI said ChatGPT Work and Codex had reached 10M million users combined (as we cover in the pod, Codex now powers ChatGPT Work, so all ChatGPT Work users are now users of the Codex harness, even if they aren’t traditional engineers) — showing the early innings of what happens when you graduate from coding agents to knowledge work agents:We’ve been calling out how coding agents are “breaking containment” to do everything else this year to power every other part of knowledge work - and it started with the org chart, with a major reorg last month that amounted to two of Codex’s most prominent leaders, Greg and Tibo, taking responsibility over product and ChatGPT specifically, completing a “Superapp” consolidation cycle first discussed in March.With these updates Codex is no longer just a coding tool. In June, OpenAI said knowledge workers already accounting for roughly 20% of Codex’s user base and growing more than 3x as quickly as developers. A product dedicated for knowledge workers was being pulled out of the Codex team.However, knowledge work has a different set of problems and environments than coding. For decades, knowledge work has been scattered across different primitives like documents for writing, spreadsheets for analysis, slide decks for communication, and specialized applications for everything else. ChatGPT Work now enables users to work across every primitive with agents. Instead of opening an application and manually operating its features, the user can describe an outcome and collaborates with an agent that can assemble the tools, context, and artifact needed to reach it.From building no-code products at Airtable to leading Productivity Engineering at OpenAI, Akshay Nathan has spent much of his career trying to make the power of software accessible to people who do not write code. In this episode, Akshay joins swyx and Vibhu to unpack the launch of ChatGPT Work, why Codex unexpectedly took off among non-developers inside OpenAI, and the company’s broader plan to bring useful agents from software engineers to knowledge workers and eventually everyone.We go deep on the shared agent harness behind Codex and ChatGPT Work, why OpenAI brought the experiences together without making them identical, and how persistent computers, artifacts, Sites, plugins, memory, and sub-agents are changing what people can delegate to AI. Akshay explains why some teams are replacing decks and spreadsheets with interactive websites, how agents can gather context across code, Slack, documents, and local files, and what OpenAI learned from personal-agent products like OpenClaw.Side note: also don’t miss Abhihek’s sandbox track keynote at AIE, which now powers a lot of the sandboxing for ChatGPT Work… and yes was also broken by an unreleased OpenAI model in the recent HuggingFace incident.Akshay also reflects on how AI is transforming product development itself: why more people will become generalists with a specialty, why ideas and taste become the bottlenecks when almost anyone can build, why LLMs still struggle to generate genuinely grounded new ideas, and why teams must distinguish increased motion from actual progress.We discuss:* Why Codex unexpectedly took off among non-developers inside OpenAI* Why employees felt like using Codex gave them a new superpower* The product insight that led OpenAI to build ChatGPT Work* Why Codex and ChatGPT Work share the same underl
6 days ago1h 54m
Inside the Model Factory — Eiso Kant, Poolside AI
In recent months, the open vs closed, and US vs China discussions on model ownership and sovereign/local AI have heated up to a fever pitch. So it is very very good news that Poolside AI are finally emerging with new models, like Laguna S 2.1, that are beating Thinking Machines’ recent release nearly 10 times their size.Poolside’s recent tech report got a lot of praise due to their level of detail, and Vibhu first covered Laguna’s recent technical report on our paper club:From spending $12 million building language models for code before the world cared to creating a Model Factory that can take a model from pre-training to release in eight weeks, Eiso Kant has spent more than a decade betting that code is the path to AGI. In this episode, the Poolside co-founder joins swyx and Vibhu to explain why ChatGPT felt like vindication, why Poolside embraced open weights and open research, and why he would rather live in a world with 100 foundation model companies than five even if Poolside were one of the five.We go deep on Poolside’s Model Factory: the engineering systems behind 10,000–20,000 experiments per month, streaming data directly into training, reproducible experimentation, low-precision compute, and agents that increasingly write code, launch jobs, evaluate results, and modify the pipelines used to train future models. Eiso also unpacks their recent launch Laguna S, why persistence, verification, and backtracking may matter more than raw intelligence, how much capability remains inside smaller models, why reinforcement learning will move earlier into pre-training, and why next-token prediction is still extracting too little from the web.We also discuss model-harness co-design, Poolside’s path from coding agents to AGI, why Eiso thinks MCP and traditional tool calls are “stupid,” the real economics behind frontier-model training, Poolside’s $500 million raise, open-source AI, regulation, NVIDIA and TSMC’s influence, engineering productivity in the agent era, high-agency teams, and hiring at Poolside.We discuss:* How Andrej Karpathy’s RNN work inspired Eiso to start building language models for code in 2015* Why Eiso spent four years and $12 million pursuing an idea before the market cared* Why ChatGPT felt like vindication and brought Poolside back to open source* Why Eiso would prefer 100 foundation model companies over an oligopoly of five* The difference between releasing open weights and publishing genuinely open research* Why Poolside deliberately built a global research organization outside the Bay Area talent war* Why model building is ultimately 90% engineering* The Model Factory: Poolside’s end-to-end system for rapidly training and improving models* How fewer than 70 researchers run roughly 10,000–20,000 experiments each month* How Poolside moved from six-month model cycles to five- and eight-week launches* Why streaming data directly into training unlocked faster experimentation* How immutable data, versioned code, and reproducibility enable rigorous model research* Why Eiso wants capable researchers to leave their labs and become Poolside’s competitors* Why 95% of model building can be reduced to better data or compute efficiency* Laguna S and why persistence, verification, and backtracking can outperform raw intelligence* Why smaller models may handle far more knowledge work than previously expected* Why reinforcement learning will move earlier into pre-training* Why next-token prediction is still failing to extract enough knowledge from the web* Why distillation and environments have become the AI industry’s favorite “drugs”* Why mid-training is really an early form of curriculum design* Low-precision training, networking bottlenecks, and the next ga
1 weeks ago1h 29m
Causal Models Need Causal Data - Xaira’s X-Cell model for Drug Discovery (Bo Wang & Ci Chu, Chief Discovery Officer & Chief AI Scientist)
Bet on informationIf test loss flatlines after 1.5B parameters while training loss continues to drop as you scale, that tells you that your model is limited by the amount of information in your data.Training on a single, smallish data set exposed an information gap: the 3.1B model falls off the scaling trend. Neither parameters nor compute will improve performance past this wall. For predicting changes to gene expression, you need more information rich data.This is what Chu and Bo’s teams have done, and here is what ~30x the information buys you:Now we can scale with parameters and training compute! We don’t know how much this effort costed, but we can guess that data collection experiments and infrastructure was a few tens of millions, and compute + headcount + research was a few million. The budget looks like a RL rollout budget, rather than a data rich pre-training one.We were lucky enough to have the two central figures in this story on our podcast. Taking the lead from Ci Chu and Bo Wang, Xaira Therapeutics is betting that information rich data is the key to AI-driven drug development. Chu was recently promoted to Chief Discovery Officer and Bo to Chief AI Scientist, underscoring just how strategic Xaira considers this bet.Reverse engineering the human cellIf you had to figure out how a human cell works, what would you do? A good place to start might be by documenting what genes are expressed (e.g. what RNA is floating around) in different kinds of cells, in different circumstances.That is CELLxGENE, a database of 168M cells built by Chan Zuckerberg Institute that maps each cell to a count of how many times 20K-30K genes were detected in that cell, plus detailed metadata about every cell. A ~4 trillion-entry matrix.If the Protein Data Bank (PDB) unlocked structural biology models [link Boltz, BioHub], CellXGene has done the same thing for Virtual Cell models. Like PDB, CELLxGENE has inspired a zoo of AI models of RNA expression; so much so that RNA expression models have become synonymous with Virtual Cell models. Bo Wang built one of the most influential, scGPT, that became the starting point for Xaira’s new model.RNA expression ≠ Virtual CellModels trained on CELLxGENE describe the relationship between cell types and cell states, but they are not good at predicting what will happen if we make changes to RNA expression. Changes in gene expression are highly correlated, and its is difficult (impossible) to figure out what causes what in most cases.If you could “turn the dial down” on one gene at a time, however, then you would be able to observe what is upstream and downstream of a given gene. You could tell if A → B & C or B → A & C or B → A, C → B → … If you did this for all of the genes, then maybe you could train a model that could predict what would happen to a cell if you change a gene (e.g. with a drug or a gene edit). Or maybe you could figure out the least invasive way to change a particular gene’s expression.X-Atlas → X-CellThis is exactly what Chu and Bo’s teams have done. The data set is called X-Atlas and the model is called X-Cell.In this episode, we discuss:* Why the team abandoned autoregression for diffusion* The CRISPR-based experiments that run millions of tests in parallel, and generate the raw data for X-Atlas and X-cell* Generalization to real lab experiments in real human cells* Beating the linear baseline that has outperformed previous models* Justifying a kitchen-sink of priors, and how that stacks up vs. data and architectureBo also shared with us some of the (major) advantages he has as an academic vs. industry leader, and how his labs keep up with the breakneck pace of AI innovation.Check out the full episode on YouTube, or your favorite podcasting platform! This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe
1 weeks ago1h 41m
🔬 The Lab of the Future Should Feel Like a Data Center — Andy Beam & Rafa Gómez-Bombarelli, Lila Sciences
Imagine a dark warehouse. Racks and racks of devices with wires, tubes, and electronics sticking out. The next AI data center? No. This is Lila Sciences‘ dream for the future of science. A dark warehouse full of AI-guided robotics and lab equipment, cranking out new experiments 24/7, building toward a scientific superintelligence.Their automated lab is almost hypnotizing to watch. They have floating plates zipping around on Wall-E-esque tracks, used vision-language models to control Windows 95 boxes, and created the world’s largest collection of voided warranties. In the process they’ve built a massive library of scientific reasoning tokens. Over 10 trillion of them, all experimentally validated.No warranties were voided in the making of this videoTo say Lila is ambitious is an understatement. Their goal is a scientific superintelligence wired directly into the wet lab. They are all in on the bitter lesson, and the thesis follows from it: a lab is an infinite token generator. Produce data at scale, and the synergies give you a general reasoner that can tackle any scientific problem. They are committing hard. Biology, chemistry, drug discovery, and materials science, all at the same time. Time will tell if it works, but it is an exciting hypothesis.In our latest episode we sat down with Lila’s very own Andy Beam (CTO) and Rafa Gómez-Bombarelli (CSO, physical sciences) and went on a journey through the possibilities of AI-run science, almost as wide-ranging as Lila’s goals.Did we mention they do both materials science and biology? In the same AI science factory? Same time, same lab, same AI. Finally a guest who can settle a long-running debate we’ve had amongst ourselves: is biology or materials science harder?Watch to find out!We discuss:* The internet is spent, science is next. Why Lila thinks the scientific method is the last untapped internet-scale dataset, and why they treat RL as a data generation mechanism with nature as the verifier.* The lab as a data center. Instruments as nodes on a graph, a magnetically levitating “PCI bus” transport layer between them, orchestration as a slurm queue. Andy is not short on analogies.* Why Lila insists it is not an automation company. They optimize for flexibility and generalizability over raw throughput, which means humans stay below the API line wherever automating does not pay.* Your experiment has a runtime. We put Escalante Bio’s question to Andy: if science is the token generator, what is the runtime of your data collection? His answer, in short, is that you cannot make the ribosome go faster. Why Lila bets on fast round-over-round iteration rather than big noisy multiplexed screens, and how Rafa’s team rebuilt a gas sorption measurement to run roughly 2,500x faster.* What is actually in 10 trillion scientific tokens. Not sequences. Experimentally verified reasoning traces, a kind of data that Andy argues exists on the internet in quantities that round to zero.* Breadth as a path to depth. Small molecule chemistry priors transferring to metal organic frameworks for carbon capture, and the claim that the general model beats domain-specific models sample for sample.* If you have the data, what do you need the model for? Sri Kosuri’s koan about the ML-for-drug-discovery business model, and Andy’s answer: the coding model got better because it also read Shakespeare and carnitas recipes.* The serendipity they want to automate. Emily Whitehead survived the first pediatric CAR-T cure only because the doctor treating her happened to know, from pediatric arthritis, which antibody would blunt her IL-6 response. Roll that dice again and you probably lose her. Breadth is how you stop depending on luck.* Move 37 for catalysts. Model suggestions for platinum-group-free electrocatalysts that went from boring, to what a 40-paper expert called stupid, to the best performers they have made.* Six months to in vivo CAR-T data in non-human primates, and the zero-FTE virtual startup commercial model that fell out of it. For context on why that number is startling, AbbVie paid $2.1B
2 weeks ago57 min
Why AI Infrastructure must evolve for Agent Experience — Akshat Bubna, Modal CTO
We’ve been running a bit of an Agent Cloud series surveying all the top inference/compute/cloud providers, from Databricks to Daytona to Railway and, even further back, E2B, but we’re excited to conclude this series returning to Modal, which has just raised a monster $355M Series C.The cloud was built for developers. But agents are now changing that.The old infra stack was designed for a human who could read docs, reason through YAML, and understand dashboards to figure out what they need when something broke. While this was painful for developers, it worked since they could fill in missing context in their heads.However, agents don’t have that luxury. Now in this new era of agents, everything has to be tighter.They need a place to write code, run it, inspect the output, change the environment, debug failures, and try again. Fast iteration and feedback loops with all the necessary context are crucial for agents to operate properly. Furthermore, sandboxes are a clear representation of this shift as agents can easily spin up isolated environments. This programmatic infra even extends to research:Two years ago, we were one of the first to cover Modal with CEO Erik Bernhardsson and Alessio designed our favorite LS thumbnail of all time:At the time, Modal was just a teeny little company with a $17M Series A.Today, fresh off their $355M Series C, Modal is one of the clearest examples of the agent cloud future being built in real time: a cloud platform moving past traditional web app assumptions toward the workloads AI actually creates such as elastic inference, sandboxes, GPU burst, post-training, background agents, and infrastructure that agents themselves can operate.In this episode, Modal CTO Akshat Bubna joins swyx and Vibhu to unpack why AI applications don’t fit traditional cloud assumptions, why Kubernetes was never designed for bursty compute-heavy workloads, and why Modal is now shifting from developer experience to agent experience.We go deep on Modal’s AI infra stack: serverless functions, decorator-based infrastructure, elastic inference for custom models, GPU snapshotting, DeFlash, speculative decoding, Auto Endpoints, sandboxes, persistent storage, networked containers, private IPv6, RDMA, multi-node training, and Modal’s capacity pool across 17 cloud providers. Akshat also explains why RL rollouts can require 100,000 sandboxes, why production agents need hard guardrails, why observability may matter more than reading code, and why AI has made infrastructure exciting again.We discuss:* Why Kubernetes wasn’t built for bursty AI workloads* How Modal started as a better runtime before becoming an AI cloud* Why Modal added GPUs before ChatGPT* The shift from developer experience to agent experience* Why observability matters when agents are writing the code* Elastic inference for custom models across audio, video, robotics, and comp bio* GPU snapshotting, cold starts, and why inference workloads are so bursty* Why RL rollouts can require 100,000 sandboxes* DeFlash, speculative decoding, and frontier-level inference performance* Auto Endpoints and making optimized inference easier to deploy* What Modal adds beyond vLLM, SGLang, and raw GPU rental* Modal’s 17-cloud capacity pool and supercloud strategy* Networked sandboxes, sidecars, private IPv6, and RDMA* Serverless multi-node training for post-training and research workloads* Auto-research, model-guided sweeps, and agents launching GPU experiments* Compute strategy, capacity planning, and batch tiers* Why production agents need specialized sandboxes and hard g
4 weeks ago1h 48m
🔬 The Coolest Diffusion Research Isn't in LLMs — Evan Feinberg & Sergey Edunov, Genesis Molecular AI
This episode has a fun personal twist: There’s a counterfactual world where I was employee #1 at Genesis Molecular AI, the company behind today’s episode. A certain introduction happened a few weeks too late and I had already happily signed at Atomwise, another ML-for-drug-discovery startup. Same problem, different company. I was certain ML was going to transform small molecule drug discovery. Early results were underwhelming. Useful at times, but nowhere near revolutionary. In the last year I’ve seen signs that ML is finally ready to deliver on my convictions from a decade ago. Genesis is one of the places that might have finally cracked this problem. I was super excited to come full circle and catch up with co-founder Evan Feinberg and CTO Sergey Edunov.If you are at all interested in small molecule drug discovery, we think you will find this fascinating!In our nearly two hour chat we cover:* What is small molecule drug discovery, and why is it hard* Structure prediction as a hotbed of innovation in AI algorithms* How advances in AI elsewhere have enabled stepwise improvements in predictive power* How the community benchmarks are essentially calling AI slop good enough* The Genesis flagship model (PEARL) can routinely hit a threshold that is necessary for real-world applications* New agentic workflows enabled by these highly accurate modelsRead on for more, and also some personal thoughts on the future at the end.The coolest diffusion research is happening at GenesisSergey Edunov came to Genesis from Meta where he led Llama 2 training and Llama 3 pretraining. Sergey was a former physicist who thought he was done with physics after many years of training LLMs. Then, he discovered Genesis, and was blown away with all the novel architecture work they’ve been developing.It probably surprises no one that modern LLM research has not resulted in fundamentally novel or exciting updates in architectures since almost the advent of the transformer — the entire field is using variants on the same idea that came out in the original “Attention is all you need” paper. Sure, some were quite useful (mixture-of-experts in particular allowed for the massive model paradigm we’re at today), but there was very little conceptually exciting.“We sort of had to wait for the right primitive to get created, and that turned out to be diffusion… Actually, some of the most innovative diffusion research that’s happening in our field is happening in 3D structure prediction right now.” — Evan FeinbergThe field of 3D structure prediction on the other hand has been a hotbed of research. Genesis’ recent model PEARL (Place Every Atom at the Right Location) is able to understand protein flexibility, and model not just where the ligand goes, but also make small adjustments of the protein so that the two fit better than either alone. The field knew this was missing for a long time, but it was really hard to model until now.Agentic DiscoveryWhat makes this problem so hard? As Sergey points out, there are 10^60 possible drug-like small molecules. You’ll never be able to search them all, and trying to find the good ones is something like finding a needle in a haystack — except everything except your needle is dangerous.“There are 10 to the 60 drug-like small molecules in the universe… it’s like finding a needle in a haystack, where everything except your needle is very, very dangerous.” — Sergey Edunov“Or finding hay in a needle stack might be a more apt analogy.” — Evan FeinbergTrying to solve the multi-parameter optimization problem is even worse. What makes a strong binder and a molecule with good “ADMET Properties” are oftentimes at tension with each other. For example, a good binder is likely greasy, but a greasy molecule is likely insoluble so it won’t enter the bloodstream and get to where it needs to go!Genesis’ advances in generative AI have now pushed them beyond the threshold where they believe agentic drug discovery loops are finally possible. We all remember the early days of LLMs. They were great chatbots but terrible agents, as small errors compounded rapidly into uselessness. As LLMs got better, the usefulness of agents rapidly improved. Evan and Sergey argue that their models at Genesis recently passed a similar threshold. Their internal agentic drug-discovery system (code named SAPPHIRE) can now iterate like a chemist: look at and reason about poses, form hypotheses, read literature, use internal tools, create candidates for the next iteration. Combining this with automated lab partnerships like the one Genesis has with <a targ
Jun 24, 20261h 8m
Why the Frontier Ecosystem must be Open — Matei Zaharia and Reynold Xin, Databricks
We’re excited to have Databricks join us at AIEWF, among hundreds of the top companies in the AI Engineer ecosystem. LS subscribers can use their discount to get past the late bird pricing and access over $50k in sponsor offers! Everyone is still talking about Satya’s Frontier Ecosystems post, but few have actually built a (now $175 billion) frontier ecosystem and cloud like our guests today.From open-sourcing the layer above coding agents to rethinking databases for the agent era, Databricks cofounders Matei Zaharia and Reynold Xin are pushing the company beyond the lakehouse into a full data-and-AI operating system. In this episode, Matei and Reynold join swyx at the 2026 Data + AI Summit to unpack Omnigent, LTAP, Lakebase, agent security, open formats, Mosaic, and why databases may matter more than ever once AI agents start doing real work.We go deep on Omnigent: Databricks’ open-source meta-harness for combining, controlling, and sharing agents across Claude Code, Codex, Cursor, Pi, custom agents, and internal tools. Matei explains why coding agents and enterprise agents run into the same problems: portability, collaboration, session history, security, spend controls, and the need for a common API above every harness.Then Reynold walks through Databricks’ database dream: why CDC is brittle enough to joke that it means “continuous data corruption,” why HTAP has been the holy grail of database engineering, and why Databricks thinks LTAP gets most of the benefits by unifying the storage layer instead of collapsing every query engine. We also cover Databricks’ infrastructure scale, the culture behind rapid prototyping, the difference between tech and enterprise customers, Databricks vs Snowflake, whether vector databases should have ever existed, the Mosaic model strategy, Genie, AI Runtime, RL fine-tuning, and the thesis that traditional software gets rewritten once the data is in the right place and agents sit on top.Databricks began as a company for the big data era. The origination of Spark from the Berkeley AMPLab which eventually turned into the product Lakehouse convinced enterprises that they didn’t need a separate data lake, warehouse, ML platform, and governance layer. They just needed one open foundation where all of their data could live and be reasoned over.Since then a lot has changed, but data has only become more important. Data is no longer something you keep track of and analyze ad hoc, it’s the necessary context agents need in order to act. So the framing has shifted from “where do we put all of our data?” to “how do we expose the right slice of state, history, permissions, and business logic to an AI system at the exact moment it’s doing work?”If frontier model performance becomes commoditized, the durable advantage then becomes the company-specific context around them: proprietary data, governed access, operational state, transaction logs, workflows, and feedback loops. Which makes Databricks positioned perfectly.Now coming fresh off the Data + AI Summit 2026, the company is moving just as fast to keep up, announcing <a target="_blank" href="https://www.databricks.com/company/newsroom/press-releases/databricks-launches-genie-one-all-new-agen
Jun 22, 20261h 6m
Red-Teaming after Mythos — Zico Kolter & Matt Fredrikson, Gray Swan
AI Engineer World’s Fair regular bird tix will sell out ~today! Join us next week ahead of the Late Bird price hike and get >$40,000 in sponsor credits for attending!Thanks to the US Government issuing an export control directive on Mythos and Fable, the risks of jailbreaks and (industry term) indirect prompt injection are suddenly the talk of the town, though we have been covering AI security for a few years now, from Hackaprompt to the enigmatic Pliny the Elder.Zico Kolter, member of OpenAI’s board of directors on the Safety & Security Committee, and Matt Fredrikson, CMU professor and CEO of Gray Swan, co-authored the definitive paper on Indirect Prompt Injections, and Gray Swan were cited authorities on the Mythos model card, directly investigating the exact capabilities that are under scrutiny right now:We seized the opportunity to ask them the state of AI Red Teaming, and Shade, the adversarial red teaming tool that Anthropic used to evaluate the robustness of their models against prompt injection attacks in coding environments. Shade is part of their overall toolkit covering Simon Willison’s Lethal Trifecta, including Cygnal, an AI guardrails product, and the world’s largest AI Red Teaming Arena, including AIRT celebrity Wyatt Walls.All of this security tooling, and yet, we’re only staving off the inevitable.The risks of extremely smart AI increasingly feel like gray swan events: an event that everyone can see coming. In this episode, Gray Swan cofounders Zico Kolter and Matt Fredrikson join swyx to explain why AI security is not just “cybersecurity with AI,” why agents introduce a new class of vulnerabilities, and why the next major AI incident may be a gray swan: unlikely, but clearly visible before it happens.We go deep on prompt injection, automated red teaming, model robustness, agent identity, computer-use agents, enterprise guardrails, and the emerging AI insurance/compliance stack. Zico and Matt also explain why frontier models are not automatically safer as they scale, why specialized red-teaming models can now beat humans at breaking AI systems, and why the future of AI security may depend on AI systems attacking, defending, and interpreting other AI systems.We discuss:* Why AI systems need a different security mindset from traditional software* How prompt injection creates a new exploit class for agents like Codex and Claude Code* Gray Swan Arena and the rise of community red teaming* Shade: AI that can outperform humans at breaking models* Why LLMs are an alien form of intelligence that fail differently from humans* Human vs browser-agent robustness and why humans ranked fourth* Why eval awareness and capability elicitation matter* Cygnal: Gray Swan’s guardrail model for policy enforcement* Why bigger models do not automatically become more robust* The lethal trifecta: untrusted data, private data, and exfiltration* Why “just prompt it better” is not enough for enterprise AI security* OpenClaw, computer-use agents, and the agent security nightmare* Agent-native identity, permissions, and enterprise deployment* Why AI security may become p

Get Latent Space: The AI Engineer Podcast summaries in your inbox

Free AI-powered daily recaps. Key takeaways, quotes, and mentions — in a 5-minute read.

Get Free Summaries →

Free forever for up to 3 podcasts. No credit card required.

You Might Also Like

Listeners also like.

The AI XR Podcast.

Industry insiders interview top founders and executives on AI, spatial computing, VR/AR, and synthetic media.

OpenAI Podcast

Conversations with OpenAI researchers and builders exploring how frontier AI models are developed and used in practice.

Everyday AI Podcast – An AI and ChatGPT Podcast

Practical AI and ChatGPT tips for professionals to improve productivity and grow their careers.

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

Interviews with AI developers and researchers exploring the transformative impact of artificial intelligence on society and technology.

This Day in AI Podcast

Two friends discuss artificial intelligence, sharing casual insights, personal experiments, and humorous experiences with AI tools and technology.

NVIDIA AI Podcast

Explores how artificial intelligence and emerging technologies are driving innovation across science, sustainability, and industry.

AI For Humans: Weekly AI News, Tools & Trends

A weekly breakdown of major AI news, tools, and breakthroughs for both newcomers and seasoned enthusiasts.

The Deep View: Conversations

Discusses AI advancements and leadership with innovators from labs, enterprises, and startups shaping the industry.

Machine Learning Street Talk (MLST)

Discussions with leading AI researchers and thinkers exploring machine learning, cognitive science, and philosophy of mind without hype.

Training Data

Experts discuss AI advancements and their impact on technology, business, and society with insights from leading researchers and builders.

The AI Daily Brief: Artificial Intelligence News and Analysis

A daily analysis of artificial intelligence news, exploring its creative potential, industry impacts, and ethical challenges.

AI and I

Interviews with professionals who use AI tools in their work, exploring how AI affects creativity, thinking, and daily life through live demonstrations.

About Latent Space: The AI Engineer Podcast

The podcast by and for AI Engineers! In 2025, over 10 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0. We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al.

By Latent.Space

Science Technology

Customized Recaps

AI-powered recaps with compact key takeaways, quotes, and insights.

Straight to Your Inbox

Get key takeaways from Latent Space: The AI Engineer Podcast in a 5-minute read.

Save Hours Every Week

Stay current on your favorite podcasts without falling behind.

Frequently Asked Questions

What is Podzilla's Latent Space: The AI Engineer Podcast daily summary?

It's a free AI-powered email that summarizes new episodes of Latent Space: The AI Engineer Podcast as soon as they're published. You get the key takeaways, notable quotes, and links & mentions — all in a quick read.

How does the Latent Space: The AI Engineer Podcast podcast summary work?

When a new episode drops, our AI transcribes and analyzes it, then generates a personalized summary tailored to your interests and profession. It's delivered to your inbox every morning.

Is this an official Latent Space: The AI Engineer Podcast product?

No. Podzilla is an independent service that summarizes publicly available podcast content. We're not affiliated with or endorsed by Latent.Space.

Can I get summaries of other podcasts too?

Absolutely! The free plan covers up to 3 podcasts. Upgrade to Pro for 15, or Premium for 50. Browse our full catalog at /podcasts.

How often does Latent Space: The AI Engineer Podcast release new episodes?

Latent Space: The AI Engineer Podcast publishes every few days. Our AI generates a summary within hours of each new episode.

What topics does Latent Space: The AI Engineer Podcast cover?

Latent Space: The AI Engineer Podcast covers topics including Science, Technology. Our AI identifies the specific themes in each episode and highlights what matters most to you.

Start getting Latent Space: The AI Engineer Podcast summaries tomorrow morning.

Free forever for up to 3 podcasts. No credit card required.

Get Free Summaries →

Free forever for up to 3 podcasts. No credit card required.

Latent Space: The AI Engineer Podcast: Daily Summaries Delivered

Latest Episodes

Codex from 0 to 10M Users: Building ChatGPT Work — Akshay Nathan, OpenAI

Inside the Model Factory — Eiso Kant, Poolside AI

Causal Models Need Causal Data - Xaira’s X-Cell model for Drug Discovery (Bo Wang & Ci Chu, Chief Discovery Officer & Chief AI Scientist)

🔬 The Lab of the Future Should Feel Like a Data Center — Andy Beam & Rafa Gómez-Bombarelli, Lila Sciences

Why AI Infrastructure must evolve for Agent Experience — Akshat Bubna, Modal CTO

🔬 The Coolest Diffusion Research Isn't in LLMs — Evan Feinberg & Sergey Edunov, Genesis Molecular AI

Why the Frontier Ecosystem must be Open — Matei Zaharia and Reynold Xin, Databricks

Red-Teaming after Mythos — Zico Kolter & Matt Fredrikson, Gray Swan

Get Latent Space: The AI Engineer Podcast summaries in your inbox

You Might Also Like

About Latent Space: The AI Engineer Podcast

Customized Recaps

Straight to Your Inbox

Save Hours Every Week

Frequently Asked Questions

Start getting Latent Space: The AI Engineer Podcast summaries tomorrow morning.