This week, the focus on artificial intelligence (AI) took a leap forward. Anthropic, a leading AI company, showcased some ambitious coding experiments with its Claude Opus 4.6 model. But as exciting as this might sound, there are important details to consider.
On Thursday, Nicholas Carlini, a researcher at Anthropic, highlighted a fascinating project. He set up 16 versions of the Claude Opus 4.6 AI to work on a shared codebase with minimal guidance. Their goal? To build a C compiler from the ground up.
Over two weeks, these AI agents engaged in nearly 2,000 coding sessions, costing around $20,000 in API fees. Their hard work paid off: they created a Rust-based compiler that spans 100,000 lines of code. This compiler can build a bootable Linux 6.9 kernel for various architectures, including x86, ARM, and RISC-V.
Carlini, who brings years of experience from Google Brain and DeepMind, utilized a new feature known as “agent teams.” Each Claude instance operated within its own Docker container. They cloned a shared Git repository, took on tasks autonomously, and submitted their completed work. When issues arose, such as merging code, the AI agents resolved them independently.
The outcome? Anthropic has made this impressive compiler available on GitHub. It can compile notable open-source projects like PostgreSQL, SQLite, Redis, FFmpeg, and QEMU. Remarkably, it boasts a 99% success rate on the GCC torture test suite. As a crowning achievement, it even compiled and ran the classic game Doom.
However, while this accomplishment is impressive, it’s important to recognize that creating a C compiler is a somewhat ideal task for AI. The specifications are long-established and clear, comprehensive tests are readily available, and there is a reliable reference compiler to compare against. In contrast, many real-world software projects lack these straightforward guidelines. Often, the real challenge in software development isn’t just writing passing code but determining what the tests should be in the first place.
This highlights a crucial aspect of AI-driven coding: while AI can excel at specific tasks with well-defined parameters, navigating the complexities of real-world projects remains a significant hurdle. As we continue to develop these technologies, understanding their strengths and limitations will shape our approach to integrating AI into various fields.
For more on the evolving landscape of AI capabilities, check out this recent Forbes article.

