36Kr Exclusive | Four Key Propositions for ByteDance AI in 2026
ByteDance AI set four ambitious goals for itself in 2026, and the most intriguing among them is the one that entered last, is catching up the fastest, and may hold the key to the future: the world model. When Wu Yonghong declared at the Seed all-hands meeting, "We must match Google Genie 3 by the end of the year," the atmosphere in the room likely carried not only ambition but also a hint of urgency—like being forced to catch up on missed lessons. Internal evaluations show a 10% performance gap
Analysis
ByteDance’s AI matrix was once praised for having "no obvious weaknesses"—Seed 2.0, Seedance 2.0, Doubao’s 200 million daily active users... the report card was envy-worthy. But the gap in world models exposed the short-term utilitarianism of its technical approach. In 2024, when Zhou Chang took the lead, internal judgment was to "wait for clearer scenarios and first focus on video models"—a classic big-company mindset: do what’s trending, invest in what can scale quickly. It wasn’t until 2025 that a team was set up to explore VLA (Vision-Language-Action), and in early 2026, the paths were merged, and tens of millions of yuan were poured into data budgets. This isn’t strategic foresight—it’s a panicked "first move" after seeing Google and OpenAI outline their presence in the embodied intelligence space. The data investment is reportedly 3-4 times that of other competitors, bringing back the familiar "data flooding tactics." The problem is, the core of a world model lies in understanding the dynamic logic of the physical world—it can’t be built simply by piling on more video data. This "brute force for miracles" path dependency reveals confusion at the foundational cognitive level.
What’s even more intriguing is the awkwardness of the Coding business. Despite investment "second only to the world model," it remains low-profile, and internal products refuse to use their own Seed-Code. The reason is blunt: the model’s capabilities are lacking, so business units opt for DeepSeek or Claude externally. As a result, real feedback data doesn’t flow back, making the model even harder to improve—a perfect death spiral. It wasn’t until 2026 when application departments were mandated to use Seed models that a basic closed loop was formed. This kind of推行技术产品 through administrative orders feels quite ironic in a tech-driven company. Coding should have been the foundation of Agent capabilities, a whetstone for honing the model’s logical reasoning—but now it’s become an experimental field for internal political economics. When your "dogfooding" (internal product testing) requires enforcement to happen, it’s a sign that your product isn’t strong enough to win voluntary adoption.
As for Seedance, which holds the SOTA (state-of-the-art) position, its success is summarized as "a victory of data"—a 2,000-person evaluation team, massive training datasets. This remains ByteDance’s signature scaling play. But the next frontier in video generation is "dynamic generation," which involves understanding motion physics and long-term consistency. This can no longer be solved by simply throwing more data at it. At the inflection point where generative AI moves from "perception" to "action," ByteDance still seems to trust the familiar scale effects over a disruptive restructuring of cognitive architecture.
Overall, ByteDance’s AI strategy in 2026 resembles a carefully calculated makeup exam: using the most abundant funds and the densest talent to quickly fill the gaps left by earlier strategic wavering. The "gamble" on world models is an acknowledgment of the uncertainty in future tracks, choosing to cover risks with resources. The forced data return for Coding is an administrative correction after recognizing the failure of internal collaboration. The maintenance of video models is a continuation of past path dependencies. ByteDance certainly has the potential to achieve results through sheer money and manpower, but the real challenge lies ahead: as AI competition shifts from "who runs faster" to "who thinks deeper," can ByteDance—which is accustomed to winning through iteration speed and engineering scale—still find that narrow, true technological gateway that cannot be crushed by resources? Chasing SOTA is important, but if you’re merely running after your opponent’s shadow, even if you reach the same line, you may have long strayed from the true endgame.
Disclaimer: The above content is generated by AI and is for reference only.