Step 3.7 Flash Ranks First in Mainstream Output Speed List of Artificial Analysis
409 tokens/s, a new record marked in red on the speed leaderboard. Step 3.7 Flash from StepFun is like a race car that has hit maximum velocity on the digital highway, leaving all mainstream competitors behind. But is this glittering medal truly worth the entire industry's celebration?
Analysis
Speed is undoubtedly one of the sexiest metrics of our time. In an era where users have long lost patience for "waiting," a model that can run 10 times faster creates a perceived chasm between being "usable" and being "truly useful." From a technical perspective, achieving this speed means mastering the entire suite of "internal martial arts"—architecture optimization, engineering deployment, operator fusion, and hardware adaptation. This is absolutely a display of hard strength and a beautiful victory for engineering culture in the dimension of efficiency. It solves a real pain point: when models are deployed at scale for conversations, search, real-time translation, and other scenarios, speed becomes the lifeline. Every 0.1-second reduction in latency means one less risk of breaking the user experience chain.
However, behind the leaderboard's glow lies a glaring shadow. What is the cost of this unrivaled speed? The metric "intelligent efficiency" is mentioned, but it is precisely the core of the problem. We’ve seen this too many times: models that achieve sky-high scores on specific benchmarks often lose their luster once they encounter real-world, complex scenarios requiring long-term reasoning or common-sense judgment. As the name suggests, Flash models prioritize "flash" speed responses. They may be highly specialized, sacrificing the model's versatility and depth of thinking for the sake of speed. For quick summaries, simple Q&A, or code completion, they are神器 (godsend). But would you dare let one independently write an industry analysis report or handle a complex task requiring multi-step trade-offs? I have my doubts.
Another eye-catching point on the leaderboard is the "speed-to-price ratio." This signals a dangerous escalation: the price war has evolved from a "feature war" to a "speed war." When every player is dragged into this arms race of being "both fast and cheap," what happens? True innovation gets stifled. Developing a "smarter," more comprehensive model that can understand nuances requires massive computational investment and lengthy training cycles—it’s slow and expensive. In contrast, refining an ultimate speed-specialized model has a clear path with immediate results. Capital and attention will flood toward the latter. Over time, we won’t get a more intelligent world but a digital fast-food world filled with "fast but shallow" models. They can quickly give you an answer, but the quality of that answer might not be worth your wait.
This reminds me of the original aspiration behind large language models. We seek intelligence closer to that of humans—partners capable of understanding, reasoning, and creating—not merely "faster typewriters." Speed is an important foundational attribute, but elevating it above "intelligence" itself is putting the cart before the horse. A 1000-point model, even if slow, is still 1000 points of intelligence; a 100-point model, no matter how fast it breaks through the sky, cannot solve truly difficult problems. Is the current evaluation system placing too much weight on "speed," potentially misleading the entire industry's R&D focus?
StepFun’s leaderboard sprint this time is a showcase of its technical prowess, which is fair enough. But if the industry collectively celebrates and treats "speed first" as the supreme glory, it would be a lamentably short-sighted view. We must guard against the inertia of "speed for speed’s sake." Users need "good and fast," not "only fast." True competitiveness lies in raising speed while ensuring the model remains sufficiently "smart," not sacrificing "smartness" for the hollow reputation of speed.
So, applaud this speed record—but only once. Then, we should calmly ask: How "smart" is your model? On the "marathon" track that requires deep thinking, can you still keep running? Speed is a means; intelligence is the end. Don’t let the numbers on the leaderboard blur our pursuit of the true essence of intelligence.
Disclaimer: The above content is generated by AI and is for reference only.