China's self-driving industry debates how to design end-to-end AI systems, as regulation and limited compute shape different approaches compared to Tesla.
Ni Tao is IE’s columnist, giving exclusive insight into China’s technology and engineering ecosystem. His monthly Inside China column explores the issues that shape discussions and understanding about Chinese innovation, providing fresh perspectives not found elsewhere.
Across the Pacific, this episode inspired awe and admiration, setting China’s intelligent driving industry abuzz with a familiar debate—this time with a new twist.The debate has evolved. The question is no longer whether end-to-end is the future, but how to build it: should developers adopt a unified one-stage architecture, or retain a two-stage system separating perception from planning? This is not just a technical split. In China, it is shaped by regulations, limited computing power, intense price competition, and a fast-moving supply chain. As safety explainability becomes a regulatory requirement, the discussion is shifting from ideology to trade-offs that determine whether systems can survive in the real world.For years, autonomous driving in China relied on a modular pipeline consisting of four layers: perception, prediction, planning and control. The structure made development manageable but introduced a fundamental weakness—errors accumulate between layers, and extreme “corner cases” multiply. End-to-end learning promises to collapse this complexity. Yet the term is not monolithic; it comprises two distinct approaches. Engineers often describe their difference through metaphor. A two-stage end-to-end system resembles a relay race. One neural network handles perception and produces structured representations such as lane geometry or obstacle positions. They are then passed to a second model responsible for planning and control. This division improves development efficiency and makes failures easier to diagnose. But translating raw perception into intermediate representations inevitably leads to information loss, limiting the system’s performance ceiling. As a result, a vehicle may correctly detect objects yet still misjudge interactions or timing once abstractions pass between modules. A one-stage model, by contrast, resembles a marathon runner maintaining control from start to finish. A single neural network converts raw sensor input directly into driving actions without human-designed intermediate steps. Because optimization occurs across the entire AI pipeline rather than module by module, this linear approach promises a higher performance ceiling. Additional benefits include smoother, more intuitive and human-like responses in complex traffic conditions. Its weakness lies at the opposite extreme. Unified models can behave like black boxes; when errors occur, tracing their causes becomes difficult, if not impossible. This concern resonates strongly with “safety-first” Chinese transportation regulators, pushing engineers to design systems that ensure accountability in the event of accidents.Faced with this dilemma, Chinese autonomous driving tech providers have responded with distinctly localized engineering solutions. QCraft represents one direction, pursuing a one-stage architecture reinforced through what it calls “safety alignment.” Rather than relying purely on data-driven learning, the company embeds human-defined safety constraints directly into model training. These constraints serve as reward functions for model fine-tuning. The aim is to preserve the flexibility and performance ceiling of unified models while keeping behavior verifiable. In practice, QCraft’s approach addresses a longstanding criticism of one-stage systems—that they trade explainability for performance. The efficiency gains are also notable. QCraft’s system enables driverless urban navigation on a single Horizon Journey 6M chip delivering roughly 128 TOPS. This reflects a design philosophy that prioritizes efficiency over brute-force computational scaling. By contrast, some domestic competitors continue to rely on dual Nvidia Orin X chips with a combined 508 TOPS as their default configuration, which is far less cost-efficient. Zhuoyu Technology, a DJI spin-off betting on two-stage systems, represents a different path. Instead of jumping on the one-stage bandwagon, Zhuoyu leverages so-called interaction modeling to predict the intentions of vehicles and pedestrians, compensating for traditional two-stage systems’ limitations in capturing dynamic interactions. Rather than treating traffic as a collection of independent mobile objects, the system seeks to infer behavioral intent. It anticipates whether a pedestrian may cross the street or a nearby vehicle intends to merge. Reinforcement learning in large-scale simulations enables defensive driving behavior through repeated virtual testing. The resulting platform delivers urban navigation on the Texas Instruments TDA4 platform, using just 8 TOPS of compute per chip. Zhuoyu’s framework pushes advanced driving capabilities into lower-cost vehicles and demonstrates how algorithmic design can make up for modest hardware. Together, these strategies show how the end-to-end debate is evolving within China’s industrial context. The divide between the two camps reflects differing engineering priorities rather than a fundamental philosophical split. Within China’s autonomous driving ecosystem, some companies, including Horizon Robotics, QCraft, and DeepRoute.ai, favor one-stage architectures, while Huawei ADS, Baidu Apollo, and Zhuoyu continue refining two-stage frameworks. Others, such as Momenta, are gradually transitioning toward one-stage methods, underscoring just how fluid the boundaries have become.China’s divergence in end-to-end architectures has been shaped by three constraints. The first is regulatory pressure. Draft national standards released in 2025 require intelligent driving systems to demonstrate traceable decision-making before implementation in 2027, turning safety explainability from a vague concept into a hard metric. Systems that can account for their decisions gain structural advantages, forcing developers to consider regulatory clearance from the earliest design stages, instead of focusing entirely on building robust unified AI models. The second constraint is the grim computing reality. Unlike Tesla, which has the wherewithal to continuously scale its chip and compute resources, Chinese rivals face a shortage of vehicular edge compute. Next-generation chips like Nvidia’s Jetson Thor promise higher performance but also surging cost. Developers thus cannot simply “buy compute” to bolster intelligence; a viable alternative is to optimize architectures and algorithms to extract maximum capability from limited hardware. The third constraint is market pressure. As advanced driver assistance rapidly enters mass-market vehicles, intense competition compresses margins. Companies must cut costs and improve capabilities simultaneously, leaving little room for inefficient architectural experimentation. Additionally, when production volumes are insufficient to spread costs, invisible safety redundancies are often the first compromise. This makes ingeniously built but costly AI frameworks a hard sell. Amid brutal market dynamics, architectural choices often end up as compromises with implications rippling across the value chain, from technical roadmaps to supplier relationships and commercialization paths. Many Tesla aspirants have argued fiercely over which end-to-end system is superior, only to discover that success in a cutthroat market depends less on selecting one-stage or two-stage systems, than on who can convert theoretical trade-offs into cleverly engineered solutions to real-world problems.The trajectories chosen by Tesla and Chinese players reveal contrasting philosophies for advancing autonomous driving. Tesla pursues what many engineers describe as “brute force,” combining massive real-world data, large neural networks and vertically integrated hardware—from vehicle chips to training infrastructure.Few companies can afford to replicate this model. Its effectiveness depends on a feedback loop linking millions of vehicles, proprietary chips and large-scale supercomputing clusters. Chinese developers, constrained by regulation, hardware supply and cost pressures, has no choice but to emphasize engineering improvement. Vehicular intelligence arises primarily from architectural efficiency and disciplined trade-offs. Again, rather than scaling models endlessly, a more realistic aim would be to extract maximum capability from each watt of compute and each line of algorithm.Even as the industry debates one-stage versus two-stage systems, the technological frontier is shifting again with the emergence of vision-language-action models. Rather than replacing end-to-end driving, VLA extends it by integrating language-model reasoning into perception and control. Vehicles are no longer limited to seeing and acting; they begin interpreting context and articulating decision logic, potentially explaining why a maneuver was chosen. This evolution may blur the architectural divide. Reasoning chains generated by VLA systems could enhance explainability while meeting regulatory demands. At the same time, VLA models with tens of billions of parameters amplify the challenge for limited vehicle-side compute, turning the debate toward how to deploy massive intelligence efficiently on constrained hardware. With companies from Xpeng to QCraft exploring VLA integration in their end-to-end platforms, the conversation is poised to move beyond the one-stage versus two-stage dichotomy. Instead, the focus will be on deployment efficiency, safety assurance and system transparency.From the vantage point of industry development, the debate around end-to-end architectures may matter less than the discipline it has imposed on the industry. China’s autonomous driving sector faces a delicate balancing act—pursuing higher technological ceilings without breaking safety protocol. And as concepts such as VLA and world models accelerate innovation cycles, developers have to guard against the risk of technological exuberance outpacing reliability. With China’s intelligent driving standards set to take effect in 2027, heralding the arrival of L3 autonomy, stakes appear to be getting higher for which architectural doctrine will prevail.Ni Tao worked with state-owned Chinese media for over a decade before he decided to quit and venture down the rabbit hole of mass communication and part-time teaching. Toward the end of his stint as a journalist, he developed a keen interest in China's booming tech ecosystem. Since then, he has been an avid follower of news from sectors like robotics, AI, autonomous driving, intelligent hardware, and eVTOL. When he's not writing, you can expect him to be on his beloved Yanagisawa saxophones, trying to play some jazz riffs, often in vain and occasionally against the protests of an angry neighbor.Beyond EarthBeyond Earth
United States Latest News, United States Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Tesla’s new Roadster could finally debut next month, but there’s an emphasis on “hopefully”Tech Product Reviews, How To, Best Ofs, deals and Advice
Read more »
Elon Musk Announces Launch Date for Tesla’s In-House AI Chip Manufacturing ProjectSource of breaking news and analysis, insightful commentary and original reporting, curated and written specifically for the new generation of independent and conservative thinkers.
Read more »
Tesla signs $4.3 billion battery deal with LG to expand US energy storage supplyTesla signs $4.3 billion LG battery deal to power energy storage systems and boost US supply chain.
Read more »
Sole survivor of fiery Cybertruck crash in Piedmont sues Tesla over lack of door handlesJordan Miller was trapped in crashed Cybertruck and suffered serious burns, lawsuit says.
Read more »
Sole survivor of Piedmont Cybertruck crash latest to sue Tesla over door designThe lawsuit alleges Tesla has known about the risk of trapping occupants 'for over a decade.'
Read more »
The New BMW i3 Has More Range Than Any TeslaThe first all-electric BMW 3 Series is here, and it means business, with an 800-volt architecture and ultra-fast charging speeds.
Read more »
