Speeding AI With Co-Processors

📆 5/7/2025 2:52 PM

Ai News

Accelerator, Co-Processor, Cadence

📆 5/7/2025 2:52 PM
📰 ForbesTech

⏱ Reading Time:
183 sec. here
9 min. at publisher
📊 Quality Score:
News: 90%
Publisher: 59%

Instead of using only part of a DSP, ASIC designers can license an AI co-processor to accelerate the non-tensor portion of the AI workload.

Most chips today are built from a combination of customized logic blocks that deliver some special sauce, and off-the-shelf blocks for commonplace technologies such as I/O, memory controllers, etc. But there is one needed function that has been missing; an AI co-processor.

In AI, the special sauce has been the circuits that do the heavy-lifting of parallel matrix operations. However, other types of operations used in AI do not lend themselves well to such matrix and tensor operators and silicon. These scalar and vector operators for computing activations and averages are typically calculated on a CPU or a digital signal processor to speed vector operations. Designers of custom AI chips often use a network processor unit coupled with a DSP block from companies like Cadence or Synopsys to accelerate scalar and vector calculations. However, these DSPs also include many features that are irrelevant to AI. Consequently, designers are spending money and power on unneeded features. Large companies that design custom chips address this by building in their own AI Co-Processor. Nvidia Orin Jetson uses a vector engine called PVA, Intel Gaudi uses its own vector processor within its TPCs, Qualcomm Snapdragon has its vector engine within the Hexagon accelerator, as does the Google TPU.But what if you are an automotive, TV, or edge infrastructure company designing your own AI ASIC for a specific application? Until now, you had to either design your own co-processor, or license a DSP block and only use part of it for your AI needs.Cadence Design has now introduced an AI co-processor, called the Tensilica NeuroEdge, which can deliver roughly the same performance of a DSP but consumes 30% less die area on an SoC. Since NeuroEdge was derived from the Cadence Vision DSP platform, it is fully supported by an existing robust software stack and development environment.The new co-processor can be used with any NPU, is scalable, and helps circuit design teams get to market faster with a fully tested and configurable block. Designers will combine CPUs from Arm or RISC-V, NPUs from EDA firms like Synopsys and Cadence, and now the “AICP” from Cadence, all off-the-shelf designs and chiplets.The AICP was born from the Vision DSP, and is configurable to meet a wide-range of compute needs. The NeuroEdge supports up to 512 8x8 MACs with FP16, 32, and BD16 support. It connects with the rest of the SoC using AXI, or using Cadence’s HBDO . Cadence has high hopes for NeuroEdge in the Automotive market, and is ready for ISO 26262 Fusa certification. NeuroEdge fully supports the NeuroWeave AI compiler toolchain for fast development with a TVM-based front-end.With the rapid proliferation of AI processing in physical AI applications such as autonomous vehicles, robotics, drones, industrial automation and healthcare, NPUs are assuming a more critical role. Today, NPUs handle the bulk of the computationally intensive AI/ML workloads, but a large number of non-MAC layers include pre- and post-processing tasks that are better offloaded. Current CPU, GPU and DSP solutions required tradeoffs, and the industry needs a low-power, high-performance solution that is optimized for co-processing and allows future proofing for rapidly evolving AI processing needs. Cadence is the first to take that step. Disclosures: This article expresses the opinions of the author and is not to be taken as advice to purchase from or invest in the companies mentioned. My firm, Cambrian-AI Research, is fortunate to have many semiconductor firms as our clients, including Baya Systems BrainChip, Cadence, Cerebras Systems, D-Matrix, Esperanto, Flex, Groq, IBM, Intel, Micron, NVIDIA, Qualcomm, Graphcore, SImA.ai, Synopsys, Tenstorrent, Ventana Microsystems, and scores of investors. I have no investment positions in any of the companies mentioned in this article. For more information, please visit our website at

We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

Accelerator Co-Processor Cadence Synopsys

Write Comment

United States Latest News, United States Headlines

Similar News:You can also read news stories similar to this one that we have collected from other news sources.

World’s second most powerful supercomputer shatters simulation time by record 96%The Frontier supercomputer is equipped with 9,408 AMD EPYC processors and 37,632 AMD Instinct MI250X accelerators.
Read more »

The Product You Could Be Using Instead Of A Wash Cloth'I will purchase these for the rest of my life,' writes one reviewer.
Read more »

Nvidia says it will record $5.5 billion charge tied to H20 processors exported to ChinaNvidia will take a quarterly charge of about $5.5 billion tied to exporting H20 graphics processing units to China and other destinations.
Read more »

Amazon is clearing out Intel gaming PC processors for their lowest prices everStan Horaczek is the executive gear editor atPopular Science He oversees a team of gear-obsessed writers and editors dedicated to finding and featuring the newest, best, and most innovative gadgets on the market and beyond.
Read more »

Dual scalable annealing processors: Overcoming capacity and precision limitsCombinatorial optimization problems (COPs) arise in various fields such as shift scheduling, traffic routing, and drug development. However, they are challenging to solve using traditional computers in a practical timeframe.
Read more »

Microsoft’s next Surface devices to be smaller, cheaper and Arm-readyMicrosoft might launch new budget Surface devices powered by Qualcomm's Arm processors
Read more »