Cloud native EDA tools & pre-optimized hardware platforms
Chips below 5nm are the fastest growing part of the semiconductor market, fueling today’s leading consumer devices and data centers. Not only are these devices getting smaller, but they’re also packed with greater functionality. Unfortunately, growing chip complexity is converging with shortages in technical talent, making it harder for companies to meet their aggressive market requirements.
The challenge is on for companies to drive greater efficiency and productivity into their organizations, and artificial intelligence (AI) has emerged as one way to do this. AI is already proving to be quite successful in helping to optimize designs to meet stringent power, performance, and area (PPA) targets. As global compute demand continues to outpace Moore’s law, companies need to figure out how to effectively leverage larger scale designs that are still viable and retarget them to similar process nodes that have available capacity, while taking advantage of the potential performance and power efficiency gains of a new node.
Such retargeting projects are often executed as a completely new project, with timelines and engineering resources matching those of the original project. The time and effort required has been impacting time to market, cost and, therefore, the viability of such product and business maneuvers.
Today, AI can make this chip design retargeting effort a more streamlined, cost-efficient reality.
In 2020, we introduced Synopsys DSO.ai and since then, the technology has been used by the top semiconductor companies to maximize design productivity. The latest generation of DSO.ai includes new AI core engines to deliver 2x faster turnaround time (TAT) and up to a 20% improvement in quality of results (QoR).
Since DSO.ai was introduced, its AI engines have continually learned and applied the learnings from the initial design optimization to derivative designs. Rather than starting “cold,” the AI engine gets a “warm” start when it comes to figuring out the best optimization strategies to meet target specifications. The next generation of DSO.ai can take that learning to the next level and apply the “warm” start capability to derivative process nodes for design retargeting.
We can see this retargeting capability in action when we look at the following example of a RISC-V high-performance computing (HPC) core migration from 5nm to 4nm.
The 5nm RISC-V HPC core in this case study is a single “big core” with 500,000 instances targeted for data center applications. The target specifications for the original 5nm design included a performance of at least 1.95GHz with power not to exceed 30mW. The area of the core was specified at 426um x 255um. Applying the out-of-the box RISC-V reference flow, Synopsys Fusion Compiler RTL-to-GDSII implementation solution was able to meet the area and power requirements, but the performance fell short with a speed of only 1.75Ghz. Closing this performance gap is expected to require two expert engineers and one month of effort.
Figure 1: Baseline Optimization Results for RISC-V HPC 5nm Design
Let’s first understand how the design space optimization technology was applied from a “cold” start to reach the optimization targets. In this example, we allowed the solution to optimize a total of 25 permutons including those from the RISC-V HPC toolbox as well as timing, legalizer, and power strategies. The theoretical search space given the permuton variations was 100 million. This size search space would theoretically take 100 million Fusion Compiler jobs to cover. However, by invoking a single DSO.ai AI-driven optimization host, we were able reduce the number of Fusion Compiler jobs needed to just 30 running in parallel over 3 iterations. The solution was able to complete the task within two days with zero human involvement. The target performance specification of 1.95GHz was met with better power (27.9mW) than expected, all while staying within the designated area parameters.
Figure 2: Optimization Results Using Synopsys DSO.ai on 5nm RISC-V HPC Core
Now, let’s see how the learnings from the 5nm “cold” start were applied to retargeting the design to 4nm in a “warm” start scenario. Moving from 5nm to 4nm included a size shrink of 10% to meet the area of 404um x 242um. The performance target increased from 1.95GHz to 2.1GHz while the power requirements stayed at 30mW. The number of permuton variations remained unchanged, leading to the same 100 million search space. By utilizing the training database from the 5nm design, the compute configuration was reduced from 30 Fusion Compiler jobs running in parallel for 3 iterations down to 15 Fusion Compiler jobs running a single iteration, resulting in a 6x reduction compared to the “cold” start. The solution was then able to complete the task in one day with zero human intervention. The results were an impressive increase in performance to 2.15GHz with a reduction in power to 29.4mW while keeping within the reduced target area.
Figure 3: Retargeting Results Using Synopsys DSO.ai
In the age of ever-shrinking market windows, increasing complexity, and talent shortages, the ability to use AI to retarget designs effectively and efficiently to smaller geometries allows teams to take advantage of proven designs and maximize productivity.