AMD’s MCM Design Cost Savings Allow Aggressive Pricing
As the underdog, AMD has to be more aggressive than Intel to seize marketshare. At the same time, Intel enjoys greater economies of scale and can charge more as the dominant brand. As a result, AMD has turned to innovative ways to undercut their giant competitor. For their new Summit Ridge designs, AMD turned to a MCM architecture. As it turns out, this leads to significant cost savings.
For Ryzen, Threadripper, and EPYC, AMD is using a common 4 core Summit Ridge design. This is made up of 2 4 core CCX units on a single die making up 8 cores. To create Threadripper and EPYC CPUs, AMD stitches together 2 or 4 modules together on a MCM package. These are connected using the high-speed Infinity Fabric interconnect. As a result, these larger CPUs are somewhat glued together, albeit designed to do so.
AMD MCM Cuts Cost by 41%
According to AMD, the move to a MCM design is critical to their pricing strategy. By using a common silicon design, AMD is able to reap greater economies of scale. The top 5% percentile get used in Threadripper with even better chips used for EPYC. Regular Ryzen gets lower end chips and defective ones are harvested for the low-end Ryzen chips. This means AMD can get better effective yields and make the most out of their silicon.
Due to the MCM design, AMD’s 4 module 32 core CPU costs just 0.59X that of a monolithic 32 core design. This 41% cost saving is getting passed onto the consumer in the form of aggressive pricing. The remaining question is how much performance is being lost with the MCM design. It will also be interesting to see how Intel will react and evolve to meet this resurgent AMD.
in EPYC, nothing is lost to Skylake SP from AMD’s MCM approach, because Infinity Fabric is that good and EPYC does not support more than 2 sockets, so they had were able to offer 8ch memory vs 6ch for Skylake SP. This higher memory BW makes up for any latency, which only exists at larger block sizes, where memory BW has even more of an effect on reducing overall transfer times.
EPYC is just plain better, lower power, and cheaper than Skylake SP.
BW and latency are two different concepts. The latency issue is not solved by providing moar BW
AMD were not to know yields would be excellent. Were they not, mcm is also a great risk/cost reducing strategy.
W/ their inspired model mix, few flawed cores/ccxS? would be total discards. I hear 99% (of cores/ccxS?) are used.
A chip mask like zen is a bet the company risk for the likes of amd. Its terribly clever of them to have leveraged that one big bet, to more than compete at every point from entry level ~I3 and mobile, up thru HEDT, Workstation and ~hi and low end servers, all products simply being multiples of the same stock mass produced cpu core.
Point being, sunk costs or overheads are as essential a viability factor as the marginal cost of each core, and mcm aced that cost factor too.
This is not anything “innovative”. MCM is an old technology has been used in the industry for many years. IBM
Power2 CPU released in 1993 used MCM. The main chip on the Nintendo Wii U console released in 2012 is another MCM package.
Moreover, AMD already used MCM designs before. For instance, the Bulldozer/Piledriver 16-core Opteron used two dies of 8-core each just to reduce costs.
The slide is a bit misleading. It mentions only die cost reduction, but it doesn’t mention the costs associated to packaging multiple dies, including the own packaging yields. But of course, even including the packaging extra cost, the whole system is cheaper than an alternative design with a giant monolithic die.
Why are most server chips monolithic? Because the MCM approach has both performance and power penalties. An monolithic die is better both for performance and power. That is the reason why engineers from IBM, Intel, Fujitsu, Sun/Oracle, MACOM, Cavium,… have preferred monolithic die for their server chips.