I'm a bit baffled about chiplets stealing all the headlines in the tech news sit...

dragontamer · on April 5, 2023

Interposers.

The start of "modern" chiplets was AMD's R9 Nano GPU, which I believe was the first to use HBM / stacked memory. Each stack has 1024 microbumps, and the GPU has 4 stacks, meaning 4096 bumps/pins connect the GPU with its RAM. EDIT: The pitch between these microbumps is 40 microns / 0.040 milimeters.

This technology is "advanced packaging", the ability to provide thousands, or even tens-of-thousands, of bumps / effective pins to serve the signals going across our computers.

-------------

Yeah, chiplets or even socket-to-socket communications, have existed for decades. But today we can take advantage of thousands, tens-of-thousands, or even hundreds-of-thousands of external "pins" that connect these components together.

Here's a rundown on Intel's technology: https://www.anandtech.com/show/16823/intel-accelerated-offen...

> Intel is also stating today that it will be using its second generation Foveros technology on the platform, implementing a bump pitch of 36 micron, effectively doubling the connection density over the first generation.

36 microns, or 36 um (micrometers) pitch density on Intel's advanced packaging technology. That's a lot of pins per mm^2!

Now I don't know how these guys are lining up 0.036 milimeter bumps and reliably making connections. I kind of imagine a very tiny soldering iron, but I'm probably wrong.

-------

In practice, AMD has led the way. Not only with "chiplets", but also off-die L3 cache (aka: x3d cache), adding 64MB of external SRAM to their chips through advanced packaging. So these thousands-of-microbumps are fast, reliable, and low-power enough to provide full-speed caches (something not quite possible with those earlier Pentiums you were talking about).

paulmd · on April 6, 2023

> Interposers.

Interposers are just one way of packaging MCM designs. AMD's RX 7900 series doesn't use a silicon interposer, it's still an (early) MCM design (although of course it's only pulling out memory controllers and cache, not multi-GCD). And technically since it's already a fiberglass package, there's no reason you can't stick things right onto a PCB instead, it's just a matter of convenience for partners placing/routing one thing instead of 7 things.

https://www.techpowerup.com/301071/amd-explains-the-economic...

Infinity Fanout integration is a different type of MCM integration with some of its own upsides (cost) and downsides (higher power, less bumpout). There's also bridges, and actual 3D stacks of multiple compute dies (or 2.5D with memory/etc) that don't use interposers at all.

https://semiengineering.com/using-silicon-bridges-in-package...

All of these are technically MCM - MCM is any sort of multi-chip package. MCM also really does include things like Core2Quad, Pentium-D, Crystalwell, Xenos, and various IBM modules in the 70s and 80s. The idea has been kicking around for a long time and "multi-chip module" refers to all of them collectively, not just interposer-based CPUs.

https://en.wikipedia.org/wiki/Multi-chip_module

And while yes, stacking and direct-bonding are great and lower power requirements a lot... there are still dies with stacking and no interposer! Ryzen x3d chips are a great example, v-cache is a MCM but there's no interposer there. And technically Ryzen itself (without v-cache) is also MCM too and still sees large benefits there too.

cma · on April 5, 2023

Pentium D was a multi chip module, and so are AMD's chiplets, but Pentium D didn't use chiplets:

> ICs that can perform most, if not all of the functions of a component of a computer, such as the CPU. Examples of this include implementations of IBM's POWER5 and Intel's Core 2 Quad. Multiple copies of the same IC are used to build the final product. In the case of POWER5, multiple POWER5 processors and their associated off-die L3 cache are used to build the final package. With the Core 2 Quad, effectively two Core 2 Duo dies were packaged together.

> ICs that perform only some of the functions, or "Intellectual Property Blocks" ("IP Blocks"), of a component in a computer. These are known as chiplets.[3][4] An example of this are the processing ICs and I/O IC of AMD's Zen 2-based processors.

https://www.wikipedia.org/wiki/Multi-chip_module

It seems chiplets are a different subset but it is very similar.

Also, AMD used different process nodes for different parts, where that wouldn't make sense in just a dual cpu package, and certain circuitry is now starting to scale different as process nodes shrink, so they may want a different older process node for cache than for logic, etc. which could emphasize the different functions aspect and explain why it is in the news more. 3d v-cache uses an older node for that reason, though needs more thorough connectivity than chiplet tech so uses interposers I think.

You might also want to buy third-party IP that is only available created with design rules from a different foundry. Chiplets let you integrate it, where an approach like SoCs wouldn't.

Shot noise is a bigger problem in EUV, so you can potentially get better yields by splitting functional components into different chiplets rather than the alternative of fusing off parts of a chip and selling it as a lower tier. Those are the reasons I see as to why they are getting lots of press and attention even though the packaging technology may be old.

Initial reasons for mixing nodes may have been more business related than technical: AMD had to buy a certain amount of output from global foundries after spinning it off.

wtallis · on April 6, 2023

I think another important distinction between today's chiplet CPUs and the Pentium D/Core 2 Quad is that those early parts were still using a traditional shared front side bus connecting CPU cores to the memory controller/northbridge residing on the motherboard. So those MCMs were functionally equivalent to a dual-socket system even from a performance perspective, but harder to cool.

AMD's chiplet-based CPUs gain some real benefits from using faster or lower-power short-range links that would not work between separate CPU sockets, and more advanced packaging using interposers or bridges further reduce the power and performance costs of communication between chiplets. These benefits mattered even for AMD's early chiplet-based CPUs that were homogeneous rather than using a mix of specialized dies.

paulmd · on April 6, 2023

This is about as meaningful a distinction as the marketing copy that defined the GeForce as "the world's first graphics processing unit (CPU)" by creating a definition that matched their exact specs while definitionally excluding its competitors. You have to do 10 million polys/sec on a single monolithic chip or it's not a real GPU guys!!!! Voodoo doesn't count because it's not monolithic!

> ICs that perform only some of the functions, or "Intellectual Property Blocks" ("IP Blocks"), of a component in a computer.

Well, a Pentium D or Core2 Quad didn't have a northbridge onboard, so each chiplet only performed "some of the functions" of a CPU.

Which functions are the important ones that count? Well obviously the ones that AMD did, and not the ones that Intel did, of course. I mean you can't really have a CPU without memory controller so... kind of an important one. One might describe that northbridge as... an IO die. Just not one that lives on the package, because that's not how it was done at the time (monolithic CPUs had external northbridges too).

And obviously that's changed over time, components of the CPU itself go through the same internalize-and-integrate/externalize-and-disambiguate lifecycle as has been well-remarked previously in other aspects of computer design. On-package northbridges aren't something unique to MCM either.

The parent comment that ascribes it to viral marketing and clever rebranding is correct. Everything that is old is new again - the IO die is just a northbridge-as-a-chiplet and the CCDs are pretty similar to pentium-d or core2quad core chiplets. Just branded.

There is very much a lesson to be learned here as far as technical marketing - how would customers know how awesome your thing is if you don't give it a special name to tell them? It's a Graphics Processing Unit, of course it's better than the competition's Boring Old Junk, it's got way more quadroflops and kilopixels! It's not a L3 cache, dad, it's AMD GamerCache, or Radeon Infinity Cache, it's totally different! It's not memory paging/swapping, it's HBCC! It's not PCIe Resizeable BAR, it's Smart Access Memory!

If you don't give it a brand name then people won't know how awesome it is and how lame your competition is for not having your exact implementation of the idea. Or even if they do, hey, yours is the one with the brandname. You can't be an ultrabook, that's our trademark, you're just some underpowered thin-n-light laptop.

https://www.vortez.net/news_story/amd_gamecache_canny_market...

https://www.amd.com/system/files/documents/infinity-cache-te...

https://itigic.com/what-is-amd-hbcc-features-and-how-it-work...

https://www.gpumag.com/smart-access-memory/

It's not that AMD didn't make any improvements - technology marches onwards and they built a good system. It's just not really a difference in kind in the sense you can draw some particular brightline and say "well this is MCM and this isn't"... these ideas have been kicking around for a long time and it's not AMD who invented them, even if they improved them.

Attempts to do so fall into the same trap as "it's only a GPU if it's a GeForce 256 descendant", because you end up with a definition specifically drawn to include the things you like and exclude the things you don't, rather than technically coherent distinctions. It's still the same general idea even if Intel's implementations weren't commercially successful (although I think Core2Quad was pretty successful overall).

--

And while I'm picking on NVIDIA with the "GPU" marketing thing... it's also not like NVIDIA didn't improve the state of the art too! Having 2D and 3D in the same chip was way more convenient overall, the "GPU" was quite revolutionary. But that's also pretty much the baseline expectation, a generationally newer product should be significantly better and will likely be conceived differently to match the nodes and the tech of the time. The reason AMD went super heavy on caches on all their 7nm products (CPU and GPU) was to take advantage of TSMC's SRAM density... and they were well-placed with a good architecture to do that too! But a lot of these things are just "products of their own time" in some ways, not having L3 is common on pre-TSMC N7 GPUs not because nobody had thought of L3 cache before, but rather because SRAM isn't a very efficient use of die area on older nodes and it wasn't a good engineering tradeoff. And then it was.

Like other kinds of alt-history hypotheticals, people tend to underweight the "overall forces of the times" that would have tended to push things in the same direction even if some specific decision had been made differently or whatever. Someone else would have thought of "wow let's use this high-density SRAM that N7 gives us and throw a big cache on our product".

natpalmer1776 · on April 5, 2023

Marketing & viral awareness go brrrrrr

Edit since this was pretty low effort and likely violates the community guidelines I'll add this:

I would guess the reason for seeing more content related to chiplets and their implications is due to a combination of seeing a big player adopt them for their main product line(s) and the resulting PR / marketing buzz that occurs as a result of that having 'trickle down' effects on the general industry discourse as a whole.

tdba · on April 5, 2023

Also the USG is making a big push to reshore chip manufacturing, especially in more future-facing areas such as chiplets.

natpalmer1776 · on April 5, 2023

Took me longer than I care to admit to realize that USG stood for United States Government.

Regarding the statement, I also would point out that there is an almost global push towards promoting domestic chip manufacturing and reducing the reliance on globalization of critical infrastructure, not just in the United States.

In my geopolitical armchair expert opinion, I would guess this is in part caused by the conflict in Ukraine as well as rising tensions between the various 'global powers' further compounding the general loss of confidence that followed COVID-19.

tdba · on April 5, 2023

This push began before Covid and well before the war in Ukraine went hot - but those two factors certainly increased the urgency. At root the push in the US began due to the rise of China as a military competitor in the early 2010s, and the consequent realization that TSMC might be blockaded, captured or destroyed.