Microsoft’s Project Scorpio: More Hardware Details Revealed


RSS feed bot
This news piece contains speculation, and suggests silicon implementation based on released products and roadmaps. The only elements confirmed for Project Scorpio are the eight x86 cores, >6 TFLOPs, 320 GB/s, it's built by AMD, and it is coming in 2017. If anyone wants to officially correct any speculation, please get in touch.

One of the critical points of contention with consoles, especially when viewed through the lens of the PC enthusiast, is the hardware specifications. Consoles have long development processes, and are thus already behind the curve at launch – leading to a rapid expansion away from high-end components as the life-cycle of the console is anywhere from five to seven years. The trade-off is usually that the console is an optimized platform, particularly for software: performance is regular and it is much easier to optimize for.

For six months or so now, Microsoft has been teasing its next generation console. Aside from launching the Xbox One S as a minor mid-season revision to the Xbox One, the next-generation ‘Project Scorpio’ aims to be the most powerful console available. While this is a commendable aspiration (one that would look odd if it wasn’t achieved), the meat and potatoes of the hardware discussion has still been relatively unknown. Well, some of the details have come to the surface through a PR reveal with Eurogamer’s Digital Foundry.

We know the aim with Project Scorpio is to support 4K playback (4K UHD Blu-Ray), as well as a substantial part of 4K gaming. With recent introductions in the PC space of ‘VR’ capable hardware coming down in price, Microsoft is able to carefully navigate what hardware it can source. It is expected that this generation will still rely on AMD’s semi-custom foundry business, given that high-end consoles are now on x86 technologies and Intel’s custom foundry business is still in the process of being enabled (Intel’s custom foundry is also expected to be expensive). Of course, pairing an AMD CPU and AMD GPU would be the sensible choice here, with AMD launching a new GPU architecture last year in Polaris.

Here’s a table of what the reveal is:

Microsoft Console Specification Comparison​
Xbox 360​
Xbox One​
Project Scorpio​
CPU Cores/Threads
8 / ?​
CPU Frequency
3.2 GHz​
1.6 GHz (est)​
2.3 GHz​
CPU µArch
IBM PowerPC​
AMD Jaguar​
AMD x86 ?​
Shared L2 Cache
2 x 2MB​
GPU L2 is 4x​
GPU Cores
16 CUs
768 SPs
853 MHz​
40 CUs
1920 SPs ?
1172 MHz​
Peak Shader Throughput
0.24 TFLOPS​
1.23 TFLOPS​
>6 TFLOPs​
Embedded Memory
Embedded Memory Bandwidth
102-204 GB/s​
System Memory
512MB GDDR3-1400​
8GB DDR3-2133​
12GB GDDR5-1700
System Memory Bus
System Memory Bandwidth
22.4 GB/s​
68.3 GB/s​
Manufacturing Process
16nm TSMC

Specifications in italics were added after the table was created.

At the high level, we have eight ‘custom’ x86 cores running at 2.3 GHz for the CPU and 40 compute units at 1172 MHz for the GPU. The GPU will be paired with 12GB of GDDR5, to give 326GB/s of bandwidth. Storage is via a 1TB HDD, and the optical drive supports 4K UHD Blu-Ray.

Let’s break this down with some explanation and predictions.

Eight Custom CPU Cores

The Xbox One uses AMD’s Jaguar cores – these are low powered and simpler cores, aimed at a low-performance profile and optimized for cost and power. In non-custom designs, we saw these CPUs hit above 2 GHz, but these were limited to 1.75 GHz in the Xbox One. While not completely impossible, it would be unlikely that Jaguar cores (that were made on a 28nm process) would also be in the Scorpio.

The other cores AMD has available are Excavator based (28nm) or Zen based (14nm). The latter is a design that has returned AMD to the high-end of x86 performance computing, offering high performance for reasonable power, but a 14nm design would be relatively expensive. Eight cores would fit in with a standard Zeppelin silicon design, which AMD has been manufacturing hand-over-fist since the launch of desktop-based Zen CPUs for PCs in March. One of the detractors against Zen inside Scorpio is the fact that it was only launched recently, and arguably the desktop PC market is more financially lucrative for AMD.

Technically Microsoft could go for Zen in the Scorpio, but I suspect this would increase the base cost of the console. However, if Microsoft were going for a premium console ($700+), this might make sense.

A note on Zen power and frequency – 2.3 GHz is a low frequency for a Zen CPU based on what we have seen in desktop PCs. Some work done internally on the power consumption of Zen CPUs has shown that the design requires a lot of power to move between 3.5 GHz and 4.0 GHz, perhaps suggesting that 2.3 GHz is so far down the DVFS curve that the power consumption is relatively low. Also, we’re under the impression that getting a super high frequency on Zen is a tough restriction when it comes to binning chips – offering a low-frequency bin would mean that all the silicon that doesn’t make it to desktop retail due to an inability to go up the DVFS curve could end up in devices like the Scorpio. The spec list doesn’t have a turbo frequency, which remains an unknown (if present).

That being said, this is a ‘custom’ x86 core. Microsoft could have requested specific IP blocks and features not present in desktop CPUs, or different methods of branch prediction enabled etc. This would either require a new silicon design of the Zeppelin silicon, or it’s already in there, ready for Microsoft. Typically a console shares DRAM between the CPU and GPU, so it might be something as simple as the CPU memory controller supporting GDDR5. So either we’re seeing Zen coming to consoles, or we’re seeing another crack at using Jaguar on 28nm (it’s unlikely to get a 14nm spin), to keep overall costs down – and given that the main focus on a console is the GPU, that’s entirely possible.

40 Customized Compute Units

AMD launched Polaris 10 last year – their latest compute architecture on a 14nm process giving substantial power efficiency gains over previous 28nm designs. The first consumer GPUs were aimed at the $200-$230 market and below, which is something that would be of interest to console manufacturers. However, AMD is set to launch Vega this year, on a new architecture (also on 14nm) with additional performance per watt gains, but for high-end GPUs.

Bypassing AMD’s Fiji GPUs using silicon interposers and high-bandwidth memory, AMD’s latest design is the RX480. The RX 480 is a 36 compute unit design, using 4GB or 8GB of 256-bit GDDR5 memory, giving 256GB/s of total memory bandwidth. According to the information given to Digital Foundry, Scorpio will have 40 compute units, 12 GB of GDDR5, and will be good for 326 GB/s of memory bandwidth. Technically the RX 480 is a fully enabled design, and only offers 36 compute units in total, suggesting that Scorpio is either using a new silicon spin version of this design (with a lop-sided memory configuration), or is moving on to a Vega based design. The fact that the spec list has 1172 MHz on it, and Vega is supposed to offer higher clocks, means that we’re in a cost issue again: Vega is expected to cost a pretty penny, whereas consoles are often low-cost designs. This is most likely a Polaris implementation, especially as we already know that Scorpio will be > 6 TFLOPs, and the RX 480 is ~5 TFLOPs.

Ideally I want to get Ryan’s thoughts on this, and will do so when he signs in for the day, but his analysis on some of the specifications back in June 2016 still stands:

The memory bandwidth of Project Scorpio, 320 GB/s, is also relatively interesting given the current rates of the RX 480 topping out at 256 GB/s. The 320 GB/s number seems round enough to be a GPU only figure, but given previous embedded memory designs is likely to include some form of embedded memory. How much is impossible to say at this point.

Additional: On 4K support, the latest AMD media block supports 4K60 with HEVC, as well as HDMI 2.0. When rendering 4K content to a 1080p screen, Microsoft has mandated that Ultra-HD rendering should super-sample down to 1080p to all developers.

What We Don’t Know

The Xbox One used a combined CPU/GPU in a single piece of silicon – adding up the Zen silicon area + a Polaris 10 die comes up at almost 450mm2, which would be a large piece of silicon from Global Foundries (as well as being expensive with low yields), so we are probably looking at a split silicon design. This might mean that the memory is split between the CPU/GPU (perhaps 4GB for CPU, 8GB for GPU?), or some low-level software is managing DRAM distribution between the two to take advantage of HSA features such as zero-copy.

The original Xbox One used 8GB of DDR3 memory to be used between the CPU and GPU, as well as a 32MB ESRAM mini-cache to help boost memory bandwidth. There’s no indication that Project Scorpio uses a caching method, and may yet still do so. The memory bandwidth value might be a combination of what’s available to the main memory and cache, or might just be related to the GPU – we don’t know at this point.

If the whole core silicon is using AMD's latest, then we’d expect it to be made at Global Foundries on a 14nm process. This leads to questions about yields and cost – we’re assuming that Microsoft is going for a high-end design, which is likely to attract a high-end price. Going back over the console generations and adjusting for inflation to today’s prices, some consoles in the last couple of decades have drifted into a $600+ equivalent territory. It might be likely that Microsoft is looking at that, if they’re going with the latest technology. The alternative is using older technologies (such as 28nm Jaguar cores for the CPU and 14nm GPU) to keep costs down.

Hardware aside, the launch titles will be an interesting story in itself, especially with recent closures of dedicated MS studios such as Lionhead.

Project Scorpio is due out in Fall / Q3 2017.

Additional 4/6 - 16nm TSMC

I missed this when I originally read the peace: Project Scorpio's central piece of silicon will be built on 16nm TSMC. Time to process this one.

Source: Digital Foundry

Jaguar was made at 28nm TSMC, and would require a redesign for 16nm. It would result in much lower power, and also much lower die area. Compared to the GPU, an 8-core Jaguar design might be 10-15% of the entire silicon.

However, AMD recently afforded additional quarterly costs for using foundries other than Global Foundries (as per their renegotiated wafer agreement), which a number of analysts chalked up to future server designs being made elsewhere. A few of us postulated it's more to do with AMD's semi-custom business, and either way it points to Zen being redesigned for 16nm TSMC. This makes it an interesting question all around. [update, see below]

Similarly, the application of the GPU - Polaris and Vega are promoted as being 14nm processes, but could be redesigned for 16nm. The Eurogamer article quotes Andrew Goossen, Technical Fellow for Graphics at Microsoft:

Those are the big ticket items, but there's a lot of other configuration that we had to do as well," says Goossen, pointing to a layout of the Scorpio Engine processor. "As you can see, we doubled the amount of shader engines. That has the effect of improvement of boosting our triangle and vertex rate by 2.7x when you include the clock boost as well. We doubled the number of render back-ends, which has the effect of increasing our fill-rate by 2.7x. We quadrupled the GPU L2 cache size, again for targeting the 4K performance."
Additional #2 4/6 - 384-bit interface, 12GB is split

The memory bus is listed as a 384-bit interface. This probably means we're dealing with a Vega-based design. This means 12 32-bit channels, with modules running at 6.8 GB/s (or GDDR5-1700, which is similar to desktop processors).

The 12GB of GDDR5 is split with 4GB available for the system and 8GB available for developers. There is no ESRAM, given the reason that the bandwidth of the GDDR5 is sufficient. The counter to this is a slightly higher latency, which Microsoft expects developers to hide when pushing higher resolutions.

Additional #3 4/6 - DX12, 360mm2, 7B transistors, 245W Power Supply

Microsoft also confirms full DX12 support, making use of new features to push draw calls.

One element of the description passed me by initially: Digital Foundry saw the silicon the floor plan, and reports two clusters of two CPU cores. These might be CCX units from Zen, each being four cores. AMD stated that a Zen CCX was 44mm2 each on GloFo 14nm, so it would be about the same on TSMC. But this would put a sizeable chunk of the die area on the silicon, at least one-third of the chip. We don't know the size of Vega, but 36 CUs of Polaris 10 on GloFo is 232mm2 at 5.7 billion transistors. So ~230 for GPU + ~100 for CPU comes out as around 330mm2. The total die size for the combination chip is listed 360mm2, including CPU and GPU, with four shader engines each containing 11 compute units (one is disabled per block). This is all within 7 billion transistors.

Microsoft also states that the power supply with the unit can be suited up to 245W. If we assume a low frequency Zen CPU inside, that could be around 45W max, leaving 200W for the GPU. A full sized RX 480 comes in at 150W, and given this GPU is a little more than that, perhaps nearer 170W. The power supply, in a Zen + Polaris configuration, seems to have a good 20-25% power budget in hand.

Source: Digital Foundry

Based on some of the discussion from the source, it would seem that AMD is implementing a good number of its power saving features, particularly related to unique DVFS profiles per silicon die as it comes off the production line, rather than a one-size fits all approach. The silicon will also be paired with a vapor chamber cooler, using a custom centrifugal fan.

Source: Digital Foundry

Continue reading...