Introduction
The purpose of this TekSpek is to delineate the publicly
known features of Intel's next generation desktop microarchitecture.
Codenamed Conroe and officially titled Intel Core 2 Duo, it's loosely
based on the current mobile Yonah (Core Duo) underpinnings.
General architecture
Intel will be unifying its platform CPUs into one family with a
broadly similar microarchitecture. Woodcrest, Conroe and Merom represent
the codenames for server, desktop and mobile CPUs, respectively,
and all are based on the current Yonah microarchitecture, albeit
with improvements in performance on a clock-for-clock basis.
Referring specifically to Intel's Core 2 Duo, or Conroe, Intel
claims that it's an energy-efficient design, delivering a high performance-per-watt
ratio. Here's why we think why.
Dual core
Conroe-based CPUs will harness two execution cores based on a single
piece of silicon. The cores communicate with the rest of the system
via a single bus, which will be clocked in at 1066MHz and offer
around 8.5GB/s CPU-to-MCH bandwidth. Initial Conroe CPUs will be
manufactured on Intel's proven 65nm process. Projections state that
45nm production will begin in Q2 2007. Conroe supports a 14-stage
execution pipeline, down (read better) from the 31 present on the
Prescott-based Pentium 4.
CPUs will initially be packaged in the present LGA775 form factor,
and most i975X chipset-based motherboards that are manufactured
post-April 2006 will support its differing power requirements. Intel
will also be launching a range of new chipsets, the 965-series,
that will carry native Conroe support.
Wide Dynamic Execution
Current x86 processors can deliver 3 instructions per clock cycle.
Conroe, however, has been architected to fetch, dispatch, execute
and retire up to four full instructions simultaneously, offering
a 33% boost over, say, a Pentium 4 CPU. Allied to this, Conroe also
supports what Intel terms Macro-Fusion, which can combine certain
common x86 instructions (pairs, say, compare and conditional jump)
into a single instruction (micro-op) for execution, thereby reducing
overall processing time.
Intel Smart Memory Access and Advanced Smart Cache
Conroe will be equipped with 4MiB of on-chip L2 cache, minimising
the need to run back to system memory for frequently used data.
Unlike the present Pentium 4 microarchitecture, Conroe's two cores
will share the cache amongst each other. Intel's engineering team
has found that forcing the cores to individually allocate and use
cache is more efficient than allotting a fixed per-core amount.
By varying the amount of cache split over the two cores Intel hopes
that cache misses, the bane of modern CPUs in terms of execution
efficiency, will be further reduced.
Conroe will also support what Intel terms Smart Memory Access.
Put simply, and falling under the banner of memory disambiguation,
it's a form of out-of-order, built-in intelligence that predicts
and loads the upcoming instruction data before current store instructions
have been processed. Intel has designed algorithms that can accurately
predict whether a load can be processed before the store, thereby,
again, potentially saving overall execution time.
Coupled with a heavily optimised cache, the Conroe's memory access
latency will be lower than the present Pentium 4's. Intel has toyed
with the idea of integrating a memory controller right on the CPU
die itself, a la AMD, but reckons that Conroe's intelligent architecture
masks latency well enough for it to do without.
Advanced Media Boost
Increasing efficiency with Streaming SIMD (Single Instruction Multiple
Data) Extentions, Conroe CPUs will be able to process a 128-bit
instruction in a single clock cycle, rather than requiring the incumbent
two clocks that current generations employ.
Intelligent Power Capability
Intel has designed Conroe not only to perform well on a clock-for-clock
basis but also to be energy-efficient whilst doing so. This is precisely
where its mobile heritage shines through. Intelligent, which seems
to be the watchword for Conroe, management monitors core usage and
application requirements such that it can power-gate parts of the
CPU when not in use; there's little need for two cores running at
full power in single-threaded applications, for example. Intel reckons
that it has improved the physical requirements of power-gating enough
for it to offer better power consumption than previous generations'.
Initial reports indicate that the majority of Conroe models will
produce a 65W TDP; half that of the Pentium Extreme Edition 965
CPU. The Conroe Extreme, however, is reckoned to ship with a slightly
higher 75W TDP.
SSE4, Virtualisation Technology, 64-bit processing
Keeping the features list ticking over is support for SSE4,
which adds a further 16 new instructions to the present SSE3 set,
and, optimally, increases computational speed for SIMD-based execution.
Virtualisation Technology offers hardware-isolated virtual partitions
that allow the user to run multiple operating systems on one PC,
and 64-bit processing is carried over from the Pentium 4 line of
CPUs.
Overall thoughts
Conroe, the desktop arm of Intel's new micro-architecture, will
debut at the beginning of Q3 2006. It will replace the present Pentium
4 as the performance processor of choice. Taking the excellent Intel
Core architecture as a base and adding a sprinkling of performance-enhancing
attributes, it looks, on paper, to effectively cover both the performance-
and power-related shortcomings of the NetBurst Pentium 4 architecture.
Further afield, Intel will continue to invest in the Core 2 Duo
architecture by teaming up two Conroe cores into one package. 4
execution units and 8MiB of L2 cache. That'll be 2007, Kentsfield
and another TekSpek.