X86 architecture
x86 or
80x86 is the generic name of a
microprocessor architecture first developed and manufactured by
Intel. The x86 architecture has dominated the desktop computer, portable computer, and small server markets since the 1980s
IBM PC, running primarily versions of
Microsoft Windows and
Unix variant
operating systems. Although more modern architectures such as PowerPC have challenged the x86 as a replacement for many niches, none have so far supplanted the x86 for its core markets.
|
An Intel Pentium 4 chip; early Northwood build |
The architecture is called
x86 because the earliest processors in this family were identified by model numbers ending in the sequence "86": the
8086, the
80186, the
80286, the
386, and the
486. Because one cannot establish
trademark rights on numbers, Intel and most of its competitors began to use trademark-acceptable names such as
Pentium for subsequent generations of processors, but the earlier naming scheme remains as a term for the entire family.
Minicomputers during the late 1970s were running up against the 16 bit 64k byte address limit as memory became cheaper to install. Most minicomputer companies redesigned their processors to fully handle 32 bits addressing and data. But the Intel 8086 would instead adopt a much criticized stopgap concept of segment registers which effectively raised the memory address limit by 4 bits from 16 bits / 64K to 20 bits / 1 megabyte. Data and code could be managed within "near" 16-bit segments within a larger 1M address space, or a compiler could operate in a "far" mode using both segment and offset. While that limit would also prove to be too small by the mid 1980s, it was ideal for the emerging PC market, and made it very simple to translate software from the older 8080 to the newer processor.
As hardware has evolved, the architecture has twice been extended to a larger
word size. In 1985, Intel released the 32-bit 386 to replace the 16-bit 286. The 32-bit architecture is called
x86-32 or
IA-32 (an abbreviation for
Intel
Architecture, 32-bit). In 2003,
AMD introduced the
Athlon 64, which implemented a further extension to the architecture to 64 bits, variously called
x86-64,
AMD64 (AMD),
EM64T or
IA-32e (Intel), and
x64 (
Microsoft), not to be confused with
IA-64.
The x86 architecture first appeared inside the Intel
8086 CPU in 1978; the 8086 was a development of the
Intel 8080 processor (which itself followed the
4004 and
8008), and programs in 8080 assembler language could be mechanically translated to equivalent programs in 8086 assembler language. It was adopted (in the externally simpler 8-bit bus
8088 version) three years later as the standard CPU of the
IBM PC. The ubiquity of the PC platform has resulted in the x86 becoming numerically the most successful CPU architecture ever. (Another successful CPU design, based on and instruction-set compatible at the machine-language binary level with the 8080, is the
Zilog Z80 architecture.)
Companies such as
Cyrix,
NEC Corporation,
IBM,
IDT and
Transmeta have manufactured
CPUs conforming to the x86 architecture. The most successful of the clone manufacturers is
AMD, whose
Athlon series, while not as popular as the
Pentium series, has a significant marketshare.
Intel introduced the
IA-64, a separate 64-bit architecture used in its
Itanium processors and Itanium Processor Family (IPF). IA-64 is a completely new system that bears no resemblance to the x86 architecture, which might affect its marketplace acceptance; it should not be confused with
IA-32, which is synonymous with the 32-bit version of x86.
The x86 architecture is a variable instruction length
CISC design with emphasis on
backward compatibility. Word sized memory access is allowed to unaligned memory addresses. Words are stored in the
little-endian order. During
execution, current x86 processors employ a few extra decoding steps to split most instructions into smaller pieces, micro-ops, which are readily executed by a
RISC-like micro-architecture.
The Intel 8086 and 8088 have 14 16-
bit registers. Four of them (AX, BX, CX, DX) are general purpose (although each have an additional purpose; for example only CX can be used as a counter with the
loop instruction). Each can be accessed as two separate bytes (thus BX's high byte can be accessed as BH and low byte as BL). Four segment registers (CS, DS, SS and ES) are used to form a memory address. There are two pointer registers. SP points to the bottom of the stack and BP which is used to point at some other place in the stack or the memory. Two registers (SI and DI) are for array indexing.The
FLAGS register contains
flags such as
carry,
overflow and zero. Finally, the instruction pointer (IP) points to the current instruction.
The 8086 has 64
KB of 8-bit (or alternatively 32 K-word of 16-bit)
I/O space, and a 64 KB (one segment)
stack in memory supported by
hardware. Only words (2 bytes) can be pushed to the stack. The stack grows downwards (toward numerically lower addresses), its bottom being pointed by SS:SP. There are 256
interrupts, which can be invoked by both hardware and software. The interrupts can cascade, using the stack to store the return address.
Real mode
Real mode is an operating mode of
80286 and later
x86-compatible
CPUs. Real mode is characterized by a 20 bit segmented memory address space (meaning that only 1
MB of memory can be addressed), direct software access to
BIOS routines and peripheral hardware, and no concept of
memory protection or
multitasking at the hardware level. All x86 CPUs in the
80286 series and later start up in real mode at power-on;
80186 CPUs and earlier had only one operational mode, which is equivalent to real mode in later chips.
In real mode, memory access is
segmented. This is done by shifting the segment address left by 4 bits and adding an offset in order to receive a final 20-bit address. For example, if DS is A000h and SI is 5677h, DS:SI will point at the absolute address DS × 16 + SI = A5677h. Thus the total address space in real mode is 2
20 bytes, or 1
MiB, quite an impressive figure for 1978. All memory addresses consist of both a segment and offset; every type of access (code, data, or stack) has a default segment register associated with it (for data the register is usually DS, for code it is CS, and for stack it is SS). For data accesses, the segment register can be explicitly specified (using a segment override prefix) to use any of the four segment registers.
In this scheme, two different segment/offset pairs can point at a single absolute location. Thus, if DS is A111h and SI is 4567h, DS:SI will point at the same A5677h as above. This scheme makes it impossible to use more than four segments at once. CS and SS are vital for the correct functioning of the program, so that only DS and ES can be used to point to data segments outside the program (or, more precisely, outside the currently-executing segment of the program) or the stack. This scheme, which was intended as a compatibility measure with the
Intel 8085.
The segmented nature can make programming and compilers design difficult because the use of near and far pointers affect performance. The introduction of bank switching schemes such as EEMS made programming even more complicated before the adoption of 32 bit addressing methods with later processors.
16-bit protected mode
In additon to real mode, the Intel 80286 supports protected mode, expanding addressable
physical memory to 16
MB and addressable
virtual memory to 1
GB. This is done by using the segment registers only for storing an index to a segment table. There were two such tables, the
GDT and the
LDT, each holding up to 8192 segment descriptors, each segment giving access to 64 KB of memory. The segment table provided a 24-bit
base address, which can be added to the desired offset to create an absolute address. Each segment can be assigned one of four
ring levels used for hardware-based
computer security.
Because real mode
DOS programs may do direct hardware access or perform segment arithmetic, both incompatible with protected mode, an operating system (OS) is limited in its ability to run these applications as
processes. To overcome these difficulties, Intel introduced the 80386 with
virtual 8086 mode. While still subject to paging, it uses real mode to form linear addresses and allows the OS to
trap both I/O and memory access. By design, protected mode programs do not assume a relation between selector values and physical addresses.
Operating systems like
OS/2 try to switch the processor between protected and real modes. This is both slow and unsafe, because a real mode program can easily
crash a computer. OS/2 defines restrictive programming rules allowing a
Family API or
bound program to run in either real or protected mode.
Windows 3.0 should run real mode programs in 16-bit protected mode. If a Windows 1.x or 2.x program is written properly and avoids segment arithmetic, it will run indifferently in both real and protected modes. Windows programs generally avoid segment arithmetic because Windows implements a software virtual memory scheme, moving program code and data in memory when programs are not running, so manipulating absolute addresses is dangerous; programs should only keep
handles to memory blocks when not running. Starting an old program while Windows 3.0 is running in protected mode triggers a warning dialog, suggesting to either run Windows in real mode or to obtain an updated version of the application. Updating well-behaved programs using a special tool avoids this dialog. It is not possible to have some GUI programs running in 16-bit protected mode and other GUI programs running in real mode. In
Windows 3.1 real mode disappeared.
32-bit protected mode
The
Intel 80386 introduced a significant advance in x86 architecture: an all
32-bit design supporting
paging. All of the registers, instructions, I/O space and memory are 32-bit. Memory is accessed through a 32-bit extension of protected mode. As in the 286, segment registers are used to index a segment table describing the division of memory. With a 32-bit offset, every application may access up to 4
GB (or more with
memory segments). In addition, 32-bit protected mode supports
paging, a mechanism making it possible to use
virtual memory. An exception to this design is the
Intel 80386SX, which is 32-bit with
24-bit addressing and a
16-bit data bus.
No new general-purpose registers were added. All 16-bit registers except the segment registers were expanded to 32 bits. This is represented by prefixing an "E" to the register
opcodes (thus the expanded AX became EAX, SI became ESI and so on). With a greater number of registers, instructions and operands, the
machine code format was expanded. To provide backward compatibility, segments with executable code can be marked as containing either 16 or 32 bit instructions. Special prefixes allow inclusion of 32-bit instructions in a 16-bit segment or vice versa.
Paging and segmented memory access are required for modern multitasking operating systems.
Linux,
386BSD and
Windows NT were developed for the 386 because it was the first CPU to support paging and 32-bit segment offsets. The 386 architecture became the basis of all further development in the x86 series. The success of Windows 3.1, the first widely accepted version, was largely because of compatibility with the 386 processor, even though it was used mainly to run multiple sessions rather than to take advantage of the native 32-bit
instruction set.
The
Intel 80387 math co-processor was integrated into the next CPU in the series, the Intel 80486 (the 486SX, sold as a budget processor, had its co-processor disabled or removed). The new
floating point unit (FPU) makes
floating point calculations, important for scientific applications and graphic design.
MMX and beyond
MMX is a
SIMD instruction set designed by Intel, introduced in 1997 for
Pentium MMX microprocessors. It developed out of a similar unit first used on the
Intel i860. It is supported on most subsequent IA-32 processors by Intel and other vendors. MMX is typically used for video applications.
MMX added 8 new
64-bit registers to the architecture, known as MM0 through MM7 (generically MMn). In reality, these new registers are aliases for the existing x87 FPU stack registers. Hence, anything done to the floating point stack also affects the MMX registers. Unlike the floating point stack, these MMn registers are
randomly accessible.
3DNow!
In 1997 AMD introduced 3DNow! which consisted of SIMD floating point instruction enhancements to MMX. The introduction of this technology coincided with the rise of
3D entertainment applications and was designed to improve the CPU's
vector processing performance of graphic-intensive applications. 3D video game developers and 3D graphics hardware vendors use 3DNow! to enhance their performance on AMD's
K6 and
Athlon series of processors.
SSE
In 1999 Intel introduced the Streaming SIMD Extensions (SSE)
instruction set which added eight new 128 bit registers (not overlayed with other registers) and 70 floating point instructions.
SSE2
In 2000 Intel introduced the SSE2 instruction set which added 1) a complete complement of integer instructions (analogous to MMX) to the original SSE registers and 2) 64-bit SIMD floating point instructions to the original SSE registers. The first addition made MMX almost obsolete, and the second allowed the instructions to be realistically targeted by conventional compilers.
SSE3
Introduced in
2004 along with the
Prescott revision of the
Pentium 4 processor, SSE3 added specific memory and
thread-handling instructions to boost the performance of Intel's
HyperThreading technology.
AMD later licensed the SSE3 instruction set for its latest (E) revision Athlon 64 processors. The SSE3 instruction set included on the new Athlons are only lacking a couple of the instructions that Intel designed for HyperThreading, since the
Athlon 64 does not support HyperThreading; however SSE3 is still recognized in software as being supported on the platform.
64-bit
By 2002, it was obvious that the 32-bit address space of the x86 architecture was limiting its performance in applications requiring large data sets. A 32-bit address space would allow the processor to directly address only 4 GB of data - a size frequently surpassed by applications such as video processing or
database engines.
Intel had originally decided not to extend x86 to 64-bit as they had to 32-bits, and instead introduced a new architecture called
IA-64. IA-64 technology is the basis for its
Itanium line of processors. IA-64 provides a backward compatibility for older 32-bit x86; this mode of operation, however, is exceedingly slow.
AMD, which traditionally would follow the lead of Intel, took the initiative of extending the 32-bit x86 (which Intel calls
IA-32) to
64-bit. It came up with an architecture, called
AMD64 (or
x86-64, prior to rebranding), and based the
Opteron and
Athlon 64 family of processors on this technology. The success of the AMD64 line of processors coupled with the lukewarm reception of the IA-64 architecture prompted Intel to ironically reverse-engineer and adopt the AMD64 instruction set, adding some new extensions of its own and branding it the
EM64T architecture. In its literature and product version names, Microsoft refers to this processor architecture as x64. By 2006, it was mainly used in very high end servers, though the small additional cost and performance as a 32 bit processor with growth potential made it a competitive offering in desktop and laptop PCs as well.
This was the first time that a major upgrade of the x86 architecture was initiated and originated by a manufacturer other than Intel. Perhaps more importantly, it was the first time that Intel actually accepted technology of this nature from an outside source.
Virtualization
x86
virtualization is difficult because the architecture does not meet the
Popek and Goldberg virtualization requirements. Nevertheless, there are several commercial
x86 virtualization products, such as
VMware,
Parallels and
Microsoft Virtual PC, as well as open source virtualization projects like and
Qemu. Intel and AMD have both announced that future x86 processors will have new enhancements to facilitate more efficient virtualization. Intel's code names for their virtualization features are "Vanderpool" and "Silvervale"; AMD uses the code name "Pacifica".
An x86 system-on-a-chip is a combination of an x86 CPU
core with a
northbridge (
memory controller) and a
southbridge (input/output (I/O) controller) in a single
integrated circuit (IC).
x86 and compatibles have been designed, manufactured and sold by a number of companies, including:
*
Intel*
AMD*
Chips and Technologies*
Cyrix*
IBM*
IDT*
National Semiconductor*
NEC*
NexGen*
Rise Technology*
SGS-Thomson*
SiS*
Texas Instruments*
Transmeta*
UMC*
VIA*
8086 - first member is
Intel 8086 (and derivates), later multiple clones appeared.
*
80186 - first member is
Intel 80186 (and derivates), later multiple clones appeared.
*
80286 - first member is
Intel 80286, later multiple clones appeared.
*
80386 - first member is
Intel 80386 (and derivates), later multiple clones appeared.
*
80486 - first member is
Intel 80486 (and derivates), later multiple clones appeared
*
80586 - first member is
Pentium (and derivates), later appeared
Nx586,
5x86,
5k86,
WinChip,
mP6*
80686 - first member is
Pentium Pro (and derivates, incl.
Pentium M and
Core), later appeared
6x86,
K6,
C3,
Crusoe*
80786 - first member is
Athlon (and derivates), later appeared
Pentium 4 (and derivates),
C7,
Efficeon*
80886 - first member is
Opteron (and derivates, incl.
Athlon 64), later appeared
Core 2*
IA-32*
x86 assembly language*
x86 instruction listings*
x87*
Real mode —
Unreal mode —
Virtual 8086 mode —
Protected mode —
Long mode*
8086/80186/80286/80386/80486 Instruction Set*
x86 cpus' guide*
x86 and x86-64 Instruction Set at sandpile.org*
AMD Geode Series*
The ChipList – By Adrian Offerman
*
CPU-INFO: x86 processor information and indepth processor history*
VIA bought IDT CPU division*
List of SOC List of System-On-Chip (
SOC) based on X86 core.
*
National Instrument Geode