Bit
A
bit refers to a
digit in the
binary numeral system (
base 2). For example, the number 1001011 is 7 bits long. Binary digits are almost always used as the basic unit of
information storage and
communication in digital
computing and digital
information theory. Information theory also often uses the natural digit, called either a
nit or a
nat.
Quantum computing also uses
qubits, a single piece of information with a probability of being true.
The bit is also a unit of measurement, the information capacity of one binary digit. It has the symbol
bit, and less formally
b (see discussion below). The unit is also known as the
shannon, with symbol
Sh.
Claude E. Shannon first used the word
bit in a 1948 paper. He attributed its origin to
John W. Tukey, who had written a Bell Labs memo in 9 January 1947 in which he contracted "binary digit" to simply "bit", forming a
portmanteau. Interestingly,
Vannevar Bush had written in 1936 of "bits of information" that could be stored on the
punch cards used in the mechanical computers of that time.
[Darwin among the machines: the evolution of global intelligence, George Dyson, 1997. ISBN 0-201-40649-7]A bit of storage is like a light switch; it can be either on (1) or off (0). A single bit is a one or a zero, a true or a false, a "flag" which is "on" or "off", or in general, the quantity of information required to distinguish two mutually exclusive
states from each other.
The bit is the smallest unit of storage used in computing.
It is important to differentiate between the use of "bit" in referring to a discrete storage unit and the use of "bit" in referring to a statistical unit of information. The bit, as a discrete storage unit, can by definition store only 0 or 1. A statistical bit is the amount of information that,
on average, can be stored in a discrete bit. It is thus the amount of information carried by a choice between two equally likely outcomes. One bit corresponds to about 0.693
nats (ln(2)), or 0.301
hartleys (log
10(2)).
Consider, for example, a
computer file with 1,000 0s and 1s which can be
losslessly compressed to a file of 500 0s and 1s (on average, over all files of that kind). The original file, although having 1,000 bits of storage, has at most 500 bits of
information entropy, since information is not destroyed by lossless compression. A file can have no more information theoretical bits than it can storage bits. If these two ideas need to be distinguished, sometimes the name
bit is used when discussing data storage while
shannon is used for the statistical bit. However, most of the time, the meaning is clear from the context.
No uniform agreement has been reached yet about what the official unit symbols for bit and
byte should be. One commonly-quoted standard, the
International Electrotechnical Commission's
IEC 60027, specifies that "bit" should be the unit symbol for the unit bit (e.g. "kbit" for kilobit), but it does not yet define any symbol for the unit byte.
The other commonly-quoted relevant standard,
IEEE 1541, specifies "b" to be the unit symbol for bit and "B" to be that for byte. This convention is also widely used in computing, but has so far not been considered acceptable internationally for several reasons:
* both these symbols are already used for other units: "b" for
barn and "B" for
bel;
* "bit" is already short for "binary digit", so there is little reason to abbreviate it any further;
* it is customary to start a unit symbol with an uppercase letter only if the unit was named after a person (see also
Claude Émile Jean-Baptiste Litre);
* instead of byte, the term
octet (unit symbol: "o") is used in some fields and in some French-speaking countries, which adds to the difficulty of agreeing on an international symbol;
* "b" is occasionally also used for byte, along with "bit" for bit.
The unit bel is rarely used by itself (only as decibel, "dB"), so the chances of conflict with "B" for byte are quite small, even though both units are very commonly used in the same fields (e.g., telecommunication).
The combination of the symbols "bit" for bit and "B" for byte is also widely used in computing, and is perhaps least ambiguous. It is proposed, for example, in Aubrey Jaffer's
Metric Interchange Format.
A
byte is a collection of bits, originally variable in size but now almost always eight bits. Eight-bit bytes, also known as
octets, can represent 256 values (2
8 values, 0â€"255). A four-bit quantity is known as a
nibble, and can represent 16 values (2
4 values, 0â€"15).
"
Word" is a term for a slightly larger group of bits, but it has no standard size. It represents the size of one register in a
Computer-
CPU. In the
IA-32 architecture, 16 bits are called a "word" (with 32 bits being a
double word or dword), but other architectures have word sizes of 8, 32, 64, 80 or others.
Terms for large quantities of bits can be formed using the standard range of prefixes, e.g.,
kilobit (
kbit) (1,000 bits),
megabit (
Mbit) and
gigabit (
Gbit). Note that much confusion exists regarding these units and their abbreviations (see above).
Certain
bitwise computer
processor instructions (such as
xor) operate at the level of manipulating bits rather than manipulating data interpreted as an aggregate of bits.
Telecommunications or
computer network transfer rates are usually described in terms of
bits per second (not to be confused with
baud).
*
Integral data type*
Bitstream*
Information entropy*
Qubit*
Binary arithmetic*
Ternary numeral system