2724 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 6, OCTOBER 1998
Quantum Information Theory
Charles H. Bennett and Peter W. Shor
(Invited Paper)
Abstract—We survey the field of quantum information theory.
In particular, we discuss the fundamentals of the field, source
coding, quantum error-correcting codes, capacities of quantum
channels, measures of entanglement, and quantum cryptography.
Index Terms— Entanglement, quantum cryptography, quan-
tum error-correcting codes, quantum information, quantum
source coding.
I. INTRODUCTION
RECENTLY, the historic connection between informationand physics has been revitalized, as the methods of
information and computation theory have been extended to
treat the transmission and processing of intact quantum states,
and the interaction of such “quantum information” with tradi-
tional “classical” information. Although many of the quantum
results are similar to their classical analogs, there are notable
differences. This new research has the potential to shed light
both on quantum physics and on classical information theory.
In retrospect, this development seems somewhat belated,
since quantum mechanics has long been thought to underlie
all classical processes. But until recently, information itself
had largely been thought of in classical terms, with quantum
mechanics playing a supporting role of helping design the
equipment used to process it, setting limits on the rate at
which it could be sent through certain quantum channels.
Now we know that a fully quantum theory of information and
information processing offers, among other benefits, a brand
of cryptography whose security rests on fundamental physics,
and a reasonable hope of constructing quantum computers that
could dramatically speed up the solution of certain mathemat-
ical problems. These feats depend on distinctively quantum
properties such as uncertainty, interference, and entanglement.
At a more fundamental level, it has become clear that an
information theory based on quantum principles extends and
completes classical information theory, somewhat as complex
numbers extend and complete the reals. The new theory
includes quantum generalizations of classical notions such as
sources, channels, and codes, as well as two complementary,
quantifiable kinds of information—classical information and
quantum entanglement. Classical information can be copied
freely, but can only be transmitted forward in time, to a
Manuscript received January 4, 1998; revised June 2, 1998.
C. H. Bennett is with the IBM Research Division, T. J. Watson
Research Center, Yorktown Heights, NY 10598 USA (e-mail: ben-
netc@watson.ibm.com).
P. W. Shor is with the AT&T Labs–Research, Florham Park, NJ 07932
USA (e-mail: shor@research.att.com).
Publisher Item Identifier S 0018-9448(98)06316-0.
receiver in the sender’s forward light cone. Entanglement, by
contrast, cannot be copied, but can connect any two points in
space–time. Conventional data-processing operations destroy
entanglement, but quantum operations can create it, preserve
it, and use it for various purposes, notably speeding up certain
computations and assisting in the transmission of classical
data (“quantum superdense coding”) or intact quantum states
(“quantum teleportation”) from a sender to a receiver.
Any means, such as an optical fiber, for delivering quantum
systems more or less intact from one place to another, may
be viewed as a quantum channel. Unlike classical channels,
such channels have three distinct capacities: a capacity
for transmitting classical data, a typically lower capacity
for transmitting intact quantum states, and a third capacity
, often between and , for transmitting intact quantum
states with the assistance of a two-way classical side-channel
between sender and receiver.
How, then, does quantum information, and the operations
that can be performed on it, differ from conventional digital
data and data-processing operations? A classical bit (e.g., a
memory element or wire carrying a binary signal) is generally
a system containing many atoms, and is described by one
or more continuous parameters such as voltages. Within this
parameter space two well-separated regions are chosen by the
designer to represent and , and signals are periodically
restored toward these standard regions to prevent them from
drifting away due to environmental perturbations, manufac-
turing defects, etc. An -bit memory can exist in any of
logical states, labeled to . Besides storing binary
data, classical computers manipulate it, a sequence of Boolean
operations (for example, NOT and AND) acting on the bits one
or two at a time being sufficient to realize any deterministic
transformation.
A quantum bit, or “qubit,” by contrast is typically a micro-
scopic system, such as an atom or nuclear spin or polarized
photon. The Boolean states and are represented by a fixed
pair of reliably distinguishable states of the qubit (for exam-
ple, horizontal and vertical polarizations: , ).
A qubit can also exist in a continuum of intermediate states, or
“superpositions,” represented mathematically as unit vectors in
a two-dimensional complex vector space (the “Hilbert space”
) spanned by the basis vectors and . For photons,
these intermediate states correspond to other polarizations, for
example,
0018–9448/98$10.00 1998 IEEE
BENNETT AND SHOR: QUANTUM INFORMATION THEORY 2725
and
(right circular polarization). Unlike the intermediate states
of a classical bit (e.g., voltages between the standard
and values), these intermediate states cannot be reliably
distinguished, even in principle, from the basis states. With
regard to any measurement which distinguishes the states
and , the superposition behaves like
with probability and like with probability . More
generally, two quantum states are reliably distinguishable
if and only if their vector representations are orthogonal;
thus and are reliably distinguishable by one type of
measurement, and and by another, but no measurement
can reliably distinguish from . Multiplying a state vector
by an arbitrary phase factor does not change its physical
significance: thus although they are usually represented by
unit vectors, quantum states are more properly identified with
rays, a ray being the equivalence class of a vector under
multiplication by a complex constant.
It is convenient to use the so-called bracket or bra-ket no-
tation, in which the inner product between two -dimensional
vectors and is denoted
where the asterisk denotes complex conjugation. This may
be thought of as matrix product of the row vector
, by a column vector
.
.
.
where for any standard column (or “ket”) vector , its row
(or “bra”) representation is obtained by transposing and
taking the complex conjugate.
A pair of qubits (for example, two photons in different
locations) is capable of existing in four basis states, , ,
, and , as well as all possible superpositions of them.
States of a pair of qubits thus lie in a four-dimensional Hilbert
space. This space contains states like
which can be interpreted in terms of individual polarizations
for the two photons, as well as “entangled” states, i.e., states
like
in which neither photon by itself has a definite state, even
though the pair together does.
More generally, where a string of classical bits could exist
in any of Boolean states through ,
a string of qubits can exist in any state of the form
(1)
where the are complex numbers such that .
In other words, a quantum state of qubits is represented by a
complex unit vector (more properly a ray, since multiplying
by a phase factor does not change its physical meaning) in
the -dimensional Hilbert space , defined as
the tensor product of copies of the two-dimensional Hilbert
space representing quantum states of a single qubit. The
exponentially large dimensionality of this space distinguishes
quantum computers from classical analog computers, whose
state is described by a number of parameters that grows
only linearly with the size of the system. This is because
classical systems, whether digital or analog, can be completely
described by separately describing the state of each part. The
vast majority of quantum states, by contrast, are entangled,
and admit no such description. The ability to preserve and
manipulate entangled states is the distinguishing feature of
quantum computers, responsible both for their power and for
the difficulty of building them.
An isolated quantum system evolves in such a way as to
preserve superpositions and distinguishability; mathematically,
such evolution is a unitary (i.e., linear and inner-product-
conserving) transformation, the Hilbert-space analog of rigid
rotation in Euclidean space. Unitary evolution and superposi-
tion are the central principles of quantum mechanics.
Just as any classical computation can be expressed as
a sequence of one- and two-bit operations (e.g., NOT and
AND gates), any quantum computation can be expressed as
a sequence of one- and two-qubit quantum gates, i.e., unitary
operations acting on one or two qubits at a time [1] (cf. Fig. 1).
The most general one-qubit gate is described by a unitary
matrix1 mapping to and to .
One-qubit gates are easily implementable physically, e.g., by
quarter- and half-wave plates acting on polarized photons, or
by radio-frequency tipping pulses acting on nuclear spins in
a magnetic field.
The standard two-qubit gate is the controlled-NOT or XOR
gate, which flips its second (or “target”) input if its first
(“control”) input is and does nothing if the first input is
. In other words, it interchanges and while leaving
and unchanged. The XOR gate is represented by the
unitary matrix
Unlike one-qubit gates, two-qubit gates are difficult to realize
in the laboratory, because they require two separate quantum
information carriers to be brought into strong and controlled
interaction. The XOR gate, together with the set of one-bit
gates, form a universal set of primitives for quantum com-
putation; that is, any quantum computation can be performed
using just this set of gates without an undue increase in the
number of gates used [1].
1A complex matrix is called unitary and represents a unitary transformation,
iff its rows are orthogonal unit vectors. The inverse of any unitary matrix U
is given by its adjoint, or conjugate transpose Uy.
2726 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 6, OCTOBER 1998
(a)
(b) (c)
(d)
Fig. 1. (a) Any unitary operation U on quantum data can be synthesized from
the two-qubit XOR gate and one-qubit unitary operations (U). (b) The XOR
can clone Boolean-valued inputs, but if one attempts to clone intermediate
superposition, an entangled state (shading) results instead. (c) A classical wire
(thick line) conducts 0 and 1 faithfully but not superpositions or entangled
states. It may be defined as a quantum wire that interacts (via an XOR)
with an ancillary 0 qubit which is then discarded. (d) The most general
treatment, or superoperator, that can be applied to quantum data is a unitary
interaction with one or more 0 qubits, followed by discarding some of the
qubits. Superoperators are typically irreversible.
The XOR gate is a prototype interaction between two quan-
tum systems, and illustrates several key features of quantum
information, in particular the impossibility of “cloning” an
unknown quantum state, and the way interaction produces
entanglement. If the XOR is applied to Boolean data in which
the second qubit is and the first is or (cf. Fig. 1(b)) the
effect is to leave the first qubit unchanged while the second
becomes a copy of it: for or
. One might suppose that the XOR operation could also be
used to copy superpositions, such as , so
that would yield , but this is not so. The
unitarity of quantum evolution requires that a superposition
of input states evolve to a corresponding superposition of
outputs. Thus the result of applying to must
be , an entangled state in which neither
output qubit alone has definite state. If one of the entangled
output qubits is lost (e.g., discarded, or allowed to escape into
the environment), the other thenceforth behaves as if it had
acquired a random classical value (with probability ) or
(with probability ). Unless the lost output is brought back
into play, all record of the original superposition will have
been lost. This behavior is characteristic not only of the XOR
gate but of unitary interactions generally: their typical effect
is to map most unentangled initial states of the interacting
systems into entangled final states, which from the viewpoint
of either system alone causes an unpredictable disturbance.
Since quantum physics underlies classical, there should
be a way to represent classical data and operations within
the quantum formalism. If a classical bit is a qubit having
the value or , a classical wire should be a wire that
conducts and reliably, but not superpositions. This can
be implemented using the XOR gate as described above, with
an initial in the target position which is later discarded
(Fig. 1(c)). In other words, from the viewpoint of quantum
information, classical communication is an irreversible process
in which the signal interacts enroute with an environment
or eavesdropper in such a way that Boolean signals pass
through undisturbed, but other states become entangled with
the environment. If the environment is lost or discarded, the
surviving signal behaves as if it had irreversibly collapsed onto
one of the Boolean states. Having defined a classical wire, we
can then go on to define a classical gate as a quantum gate with
classical wires on its inputs and outputs. The classical wire of
Fig. 1(c) is an example of quantum information processing
in an open system. Any processing that can be applied to
quantum data, including unitary processing as a special case,
can be described (Fig. 1(d)) as a unitary interaction of the
quantum data with some ancillary qubits, initially in a standard
state, followed by discarding some of the qubits. Such a
general quantum data processing operation (also called a trace-
preserving completely positive map or superoperator [47],
[64]) can therefore have an output Hilbert space larger or
smaller than its input space (for unitary operations, the input
and output Hilbert spaces are, of course, equidimensional).
Paradoxically, entangling interactions with the environment
are thought to be the main reason why the macroscopic world
seems to behave classically and not quantum-mechanically
[74]. Macroscopically different states, e.g., the different charge
states representing and in a VLSI memory cell, interact so
strongly with their environment that information rapidly leaks
out as to which state the cell is in. Therefore, even if it were
possible to prepare the cell in superposition of and , the
superposition would rapidly evolve into a complex entangled
state involving the environment, which from the viewpoint
of the memory cell would appear as a statistical mixture,
rather than a superposition, of the two classical values. This
spontaneous decay of superpositions into mixtures is called
decoherence.
The quantum states we have been talking about so far,
identified with rays in Hilbert space, are called pure states.
They represent situations of minimal ignorance, in which, in
principle, there is nothing more to be known about the system.
Pure states are fundamental in that the quantum mechanics
of a closed system can be completely described as a unitary
evolution of pure states in an appropriately dimensioned
Hilbert space, without need of further notions. However, a
very useful notion, the mixed state has been introduced to
deal with situations of greater ignorance, in particular
• an ensemble in which the system in question may be in
any of several pure states , , with specified
probabilities ;
• a situation in which the system in question (call it ) is
part of larger system (call it ), which itself is in an
entangled pure state .
BENNETT AND SHOR: QUANTUM INFORMATION THEORY 2727
In open systems, a pure state may naturally evolve into a mixed
state (which can also be described as a pure state of a larger
system comprising the original system and its environment).
Mathematically, a mixed state is represented by a positive-
semidefinite, self-adjoint density matrix , having unit trace,
and being defined in the first situation by
(2)
and in the second situation by
(3)
Here denotes a partial trace over the indices of the
subsystem. A pure state is represented in the density-matrix
formalism by the rank-one projection matrix .
It is evident, in the first situation, that infinitely many
different ensembles can give rise to the same density matrix.
For example, the density matrix
may be viewed as an equal mixture of the pure states
and , or as an equal mixture of and
, or indeed as any other equal mixture of two
orthogonal single-qubit pure states. Similarly, in the second
situation, it is evident that infinitely many different pure states
of the system can give rise to the same density matrix
for the subsystem. One may therefore wonder in what
sense a density matrix is an adequate description of a statistical
ensemble of pure states, or of part of a larger system in a
pure entangled state. The answer is that the density matrix
captures all and only that information that can be obtained by
an observer allowed to examine infinitely many states sampled
from the ensemble , or given infinitely many opportunities to
examine part of an system prepared in entangled pure
state . This follows from the elementary fact that for any test
vector , if a specimen drawn from ensemble is
tested for whether it is in state , the probability of a positive
outcome is
Similarly, for any test state of the subsystem, the
probability that an entangled state of the system having
as its partial trace will give a positive outcome is simply
.
Perhaps more remarkable than the indistinguishability of
the different ensembles compatible with is the fact that any
of them can be produced at will starting from any entangled
state of the system having as its partial trace. More
specifically, if two parties (call them Alice and Bob) are in
possession of the and parts, respectively, of a system
in state , then for each compatible ensemble
that Bob might wish to create in Alice’s hands, there is a
measurement he can perform on the subsystem alone,
without Alice’s knowledge or cooperation, that will realize that
ensemble in the sense that the measurement yields outcome
with probability , and conditionally on that outcome having
occurred, Bob will know that Alice holds pure state . Bob’s
ability to decide Alice’s ensemble in this unilateral, post facto
fashion, has an important bearing on quantum cryptography
as will be discussed later.
Since a mixed state represents incomplete information, it is
natural to associate with any mixed state an entropy, given by
the von Neumann formula
(4)
If the pure states comprising an ensemble are orthogonal,
then they are mutually distinguishable, and can thus be treated
as classical states. In this situation, the von Neumann entropy
is equal to the Shannon entropy of the probabilities
When the pure states are nonorthogonal, and thus not
wholly distinguishable as physical states, the ensemble’s von
Neumann entropy is less than the Shannon entropy.
It is not hard to show that for any bipartite pure state , the
density matrices and of its parts have equal rank and
equal spectra of nonzero eigenvalues. Moreover, the original
state has an especially simple expression in terms of these
eigenvalues and eigenvectors
(5)
where and are eigenvectors of and , respec-
tively, corresponding to the positive eigenvalues . This
expression, known as the Schmidt decomposition, unfortu-
nately has no simple counterpart for tripartite and higher
systems.
The recent rapid progress in the theory of quantum informa-
tion processing can be d