Tuesday, 15 January 2013

Who Created the Internet Network?

Question: Who Created the Internet Network?
Development of the technologies that became the Internet began decades ago. The development of the World Wide Web (WWW) portion of the Internet happened much later, although many people consider this synonymous with creating the Internet itself.
Answer: No single person or organization created the modern Internet, including Al Gore, Lyndon Johnson, or any other individual. Instead, multiple people developed the key technologies that later grew to become the Internet:
  • Email - Long before the World Wide Web, email was the dominant communication method on the Internet. Ray Tomlinson developed in 1971 the first email system that worked over the early Internet.

  • Ethernet - The physical communication technology underlying the Internet, Ethernet was created by Robert Metcalfe and David Boggs in 1973.

  • TCP/IP - In May, 1974, the Institute of Electrical and Electronic Engineers (IEEE) published a paper titled "A Protocol for Packet Network Interconnection." The paper's authors - Vinton Cerf and Robert Kahn - described a protocol called TCP that incorporated both connection-oriented and datagram services. This protocol later became known as TCP/IP.

Capacity of wireless channels


5 Capacity of wireless channels
In the previous two chapters, we studied specific techniques for communication
over wireless channels. In particular, Chapter 3 is centered on the
point-to-point communication scenario and there the focus is on diversity as
a way to mitigate the adverse effect of fading. Chapter 4 looks at cellular
wireless networks as a whole and introduces several multiple access and
interference management techniques.
The present chapter takes a more fundamental look at the problem of
communication over wireless fading channels. We ask: what is the optimal
performance achievable on a given channel and what are the techniques to
achieve such optimal performance? We focus on the point-to-point scenario in
this chapter and defer the multiuser case until Chapter 6. The material covered
in this chapter lays down the theoretical basis of the modern development in
wireless communication to be covered in the rest of the book.
The framework for studying performance limits in communication is information
theory. The basic measure of performance is the capacity of a channel:
the maximum rate of communication for which arbitrarily small error
probability can be achieved. Section 5.1 starts with the important example
of the AWGN (additive white Gaussian noise) channel and introduces
the notion of capacity through a heuristic argument. The AWGN channel
is then used as a building block to study the capacity of wireless
fading channels. Unlike the AWGN channel, there is no single definition
of capacity for fading channels that is applicable in all scenarios. Several
notions of capacity are developed, and together they form a systematic
study of performance limits of fading channels. The various capacity
measures allow us to see clearly the different types of resources available
in fading channels: power, diversity and degrees of freedom. We will see
how the diversity techniques studied in Chapter 3 fit into this big picture.
More importantly, the capacity results suggest an alternative technique,
opportunistic communication, which will be explored further in the later
chapters.
166
167 5.1 AWGN channel capacity
5.1 AWGN channel capacity
Information theory was invented by Claude Shannon in 1948 to characterize
the limits of reliable communication. Before Shannon, it was widely believed
that the only way to achieve reliable communication over a noisy channel,
i.e., to make the error probability as small as desired, was to reduce the data
rate (by, say, repetition coding). Shannon showed the surprising result that
this belief is incorrect: by more intelligent coding of the information, one
can in fact communicate at a strictly positive rate but at the same time with
as small an error probability as desired. However, there is a maximal rate,
called the capacity of the channel, for which this can be done: if one attempts
to communicate at rates above the channel capacity, then it is impossible to
drive the error probability to zero.
In this section, the focus is on the familiar (real) AWGN channel:
ym = xm+wm (5.1)
where xm and ym are real input and output at time m respectively and wm
is 02 noise, independent over time. The importance of this channel is
two-fold:
• It is a building block of all of the wireless channels studied in this book.
• It serves as a motivating example of what capacity means operationally and
gives some sense as to why arbitrarily reliable communication is possible
at a strictly positive data rate.
5.1.1 Repetition coding
Using uncoded BPSK symbols xm = ±

P, the error probability is
QP/2. To reduce the error probability, one can repeat the same
symbol N times to transmit the one bit of information. This is a
repetition code of block length N, with codewords xA
=

P1    1t
and xB
=

P−1   −1t . The codewords meet a power constraint of
P joules/symbol. If xA is transmitted, the received vector is
y = xA
+w (5.2)
where w = w1    wNt . Error occurs when y is closer to xB than to
xA, and the error probability is given by
Q
xA
−xB

2
 = Q
NP
2

 (5.3)
which decays exponentially with the block length N. The good news is that
communication can now be done with arbitrary reliability by choosing a large
168 Capacity of wireless channels
enough N. The bad news is that the data rate is only 1/N bits per symbol
time and with increasing N the data rate goes to zero.
The reliably communicated data rate with repetition coding can be
marginally improved by using multilevel PAM (generalizing the two-level
BPSK scheme from earlier). By repeating an M-level PAM symbol, the levels
equally spaced between ±

P, the rate is logM/N bits per symbol time1 and
the error probability for the inner levels is equal to
Q


NP
M −1

 (5.4)
As long as the number of levels M grows at a rate less than

N, reliable
communication is guaranteed at large block lengths. But the data rate is
bounded by log

N/N and this still goes to zero as the block length
increases. Is that the price one must pay to achieve reliable communication?
5.1.2 Packing spheres
Geometrically, repetition coding puts all the codewords (the M levels) in just
one dimension (Figure 5.1 provides an illustration; here, all the codewords
are on the same line). On the other hand, the signal space has a large number
of dimensions N. We have already seen in Chapter 3 that this is a very
inefficient way of packing codewords. To communicate more efficiently, the
codewords should be spread in all the N dimensions.
We can get an estimate on the maximum number of codewords that can
be packed in for the given power constraint P, by appealing to the classic
sphere-packing picture (Figure 5.2). By the law of large numbers, the
N-dimensional received vector y=x+w will, with high probability, lie within
Figure 5.1 Repetition coding
packs points inefficiently in the
high-dimensional signal space.
√N(P + σ
2)
1 In this chapter, all logarithms are taken to be to the base 2 unless specified otherwise.
169 5.1 AWGN channel capacity
Figure 5.2 The number of
noise spheres that can be
packed into the y-sphere
yields the maximum number
of codewords that can be
reliably distinguished. Nσ
2 √NP
√N(P + σ
2)
a y-sphere of radius NP +2; so without loss of generality we need only
focus on what happens inside this y-sphere. On the other hand
1
N
N
   
m=1
w2m→2 (5.5)
as N →, by the law of large numbers again. So, for N large, the received
√vector y lies, with high probability, near the surface of a noise sphere of radius
N around the transmitted codeword (this is sometimes called the sphere
hardening effect). Reliable communication occurs as long as the noise spheres
around the codewords do not overlap. The maximum number of codewords
that can be packed with non-overlapping noise spheres is the ratio of the
volume of the y-sphere to the volume of a noise sphere:2
NP +2N


N2N  (5.6)
This implies that the maximum number of bits per symbol that can be reliably
communicated is
1
N
log


NP +2N


N2N


= 1
2
log1+ P
2

 (5.7)
This is indeed the capacity of the AWGN channel. (The argument might sound
very heuristic. Appendix B.5 takes a more careful look.)
The sphere-packing argument only yields the maximum number of codewords
that can be packed while ensuring reliable communication. How to construct
codes to achieve the promised rate is another story. In fact, in Shannon’s
argument, he never explicitly constructed codes. What he showed is that if
2 The volume of an N-dimensional sphere of radius r is proportional to rN and an exact
expression is evaluated in Exercise B.10.
170 Capacity of wireless channels
one picks the codewords randomly and independently, with the components
of each codeword i.i.d. 0P, then with very high probability the randomly
chosen code will do the job at any rate R < C. This is the so-called i.i.d.
Gaussian code. A sketch of this random coding argument can be found in
Appendix B.5.
From an engineering standpoint, the essential problem is to identify easily
encodable and decodable codes that have performance close to the capacity.
The study of this problem is a separate field in itself and Discussion 5.1
briefly chronicles the success story: codes that operate very close to capacity
have been found and can be implemented in a relatively straightforward way
using current technology. In the rest of the book, these codes are referred to
as “capacity-achieving AWGN codes”.
Discussion 5.1 Capacity-achieving AWGN channel codes
Consider a code for communication over the real AWGN channel in (5.1).
The ML decoder chooses the nearest codeword to the received vector as
the most likely transmitted codeword. The closer two codewords are to
each other, the higher the probability of confusing one for the other: this
yields a geometric design criterion for the set of codewords, i.e., place
the codewords as far apart from each other as possible. While such a set
of maximally spaced codewords are likely to perform very well, this in
itself does not constitute an engineering solution to the problem of code
construction: what is required is an arrangement that is “easy” to describe
and “simple” to decode. In other words, the computational complexity of
encoding and decoding should be practical.
Many of the early solutions centered around the theme of ensuring
efficient ML decoding. The search of codes that have this property leads to
a rich class of codes with nice algebraic properties, but their performance
is quite far from capacity. A significant breakthrough occurred when the
stringent ML decoding was relaxed to an approximate one. An iterative
decoding algorithm with near ML performance has led to turbo and low
density parity check codes.
A large ensemble of linear parity check codes can be considered in conjunction
with the iterative decoding algorithm. Codes with good performance
can be found offline and they have been verified to perform very close to
capacity.Toget a feel for their performance,weconsidersomesampleperformance
numbers. The capacity of the AWGN channel at 0 dB SNR is 0.5 bits
per symbol. The error probability of a carefully designedLDPCcode in these
operating conditions (rate 0.5 bits per symbol, and the signal-to-noise ratio is
equal to 0.1 dB) with a block length of 8000 bits is approximately 10−4. With
a larger block length, much smaller error probabilities have been achieved.
These modern developments are well surveyed in [100].
171 5.1 AWGN channel capacity
The capacity of the AWGN channel is probably the most well-known
result of information theory, but it is in fact only a special case of Shannon’s
general theory applied to a specific channel. This general theory is outlined
in Appendix B. All the capacity results used in the book can be derived from
this general framework. To focus more on the implications of the results in
the main text, the derivation of these results is relegated to Appendix B. In
the main text, the capacities of the channels looked at are justified by either
Figure 5.3 The three
communication schemes when
viewed in N-dimensional space:
(a) uncoded signaling: error
probability is poor since large
noise in any dimension is
enough to confuse the receiver;
(b) repetition code: codewords
are now separated in all
dimensions, but there are only
a few codewords packed in a
single dimension; (c)
capacity-achieving code:
codewords are separated in all
dimensions and there are many
of them spread out in the
space.
Summary 5.1 Reliable rate of communication and capacity
• Reliable communication at rate R bits/symbol means that one can design
codes at that rate with arbitrarily small error probability.
• To get reliable communication, one must code over a long block; this
is to exploit the law of large numbers to average out the randomness of
the noise.
• Repetition coding over a long block can achieve reliable communication,
but the corresponding data rate goes to zero with increasing block length.
• Repetition coding does not pack the codewords in the available degrees
of freedom in an efficient manner. One can pack a number of codewords
that is exponential in the block length and still communicate reliably.
This means the data rate can be strictly positive even as reliability is
increased arbitrarily by increasing the block length.
• The maximum data rate at which reliable communication is possible is
called the capacity C of the channel.
• The capacity of the (real) AWGN channel with power constraint P and
noise variance 2 is:
Cawgn
= 1
2
log1+ P
2

 (5.8)
and the engineering problem of constructing codes close to this performance
has been successfully addressed.
Figure 5.3 summarizes the three communication schemes discussed.
(a) (b) (c)
172 Capacity of wireless channels
transforming the channels back to the AWGN channel, or by using the type
of heuristic sphere-packing arguments we have just seen.
5.2 Resources of the AWGN channel
The AWGN capacity formula (5.8) can be used to identify the roles of the
key resources of power and bandwidth.
5.2.1 Continuous-time AWGN channel
Consider a continuous-time AWGN channel with bandwidth W Hz, power
constraint ¯P watts, and additive white Gaussian noise with power spectral
density N0/2. Following the passband–baseband conversion and sampling at
rate 1/W (as described in Chapter 2), this can be represented by a discretetime
complex baseband channel:
ym = xm+wm (5.9)
where wm is 0N0 and is i.i.d. over time. Note that since the noise is
independent in the I and Q components, each use of the complex channel can
be thought of as two independent uses of a real AWGN channel. The noise
variance and the power constraint per real symbol are N0/2 and ¯ P/2W
respectively. Hence, the capacity of the channel is
1
2
log1+
¯P
N0W
bits per real dimension (5.10)
or
log1+
¯P
N0W
bits per complex dimension (5.11)
This is the capacity in bits per complex dimension or degree of freedom.
Since there are W complex samples per second, the capacity of the continuoustime
AWGN channel is
Cawgn ¯ PW = W log1+
¯P
N0W
bits/s (5.12)
Note that SNR     = ¯P/N0W is the SNR per (complex) degree of freedom.
Hence, AWGN capacity can be rewritten as
Cawgn
= log1+SNR bits/s/Hz (5.13)
This formula measures the maximum achievable spectral efficiency through
the AWGN channel as a function of the SNR.
173 5.2 Resources of the AWGN channel
5.2.2 Power and bandwidth
Let us ponder the significance of the capacity formula (5.12) to a communication
engineer. One way of using this formula is as a benchmark for evaluating
the performance of channel codes. For a system engineer, however, the main
significance of this formula is that it provides a high-level way of thinking
about how the performance of a communication system depends on the basic
resources available in the channel, without going into the details of specific
modulation and coding schemes used. It will also help identify the bottleneck
that limits performance.
The basic resources of the AWGN channel are the received power ¯P and
the bandwidth W. Let us first see how the capacity depends on the received
power. To this end, a key observation is that the function
fSNR     = log1+SNR (5.14)
is concave, i.e., fx≤0 for all x≥0 (Figure 5.4). This means that increasing
the power ¯P suffers from a law of diminishing marginal returns: the higher
the SNR, the smaller the effect on capacity. In particular, let us look at the
low and the high SNR regimes. Observe that
log21+x ≈ x log2 e whenx ≈ 0 (5.15)
log21+x ≈ log2 x whenx    1 (5.16)
Thus, when the SNR is low, the capacity increases linearly with the received
power ¯P: every 3 dB increase in (or, doubling) the power doubles the capacity.
When the SNR is high, the capacity increases logarithmically with ¯P : every
3 dB increase in the power yields only one additional bit per dimension.
This phenomenon should not come as a surprise. We have already seen in
Figure 5.4 Spectral efficiency
log1+SNR of the AWGN
channel.
0
3
4
5
6
7
0 20 40 60 80 100
1
2
SNR
log (1 + SNR)
174 Capacity of wireless channels
Chapter 3 that packing many bits per dimension is very power-inefficient.
The capacity result says that this phenomenon not only holds for specific
schemes but is in fact fundamental to all communication schemes. In fact,
for a fixed error probability, the data rate of uncoded QAM also increases
logarithmically with the SNR (Exercise 5.7).
The dependency of the capacity on the bandwidth W is somewhat more
complicated. From the formula, the capacity depends on the bandwidth in two
ways. First, it increases the degrees of freedom available for communication.
This can be seen in the linear dependency on W for a fixed SNR = ¯P/N0W.
On the other hand, for a given received power ¯P, the SNR per dimension
decreases with the bandwidth as the energy is spread more thinly across the
degrees of freedom. In fact, it can be directly calculated that the capacity is
an increasing, concave function of the bandwidth W (Figure 5.5). When the
bandwidth is small, the SNR per degree of freedom is high, and then the
capacity is insensitive to small changes in SNR. Increasing W yields a rapid
increase in capacity because the increase in degrees of freedom more than
compensates for the decrease in SNR. The system is in the bandwidth-limited
regime. When the bandwidth is large such that the SNR per degree of freedom
is small,
W log1+
¯P
N0W
 ≈ W
 ¯P
N0W
log2 e =
¯P
N0
log2 e (5.17)
In this regime, the capacity is proportional to the total received power across
the entire band. It is insensitive to the bandwidth, and increasing the bandwidth
has a small impact on capacity. On the other hand, the capacity is now linear
in the received power and increasing power has a significant effect. This is
the power-limited regime.
Figure 5.5 Capacity as a
function of the bandwidth W.
Here ¯P/N0 = 106.
5 30
Bandwidth W (MHz)
Capacity
Limit for W → ∞
Power limited region
0.2
1
Bandwidth limited region
(Mbps)
C(W )
0.4
0 10 15 20 25
1.6
1.4
1.2
0.8
0.6
0
P
N0
log2 e
175 5.2 Resources of the AWGN channel
As W increases, the capacity increases monotonically (why must it?) and
reaches the asymptotic limit
C =
¯P
N0
log2 e bits/s (5.18)
This is the infinite bandwidth limit, i.e., the capacity of the AWGN channel
with only a power constraint but no limitation on bandwidth. It is seen that
even if there is no bandwidth constraint, the capacity is finite.
In some communication applications, the main objective is to minimize
the required energy per bit b rather than to maximize the spectral efficiency.
At a given power level ¯P, the minimum required energy per bit
b is ¯ P/Cawgn ¯ PW. To minimize this, we should be operating in the most
power-efficient regime, i.e., ¯P →0. Hence, the minimum b/N0 is given by
 b
N0

min
= lim
¯P
→0
¯P
Cawgn ¯ PWN0
= 1
log2 e
=−159dB (5.19)
To achieve this, the SNR per degree of freedom goes to zero. The price
to pay for the energy efficiency is delay: if the bandwidth W is fixed, the
communication rate (in bits/s) goes to zero. This essentially mimics the
infinite bandwidth regime by spreading the total energy over a long time
interval, instead of spreading the total power over a large bandwidth.
It was already mentioned that the success story of designing capacityachieving
AWGN codes is a relatively recent one. In the infinite bandwidth
regime, however, it has long been known that orthogonal codes3 achieve the
capacity (or, equivalently, achieve the minimum b/N0 of −159 dB). This is
explored in Exercises 5.8 and 5.9.
Example 5.2 Bandwidth reuse in cellular systems
The capacity formula for the AWGN channel can be used to conduct
a simple comparison of the two orthogonal cellular systems discussed
in Chapter 4: the narrowband system with frequency reuse versus the
wideband system with universal reuse. In both systems, users within a cell
are orthogonal and do not interfere with each other. The main parameter
of interest is the reuse ratio

 ≤ 1. If W denotes the bandwidth per user
within a cell, then each user transmission occurs over a bandwidth of
W.
The parameter
 = 1 yields the full reuse of the wideband OFDM system
and
<1 yields the narrowband system.
3 One example of orthogonal coding is the Hadamard sequences used in the IS-95 system
(Section 4.3.1). Pulse position modulation (PPM), where the position of the on–off pulse
(with large duty cycle) conveys the information, is another example.
176 Capacity of wireless channels
Here we consider the uplink of this cellular system; the study of the
downlink in orthogonal systems is similar. A user at a distance r is heard
at the base-station with an attenuation of a factor r− in power; in free
space the decay rate is equal to 2 and the decay rate is 4 in the model
of a single reflected path off the ground plane, cf. Section 2.1.5.
The uplink user transmissions in a neighboring cell that reuses the same
frequency band are averaged and this constitutes the interference (this
averaging is an important feature of the wideband OFDM system; in the
narrowband system in Chapter 4, there is no interference averaging but that
effect is ignored here). Let us denote by f
 the amount of total out-of-cell
interference at a base-station as a fraction of the received signal power of
a user at the edge of the cell. Since the amount of interference depends
on the number of neighboring cells that reuse the same frequency band,
the fraction f
 depends on the reuse ratio and also on the topology of the
cellular system.
For example, in a one-dimensional linear array of base-stations
(Figure 5.6), a reuse ratio of
 corresponds to one in every 1/
 cells using
the same frequency band. Thus the fraction f
 decays roughly as
. On
the other hand, in a two-dimensional hexagonal array of base-stations, a
reuse ratio of
 corresponds to the nearest reusing base-station roughly a
distance of

1/
 away: this means that the fraction f
 decays roughly as

/2. The exact fraction f
 takes into account geographical features of the
cellular system (such as shadowing) and the geographic averaging of the
interfering uplink transmissions; it is usually arrived at using numerical
simulations (Table 6.2 in [140] has one such enumeration for a full reuse
system). In a simple model where the interference is considered to come
from the center of the cell reusing the same frequency band, f
 can be
taken to be 2
/2 for the linear cellular system and 6
/4 /2 for the
hexagonal planar cellular system (see Exercises 5.2 and 5.3).
The received SINR at the base-station for a cell edge user is
SINR = SNR

+f
SNR
 (5.20)
where the SNR for the cell edge user is
SNR     = P
N0Wd
 (5.21)
d
Figure 5.6 A linear cellular system with base-stations along a line (representing a highway).
177 5.2 Resources of the AWGN channel
with d the distance of the user to the base-station and P the uplink
transmit power. The operating value of the parameter SNR is decided by the
coverage of a cell: a user at the edge of a cell has to have a minimum SNR
to be able to communicate reliably (at aleast a fixed minimum rate) with
the nearest base-station. Each base-station comes with a capital installation
cost and recurring operation costs and to minimize the number of basestations,
the cell size d is usually made as large as possible; depending on
the uplink transmit power capability, coverage decides the cell size d.
Using the AWGN capacity formula (cf. (5.14)), the rate of reliable
communication for a user at the edge of the cell, as a function of the reuse
ratio
, is
R

=
W log21+SINR =
W log2
1+ SNR

+f
SNR
bits/s (5.22)
The rate depends on the reuse ratio through the available degrees of
freedom and the amount of out-of-cell interference. A large
 increases
the available bandwidth per cell but also increases the amount of out-ofcell
interference. The formula (5.22) allows us to study the optimal reuse
factor. At low SNR, the system is not degree of freedom limited and the
interference is small relative to the noise; thus the rate is insensitive to the
reuse factor and this can be verified directly from (5.22). On the other hand,
at large SNR the interference grows as well and the SINR peaks at 1/f
.
(A general rule of thumb in practice is to set SNR such that the interference
is of the same order as the background noise; this will guarantee that the
operating SINR is close to the largest value.) The largest rate is

W log2
1+ 1
f


 (5.23)
This rate goes to zero for small values of
; thus sparse reuse is not
favored. It can be verified that universal reuse yields the largest rate in
(5.23) for the hexagonal cellular system (Exercise 5.3). For the linear
cellular model, the corresponding optimal reuse is
 = 1/2, i.e., reusing
the frequency every other cell (Exercise 5.5). The reduction in interference
due to less reuse is more dramatic in the linear cellular system when
compared to the hexagonal cellular system. This difference is highlighted
in the optimal reuse ratios for the two systems at high SNR: universal
reuse is preferred for the hexagonal cellular system while a reuse ratio of
1/2 is preferred for the linear cellular system.
This comparison also holds for a range of SNR between the small and
the large values: Figures 5.7 and 5.8 plot the rates in (5.22) for different
reuse ratios for the linear and hexagonal cellular systems respectively.
Here the power decay rate is fixed to 3 and the rates are plotted as a
function of the SNR for a user at the edge of the cell, cf. (5.21). In the
178 Capacity of wireless channels
10 15 20 25 30
Rate
bits / s / Hz
Cell edge SNR (dB)
1/2
Frequency reuse factor 1
1/3
0.5
–10 –5 0 5
3
2.5
2
1.5
1
0
Figure 5.7 Rates in bits/s/Hz as a function of the SNR for a user at the edge of the cell for
universal reuse and reuse ratios of 1/2 and 1/3 for the linear cellular system. The power decay
rate  is set to 3.
10 15 20 25 30
1/7
Cell edge SNR (dB)
Frequency reuse factor 1
0.2 1/2
–10 –5 0 5
1.4
1.2
1
0.8
0.6
0.4
0
Rate
bits /s / Hz
Figure 5.8 Rates in bits/s/Hz as a function of the SNR for a user at the edge of the cell for
universal reuse, reuse ratios 1/2 and 1/7 for the hexagonal cellular system. The power decay rate
 is set to 3.
hexagonal cellular system, universal reuse is clearly preferred at all ranges
of SNR. On the other hand, in a linear cellular system, universal reuse
and a reuse of 1/2 have comparable performance and if the operating
SNR value is larger than a threshold (10 dB in Figure 5.7), then it pays to
reuse, i.e., R1/2 >R1. Otherwise, universal reuse is optimal. If this SNR
threshold is within the rule of thumb setting mentioned earlier (i.e., the
gain in rate is worth operating at this SNR), then reuse is preferred. This
Preference has to be traded off with the size of the cell dictated by (5.21)
due to a transmit power constraint on the mobile device.
179 5.3 Linear time-invariant Gaussian channels
5.3 Linear time-invariant Gaussian channels
We give three examples of channels which are closely related to the simple
AWGN channel and whose capacities can be easily computed. Moreover,
optimal codes for these channels can be constructed directly from an optimal
code for the basic AWGN channel. These channels are time-invariant, known
to both the transmitter and the receiver, and they form a bridge to the fading
channels which will be studied in the next section.
5.3.1 Single input multiple output (SIMO) channel
Consider a SIMO channel with one transmit antenna and L receive antennas:
y m = h xm+w m = 1   L (5.24)
where h is the fixed complex channel gain from the transmit antenna to
the th receive antenna, and w m is 0N0 is additive Gaussian noise
independent across antennas. A sufficient statistic for detecting xm from
ym     = y1m    yLmt is
˜y
m     = h∗ym = h2xm+h∗wm (5.25)
where h     = h1   hLt and wm     = w1m    wLmt . This is an
AWGN channel with received SNR Ph2/N0 if P is the average energy per
transmit symbol. The capacity of this channel is therefore
C = log1+ Ph2
N0
bits/s/Hz (5.26)
Multiple receive antennas increase the effective SNR and provide a power
gain. For example, for L=2 and h1
= h2
=1, dual receive antennas provide
a 3 dB power gain over a single antenna system. The linear combining (5.25)
maximizes the output SNR and is sometimes called receive beamforming.
5.3.2 Multiple input single output (MISO) channel
Consider a MISO channel with L transmit antennas and a single receive
antenna:
ym = h∗xm+wm (5.27)
where h = h1   hLt and h is the (fixed) channel gain from transmit
antenna to the receive antenna. There is a total power constraint of P across
the transmit antennas.
180 Capacity of wireless channels
In the SIMO channel above, the sufficient statistic is the projection of the
L-dimensional received signal onto h: the projections in orthogonal directions
contain noise that is not helpful to the detection of the transmit signal. A natural
reciprocal transmission strategy for the MISO channel would send information
only in the direction of the channel vector h; information sent in any orthogonal
direction will be nulled out by the channel anyway. Therefore, by setting
xm = h
h
˜x
m (5.28)
the MISO channel is reduced to the scalar AWGN channel:
ym = h˜xm+wm (5.29)
with a power constraint P on the scalar input. The capacity of this scalar
channel is
log1+ Ph2
N0
bits/s/Hz (5.30)
Can one do better than this scheme? Any reliable code for the MISO channel
can be used as a reliable code for the scalarAWGNchannel ym=xm+wm:
if
Xi are the transmitted L×N (space-time) code matrices for the MISO channel,
then the received 1×N vectors
h∗Xi form a code for the scalar AWGN
channel. Hence, the rate achievable by a reliable code for the MISO channel
must be at most the capacity of a scalar AWGN channel with the same received
SNR. Exercise 5.11 shows that the received SNR Ph2/N0 of the transmission
strategy above is in fact the largest possible SNR given the transmit power constraint
of P. Any other scheme has a lower received SNR and hence its reliable
rate must be less than (5.30), the rate achieved by the proposed transmission
strategy. We conclude that the capacity of the MISO channel is indeed
C = log1+ Ph2
N0
bits/s/Hz (5.31)
Intuitively, the transmission strategy maximizes the received SNR by having
the received signals from the various transmit antennas add up in-phase
(coherently) and by allocating more power to the transmit antenna with the
better gain. This strategy, “aligning the transmit signal in the direction of
the transmit antenna array pattern”, is called transmit beamforming. Through
beamforming, the MISO channel is converted into a scalar AWGN channel
and thus any code which is optimal for theAWGNchannel can be used directly.
In both the SIMO and the MISO examples the benefit from having multiple
antennas is a power gain. To get a gain in degrees of freedom, one has to use
both multiple transmit and multiple receive antennas (MIMO). We will study
this in depth in Chapter 7.
181 5.3 Linear time-invariant Gaussian channels
5.3.3 Frequency-selective channel
Transformation to a parallel channel
Consider a time-invariant L-tap frequency-selective AWGN channel:
ym =
L−1
   
=0
h xm− +wm (5.32)
with an average power constraint P on each input symbol. In Section 3.4.4, we
saw that the frequency-selective channel can be converted into Nc independent
sub-carriers by adding a cyclic prefix of length L−1 to a data vector of
length Nc, cf. (3.137). Suppose this operation is repeated over blocks of data
symbols (of length Nc each, along with the corresponding cyclic prefix of
length L−1); see Figure 5.9. Then communication over the ith OFDM block
can be written as
˜y
ni = ˜hn
˜d
ni+ ˜wni n = 0 1   Nc
−1 (5.33)
Here,
˜d
i     = ˜d0i     ˜dNc−1it (5.34)
˜ wi     = ˜w0i     ˜wNc−1it (5.35)
˜yi     = ˜y0i    ˜yNc−1it (5.36)
are the DFTs of the input, the noise and the output of the ith OFDM block
respectively. ˜h is the DFT of the channel scaled by

Nc (cf. (3.138)). Since the
overhead in the cyclic prefix relative to the block lengthNc can be made arbitrarily
small by choosing Nc large, the capacity of the original frequency-selective
channel is the same as the capacity of this transformed channel as Nc
→.
The transformed channel (5.33) can be viewed as a collection of sub-channels,
one for each sub-carrier n. Each of the sub-channels is an AWGN channel. The
Figure 5.9 A coded OFDM
system. Information bits are
coded and then sent over the
frequency-selective channel via
OFDM modulation. Each
channel use corresponds to an
OFDM block. Coding can be
done across different OFDM
blocks as well as over different
sub-carriers.
Encoder
OFDM
modulator
Channel
(use 2)
OFDM
modulator
Channel
(use 3)
Channel
(use 1)
Information
bits
OFDM
modulator
182 Capacity of wireless channels
transformed noise w˜ i is distributed as 0N0I, so the noise is 0N0
in each of the sub-channels and, moreover, the noise is independent across
sub-channels. The power constraint on the input symbols in time translates
to one on the data symbols on the sub-channels (Parseval theorem for DFTs):
˜di2 ≤ NcP (5.37)
In information theory jargon, a channel which consists of a set of noninterfering
sub-channels, each of which is corrupted by independent noise, is
called a parallel channel. Thus, the transformed channel here is a parallel
AWGN channel, with a total power constraint across the sub-channels. A natural
strategy for reliable communication over a parallel AWGN channel is
illustrated in Figure 5.10. We allocate power to each sub-channel, Pn to the
nth sub-channel, such that the total power constraint is met. Then, a separate
capacity-achieving AWGN code is used to communicate over each of the subchannels.
The maximum rate of reliable communication using this scheme is
Nc−1
   
n=0
log

1+ Pn
˜hn
2
N0

bits/OFDM symbol (5.38)
Further, the power allocation can be chosen appropriately, so as to maximize
the rate in (5.38). The “optimal power allocation”, thus, is the solution to the
optimization problem:
CNc     = max
P0    PNc−1
Nc−1
   
n=0
log

1+ Pn
˜hn
2
N0

 (5.39)
Figure 5.10 Coding
independently over each of the
sub-carriers. This architecture,
with appropriate power and
rate allocations, achieves the
capacity of the
frequency-selective channel.
OFDM
modulator
OFDM
modulator
OFDM
modulator
Channel
(use 1)
Channel
(use 2)
Channel
(use 3)
Information
bits
Information
bits
Encoder
for subcarrier 1
Encoder
for subcarrier 2
183 5.3 Linear time-invariant Gaussian channels
subject to
Nc−1
   
n=0
Pn
= NcP Pn
≥ 0 n= 0   Nc
−1 (5.40)
Waterfilling power allocation
The optimal power allocation can be explicitly found. The objective function
in (5.39) is jointly concave in the powers and this optimization problem can
be solved by Lagrangian methods. Consider the Lagrangian
P0   PNc−1     =
Nc−1
   
n=0
log

1+ Pn
˜hn
2
N0

−
Nc−1
   
n=0
Pn (5.41)
where  is the Lagrange multiplier. The Kuhn–Tucker condition for the
optimality of a power allocation is

Pn
=0 ifPn > 0
≤0 ifPn
= 0
(5.42)
Define x+     = maxx 0. The power allocation
P∗
n
= 1

− N0
˜hn
2

+
 (5.43)
satisfies the conditions in (5.42) and is therefore optimal, with the Lagrange
multiplier  chosen such that the power constraint is met:
1
Nc
Nc−1
   
n=0
1

− N0
˜hn
2

+
= P (5.44)
Figure 5.11 gives a pictorial view of the optimal power allocation strategy
for the OFDM system. Think of the values N0/ ˜hn
2 plotted as a function
of the sub-carrier index n = 0   Nc
−1, as tracing out the bottom of a
vessel. If P units of water per sub-carrier are filled into the vessel, the depth
of the water at sub-carrier n is the power allocated to that sub-carrier, and
1/ is the height of the water surface. Thus, this optimal strategy is called
waterfilling or waterpouring. Note that there are some sub-carriers where the
bottom of the vessel is above the water and no power is allocated to them. In
these sub-carriers, the channel is too poor for it to be worthwhile to transmit
information. In general, the transmitter allocates more power to the stronger
sub-carriers, taking advantage of the better channel conditions, and less or
even no power to the weaker ones.
184 Capacity of wireless channels
Figure 5.11 Waterfilling power
allocation over the Nc subcarriers.
P1 = 0
N0
|H( f )|2
Subcarrier n
P2
P3
*
*
*

Observe that
˜h
n
=
L−1
   
=0
h exp−j2 n
Nc

 (5.45)
is the discrete-time Fourier transform Hf evaluated at f = nW/Nc, where
(cf. (2.20))
Hf      =
L−1
   
=0
h exp−j2 f
W

 f∈ 0 W (5.46)
As the number of sub-carriers Nc grows, the frequency width W/Nc of the
sub-carriers goes to zero and they represent a finer and finer sampling of the
continuous spectrum. So, the optimal power allocation converges to
P∗f  = 1

− N0
Hf  2

+
 (5.47)
where the constant  satisfies (cf. (5.44))

W
0
P∗f df = P (5.48)
The power allocation can be interpreted as waterfilling over frequency (see
Figure 5.12). With Nc sub-carriers, the largest reliable communication rate
185 5.3 Linear time-invariant Gaussian channels
Figure 5.12 Waterfilling power
allocation over the frequency
spectrum of the two-tap
channel (high-pass filter):
h0 = 1 and h1 = 05.
P ( f )
Frequency ( f )
– 0.4W – 0.2W 0 0.2W 0.4W
4
0
3.5
3
2.5
2
1.5
1
0.5
N0
|H( f )|2
*

with independent coding is CNc bits per OFDM symbol or CNc/Nc bits/s/Hz
(CNc given in (5.39)). So as Nc
→, the WCNc/Nc converges to
C = 
W
0
log1+ P∗f  Hf  2
N0
df bits/s (5.49)
Does coding across sub-carriers help?
So far we have considered a very simple scheme: coding independently over
each of the sub-carriers. By coding jointly across the sub-carriers, presumably
better performance can be achieved. Indeed, over a finite block length, coding
jointly over the sub-carriers yields a smaller error probability than can be
achieved by coding separately over the sub-carriers at the same rate. However,
somewhat surprisingly, the capacity of the parallel channel is equal to the
largest reliable rate of communication with independent coding within each
sub-carrier. In other words, if the block length is very large then coding jointly
over the sub-carriers cannot increase the rate of reliable communication any
more than what can be achieved simply by allocating power and rate over
the sub-carriers but not coding across the sub-carriers. So indeed (5.49) is the
capacity of the time-invariant frequency-selective channel.
To get some insight into why coding across the sub-carriers with large
block length does not improve capacity, we turn to a geometric view. Consider
a code, with block length NcN symbols, coding over all Nc of the sub-carriers
with N symbols from each sub-carrier. In high dimensions, i.e., N      1, the
NcN-dimensional received vector after passing through the parallel channel
(5.33) lives in an ellipsoid, with different axes stretched and shrunk by the
different channel gains ˜hn. The volume of the ellipsoid is proportional to
Nc−1

n=0
 ˜hn
2Pn
+N0
N
 (5.50)
186 Capacity of wireless channels
see Exercise 5.12. The volume of the noise sphere is, as in Section 5.1.2,
proportional to NNcN
0 . The maximum number of distinguishable codewords
that can be packed in the ellipsoid is therefore
Nc−1

n=0

1+ Pn
˜hn
2
N0
N
 (5.51)
The maximum reliable rate of communication is
1
N
log
Nc−1

n=0

1+ Pn
˜hn
2
N0
N
=
Nc−1
   
n=0
log

1+ Pn
˜hn
2
N0

bits/OFDM symbol
(5.52)
This is precisely the rate (5.38) achieved by separate coding and this suggests
that coding across sub-carriers can do no better. While this sphere-packing
argument is heuristic, Appendix B.6 gives a rigorous derivation from information
theoretic first principles.
Even though coding across sub-carriers cannot improve the reliable rate of
communication, it can still improve the error probability for a given data rate.
Thus, coding across sub-carriers can still be useful in practice, particularly
when the block length for each sub-carrier is small, in which case the coding
effectively increases the overall block length.
In this section we have used parallel channels to model a frequencyselective
channel, but parallel channels will be seen to be very useful in
modeling many other wireless communication scenarios as well.
5.4 Capacity of fading channels
The basic capacity results developed in the last few sections are now applied
to analyze the limits to communication over wireless fading channels.
Consider the complex baseband representation of a flat fading channel:
ym = hmxm+wm (5.53)
where
hm is the fading process and
wm is i.i.d. 0N0 noise.
As before, the symbol rate is W Hz, there is a power constraint of P
joules/symbol, and  hm 2 = 1 is assumed for normalization. Hence
SNR     = P/N0 is the average received SNR.
In Section 3.1.2, we analyzed the performance of uncoded transmission for
this channel. What is the ultimate performance limit when information can
be coded over a sequence of symbols? To answer this question, we make
the simplifying assumption that the receiver can perfectly track the fading
process, i.e., coherent reception. As we discussed in Chapter 2, the coherence
time of typical wireless channels is of the order of hundreds of symbols and
187 5.4 Capacity of fading channels
so the channel varies slowly relative to the symbol rate and can be estimated
by say a pilot signal. For now, the transmitter is not assumed to have any
knowledge of the channel realization other than the statistical characterization.
The situation when the transmitter has access to the channel realizations will
be studied in Section 5.4.6.
5.4.1 Slow fading channel
Let us first look at the situation when the channel gain is random but remains
constant for all time, i.e., hm = h for all m. This models the slow fading
situation where the delay requirement is short compared to the channel
coherence time (cf. Table 2.2). This is also called the quasi-static scenario.
Conditional on a realization of the channel h, this is an AWGN channel
with received signal-to-noise ratio h 2SNR. The maximum rate of reliable
communication supported by this channel is log1+ h 2SNR bits/s/Hz. This
quantity is a function of the random channel gain h and is therefore random
(Figure 5.13). Now suppose the transmitter encodes data at a rate R bits/s/Hz.
If the channel realization h is such that log1+ h 2SNR < R, then whatever
the code used by the transmitter, the decoding error probability cannot be
made arbitrarily small. The system is said to be in outage, and the outage
probability is
poutR     = 
log1+ h 2SNR < R (5.54)
Thus, the best the transmitter can do is to encode the data assuming that
the channel gain is strong enough to support the desired rate R. Reliable
communication can be achieved whenever that happens, and outage occurs
otherwise.
A more suggestive interpretation is to think of the channel as allowing
log1+ h 2SNR bits/s/Hz of information through when the fading gain is h.
Figure 5.13 Density of
log1+h2SNR, for Rayleigh
fading and SNR = 0 dB. For
any target rate R, there is a
non-zero outage probability.
0
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0 1 2 3 4 5
0.05
0.1
R
Area = pout (R)
188 Capacity of wireless channels
Reliable decoding is possible as long as this amount of information exceeds
the target rate.
For Rayleigh fading (i.e., h is 0 1), the outage probability is
poutR = 1−exp−2R−1
SNR

 (5.55)
At high SNR,
poutR ≈ 2R−1
SNR
 (5.56)
and the outage probability decays as 1/SNR. Recall that when we discussed
uncoded transmission in Section 3.1.2, the detection error probability also
decays like 1/SNR. Thus, we see that coding cannot significantly improve the
error probability in a slow fading scenario. The reason is that while coding
can average out the Gaussian white noise, it cannot average out the channel
fade, which affects all the coded symbols. Thus, deep fade, which is the
typical error event in the uncoded case, is also the typical error event in the
coded case.
There is a conceptual difference between the AWGN channel and the slow
fading channel. In the former, one can send data at a positive rate (in fact, any
rate less than C) while making the error probability as small as desired. This
cannot be done for the slow fading channel as long as the probability that
the channel is in deep fade is non-zero. Thus, the capacity of the slow fading
channel in the strict sense is zero. An alternative performance measure is the
-outage capacity C. This is the largest rate of transmission R such that the
outage probability poutR is less than . Solving poutR =  in (5.54) yields
C
= log1+F−11− SNR bits/s/Hz (5.57)
where F is the complementary cumulative distribution function of h 2, i.e.,
Fx     = 
h 2 > x.
In Section 3.1.2, we looked at uncoded transmission and there it was natural
to focus only on the high SNR regime; at low SNR, the error probability of
uncoded transmission is very poor. On the other hand, for coded systems,
it makes sense to consider both the high and the low SNR regimes. For
example, the CDMA system in Chapter 4 operates at very low SINR and
uses very low-rate orthogonal coding. A natural question is: in which regime
does fading have a more significant impact on outage performance? One can
answer this question in two ways. Eqn (5.57) says that, to achieve the same
rate as the AWGN channel, an extra 10 log1/F−11− dB of power is
needed. This is true regardless of the operating SNR of the environment. Thus
the fade margin is the same at all SNRs. If we look at the outage capacity
at a given SNR, however, the impact of fading depends very much on the
operating regime. To get a sense, Figure 5.14 plots the -outage capacity as
189 5.4 Capacity of fading channels
Figure 5.14 -outage capacity
as a fraction of AWGN capacity
under Rayleigh fading, for
 = 01 and  = 001.
0
1
–10 –5 0 5 10 15 20 25 30
0.6
0.4
0.2
0.8
= 0.1
= 0.01
C
Cawgn
SNR (dB)
35 40
∋ ∋

a function of SNR for the Rayleigh fading channel. To assess the impact of
fading, the -outage capacity is plotted as a fraction of the AWGN capacity
at the same SNR. It is clear that the impact is much more significant in the
low SNR regime. Indeed, at high SNR,
C
≈ log SNR+logF−11− (5.58)
≈ Cawgn
−log 1
F−11−

 (5.59)
a constant difference irrespective of the SNR. Thus, the relative loss gets
smaller at high SNR. At low SNR, on the other hand,
C
≈ F−11−SNR log2 e (5.60)
≈ F−11−Cawgn (5.61)
For reasonably small outage probabilities, the outage capacity is only a
small fraction of the AWGN capacity at low SNR. For Rayleigh fading,
F−11− ≈  for small  and the impact of fading is very significant. At
an outage probability of 001, the outage capacity is only 1% of the AWGN
capacity! Diversity has a significant effect at high SNR (as already seen in
Chapter 3), but can be more important at low SNR. Intuitively, the impact
of the randomness of the channel is in the received SNR, and the reliable
rate supported by the AWGN channel is much more sensitive to the received
SNR at low SNR than at high SNR. Exercise 5.10 elaborates on this point.
5.4.2 Receive diversity
Let us increase the diversity of the channel by having L receive antennas
instead of one. For given channel gains h     = h1   hLt , the capacity was
190 Capacity of wireless channels
calculated in Section 5.3.1 to be log1+h2SNR. Outage occurs whenever
this is below the target rate R:
prx
outR     = 
log1+h2SNR < R (5.62)
This can be rewritten as
poutR = 
h2 <
2R−1
SNR

 (5.63)
Under independent Rayleigh fading, h2 is a sum of the squares of 2L
independent Gaussian random variables and is distributed as Chi-square with
2L degrees of freedom. Its density is
fx = 1
L−1!xL−1e−x x≥ 0 (5.64)
Approximating e−x by 1 for x small, we have (cf. (3.44)),

h2 <  ≈ 1
L!L (5.65)
for  small. Hence at high SNR the outage probability is given by
poutR ≈ 2R−1L
L!SNRL  (5.66)
Comparing with (5.55), we see a diversity gain of L: the outage probability
now decays like 1/SNRL. This parallels the performance of uncoded transmission
discussed in Section 3.3.1: thus, coding cannot increase the diversity
gain.
The impact of receive diversity on the -outage capacity is plotted in
Figure 5.15. The -outage capacity is given by (5.57) with F now the cumulative
distribution function of h2. Receive antennas yield a diversity gain
and an L-fold power gain. To emphasize the impact of the diversity gain, let
us normalize the outage capacity C by Cawgn
= log1+LSNR. The dramatic
salutary effect of diversity on outage capacity can now be seen. At low SNR
and small , (5.61) and (5.65) yield
C
≈ F−11−SNR log2 e (5.67)
≈ L!
1L

1L
SNR log2 e bits/s/Hz (5.68)
and the loss with respect to the AWGN capacity is by a factor of 1/L rather
than by  when there is no diversity. At  = 001 and L = 2, the outage
capacity is increased to 14% of the AWGN capacity (as opposed to 1% for
L = 1).
191 5.4 Capacity of fading channels
Figure 5.15 -outage capacity
with L-fold receive diversity, as
a fraction of the AWGN
capacity log1+LSNR for
 = 001 and different L.
0
–10 0 5 10 15 20 25 30 35 40
1
0.8
0.6
0.4
0.2
–5
C
Cawgn
L = 2
L = 4
L = 5
L = 3
L = 1
SNR (dB)

5.4.3 Transmit diversity
Now suppose there are L transmit antennas but only one receive antenna, with
a total power constraint of P. From Section 5.3.2, the capacity of the channel
conditioned on the channel gains h = h1   hLt is log1+h2SNR.
Following the approach taken in the SISO and the SIMO cases, one is tempted
to say that the outage probability for a fixed rate R is
pfull−csi
out R = 
log1+h2SNR < R (5.69)
which would have been exactly the same as the corresponding SIMO system
with 1 transmit and L receive antennas. However, this outage performance
is achievable only if the transmitter knows the phases and magnitudes of the
gains h so that it can perform transmit beamforming, i.e., allocate more power
to the stronger antennas and arrange the signals from the different antennas to
align in phase at the receiver. When the transmitter does not know the channel
gains h, it has to use a fixed transmission strategy that does not depend on h.
(This subtlety does not arise in either the SISO or the SIMO case because the
transmitter need not know the channel realization to achieve the capacity for
those channels.) How much performance loss does not knowing the channel
entail?
Alamouti scheme revisited
For concreteness, let us focus on L = 2 (dual transmit antennas). In this
situation, we can use the Alamouti scheme, which extracts transmit diversity
without transmitter channel knowledge (introduced in Section 3.3.2). Recall
from (3.76) that, under this scheme, both the transmitted symbols u1u2 over a
block of 2 symbol times see an equivalent scalar fading channel with gain h
192 Capacity of wireless channels
h2
w2
h1 w1
w2
h2
MISO channel
MISO channel
repetition
Alamouti
post-processing
y1 = (|h1|2
+ |h2|2)u1 + w1
y1 = (|h1|2
+ |h2|2)u1 + w1
y2 = (|h1|2
+ |h2|2)u2 + w2
h2
h1
h2h2 *
*
*
post-processing
u1
*
*
*
–*
u1
u2
(b)
(a)
2 equivalent scalar channels
equivalent scalar channel
h1 w1
h1
–h1
Figure 5.16 A space-time and additive noise 0N0 (Figure 5.16(b)). The energy in the symbols
coding scheme combined with
the MISO channel can be
viewed as an equivalent scalar
channel: (a) repetition coding;
(b) the Alamouti scheme. The
outage probability of the
scheme is the outage
probability of the equivalent
channel.
u1 and u2 is P/2. Conditioned on h1h2, the capacity of the equivalent scalar
channel is
log1+h2 SNR
2
bits/s/Hz (5.70)
Thus, if we now consider successive blocks and use an AWGN capacityachieving
code of rate R over each of the streams
u1m and
u2m
separately, then the outage probability of each stream is
pAla
out R = 
log1+h2 SNR
2

<R

 (5.71)
Compared to (5.69) when the transmitter knows the channel, the Alamouti
scheme performs strictly worse: the loss is 3 dB in the received SNR. This
can be explained in terms of the efficiency with which energy is transferred
to the receiver. In the Alamouti scheme, the symbols sent at the two transmit
antennas in each time are independent since they come from two separately
coded streams. Each of them has power P/2. Hence, the total SNR at the
receive antenna at any given time is
 h1
2 + h2
2
SNR
2
 (5.72)
In contrast, when the transmitter knows the channel, the symbols transmitted
at the two antennas are completely correlated in such a way that the
signals add up in phase at the receive antenna and the SNR is now
 h1
2 + h2
2 SNR
193 5.4 Capacity of fading channels
a 3-dB power gain over the independent case.4 Intuitively, there is a power
loss because, without channel knowledge, the transmitter is sending signals
that have energy in all directions instead of focusing the energy in a specific
direction. In fact, the Alamouti scheme radiates energy in a perfectly isotropic
manner: the signal transmitted from the two antennas has the same energy
when projected in any direction (Exercise 5.14).
Ascheme radiates energy isotropically whenever the signals transmitted from
the antennas are uncorrelated and have equal power (Exercise 5.14). Although
the Alamouti scheme does not perform as well as transmit beamforming, it
is optimal in one important sense: it has the best outage probability among
all schemes that radiate energy isotropically. Indeed, any such scheme must
have a received SNR equal to (5.72) and hence its outage performance must be
no better than that of a scalar slow fading AWGN channel with that received
SNR. But this is precisely the performance achieved by the Alamouti scheme.
Can one do even better by radiating energy in a non-isotropic manner (but
in a way that does not depend on the random channel gains)? In other words,
can one improve the outage probability by correlating the signals from the
transmit antennas and/or allocating unequal powers on the antennas? The
answer depends of course on the distribution of the gains h1h2. If h1h2
are i.i.d. Rayleigh, Exercise 5.15 shows, using symmetry considerations, that
correlation never improves the outage performance, but it is not necessarily
optimal to use all the transmit antennas. Exercise 5.16 shows that uniform
power allocation across antennas is always optimal, but the number of antennas
used depends on the operating SNR. For reasonable values of target outage
probabilities, it is optimal to use all the antennas. This implies that in most
cases of interest, the Alamouti scheme has the optimal outage performance
for the i.i.d. Rayleigh fading channel.
What about forL>2 transmit antennas? An information theoretic argument
in Appendix B.8 shows (in a more general framework) that
poutR = 
log1+h2 SNR
L

<R
 (5.73)
is achievable. This is the natural generalization of (5.71) and corresponds again
to isotropic transmission of energy from the antennas. Again, Exercises 5.15
and 5.16 show that this strategy is optimal for the i.i.d. Rayleigh fading
channel and for most target outage probabilities of interest. However, there
is no natural generalization of the Alamouti scheme for a larger number
of transmit antennas (cf. Exercise 3.17). We will return to the problem of
outage-optimal code design for L>2 in Chapter 9.
4 The addition of two in-phase signals of equal power yields a sum signal that has double the
amplitude and four times the power of each of the signals. In contrast, the addition of two
independent signals of equal power only doubles the power.
194 Capacity of wireless channels
1e–10
10 15
1e–08
1e–06
0.0001
0.01
1
–10 –5 0 5 10 15 20 5
7
6
5
4
3
2
1
0
–10 –5 0
9
8
1e–14
1e–12
C
(bps /
Hz)
(a)
SNR (dB)
pout
L = 5
L = 3
L = 1
MISO
SIMO
SNR (dB)
(b)
20
L = 5
L = 3
L = 1

Figure 5.17 Comparisonof The outage performances of the SIMO and the MISO channels with i.i.d.
outage performance between
SIMOandMISOchannels for
different L: (a) outage probability
as a function of SNR, for fixed
R = 1; (b) outage capacity as a
function of SNR, for a fixed outage
probability of 10−2.
Rayleigh gains are plotted in Figure 5.17 for different numbers of transmit
antennas. The difference in outage performance clearly outlines the asymmetry
between receive and transmit antennas caused by the transmitter lacking
knowledge of the channel.
Suboptimal schemes: repetition coding
In the above, the Alamouti scheme is viewed as an inner code that converts
the MISO channel into a scalar channel. The outage performance (5.71) is
achieved when the Alamouti scheme is used in conjunction with an outer code
that is capacity-achieving for the scalar AWGN channel. Other space-time
schemes can be similarly used as inner codes and their outage probability
analyzed and compared to the channel outage performance.
Here we consider the simplest example, the repetition scheme: the same
symbol is transmitted over the L different antennas over L symbol periods,
using only one antenna at a time to transmit. The receiver does maximal
ratio combining to demodulate each symbol. As a result, each symbol sees
an equivalent scalar fading channel with gain h and noise variance N0
(Figure 5.16(a)). Since only one symbol is transmitted every L symbol periods,
a rate of LR bits/symbol is required on this scalar channel to achieve a target
rate of R bits/symbol on the original channel. The outage probability of this
scheme, when combined with an outer capacity-achieving code, is therefore:
prep
outR = 
 1
L
log1+h2SNR < R

 (5.74)
Compared to the outage probability (5.73) of the channel, this scheme is
suboptimal: the SNR has to be increased by a factor of
L2R−1
2LR−1
 (5.75)
195 5.4 Capacity of fading channels
to achieve the same outage probability for the same target rate R. Equivalently,
the reciprocal of this ratio can be interpreted as the maximum achievable
coding gain over the simple repetition scheme. For a fixed R, the performance
loss increases with L: the repetition scheme becomes increasingly inefficient
in using the degrees of freedom of the channel. For a fixed L, the performance
loss increases with the target rate R. On the other hand, for R small,
2R−1 ≈ Rln 2 and 2RL−1 ≈ RLln 2, so
L2R−1
2LR−1
≈ LRln 2
LRln 2
= 1 (5.76)
and there is hardly any loss in performance. Thus, while the repetition scheme
is very suboptimal in the high SNR regime where the target rate can be high,
it is nearly optimal in the low SNR regime. This is not surprising: the system
is degree-of-freedom limited in the high SNR regime and the inefficiency of
the repetition scheme is felt more there.
Summary 5.2 Transmit and receive diversity
With receive diversity, the outage probability is
prx
outR     = 
log1+h2SNR < R (5.77)
With transmit diversity and isotropic transmission, the outage probability is
ptx
outR     = 
log1+h2 SNR
L

<R

 (5.78)
a loss of a factor of L in the received SNR because the transmitter has
no knowledge of the channel direction and is unable to beamform in the
specific channel direction.
With two transmit antennas, capacity-achieving AWGN codes in conjunction
with the Alamouti scheme achieve the outage probability.
5.4.4 Time and frequency diversity
Outage performance of parallel channels
Another way to increase channel diversity is to exploit the time-variation
of the channel: in addition to coding over symbols within one coherence
period, one can code over symbols from L such periods. Note that this is
a generalization of the schemes considered in Section 3.2, which take one
symbol from each coherence period. When coding can be performed over
196 Capacity of wireless channels
many symbols from each period, as well as between symbols from different
periods, what is the performance limit?
One can model this situation using the idea of parallel channels introduced
in Section 5.3.3: each of the sub-channels, = 1   L, represents
a coherence period of duration Tc symbols:
y m = h x m+w m m = 1   Tc (5.79)
Here h is the (non-varying) channel gain during the th coherence period.
It is assumed that the coherence time Tc is large such that one can code
over many symbols in each of the sub-channels. An average transmit power
constraint of P on the original channel translates into a total power constraint
of LP on the parallel channel.
For a given realization of the channel, we have already seen in Section 5.3.3
that the optimal power allocation across the sub-channels is waterfilling.
However, since the transmitter does not know what the channel gains are, a
reasonable strategy is to allocate equal power P to each of the sub-channels.
In Section 5.3.3, it was mentioned that the maximum rate of reliable communication
given the fading gains h is
L
   
=1
log1+ h
2SNR bits/s/Hz (5.80)
where SNR = P/N0. Hence, if the target rate is R bits/s/Hz per sub-channel,
then outage occurs when
L
   
=1
log1+ h
2SNR < LR (5.81)
Can one design a code to communicate reliably whenever
L
   
=1
log1+ h
2SNR > LR? (5.82)
If so, an L-fold diversity is achieved for i.i.d. Rayleigh fading: outage occurs
only if each of the terms in the sum L
=1 log1+ h
2SNR is small.
The term log1 + h
2SNR is the capacity of an AWGN channel with
received SNR equal to h
2SNR. Hence, a seemingly straightforward strategy,
already used in Section 5.3.3, would be to use a capacity-achieving AWGN
code with rate
log1+ h
2SNR
for the th coherence period, yielding an average rate of
1
L
L
   
=1
log1+ h
2SNR bits/s/Hz
197 5.4 Capacity of fading channels
and meeting the target rate whenever condition (5.82) holds. The caveat is
that this strategy requires the transmitter to know in advance the channel state
during each of the coherence periods so that it can adapt the rate it allocates to
each period. This knowledge is not available. However, it turns out that such
transmitter adaptation is unnecessary: information theory guarantees that
one can design a single code that communicates reliably at rate R whenever
the condition (5.82) is met. Hence, the outage probability of the time diversity
channel is precisely
poutR = 
 1
L
L
   
=1
log1+ h
2SNR < R

 (5.83)
Even though this outage performance can be achieved with or without
transmitter knowledge of the channel, the coding strategy is vastly different.
With transmitter knowledge of the channel, dynamic rate allocation and separate
coding for each sub-channel suffices. Without transmitter knowledge,
separate coding would mean using a fixed-rate code for each sub-channel and
poor diversity results: errors occur whenever one of the sub-channels is bad.
Indeed, coding across the different coherence periods is now necessary: if the
channel is in deep fade during one of the coherence periods, the information
bits can still be protected if the channel is strong in other periods.
A geometric view
Figure 5.18 gives a geometric view of our discussion so far. Consider a code
with rate R, coding over all the sub-channels and over one coherence timeinterval;
the block length is LTc symbols. The codewords lie in an LTcdimensional
sphere. The received LTc-dimensional signal lives in an ellipsoid,
with (L groups of) different axes stretched and shrunk by the different subchannel
gains (cf. Section 5.3.3). The ellipsoid is a function of the sub-channel
gains, and hence random. The no-outage condition (5.82) has a geometric
interpretation: it says that the volume of the ellipsoid is large enough to
contain 2LTcR noise spheres, one for each codeword. (This was already seen
in the sphere-packing argument in Section 5.3.3.) An outage-optimal code is
one that communicates reliably whenever the random ellipsoid is at least this
large. The subtlety here is that the same code must work for all such ellipsoids.
Since the shrinking can occur in any of the L groups of dimensions, a robust
code needs to have the property that the codewords are simultaneously wellseparated
in each of the sub-channels (Figure 5.18(a)). A set of independent
codes, one for each sub-channel, is not robust: errors will be made when even
only one of the sub-channels fades (Figure 5.18(b)).
We have already seen, in the simple context of Section 3.2, codes for
the parallel channel which are designed to be well-separated in all the subchannels.
For example, the repetition code and the rotation code in Figure 3.8
have the property that the codewords are separated in bot the sub-channels
198 Capacity of wireless channels
Channel
fade
Channel
fade
(a)
Reliable communication Noise spheres overlap
(b)
(here Tc
=1 symbol and L=2 sub-channels). More generally, the code design
Figure 5.18 Effect of the fading
gains on codes for the parallel
channel. Here there are L= 2
sub-channels and each axis
represents Tc dimensions within
a sub-channel. (a) Coding
across the sub-channels. The
code works as long as the
volume of the ellipsoid is big
enough. This requires good
codeword separation in both
the sub-channels. (b) Separate,
non-adaptive code for each
sub-channel. Shrinking of one
of the axes is enough to cause
confusion between the
codewords.
criterion of maximizing the product distance for all pairs of codewords naturally
favors codes that satisfy this property. Coding over long blocks affords
a larger coding gain; information theory guarantees the existence of codes
with large enough coding gain to achieve the outage probability in (5.83).
To achieve the outage probability, one wants to design a code that communicates
reliably over every parallel channel that is not in outage (i.e., parallel
channels that satisfy (5.82)). In information theory jargon, a code that communicates
reliably for a class of channels is said to be universal for that class.
In this language, we are looking for universal codes for parallel channels that
are not in outage. In the slow fading scalar channel without diversity (L = 1),
this problem is the same as the code design problem for a specific channel.
This is because all scalar channels are ordered by their received SNR; hence a
code that works for the channel that is just strong enough to support the target
rate will automatically work for all better channels. For parallel channels,
each channel is described by a vector of channel gains and there is no natural
ordering of channels; the universal code design problem is now non-trivial.
In Chapter 9, a universal code design criterion will be developed to construct
universal codes that come close to achieving the outage probability.
Extensions
In the above development, a uniform power allocation across the sub-channels
is assumed. Instead, if we choose to allocate power P to sub-channel , then
the outage probability (5.83) generalizes to
poutR = 
 L
   
=1
log1+ h
2SNR  < LR

 (5.84)
where SNR
= P /N0. Exercise 5.17 shows that for the i.i.d. Rayleigh fading
model, a non-uniform power allocation that does not depend on the channel
gains cannot improve the outage performance.
199 5.4 Capacity of fading channels
The parallel channel is used to model time diversity, but it can model
frequency diversity as well. By using the usual OFDM transformation, a slow
frequency-selective fading channel can be converted into a set of parallel subchannels,
one for each sub-carrier. This allows us to characterize the outage
capacity of such channels as well (Exercise 5.22).
We summarize the key idea in this section using more suggestive
language.
Summary 5.3 Outage for parallel channels
Outage probability for a parallel channel with L sub-channels and the th
channel having random gain h :
poutR = 
 1
L
L
   
=1
log1+ h
2SNR < R

 (5.85)
where R is in bits/s/Hz per sub-channel.
The th sub-channel allows log1+ h
2SNR bits of information per symbol
through. Reliable decoding can be achieved as long as the total amount
of information allowed through exceeds the target rate.
5.4.5 Fast fading channel
In the slow fading scenario, the channel remains constant over the transmission
duration of the codeword. If the codeword length spans several coherence
periods, then time diversity is achieved and the outage probability improves.
When the codeword length spans many coherence periods, we are in the
so-called fast fading regime. How does one characterize the performance limit
of such a fast fading channel?
Capacity derivation
Let us first consider a very simple model of a fast fading channel:
ym = hmxm+wm (5.86)
where hm = h remains constant over the th coherence period of Tc symbols
and is i.i.d. across different coherence periods. This is the so-called
block fading model; see Figure 5.19(a). Suppose coding is done over L such
coherence periods. If Tc
     1, we can effectively model this as L parallel
sub-channels that fade independently. The outage probability from (5.83) is
poutR = 
 1
L
L
   
=1
log1+ h
2SNR < R

 (5.87)
200 Capacity of wireless channels
Figure 5.19 (a) Typical
trajectory of the channel
strength as a function of
symbol time under a block
fading model. (b) Typical
trajectory of the channel
strength after interleaving. One
can equally think of these
plots as rates of flow of
information allowed through
the channel over time.
m
l = 0
h[m]
l = 1 l = 2 l = 3
m
h[m]
(a) (b)
For finite L, the quantity
1
L
L
   
=1
log1+ h
2SNR
is random and there is a non-zero probability that it will drop below any
target rate R. Thus, there is no meaningful notion of capacity in the sense of
maximum rate of arbitrarily reliable communication and we have to resort to
the notion of outage. However, as L→, the law of large numbers says that
1
L
L
   
=1
log1+ h
2SNR→log1+ h 2SNR (5.88)
Now we can average over many independent fades of the channel by coding


The wireless channel

2 The wireless channel
A good understanding of the wireless channel, its key physical parameters
and the modeling issues, lays the foundation for the rest of the book. This is
the goal of this chapter.
A defining characteristic of the mobile wireless channel is the variations
of the channel strength over time and over frequency. The variations can be
roughly divided into two types (Figure 2.1):
• Large-scale fading, due to path loss of signal as a function of distance
and shadowing by large objects such as buildings and hills. This occurs as
the mobile moves through a distance of the order of the cell size, and is
typically frequency independent.
• Small-scale fading, due to the constructive and destructive interference of the
multiple signal paths between the transmitter and receiver. This occurs at the
spatial scale of the order of the carrier wavelength, and is frequency dependent.
We will talk about both types of fading in this chapter, but with more
emphasis on the latter. Large-scale fading is more relevant to issues such as
cell-site planning. Small-scale multipath fading is more relevant to the design
of reliable and efficient communication systems – the focus of this book.
We start with the physical modeling of the wireless channel in terms of electromagnetic
waves. We then derive an input/output linear time-varying model
for the channel, and define some important physical parameters. Finally, we
introduce a few statistical models of the channel variation over time and over
frequency.
2.1 Physical modeling for wireless channels
Wireless channels operate through electromagnetic radiation from the transmitter
to the receiver. In principle, one could solve the electromagnetic
field equations, in conjunction with the transmitted signal, to find the
10
11 2.1 Physical modeling for wireless channels
Figure 2.1 Channel quality
varies over multiple
time-scales. At a slow scale,
channel varies due to
large-scale fading effects. At a
fast scale, channel varies due
to multipath effects.
Time
Channel quality
electromagnetic field impinging on the receiver antenna. This would have to
be done taking into account the obstructions caused by ground, buildings,
vehicles, etc. in the vicinity of this electromagnetic wave.1
Cellular communication in the USA is limited by the Federal Communication
Commission (FCC), and by similar authorities in other countries,
to one of three frequency bands, one around 0.9 GHz, one around 1.9 GHz,
and one around 5.8 GHz. The wavelength of electromagnetic radiation at
any given frequency f is given by = c/f , where c = 3×108m/s is the
speed of light. The wavelength in these cellular bands is thus a fraction of a
meter, so to calculate the electromagnetic field at a receiver, the locations of
the receiver and the obstructions would have to be known within sub-meter
accuracies. The electromagnetic field equations are therefore too complex to
solve, especially on the fly for mobile users. Thus, we have to ask what we
really need to know about these channels, and what approximations might be
reasonable.
One of the important questions is where to choose to place the base-stations,
and what range of power levels are then necessary on the downlink and uplink
channels. To some extent this question must be answered experimentally, but
it certainly helps to have a sense of what types of phenomena to expect.
Another major question is what types of modulation and detection techniques
look promising. Here again, we need a sense of what types of phenomena to
expect. To address this, we will construct stochastic models of the channel,
assuming that different channel behaviors appear with different probabilities,
and change over time (with specific stochastic properties). We will return to
the question of why such stochastic models are appropriate, but for now we
simply want to explore the gross characteristics of these channels. Let us start
by looking at several over-idealized examples.
1 By obstructions, we mean not only objects in the line-of-sight between transmitter and
receiver, but also objects in locations that cause non-negligible changes in the electromagnetic
field at the receiver; we shall see examples of such obstructions later.
12 The wireless channel
2.1.1 Free space, fixed transmit and receive antennas
First consider a fixed antenna radiating into free space. In the far field,2 the
electric field and magnetic field at any given location are perpendicular both
to each other and to the direction of propagation from the antenna. They
are also proportional to each other, so it is sufficient to know only one of
them ( just as in wired communication, where we view a signal as simply
a voltage waveform or a current waveform). In response to a transmitted
sinusoid cos 2 ft, we can express the electric far field at time t as
E f t r = s f cos 2 f t−r/c
r
     (2.1)
Here, r represents the point u in space at which the electric field is
being measured, where r is the distance from the transmit antenna to u and
where represents the vertical and horizontal angles from the antenna
to u respectively. The constant c is the speed of light, and s f is the
radiation pattern of the sending antenna at frequency f in the direction ;
it also contains a scaling factor to account for antenna losses. Note that the
phase of the field varies with fr/c, corresponding to the delay caused by the
radiation traveling at the speed of light.
We are not concerned here with actually finding the radiation pattern for
any given antenna, but only with recognizing that antennas have radiation
patterns, and that the free space far field behaves as above.
It is important to observe that, as the distance r increases, the electric field
decreases as r−1 and thus the power per square meter in the free space wave
decreases as r−2. This is expected, since if we look at concentric spheres of
increasing radius r around the antenna, the total power radiated through the
sphere remains constant, but the surface area increases as r2. Thus, the power
per unit area must decrease as r−2. We will see shortly that this r−2 reduction
of power with distance is often not valid when there are obstructions to free
space propagation.
Next, suppose there is a fixed receive antenna at the location u = r .
The received waveform (in the absence of noise) in response to the above
transmitted sinusoid is then
Er f t u = f cos 2 f t−r/c
r
(2.2)
where f is the product of the antenna patterns of transmit and receive
antennas in the given direction. Our approach to (2.2) is a bit odd since we
started with the free space field at u in the absence of an antenna. Placing a
2 The far field is the field sufficiently far away from the antenna so that (2.1) is valid. For
cellular systems, it is a safe assumption that the receiver is in the far field.
13 2.1 Physical modeling for wireless channels
receive antenna there changes the electric field in the vicinity of u, but this
is taken into account by the antenna pattern of the receive antenna.
Now suppose, for the given u, that we define
H f
= f e −j2 fr/c
r
     (2.3)
We then have Er f t u = H f e j2 ft . We have not mentioned it yet,
but (2.1) and (2.2) are both linear in the input. That is, the received field
(waveform) at u in response to a weighted sum of transmitted waveforms is
simply the weighted sum of responses to those individual waveforms. Thus,
H f is the system function for an LTI (linear time-invariant) channel, and its
inverse Fourier transform is the impulse response. The need for understanding
electromagnetism is to determine what this system function is. We will find in
what follows that linearity is a good assumption for all the wireless channels
we consider, but that the time invariance does not hold when either the
antennas or obstructions are in relative motion.
2.1.2 Free space, moving antenna
Next consider the fixed antenna and free space model above with a receive
antenna that is moving with speed v in the direction of increasing distance
from the transmit antenna. That is, we assume that the receive antenna is at
a moving location described as u t = r t with r t = r0
+vt. Using
(2.1) to describe the free space electric field at the moving point u t (for the
moment with no receive antenna), we have
E f t r0
+vt = s f cos 2 f t−r0/c−vt/c
r0
+vt
     (2.4)
Note that we can rewrite f t −r0/c −vt/c as f 1−v/c t −fr0/c. Thus,
the sinusoid at frequency f has been converted to a sinusoid of frequency
f 1−v/c ; there has been a Doppler shift of −fv/c due to the motion of
the observation point.3 Intuitively, each successive crest in the transmitted
sinusoid has to travel a little further before it gets observed at the moving
observation point. If the antenna is now placed at u t , and the change of
field due to the antenna presence is again represented by the receive antenna
pattern, the received waveform, in analogy to (2.2), is
Er f t r0
+vt = f cos 2 f 1−v/c t−r0/c
r0
+vt
     (2.5)
3 The reader should be familiar with the Doppler shift associated with moving cars. When an
ambulance is rapidly moving toward us we hear a higher frequency siren. When it passes us
we hear a rapid shift toward a lower frequency.
14 The wireless channel
This channel cannot be represented as an LTI channel. If we ignore the timevarying
attenuation in the denominator of (2.5), however, we can represent the
channel in terms of a system function followed by translating the frequency f
by the Doppler shift −fv/c. It is important to observe that the amount of shift
depends on the frequency f. We will come back to discussing the importance
of this Doppler shift and of the time-varying attenuation after considering the
next example.
The above analysis does not depend on whether it is the transmitter or
the receiver (or both) that are moving. So long as r t is interpreted as the
distance between the antennas (and the relative orientations of the antennas
are constant), (2.4) and (2.5) are valid.
2.1.3 Reflecting wall, fixed antenna
Consider Figure 2.2 in which there is a fixed antenna transmitting the sinusoid
cos 2 ft, a fixed receive antenna, and a single perfectly reflecting large fixed
wall. We assume that in the absence of the receive antenna, the electromagnetic
field at the point where the receive antenna will be placed is the sum of
the free space field coming from the transmit antenna plus a reflected wave
coming from the wall. As before, in the presence of the receive antenna, the
perturbation of the field due to the antenna is represented by the antenna pattern.
An additional assumption here is that the presence of the receive antenna does
not appreciably affect the plane wave impinging on the wall. In essence, what
we have done here is to approximate the solution of Maxwell’s equations by a
method called ray tracing. The assumption here is that the received waveform
can be approximated by the sum of the free space wave from the transmitter plus
the reflected free space waves from each of the reflecting obstacles.
In the present situation, if we assume that the wall is very large, the reflected
wave at a given point is the same (except for a sign change4) as the free space
wave that would exist on the opposite side of the wall if the wall were not present
(see Figure 2.3). Thismeansthat the reflectedwavefromthe wall has the intensity
of a free space wave at a distance equal to the distance to the wall and then
Figure 2.2 Illustration of a
direct path and a reflected
path.
Wall
Transmit
antenna
Receive antenna
r
d
4 By basic electromagnetics, this sign change is a consequence of the fact that the electric field is
parallel to the plane of the wall for this example.
15 2.1 Physical modeling for wireless channels
Figure 2.3 Relation of reflected
wave to wave without wall.
Transmit
antenna Wall
back to the receive antenna, i.e., 2d−r. Using (2.2) for both the direct and the
reflected wave, and assuming the same antenna gain for both waves, we get
Er f t = cos 2 f t−r/c
r
− cos 2 f t− 2d−r /c
2d−r
     (2.6)
The received signal is a superposition of two waves, both of frequency f.
The phase difference between the two waves is

= 2 f 2d−r
c
+
− 2 fr
c
= 4 f
c
d−r +      (2.7)
When the phase difference is an integer multiple of 2 , the two waves add
constructively, and the received signal is strong. When the phase difference
is an odd integer multiple of , the two waves add destructively, and the
received signal is weak. As a function of r, this translates into a spatial pattern
of constructive and destructive interference of the waves. The distance from
a peak to a valley is called the coherence distance:

xc
=
4
(2.8)
where
= c/f is the wavelength of the transmitted sinusoid. At distances
much smaller than
xc, the received signal at a particular time does not
change appreciably.
The constructive and destructive interference pattern also depends on the
frequency f: for a fixed r, if f changes by
1
2
2d−r
c
− r
c

−1
(2.9)
we move from a peak to a valley. The quantity
Td
= 2d−r
c
− r
c
(2.10)
is called the delay spread of the channel: it is the difference between the propagation
delays along the two signal paths. The constructive and destructive interference
pattern does not change appreciably if the frequency changes by an amount
much smaller than 1/Td. This parameter is called the coherence bandwidth.
16 The wireless channel
2.1.4 Reflecting wall, moving antenna
Suppose the receive antenna is now moving at a velocity v (Figure 2.4). As it
moves through the pattern of constructive and destructive interference created
by the two waves, the strength of the received signal increases and decreases.
This is the phenomenon of multipath fading. The time taken to travel from a
peak to a valley is c/ 4fv : this is the time-scale at which the fading occurs,
and it is called the coherence time of the channel.
An equivalent way of seeing this is in terms of the Doppler shifts of the
direct and the reflected waves. Suppose the receive antenna is at location r0
at time 0. Taking r = r0
+vt in (2.6), we get
Er f t = cos 2 f 1−v/c t−r0/c
r0
+vt
− cos 2 f 1+v/c t+ r0
−2d /c
2d−r0
−vt
     (2.11)
The first term, the direct wave, is a sinusoid at frequency f 1−v/c , experiencing
a Doppler shift D1
=−fv/c. The second is a sinusoid at frequency
f 1+v/c , with a Doppler shift D2
=+fv/c. The parameter
Ds
= D2
−D1 (2.12)
is called the Doppler spread. For example, if the mobile is moving at 60 km/h
and f = 900 MHz, the Doppler spread is 100 Hz. The role of the Doppler
spread can be visualized most easily when the mobile is much closer to the
wall than to the transmit antenna. In this case the attenuations are roughly the
same for both paths, and we can approximate the denominator of the second
term by r = r0
+vt. Then, combining the two sinusoids, we get
Er f t ≈ 2 sin 2 f vt/c+ r0
−d /c sin 2 f t−d/c
r0
+vt
     (2.13)
This is the product of two sinusoids, one at the input frequency f, which is typically
of the order of GHz, and the other one at fv/c=Ds/2, which might be of
the order of 50 Hz. Thus, the response to a sinusoid at f is another sinusoid at
f with a time-varying envelope, with peaks going to zeros around every 5 ms
(Figure 2.5). The envelope is at its widest when the mobile is at a peak of the
Figure 2.4 Illustration of a
direct path and a reflected
path.
Wall
Transmit
antenna
r (t)
d
υ
17 2.1 Physical modeling for wireless channels
Figure 2.5 The received
waveform oscillating at
frequency f with a slowly
varying envelope at frequency
Ds/2.
t
Er (t)
interference pattern and at its narrowest when the mobile is at a valley. Thus,
the Doppler spread determines the rate of traversal across the interference
pattern and is inversely proportional to the coherence time of the channel.
We now see why we have partially ignored the denominator terms in (2.11)
and (2.13). When the difference in the length between two paths changes by
a quarter wavelength, the phase difference between the responses on the two
paths changes by /2, which causes a very significant change in the overall
received amplitude. Since the carrier wavelength is very small relative to
the path lengths, the time over which this phase effect causes a significant
change is far smaller than the time over which the denominator terms cause
a significant change. The effect of the phase changes is of the order of
milliseconds, whereas the effect of changes in the denominator is of the order
of seconds or minutes. In terms of modulation and detection, the time-scales
of interest are in the range of milliseconds and less, and the denominators are
effectively constant over these periods.
The reader might notice that we are constantly making approximations in
trying to understand wireless communication, much more so than for wired
communication. This is partly because wired channels are typically timeinvariant
over a very long time-scale, while wireless channels are typically
time-varying, and appropriate models depend very much on the time-scales of
interest. For wireless systems, the most important issue is what approximations
to make. Thus, it is important to understand these modeling issues thoroughly.
2.1.5 Reflection from a ground plane
Consider a transmit and a receive antenna, both above a plane surface such
as a road (Figure 2.6). When the horizontal distance r between the antennas
becomes very large relative to their vertical displacements from the ground
18 The wireless channel
Figure 2.6 Illustration of a
direct path and a reflected
path off a ground plane.
Transmit antenna
Ground plane
Receive antenna
hr
hs
r2
r
r1
plane (i.e., height), a very surprising thing happens. In particular, the difference
between the direct path length and the reflected path length goes to zero
as r−1 with increasing r (Exercise 2.5). When r is large enough, this difference
between the path lengths becomes small relative to the wavelength c/f . Since
the sign of the electric field is reversed on the reflected path5, these two waves
start to cancel each other out. The electric wave at the receiver is then attenuated
as r−2, and the received power decreases as r−4. This situation is particularly
important in rural areas where base-stations tend to be placed on roads.
2.1.6 Power decay with distance and shadowing
The previous example with reflection from a ground plane suggests that the
received power can decrease with distance faster than r−2 in the presence of
disturbances to free space. In practice, there are several obstacles between
the transmitter and the receiver and, further, the obstacles might also absorb
some power while scattering the rest. Thus, one expects the power decay to
be considerably faster than r−2. Indeed, empirical evidence from experimental
field studies suggests that while power decay near the transmitter is like r−2,
at large distances the power can even decay exponentially with distance.
The ray tracing approach used so far provides a high degree of numerical
accuracy in determining the electric field at the receiver, but requires a precise
physical model including the location of the obstacles. But here, we are only
looking for the order of decay of power with distance and can consider an
alternative approach. So we look for a model of the physical environment with
the fewest parameters but one that still provides useful global information
about the field properties. A simple probabilistic model with two parameters
of the physical environment, the density of the obstacles and the fraction of
energy each object absorbs, is developed in Exercise 2.6. With each obstacle
5 This is clearly true if the electric field is parallel to the ground plane. It turns out that this is
also true for arbitrary orientations of the electric field, as long as the ground is not a perfect
conductor and the angle of incidence is small enough. The underlying electromagnetics is
analyzed in Chapter 2 of Jakes [62].
19 2.1 Physical modeling for wireless channels
absorbing the same fraction of the energy impinging on it, the model allows
us to show that the power decays exponentially in distance at a rate that is
proportional to the density of the obstacles.
With a limit on the transmit power (either at the base-station or at the
mobile), the largest distance between the base-station and a mobile at which
communication can reliably take place is called the coverage of the cell. For
reliable communication, a minimal received power level has to be met and
thus the fast decay of power with distance constrains cell coverage. On the
other hand, rapid signal attenuation with distance is also helpful; it reduces the
interference between adjacent cells. As cellular systems become more popular,
however, the major determinant of cell size is the number of mobiles in the
cell. In engineering jargon, the cell is said to be capacity limited instead of
coverage limited. The size of cells has been steadily decreasing, and one talks
of micro cells and pico cells as a response to this effect. With capacity limited
cells, the inter-cell interference may be intolerably high. To alleviate the
inter-cell interference, neighboring cells use different parts of the frequency
spectrum, and frequency is reused at cells that are far enough. Rapid signal
attenuation with distance allows frequencies to be reused at closer distances.
The density of obstacles between the transmit and receive antennas depends
very much on the physical environment. For example, outdoor plains have
very little by way of obstacles while indoor environments pose many obstacles.
This randomness in the environment is captured by modeling the density
of obstacles and their absorption behavior as random numbers; the overall
phenomenon is called shadowing.6 The effect of shadow fading differs from
multipath fading in an important way. The duration of a shadow fade lasts for
multiple seconds or minutes, and hence occurs at a much slower time-scale
compared to multipath fading.
2.1.7 Moving antenna, multiple reflectors
Dealing with multiple reflectors, using the technique of ray tracing, is in principle
simply a matter of modeling the received waveform as the sum of the responses
from the different paths rather than just two paths. We have seen enough examples,
however, to understand that finding the magnitudes and phases of these
responses is no simple task. Even for the very simple large wall example in
Figure 2.2, the reflected field calculated in (2.6) is valid only at distances from
the wall that are small relative to the dimensions of the wall. At very large distances,
the total power reflected from the wall is proportional to both d−2 and
to the area of the cross section of the wall. The power reaching the receiver is
proportional to d −r t −2. Thus, the power attenuation from transmitter to
receiver (for the large distance case) is proportional to d d−r t −2 rather
6 This is called shadowing because it is similar to the effect of clouds partly blocking sunlight.
20 The wireless channel
than to 2d − r t −2. This shows that ray tracing must be used with some
caution. Fortunately, however, linearity still holds in these more complex cases.
Another type of reflection is known as scattering and can occur in the
atmosphere or in reflections from very rough objects. Here there are a very
large number of individual paths, and the received waveform is better modeled
as an integral over paths with infinitesimally small differences in their lengths,
rather than as a sum.
Knowing how to find the amplitude of the reflected field from each type
of reflector is helpful in determining the coverage of a base-station (although
ultimately experimentation is necessary). This is an important topic if our
objective is trying to determine where to place base-stations. Studying this in
more depth, however, would take us afield and too far into electromagnetic
theory. In addition, we are primarily interested in questions of modulation,
detection, multiple access, and network protocols rather than location of
base-stations. Thus, we turn our attention to understanding the nature of the
aggregate received waveform, given a representation for each reflected wave.
This leads to modeling the input/output behavior of a channel rather than the
detailed response on each path.
2.2 Input/output model of the wireless channel
We derive an input/output model in this section. We first show that the multipath
effects can be modeled as a linear time-varying system. We then obtain
a baseband representation of this model. The continuous-time channel is then
sampled to obtain a discrete-time model. Finally we incorporate additive noise.
2.2.1 The wireless channel as a linear time-varying system
In the previous section we focused on the response to the sinusoidal input
t =cos 2 ft. Thereceived signal can be written as i ai f t t− i f t ,
where ai f t and i f t are respectively the overall attenuation and propagation
delay at time t from the transmitter to the receiver on path i. The
overall attenuation is simply the product of the attenuation factors due to the
antenna pattern of the transmitter and the receiver, the nature of the reflector,
as well as a factor that is a function of the distance from the transmitting
antenna to the reflector and from the reflector to the receive antenna. We have
described the channel effect at a particular frequency f. If we further assume
that the ai f t and the i f t do not depend on the frequency f, then we
can use the principle of superposition to generalize the above input/output
relation to an arbitrary input x t with non-zero bandwidth:
y t =
i
ai t x t− i t      (2.14)
21 2.2 Input/output model of the wireless channel
In practice the attenuations and the propagation delays are usually slowly
varying functions of frequency. These variations follow from the time-varying
path lengths and also from frequency-dependent antenna gains. However, we
are primarily interested in transmitting over bands that are narrow relative
to the carrier frequency, and over such ranges we can omit this frequency
dependence. It should however be noted that although the individual attenuations
and delays are assumed to be independent of the frequency, the overall
channel response can still vary with frequency due to the fact that different
paths have different delays.
For the example of a perfectly reflecting wall in Figure 2.4, then,
a1 t =

r0
+vt
a2 t =

2d−r0
−vt
(2.15)
1 t = r0
+vt
c
− ∠ 1
2 f
2 t = 2d−r0
−vt
c
− ∠ 2
2 f
(2.16)
where the first expression is for the direct path and the second for the reflected
path. The term ∠ j here is to account for possible phase changes at the
transmitter, reflector, and receiver. For the example here, there is a phase
reversal at the reflector so we take 1
= 0 and 2
= .
Since the channel (2.14) is linear, it can be described by the response
h t at time t to an impulse transmitted at time t− . In terms of h t ,
the input/output relationship is given by
y t =


h t x t− d      (2.17)
Comparing (2.17) and (2.14), we see that the impulse response for the fading
multipath channel is
h t =
i
ai t − i t      (2.18)
This expression is really quite nice. It says that the effect of mobile users,
arbitrarily moving reflectors and absorbers, and all of the complexities of solving
Maxwell’s equations, finally reduce to an input/output relation between
transmit and receive antennas which is simply represented as the impulse
response of a linear time-varying channel filter.
The effect of the Doppler shift is not immediately evident in this representation.
From (2.16) for the single reflecting wall example,
i t = vi/c
where vi is the velocity with which the ith path length is increasing. Thus,
the Doppler shift on the ith path is −f
i t .
In the special case when the transmitter, receiver and the environment
are all stationary, the attenuations ai t and propagation delays i t do not
22 The wireless channel
depend on time t, and we have the usual linear time-invariant channel with
an impulse response
h =
i
ai − i      (2.19)
For the time-varying impulse response h t , we can define a time-varying
frequency response
H f t
=


h t e−j2 f d =
i
ai t e−j2 f i t      (2.20)
In the special case when the channel is time-invariant, this reduces to the
usual frequency response. One way of interpreting H f t is to think of the
system as a slowly varying function of t with a frequency response H f t
at each fixed time t. Corresponding, h t can be thought of as the impulse
response of the system at a fixed time t. This is a legitimate and useful
way of thinking about many multipath fading channels, as the time-scale
at which the channel varies is typically much longer than the delay spread
(i.e., the amount of memory) of the impulse response at a fixed time. In the
reflecting wall example in Section 2.1.4, the time taken for the channel to
change significantly is of the order of milliseconds while the delay spread is
of the order of microseconds. Fading channels which have this characteristic
are sometimes called underspread channels.
2.2.2 Baseband equivalent model
In typical wireless applications, communication occurs in a passband
fc
−W/2 fc
+W/2 of bandwidth W around a center frequency fc, the
spectrum having been specified by regulatory authorities. However, most
of the processing, such as coding/decoding, modulation/demodulation,
synchronization, etc., is actually done at the baseband. At the transmitter, the
last stage of the operation is to “up-convert” the signal to the carrier frequency
and transmit it via the antenna. Similarly, the first step at the receiver is to
“down-convert” the RF (radio-frequency) signal to the baseband before further
processing. Therefore from a communication system design point of view, it
is most useful to have a baseband equivalent representation of the system.
We first start with defining the baseband equivalent representation of signals.
Consider a real signal s t with Fourier transform S f , band-limited in
fc
−W/2 fc
+W/2 with W<2fc. Define its complex baseband equivalent
sb t as the signal having Fourier transform:
Sb f =

2S f +fc f+fc > 0
0 f +fc
≤ 0   
(2.21)
23 2.2 Input/output model of the wireless channel
Figure 2.7 Illustration of the
relationship between a
passband spectrum S(f ) and
its baseband equivalent Sb(f ).
W
2
1
Sb ( f )
S( f )
f
f
–fc – W
2
fc – W
2
– fc
W
2
+ W
2
fc +
W
2

√2
Since s t is real, its Fourier transform satisfies S f =S∗ −f , which means
that sb t contains exactly the same information as s t . The factor of

2 is
quite arbitrary but chosen to normalize the energies of sb t and s t to be
the same. Note that sb t is band-limited in −W/2 W/2 . See Figure 2.7.
To reconstruct s t from sb t , we observe that

2S f = Sb f −fc +S∗
b −f −fc      (2.22)
Taking inverse Fourier transforms, we get
s t = 1 √
2
sb t ej2 fct +s∗
b t e−j2 fct =

2 sb t ej2 fct      (2.23)
In terms of real signals, the relationship between s t and sb t is
shown in Figure 2.8. The passband signal s t is obtained by modulating
sb t by

2 cos 2 fct and     sb t by −

2 sin 2 fct and summing, to
get

2 sb t ej2 fct (up-conversion). The baseband signal sb t (respectively
    sb t ) is obtained by modulating s t by

2 cos 2 fct (respectively


2 sin 2 fct) followed by ideal low-pass filtering at the baseband
−W/2 W/2 (down-conversion).
Let us now go back to the multipath fading channel (2.14) with impulse
response given by (2.18). Let xb t and yb t be the complex baseband
equivalents of the transmitted signal x t and the received signal y t ,
respectively. Figure 2.9 shows the system diagram from xb t to yb t . This
implementation of a passband communication system is known as quadrature
amplitude modulation (QAM). The signal xb t is sometimes called the
24 The wireless channel
Figure 2.8 Illustration of
upconversion from sb(t) to
s(t), followed by
downconversion from s(t)
back to sb(t). X
X
X
X
[sb(t)]
    [sb(t)]
[sb(t)]
    [sb(t)]
–√2 sin 2π fc t –√2 sin 2π fc t
√2 cos 2π fc √2 cos 2π fc t t
s(t)
–W
2
W2
–W
2
W2
1
1
+
Figure 2.9 System diagram
from the baseband transmitted
signal xb(t) to the baseband
received signal yb(t).
X
X
X
X
[xb(t)]
    [xb(t)]
[yb(t)]
    [yb(t)]
–W
2
W2
–W
2
W2
1
1
+
x(t) y(t)
h(τ, t)
–√2 sin 2π fc t –√2 sin 2π fc t
√2 cos 2π fc √2 cos 2π fc t t
in-phase component I and     xb t the quadrature component Q (rotated
by /2). We now calculate the baseband equivalent channel. Substituting
x t =

2 xb t e j2 fct and y t =

2 yb t e j2 fct into (2.14) we get
yb t e j2 fct =
i
ai t xb t− i t e j2 fc t− i t
=
   


i
ai t xb t− i t e−j2 fc i t

e j2 fct

     (2.24)
Similarly, one can obtain (Exercise 2.13)
    yb t e j2 fct =   
   


i
ai t xb t− i t e−j2 fc i t

e j2 fct

     (2.25)
Hence, the baseband equivalent channel is
yb t =
i
ab
i t xb t− i t (2.26)
25 2.2 Input/output model of the wireless channel
where
ab
i t
= ai t e−j2 fc i t      (2.27)
The input/output relationship in (2.26) is also that of a linear time-varying
system, and the baseband equivalent impulse response is
hb t =
i
ab
i t − i t      (2.28)
This representation is easy to interpret in the time domain, where the effect
of the carrier frequency can be seen explicitly. The baseband output is the
sum, over each path, of the delayed replicas of the baseband input. The
magnitude of the ith such term is the magnitude of the response on the given
path; this changes slowly, with significant changes occurring on the order of
seconds or more. The phase is changed by /2 (i.e., is changed significantly)
when the delay on the path changes by 1/ 4fc , or equivalently, when the
path length changes by a quarter wavelength, i.e., by c/ 4fc . If the path
length is changing at velocity v, the time required for such a phase change
is c/ 4fcv . Recalling that the Doppler shift D at frequency f is fv/c, and
noting that f ≈ fc for narrowband communication, the time required for a
/2 phase change is 1/ 4D . For the single reflecting wall example, this is
about 5ms (assuming fc
= 900MHz and v = 60km/h). The phases of both
paths are rotating at this rate but in opposite directions.
Note that the Fourier transform Hb f t of hb t for a fixed t is simply
H f +fc t , i.e., the frequency response of the original system (at a fixed t)
shifted by the carrier frequency. This provides another way of thinking about
the baseband equivalent channel.
2.2.3 A discrete-time baseband model
The next step in creating a useful channel model is to convert the continuoustime
channel to a discrete-time channel. We take the usual approach of the
sampling theorem. Assume that the input waveform is band-limited to W.
The baseband equivalent is then limited to W/2 and can be represented as
xb t =
n
x n sinc Wt−n (2.29)
where x n is given by xb n/W and sinc t is defined as
sinc t
= sin t
t
     (2.30)
This representation follows from the sampling theorem, which says that any
waveform band-limited to W/2 can be expanded in terms of the orthogonal
26 The wireless channel
basis sinc Wt−n n, with coefficients given by the samples (taken uniformly
at integer multiples of 1/W).
Using (2.26), the baseband output is given by
yb t =
n
x n
i
ab
i t sinc Wt−W i t −n      (2.31)
The sampled outputs at multiples of 1/W, y m
= yb m/W , are then
given by
y m =
n
x n
i
ab
i m/W sinc m−n− i m/W W      (2.32)
The sampled output y m can equivalently be thought of as the projection
of the waveform yb t onto the waveform Wsinc Wt−m . Let
= m−n.
Then
y m =

x m−
i
ab
i m/W sinc − i m/W W      (2.33)
By defining
h m
=
i
ab
i m/W sinc − i m/W W (2.34)
(2.33) can be written in the simple form
y m =

h m x m−      (2.35)
We denote h m as the th (complex) channel filter tap at time m. Its value
is a function of mainly the gains abi
t of the paths, whose delays i t are
close to /W (Figure 2.10). In the special case where the gains abi
t and the
delays i t of the paths are time-invariant, (2.34) simplifies to
h
=
i
ab
i sinc − iW (2.36)
and the channel is linear time-invariant. The th tap can be interpreted as
the sample /W th of the low-pass filtered baseband channel response hb
(cf. (2.19)) convolved with sinc(W ).
We can interpret the sampling operation as modulation and demodulation in
a communication system. At time n, we are modulating the complex symbol
x m (in-phase plus quadrature components) by the sinc pulse before the
up-conversion. At the receiver, the received signal is sampled at times m/W
27 2.2 Input/output model of the wireless channel
Figure 2.10 Due to the decay
of the sinc function, the ith
path contributes most
significantly to the th tap if
its delay falls in the window
/W −1/ 2W /W +
1/ 2W .
1
W
Main contribution l = 0
Main contribution l = 0
Main contribution l = 1
Main contribution l = 2
Main contribution l = 2
i = 0
i = 1
i = 2
i = 3
i = 4
0 1 2
l
at the output of the low-pass filter. Figure 2.11 shows the complete system.
In practice, other transmit pulses, such as the raised cosine pulse, are often
used in place of the sinc pulse, which has rather poor time-decay property
and tends to be more susceptible to timing errors. This necessitates sampling
at the Nyquist sampling rate, but does not alter the essential nature of the
model. Hence we will confine to Nyquist sampling.
Due to the Doppler spread, the bandwidth of the output yb t is generally
slightly larger than the bandwidth W/2 of the input xb t , and thus the output
samples y m do not fully represent the output waveform. This problem is
usually ignored in practice, since the Doppler spread is small (of the order
of tens to hundreds of Hz) compared to the bandwidth W. Also, it is very
convenient for the sampling rate of the input and output to be the same.
Alternatively, it would be possible to sample the output at twice the rate of
the input. This would recapture all the information in the received waveform.
28 The wireless channel
X X
X X
[x[m]]
sinc (Wt – n)
    [x[m]]
sinc (Wt – n)
h(τ, t)
1
–W W
–W W
1
+
[xb(t)]
    [y[m]]
[yb(t)] [y[m]]
    [yb(t)]
x(t) y(t)
    [xb(t)]
2 2
2 2
–√2 sin 2π fc t –√2 sin 2π fc t
√2 cos 2π fc √2 cos 2π fc t t
Figure 2.11 System diagram The number of taps would be almost doubled because of the reduced sample
from the baseband transmitted
symbol x[m] to the baseband
sampled received signal y[m].
interval, but it would typically be somewhat less than doubled since the
representation would not spread the path delays so much.
Discussion 2.1 Degrees of freedom
The symbol x m is the mth sample of the transmitted signal; there are
W samples per second. Each symbol is a complex number; we say that it
represents one (complex) dimension or degree of freedom. The continuoustime
signal x t of duration one second corresponds to W discrete symbols;
thus we could say that the band-limited, continuous-time signal has W
degrees of freedom, per second.
The mathematical justification for this interpretation comes from the
following important result in communication theory: the signal space of
complex continuous-time signals of duration T which have most of their
energy within the frequency band −W/2 W/2 has dimension approximately
WT. (A precise statement of this result is in standard communication
theory text/books; see Section 5.3 of [148] for example.)
This result reinforces our interpretation that a continuous-time signal
with bandwidth W can be represented by W complex dimensions per
second.
The received signal y t is also band-limited to approximately W (due
to the Doppler spread, the bandwidth is slightly larger than W) and has W
complex dimensions per second. From the point of view of communication
over the channel, the received signal space is what matters because it
dictates the number of different signals which can be reliably distinguished
at the receiver. Thus, we define the degrees of freedom of the channel
to be the dimension of the received signal space, and whenever we refer
to the signal space, we implicitly mean the received signal space unless
stated otherwise.
29 2.2 Input/output model of the wireless channel
2.2.4 Additive white noise
As a last step, we include additive noise in our input/output model. We make
the standard assumption that w t is zero-mean additive white Gaussian noise
(AWGN) with power spectral density N0/2 (i.e., E w 0 w t = N0/2 t .
The model (2.14) is now modified to be
y t =
i
ai t x t− i t +w t      (2.37)
See Figure 2.12. The discrete-time baseband-equivalent model (2.35) now
becomes
y m =

h m x m− +w m (2.38)
where w m is the low-pass filtered noise at the sampling instant m/W.
Just like the signal, the white noise w t is down-converted, filtered at the
baseband and ideally sampled. Thus, it can be verified (Exercise 2.11) that
w m =


w t m 1 t dt (2.39)
    w m =


w t m 2 t dt (2.40)
where
m 1 t
=

2W cos 2 fct sinc Wt−m
m 2 t
= −

2W sin 2 fct sinc Wt−m      (2.41)
It can further be shown that m 1 t m 2 t m forms an orthonormal set of
waveforms, i.e., the waveforms are orthogonal to each other (Exercise 2.12).
In Appendix A we review the definition and basic properties of white Gaussian
random vectors (i.e., vectors whose components are independent and
identically distributed (i.i.d.) Gaussian random variables). A key property is
that the projections of a white Gaussian random vector onto any orthonormal
vectors are independent and identically distributed Gaussian random
variables. Heuristically, one can think of continuous-time Gaussian white
noise as an infinite-dimensional white random vector and the above property
carries through: the projections onto orthogonal waveforms are uncorrelated
and hence independent. Hence the discrete-time noise process w m
is white, i.e., independent over time; moreover, the real and imaginary
components are i.i.d. Gaussians with variances N0/2. A complex Gaussian
random variable X whose real and imaginary components are i.i.d. satisfies
a circular symmetry property: ej X has the same distribution as X for
any . We shall call such a random variable circular symmetric complex
30 The wireless channel
X
X X
X
    [x[m]]     [y[m]]
[x[m]] [y[m]]
    [xb(t)]     [yb(t)]
[xb(t)] [yb(t)]
sinc(Wt – n)
sinc(Wt – n)
w(t)
y(t)
x(t)
+ h(τ, t) +
W2
2
– W
2
W2
2
– W
2
–√2 sin 2π fc t –√2 sin 2π fc t
√2 cos 2π fc √2 cos 2π fc t t
Figure 2.12 A complete system Gaussian, denoted by 0 2 , where 2 = E X 2 . The concept of cirdiagram.
cular symmetry is discussed further in Section A.1.3 of Appendix A.
The assumption of AWGN essentially means that we are assuming that the
primary source of the noise is at the receiver or is radiation impinging on
the receiver that is independent of the paths over which the signal is being
received. This is normally a very good assumption for most communication
situations.
2.3 Time and frequency coherence
2.3.1 Doppler spread and coherence time
An important channel parameter is the time-scale of the variation of the
channel. How fast do the taps h m vary as a function of time m? Recall that
h m =
i
ab
i m/W sinc − i m/W W
=
i
ai m/W e−j2 fc i m/W sinc − i m/W W      (2.42)
Let us look at this expression term by term. From Section 2.2.2 we gather that
significant changes in ai occur over periods of seconds or more. Significant
changes in the phase of the ith path occur at intervals of 1/ 4Di , where
Di
= fc
i t is the Doppler shift for that path. When the different paths
contributing to the th tap have different Doppler shifts, the magnitude of
h m changes significantly. This is happening at the time-scale inversely
proportional to the largest difference between the Doppler shifts, the Doppler
spread Ds:
Ds
= max
i j
fc

i t −
j t (2.43)
31 2.3 Time and frequency coherence
where the maximum is taken over all the paths that contribute significantly to
a tap.7 Typical intervals for such changes are on the order of 10 ms. Finally,
changes in the sinc term of (2.42) due to the time variation of each i t are
proportional to the bandwidth, whereas those in the phase are proportional
to the carrier frequency, which is typically much larger. Essentially, it takes
much longer for a path to move from one tap to the next than for its phase
to change significantly. Thus, the fastest changes in the filter taps occur
because of the phase changes, and these are significant over delay changes
of 1/ 4Ds .
The coherence time Tc of a wireless channel is defined (in an order of
magnitude sense) as the interval over which h m changes significantly as a
function of m. What we have found, then, is the important relation
Tc
= 1
4Ds
     (2.44)
This is a somewhat imprecise relation, since the largest Doppler shifts may
belong to paths that are too weak to make a difference. We could also view a
phase change of /4 to be significant, and thus replace the factor of 4 above
by 8. Many people instead replace the factor of 4 by 1. The important thing
is to recognize that the major effect in determining time coherence is the
Doppler spread, and that the relationship is reciprocal; the larger the Doppler
spread, the smaller the time coherence.
In the wireless communication literature, channels are often categorized as
fast fading and slow fading, but there is little consensus on what these terms
mean. In this book, we will call a channel fast fading if the coherence time Tc
is much shorter than the delay requirement of the application, and slow fading
if Tc is longer. The operational significance of this definition is that, in a
fast fading channel, one can transmit the coded symbols over multiple fades
of the channel, while in a slow fading channel, one cannot. Thus, whether a
channel is fast or slow fading depends not only on the environment but also
on the application; voice, for example, typically has a short delay requirement
of less than 100 ms, while some types of data applications can have a laxer
delay requirement.
2.3.2 Delay spread and coherence bandwidth
Another important general parameter of a wireless system is the multipath
delay spread, Td, defined as the difference in propagation time between the
7 The Doppler spread can in principle be different for different taps. Exercise 2.10 explores
this possibility.
32 The wireless channel
longest and shortest path, counting only the paths with significant energy.
Thus,
Td
= max
i j
i t − j t      (2.45)
This is defined as a function of t, but we regard it as an order of magnitude
quantity, like the time coherence and Doppler spread. If a cell or LAN has
a linear extent of a few kilometers or less, it is very unlikely to have path
lengths that differ by more than 300 to 600 meters. This corresponds to path
delays of one or two microseconds. As cells become smaller due to increased
cellular usage, Td also shrinks. As was already mentioned, typical wireless
channels are underspread, which means that the delay spread Td is much
smaller than the coherence time Tc.
The bandwidths of cellular systems range between several hundred kilohertz
and several megahertz, and thus, for the above multipath delay spread values,
all the path delays in (2.34) lie within the peaks of two or three sinc functions;
more often, they lie within a single peak. Adding a few extra taps to each
channel filter because of the slow decay of the sinc function, we see that
cellular channels can be represented with at most four or five channel filter
taps. On the other hand, there is a recent interest in ultra-wideband (UWB)
communication, operating from 3.1 to 10.6 GHz. These channels can have up
to a few hundred taps.
When we study modulation and detection for cellular systems, we shall see
that the receiver must estimate the values of these channel filter taps. The taps
are estimated via transmitted and received waveforms, and thus the receiver
makes no explicit use of (and usually does not have) any information about
individual path delays and path strengths. This is why we have not studied the
details of propagation over multiple paths with complicated types of reflection
mechanisms. All we really need is the aggregate values of gross physical
mechanisms such as Doppler spread, coherence time, and multipath spread.
The delay spread of the channel dictates its frequency coherence. Wireless
channels change both in time and frequency. The time coherence shows
us how quickly the channel changes in time, and similarly, the frequency
coherence shows how quickly it changes in frequency. We first understood
about channels changing in time, and correspondingly about the duration of
fades, by studying the simple example of a direct path and a single reflected
path. That same example also showed us how channels change with frequency.
We can see this in terms of the frequency response as well.
Recall that the frequency response at time t is
H f t =
i
ai t e−j2 f i t      (2.46)
The contribution due to a particular path has a phase linear in f. For multiple
paths, there is a differential phase, 2 f i t − k t . This differential
33 2.3 Time and frequency coherence
10
0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
–60
–50
–40
–30
–20
–10
0
0.65 0.66 0.67 0.68 0.69 0.7 0.71 0.72 0.73 0.74 0.75 0.76
0.45
0
–10
–20
–0.001
–0.0008
–0.0006
–0.0004
–0.0002
0
0.0002
0.0004
0.0006
0.0008
0.001
0 50 100 150 200 250 300 350 400 450 500 550
–30
–40
–50
–60
–0.006 –70
–0.005
–0.004
–0.003
–0.002
–0.001
0
0.001
0.002
0.003
0.004
0 50 100 150 200 250 300 350 400 450 500 550 0.5
(d)
Power spectrum
(dB)
Power specturm
(dB)
Amplitude
(linear scale)
Amplitude
(linear scale)
(b)
Time (ns)
Time (ns)
(a)
(c)
40 MHz
Frequency (GHz)
Frequency (GHz)
200 MHz
Figure 2.13 (a) A channel over phase causes selective fading in frequency. This says that Er f t changes
200MHz is frequency-selective,
and the impulse response has
many taps. (b) The spectral
content of the same channel.
(c) The same channel over
40MHz is flatter, and has for
fewer taps. (d) The spectral
contents of the same channel,
limited to 40MHz bandwidth.
At larger bandwidths, the same
physical paths are resolved into
a finer resolution.
significantly, not only when t changes by 1/ 4Ds , but also when f changes
by 1/ 2Td . This argument extends to an arbitrary number of paths, so the
coherence bandwidth, Wc, is given by
Wc
= 1
2Td
     (2.47)
This relationship, like (2.44), is intended as an order of magnitude relation,
essentially pointing out that the coherence bandwidth is reciprocal to the
multipath spread. When the bandwidth of the input is considerably less than
Wc, the channel is usually referred to as flat fading. In this case, the delay
spread Td is much less than the symbol time 1/W, and a single channel
filter tap is sufficient to represent the channel. When the bandwidth is much
larger than Wc, the channel is said to be frequency-selective, and it has to
be represented by multiple taps. Note that flat or frequency-selective fading
is not a property of the channel alone, but of the relationship between the
bandwidth W and the coherence bandwidth Td (Figure 2.13).
The physical parameters and the time-scale of change of key parameters of
the discrete-time baseband channel model are summarized in Table 2.1. The
different types of channels are summarized in Table 2.2.
34 The wireless channel
Table 2.1 A summary of the physical parameters of the channel and the
time-scale of change of the key parameters in its discrete-time baseband
model.
Key channel parameters and time-scales Symbol Representative values
Carrier frequency fc 1 GHz
Communication bandwidth W 1MHz
Distance between transmitter and receiver d 1 km
Velocity of mobile v 64 km/h
Doppler shift for a path D = fcv/c 50 Hz
Doppler spread of paths corresponding to
a tap Ds 100 Hz
Time-scale for change of path amplitude d/v 1 minute
Time-scale for change of path phase 1/ 4D 5ms
Time-scale for a path to move over a tap c/ vW 20 s
Coherence time Tc
= 1/ 4Ds 2.5 ms
Delay spread Td 1 s
Coherence bandwidth Wc
= 1/ 2Td 500 kHz
Table 2.2 A summary of the types of wireless
channels and their defining characteristics.
Types of channel Defining characteristic
Fast fading Tc

 delay requirement
Slow fading Tc
delay requirement
Flat fading W
Wc
Frequency-selective fading W Wc
Underspread Td

Tc
2.4 Statistical channel models
2.4.1 Modeling philosophy
We defined Doppler spread and multipath spread in the previous section as
quantities associated with a given receiver at a given location, velocity, and
time. However, we are interested in a characterization that is valid over some
range of conditions. That is, we recognize that the channel filter taps {h m }
must be measured, but we want a statistical characterization of how many
taps are necessary, how quickly they change and how much they vary.
Such a characterization requires a probabilistic model of the channel tap
values, perhaps gathered by statistical measurements of the channel. We are
familiar with describing additive noise by such a probabilistic model (as
a Gaussian random variable). We are also familiar with evaluating error
probability while communicating over a channel using such models. These
35 2.4 Statistical channel models
error probability evaluations, however, depend critically on the independence
and Gaussian distribution of the noise variables.
It should be clear from the description of the physical mechanisms generating
Doppler spread and multipath spread that probabilistic models for the
channel filter taps are going to be far less believable than the models for
additive noise. On the other hand, we need such models, even if they are
quite inaccurate. Without models, systems are designed using experience and
experimentation, and creativity becomes somewhat stifled. Even with highly
over-simplified models, we can compare different system approaches and get
a sense of what types of approaches are worth pursuing.
To a certain extent, all analytical work is done with simplified models. For
example, white Gaussian noise (WGN) is often assumed in communication
models, although we know the model is valid only over sufficiently small
frequency bands. With WGN, however, we expect the model to be quite good
when used properly. For wireless channel models, however, probabilistic
models are quite poor and only provide order-of-magnitude guides to system
design and performance. We will see that we can define Doppler spread, multipath
spread, etc. much more cleanly with probabilistic models, but the underlying
problem remains that these channels are very different from each other and
cannot really be characterized by probabilistic models. At the same time, there
is a large literature based on probabilistic models for wireless channels, and it
has been highly useful for providing insight into wireless systems. However,
it is important to understand the robustness of results based on these models.
There is another question in deciding what to model. Recall the continuoustime
multipath fading channel
y t =
i
ai t x t− i t +w t      (2.48)
This contains an exact specification of the delay and magnitude of each path.
From this, we derived a discrete-time baseband model in terms of channel
filter taps as
y m =

h m x m− +w m (2.49)
where
h m =
i
ai m/W e−j2 fc i m/W sinc − i m/W W      (2.50)
We used the sampling theorem expansion in which x m = xb m/W and
y m = yb m/W . Each channel tap h m contains an aggregate of paths,
with the delays smoothed out by the baseband signal bandwidth.
Fortunately, it is the filter taps that must be modeled for input/output
descriptions, and also fortunately, the filter taps often contain a sufficient path
aggregation so that a statistical model might have a chance of success.
36 The wireless channel
2.4.2 Rayleigh and Rician fading
The simplest probabilistic model for the channel filter taps is based on
the assumption that there are a large number of statistically independent
reflected and scattered paths with random amplitudes in the delay window corresponding
to a single tap. The phase of the ith path is 2 fc i modulo 2 . Now,
fc i
= di/ , where di is the distance travelled by the ith path and is the carrier
wavelength. Since the reflectors and scatterers are far away relative to the carrier
wavelength, i.e., di
, it is reasonable to assume that the phase for each
path is uniformly distributed between 0 and 2 and that the phases of different
paths are independent. The contribution of each path in the tap gain h m is
ai m/W e−j2 fc i m/W sinc − i m/W W (2.51)
and this can be modeled as a circular symmetric complex random variable.8
Each tap h m is the sum of a large number of such small independent
circular symmetric random variables. It follows that h m is the sum of
many small independent real random variables, and so by the Central Limit
Theorem, it can reasonably be modeled as a zero-mean Gaussian random
variable. Similarly, because of the uniform phase, h m ej is Gaussian
with the same variance for any fixed . This assures us that h m is in
fact circular symmetric 0 2
(see Section A.1.3 in Appendix A for an
elaboration). It is assumed here that the variance of h m is a function of the
tap , but independent of time m (there is little point in creating a probabilistic
model that depends on time). With this assumed Gaussian probability density,
we know that the magnitude h m of the th tap is a Rayleigh random
variable with density (cf. (A.20) in Appendix A and Exercise 2.14)
x
2

exp −x2
2 2



x≥ 0 (2.52)
and the squared magnitude h m 2 is exponentially distributed with density
1
2

exp −x
2



x≥ 0     (2.53)
This model, which is called Rayleigh fading, is quite reasonable for scattering
mechanisms where there are many small reflectors, but is adopted
primarily for its simplicity in typical cellular situations with a relatively small
number of reflectors. The word Rayleigh is almost universally used for this
8 See Section A.1.3 in Appendix A for a more in-depth discussion of circular symmetric
random variables and vectors.
37 2.4 Statistical channel models
model, but the assumption is that the tap gains are circularly symmetric
complex Gaussian random variables.
There is a frequently used alternative model in which the line-of-sight path
(often called a specular path) is large and has a known magnitude, and that
there are also a large number of independent paths. In this case, h m , at
least for one value of , can be modeled as
h m =
+1
ej + 1
+1
0 2
(2.54)
with the first term corresponding to the specular path arriving with uniform
phase and the second term corresponding to the aggregation of the large
number of reflected and scattered paths, independent of . The parameter
(so-called K-factor) is the ratio of the energy in the specular path to the
energy in the scattered paths; the larger is, the more deterministic is the
channel. The magnitude of such a random variable is said to have a Rician
distribution. Its density has quite a complicated form; it is often a better model
of fading than the Rayleigh model.
2.4.3 Tap gain auto-correlation function
Modeling each h m as a complex random variable provides part of the statistical
description that we need, but this is not the most important part. The more
important issue is how these quantities vary with time. As we will see in the rest
of the book, the rate of channel variation has significant impact on several aspects
of the communication problem. A statistical quantity that models this relationship
is known as the tap gain auto-correlation function, R n . It is defined as
R n
= h∗
m h m+n      (2.55)
For each tap , this gives the auto-correlation function of the sequence of
random variables modeling that tap as it evolves in time. We are tacitly
assuming that this is not a function of time m. Since the sequence of random
variables h m for any given has both a mean and covariance function
that does not depend on m, this sequence is wide-sense stationary. We also
assume that, as a random variable, h m is independent of h m for all

= and all m m . This final assumption is intuitively plausible since paths
in different ranges of delay contribute to h m for different values of .9
The coefficient R 0 is proportional to the energy received in the th
tap. The multipath spread Td can be defined as the product of 1/W times
the range of which contains most of the total energy

=0 R 0 . This is
9 One could argue that a moving reflector would gradually travel from the range of one tap to
another, but as we have seen, this typically happens over a very large time-scale.
38 The wireless channel
somewhat preferable to our previous “definition” in that the statistical nature
of Td becomes explicit and the reliance on some sort of stationarity becomes
explicit. Now, we can also define the coherence time Tc more explicitly as
the smallest value of n > 0 for which R n is significantly different from
R 0 . With both of these definitions, we still have the ambiguity of what
“significant” means, but we are now facing the reality that these quantities
must be viewed as statistics rather than as instantaneous values.
The tap gain auto-correlation function is useful as a way of expressing the
statistics for how tap gains change given a particular bandwidth W, but gives
little insight into questions related to choice of a bandwidth for communication.
If we visualize increasing the bandwidth, we can see several things happening.
First, the ranges of delay that are separated into different taps become narrower
(1/W seconds), so there are fewer paths corresponding to each tap, and thus the
Rayleigh approximation becomes poorer. Second, the sinc functions of (2.50)
become narrower, andR 0 gives a finer grained picture of the amount of power
being received in the th delay window of width 1/W. In summary, as we try
to apply this model to larger W, we get more detailed information about delay
and correlation at that delay, but the information becomes more questionable.
Example 2.2 Clarke’s model
This is a popular statistical model for flat fading. The transmitter is fixed,
the mobile receiver is moving at speed v, and the transmitted signal is
scattered by stationary objects around the mobile. There are K paths, the
ith path arriving at an angle i
= 2 i/K, i = 0 K−1, with respect
to the direction of motion. K is assumed to be large. The scattered path
arriving at the mobile at the angle has a delay of t and a timeinvariant
gain a , and the input/output relationship is given by
y t =
K−1

i=0
a i
x t− i
t (2.56)
The most general version of the model allows the received power distribution
p and the antenna gain pattern to be arbitrary functions of
the angle , but the most common scenario assumes uniform power distribution
and isotropic antenna gain pattern, i.e., the amplitudes a
= a/

K
for all angles . This models the situation when the scatterers are located
in a ring around the mobile (Figure 2.14). We scale the amplitude of each
path by

K so that the total received energy along all paths is a2; for large
K, the received energy along each path is a small fraction of the total energy.
Suppose the communication bandwidth W is much smaller than the
reciprocal of the delay spread. The complex baseband channel can be
represented by a single tap at each time:
y m = h0 m x m +w m      (2.57)
39 2.4 Statistical channel models
Rx
Figure 2.14 The one-ring model.
The phase of the signal arriving at time 0 from an angle is 2 fc 0
mod 2 , where fc is the carrier frequency. Making the assumption that
this phase is uniformly distributed in 0 2 and independently distributed
across all angles , the tap gain process h0 m is a sum of many small
independent contributions, one from each angle. By the Central Limit
Theorem, it is reasonable to model the process as Gaussian. Exercise 2.17
shows further that the process is in fact stationary with an autocorrelation
function R0 n given by:
R0 n = 2a2 J0 n Ds/W (2.58)
where J0 · is the zeroth-order Bessel function of the first kind:
J0 x
= 1



0
ejx cos d      (2.59)
and Ds
= 2fcv/c is the Doppler spread. The power spectral density S f ,
defined on −1/2 +1/2 , is given by
S f =

 4a2W
Ds

1− 2fW/Ds 2
−Ds/ 2W f +Ds/ 2W
0 else   
(2.60)
This can be verified by computing the inverse Fourier transform of (2.60)
to be (2.58). Plots of the autocorrelation function and the spectrum for are
shown in Figure 2.15. If we define the coherence time Tc to be the value
of n/W such that R0 n = 0    05R0 0 , then
Tc
= J−1
0 0    05
Ds
(2.61)
i.e., the coherence time is inversely proportional to Ds .
40 The wireless channel
2000
2.5
3
3.5
1.5
1
0.5
0
–0.5
–1
–1.5
200 400 600 800 1000 1200 1400 1600 1800
2
R0 [n]
–1/2 1/2
S ( f )
–Ds / (2W ) 0 Ds / (2W )
Figure 2.15 Plots of the auto-correlation function and Doppler spectrum in Clarke’s model.
In Exercise 2.17, you will also verify that S f df has the physical
interpretation of the received power along paths that have Doppler shifts
in the range f f +df . Thus, S f is also called the Doppler spectrum.
Note that S f is zero beyond the maximum Doppler shift.
Chapter 2 The main plot
Large-scale fading
Variation of signal strength over distances of the order of cell sizes.
Received power decreases with distance r like:
1
r2 (free space)
1
r4 (reflection from ground plane)   
Decay can be even faster due to shadowing and scattering effects.
41 2.4 Statistical channel models
Small-scale fading
Variation of signal strength over distances of the order of the carrier
wavelength, due to constructive and destructive interference of multipaths.
Key parameters:
Doppler spread Ds
←→coherence time Tc
∼ 1/Ds
Doppler spread is proportional to the velocity of the mobile and to the
angular spread of the arriving paths.
delay spread Td
←→coherence bandwidth Wc
∼ 1/Td
Delay spread is proportional to the difference between the lengths of the
shortest and the longest paths.
Input/output channel models
• Continuous-time passband (2.14):
y t =
i
ai t x t− i t    
• Continuous-time complex baseband (2.26):
yb t =
i
ai t e−j2 fc i t xb t− i t    
• Discrete-time complex baseband with AWGN (2.38):
y m =

h m x m− +w m    
The th tap is the aggregation of the physical paths with delays in
/W −1/ 2W /W +1/ 2W .
Statistical channel models
• h m m is modeled as circular symmetric processes independent across
the taps.
• If for all taps,
h m ∼ 0 2

the model is called Rayleigh.
• If for one tap,
h m =
+1
ej + 1
+1
0 2

the model is called Rician with K-factor .
42 The wireless channel
• The tap gain auto-correlation function R n
= E h∗
0 h n models
the dependency over time.
• The delay spread is 1/W times the range of taps which contains most
of the total gain

=0 R 0 . The coherence time is 1/W times the range
of n for which R n is significantly different from R 0 .
2.5 Bibliographical notes
This chapter was modified from R. G. Gallager’s MIT 6.450 course notes on digital
communication. The focus is on small-scale multipath fading. Large-scale fading
models are discussed in many texts; see for example Rappaport [98]. Clarke’s model
was introduced in [22] and elaborated further in [62]. Our derivation here of the Clarke
power spectrum follows the approach of [111].
2.6 Exercises
Exercise 2.1 (Gallager) Consider the electric field in (2.4).
1. It has been derived under the assumption that the motion is in the direction of
the line-of-sight from sending antenna to receive antenna. Find the electric field
assuming that is the angle between the line-of-sight and the direction of motion
of the receiver. Assume that the range of time of interest is small enough so that
changes in can be ignored.
2. Explain why, and under what conditions, it is a reasonable approximation to ignore
the change in over small intervals of time.
Exercise 2.2 (Gallager) Equation (2.13) was derived under the assumption that
r t ≈ d. Derive an expression for the received waveform for general r t . Break the
first term in (2.11) into two terms, one with the same numerator but the denominator
2d−r0
−vt and the other with the remainder. Interpret your result.
Exercise 2.3 In the two-path example in Sections 2.1.3 and 2.1.4, the wall is on the
right side of the receiver so that the reflected wave and the direct wave travel in opposite
directions. Suppose now that the reflecting wall is on the left side of transmitter. Redo the
analysis. What is the nature of the multipath fading, both over time and over frequency?
Explain any similarity or difference with the case considered in Sections 2.1.3 and 2.1.4.
Exercise 2.4 A mobile receiver is moving at a speed v and is receiving signals arriving
along two reflected paths which make angles 1 and 2 with the direction of motion.
The transmitted signal is a sinusoid at frequency f.
1. Is the above information enough for estimating (i) the coherence time Tc; (ii) the
coherence bandwidth Wc? If so, express them in terms of the given parameters. If
not, specify what additional information would be needed.
2. Consider an environment in which there are reflectors and scatterers in all directions
from the receiver and an environment in which they are clustered within a small
43 2.6 Exercises
angular range. Using part (1), explain how the channel would differ in these two
environments.
Exercise 2.5 Consider the propagation model in Section 2.1.5 where there is a reflected
path from the ground plane.
1. Let r1 be the length of the direct path in Figure 2.6. Let r2 be the length of the
reflected path (summing the path length from the transmitter to the ground plane
and the path length from the ground plane to the receiver). Show that r2
−r1 is
asymptotically equal to b/r and find the value of the constant b. Hint: Recall that
for x small,

1+x ≈ 1+x/2 in the sense that

1+x−1 /x→1/2 as x→0.
2. Assume that the received waveform at the receive antenna is given by
Er f t = cos 2 ft−fr1/c
r1
− cos 2 ft−fr2/c
r2
     (2.62)
Approximate the denominator r2 by r1 in (2.62) and show that Er
≈ /r2 for r−1
much smaller than c/f . Find the value of .
3. Explain why this asymptotic expression remains valid without first approximating
the denominator r2 in (2.62) by r1.
Exercise 2.6 Consider the following simple physical model in just a single dimension.
The source is at the origin and transmits an isotropic wave of angular frequency .
The physical environment is filled with uniformly randomly located obstacles. We
will model the inter-obstacle distance as an exponential random variable, i.e., it has
the density10
e− r r≥ 0     (2.63)
Here 1/ is the mean distance between obstacles and captures the density of the obstacles.
Viewing the source as a stream of photons, suppose each obstacle independently
(from one photon to the other and independent of the behavior of the other obstacles)
either absorbs the photon with probability or scatters it either to the left or to the
right (both with equal probability 1− /2).
Now consider the path of a photon transmitted either to the left or to the right with
equal probability from some fixed point on the line. The probability density function
of the distance (denoted by r) to the first obstacle (the distance can be on either side
of the starting point, so r takes values on the entire line) is equal to
q r
= e− r
2
r∈      (2.64)
So the probability density function of the distance at which the photon is absorbed
upon hitting the first obstacle is equal to
f1 r
= q r r ∈      (2.65)
10 This random arrangement of points on a line is called a Poisson point process.
44 The wireless channel
1. Show that the probability density function of the distance from the origin at which
the second obstacle is met is
f2 r
=


1− q x f1 r −x dx r ∈      (2.66)
2. Denote by fk r the probability density function of the distance from the origin
at which the photon is absorbed by exactly the kth obstacle it hits and show the
recursive relation
fk+1 r =


1− q x fk r −x dx r ∈      (2.67)
3. Conclude from the previous step that the probability density function of the distance
from the source at which the photon is absorbed (by some obstacle), denoted by
f r , satisfies the recursive relation
f r = q r + 1−


q x f r −x dx r ∈      (2.68)
Hint: Observe that f r =

k=1 fk r .
4. Show that
f r =


2
e−

r (2.69)
is a solution to the recursive relation in (2.68). Hint: Observe that the convolution
between the probability densities q · and f · in (2.68) is more easily represented
using Fourier transforms.
5. Now consider the photons that are absorbed at a distance of more than r from the
source. This is the radiated power density at a distance r and is found by integrating
f x over the range r if r > 0 and − r if r < 0. Calculate the radiated
power density to be
e−

r
2
(2.70)
and conclude that the power decreases exponentially with distance r. Also observe
that with very low absorption →0 or very few obstacles →0 , the power
density converges to 0.5; this is expected since the power splits equally on either
side of the line.
Exercise 2.7 In Exercise 2.6, we considered a single-dimensional physical model of a
scattering and absorption environment and concluded that power decays exponentially
with distance. A reading exercise is to study [42], which considers a natural extension
of this simple model to two- and three-dimensional spaces. Further, it extends the
analysis to two- and three-dimensional physical models. While the analysis is more
complicated, we arrive at the same conclusion: the radiated power decays exponentially
with distance.
45 2.6 Exercises
Exercise 2.8 (Gallager) Assume that a communication channel first filters the transmitted
passband signal before adding WGN. Suppose the channel is known and the
channel filter has an impulse response h t . Suppose that a QAM scheme with symbol
duration T is developed without knowledge of the channel filtering. A baseband filter
t is developed satisfying the Nyquist property that t−kT k is an orthonormal
set. The matched filter −t is used at the receiver before sampling and detection.
If one is aware of the channel filter h t , one may want to redesign either the
baseband filter at the transmitter or the baseband filter at the receiver so that there
is no intersymbol interference between receiver samples and so that the noise on the
samples is i.i.d.
1. Which filter should one redesign?
2. Give an expression for the impulse response of the redesigned filter (assume a
carrier frequency fc).
3. Draw a figure of the various filters at passband to show why your solution is
correct. (We suggest you do this before answering the first two parts.)
Exercise 2.9 Consider the two-path example in Section 2.1.4 with d = 2 km and the
receiver at 1.5 km from the transmitter moving at velocity 60 km/h away from the
transmitter. The carrier frequency is 900 MHz.
1. Plot in MATLAB the magnitudes of the taps of the discrete-time baseband channel
at a fixed time t. Give a few plots for several bandwidths W so as to exhibit both
flat and frequency-selective fading.
2. Plot the time variation of the phase and magnitude of a typical tap of the discretetime
baseband channel for a bandwidth where the channel is (approximately)
flat and for a bandwidth where the channel is frequency-selective. How do the
time-variations depend on the bandwidth? Explain.
Exercise 2.10 For each tap of the discrete-time channel response, the Doppler spread
is the range of Doppler shifts of the paths contributing to that tap. Give an example
of an environment (i.e. location of reflectors/scatterers with respect to the location of
the transmitter and the receiver) in which the Doppler spread is the same for different
taps and an environment in which they are different.
Exercise 2.11 Verify (2.39) and (2.40).
Exercise 2.12 In this problem we consider generating passband orthogonal waveforms
from baseband ones.
1. Show that if the waveforms t − nT n form an orthogonal set, then the
waveforms n 1 n 2 n also form an orthogonal set, provided that t is bandlimited
to −fc fc . Here,
n 1 t = t−nT cos 2 fct
n 2 t = t−nT sin 2 fct   
How should we normalize the energy of t to make the t orthonormal?
2. For a given fc, find an example where the result in part (1) is false when the
condition that t is band-limited to −fc fc is violated.
Exercise 2.13 Verify (2.25). Does this equation contain any more information about
the communication system in Figure 2.9 beyond what is in (2.24)? Explain.
46 The wireless channel
Exercise 2.14 Compute the probability density function of the magnitude X of a
complex circular symmetric Gaussian random variable X with variance 2.
Exercise 2.15 In the text we have discussed the various reasons why the channel tap
gains, h m , vary in time (as a function of m) and how the various dynamics operate
at different time-scales. The analysis is based on the assumption that communication
takes place on a bandwidth W around a carrier frequency fc with fc
W. This
assumption is not valid for ultra-wideband (UWB) communication systems, where the
transmission bandwidth is from 3.1 GHz to 10.6 GHz, as regulated by the FCC. Redo
the analysis for this system. What is the main mechanism that causes the tap gains to
vary at the fastest time-scale, and what is this fastest time-scale determined by?
Exercise 2.16 In Section 2.4.2, we argue that the channel gain h m at a particular
time m can be assumed to be circular symmetric. Extend the argument to show that it
is also reasonable to assume that the complex random vector
h
=


h m
h m+1
           
h m+n


is circular symmetric for any n.
Exercise 2.17 In this question, we will analyze in detail Clarke’s one-ring model
discussed at the end of the chapter. Recall that the scatterers are assumed to be located
in a ring around the receiver moving at speed v. There are K paths coming in at angles
i
= 2 i/K with respect to the direction of motion of the mobile, i = 0 K−1   
The path coming at angle has a delay of t and a time-invariant gain a/

K (not
dependent on the angle), and the input/output relationship is given by
y t = √a
K
K−1

i=0
x t− i
t      (2.71)
1. Give an expression for the impulse response h t for this channel, and give an
expression for t in terms of 0 . (You can assume that the distance the mobile
travelled in 0 t is small compared to the radius of the ring.)
2. Suppose communication takes place at carrier frequency fc and over a narrowband
of bandwidth W such that the delay spread of the channel Td satisfies Td

1/W.
Argue that the discrete-time baseband model can be approximately represented by
a single tap
y m = h0 m x m +w m (2.72)
and give an approximate expression for that tap in terms of the a ’s and t ’s.
Hint: Your answer should contain no sinc functions.
3. Argue that it is reasonable to assume that the phase of the path from an angle at
time 0,
2 fc 0 mod 2
is uniformly distributed in 0 2 and that it is i.i.d. across .
47 2.6 Exercises
4. Based on the assumptions in part (3), for large K one can use the Central Limit
Theorem to approximate h0 m as a Gaussian process. Verify that the limiting
process is stationary and the autocorrelation function R0 n is given by (2.58).
5. Verify that the Doppler spectrum S f is given by (2.60). Hint: It is easier to show
that the inverse Fourier transform of (2.60) is (2.58).
6. Verify that S f df is indeed the received power from the paths that have Doppler
shifts in f f +df . Is this surprising?
Exercise 2.18 Consider a one-ring model where there are K scatterers located at
angles i
= 2 i/K, i = 0 K−1, on a circle of radius 1 km around the receiver
and the transmitter is 2 km away. (The angles are with respect to the line joining the
transmitter and the receiver.) The transmit power is P. The power attenuation along a
path from the transmitter to a scatterer to the receiver is
G
K
· 1
s2
· 1
r2 (2.73)
where G is a constant and r and s are the distance from the transmitter to the scatterer
and the distance from the scatterer to the receiver respectively. Communication takes
place at a carrier frequency fc
=1    9 GHz and the bandwidth is W Hz. You can assume
that, at any time, the phases of each arriving path in the baseband representation of
the channel are independent and uniformly distributed between 0 and 2 .
1. What are the key differences and the similarities between this model and the
Clarke’s model in the text?
2. Find approximate conditions on the bandwidth W for which one gets a flat fading
channel.
3. Suppose the bandwidth is such that the channel is frequency selective. For large
K, find approximately the amount of power in tap of the discrete-time baseband
impulse response of the channel (i.e., compute the power-delay profile.). Make any
simplifying assumptions but state them. (You can leave your answers in terms of
integrals if you cannot evaluate them.)
4. Compute and sketch the power-delay profile as the bandwidth becomes very large
(and K is large).
5. Suppose now the receiver is moving at speed v towards the (fixed) transmitter. What
is the Doppler spread of tap ? Argue heuristically from physical considerations
what the Doppler spectrum (i.e., power spectral density) of tap is, for large K.
6. We have made the assumptions that the scatterers are all on a circle of radius 1km
around the receiver and the paths arrive with independent and uniform distributed
phases at the receiver. Mathematically, are the two assumptions consistent? If not,
do you think it matters, in terms of the validity of your answers to the earlier parts
of this question?
Exercise 2.19 Often in modeling multiple input multiple output (MIMO) fading
channels the fading coefficients between different transmit and receive antennas are
assumed to be independent random variables. This problem explores whether this is
a reasonable assumption based on Clarke’s one-ring scattering model and the antenna
separation.
1. (Antenna separation at the mobile) Assume a mobile with velocity v moving away
from the base-station, with uniform scattering from the ring around it.
48 The wireless channel
(a) Compute the Doppler spread Ds for a carrier frequency fc, and the corresponding
coherence time Tc.
(b) Assuming that fading states separated by Tc are approximately uncorrelated, at
what distance should we place a second antenna at the mobile to get an independently
faded signal? Hint: How much distance does the mobile travel in Tc?
2. (Antenna separation at the base-station) Assume that the scattering ring has radius
R and that the distance between the base-station and the mobile is d. Further
assume for the time being that the base-station is moving away from the mobile
with velocity v . Repeat the previous part to find the minimum antenna spacing at
the base-station for uncorrelated fading. Hint: Is the scattering still uniform around
the base-station?
3. Typically, the scatterers are local around the mobile (near the ground) and far away
from the base-station (high on a tower). What is the implication of your result in
part (2) for this scenario?