17 Solutions to homework problems

This chapter collects worked solutions to the homework problems posed at the end of each chapter. Conceptual answers are given as model paragraphs; quantitative answers show the intermediate steps and the final numeric or derived result. Each solution is labeled with the same problem identifier used in the corresponding chapter.

17.1 Markets

MKT-C1. Street-name holding means the broker is the registered holder of record on the issuer’s books, while the broker’s own internal records identify the client as the beneficial owner — the person actually entitled to the economic value (dividends, voting instructions, sale proceeds). The friend’s worry confuses registered title with ownership. The point of street-name holding is mechanical convenience: because the broker (or, in practice, a central depository through the broker) is the name on record, a sale can be settled by an accounting entry rather than by physically re-registering a paper certificate in each new owner’s name. This is what makes transfer of title fast and cheap when you sell. Your claim is protected not by whose name is on the certificate but by the broker’s fiduciary and recordkeeping obligations, by regulatory segregation of client assets from the broker’s own, and by investor-protection arrangements — the beneficial ownership is legally recognized even though the registered name is the broker’s.

MKT-C2. A long position means the investor is the rightful owner of the asset; a short position means she owes the security to another entity (she has borrowed and sold it, or otherwise is obligated to deliver it). An open position is either of these; to close a position is to move from an open to a closed state, eliminating the obligation or the holding. To close a long position the investor need only sell the security she already owns — she has it in hand, so a single sale extinguishes the position. To close a short position she must buy the security in the market and return the borrowed shares to the lender. The asymmetry is that the short seller does not own what she has already sold: she has an outstanding obligation to deliver shares, and the only way to satisfy it is to acquire those shares by purchase and hand them back.

MKT-C3. Both investors lose the same dollar amount on the position (20% of $200,000 = $40,000), because both control $200,000 of stock. But the percentage loss is measured against each investor’s own capital. The cash investor put up $200,000, so her loss is $40,000/$200,000 = 20%. The margin investor put up only $100,000 of her own money (borrowing the other $100,000), so her equity falls from $100,000 to $60,000 — a 40% loss on her capital. Leverage magnifies the percentage return in both directions; borrowing doubles the exposure per dollar of equity, so it doubles the percentage swing. The maintenance-margin requirement can force the leveraged investor to sell at the worst moment: as the price falls, her equity-to-asset ratio drops, and once it crosses the maintenance threshold the broker issues a margin call. If she cannot post additional collateral, the broker liquidates — selling precisely when the price is low. The cash investor has no loan, no margin ratio, and therefore can never receive a margin call; she can simply hold and wait.

MKT-C4. The initial margin requirement (Regulation T, currently 50%) governs how much of a new position the investor must fund with her own money — it limits the size of the loan at the moment the position is opened. The maintenance margin requirement (set by the exchange, e.g. 25% for a long position, 30% for a short) governs the ongoing minimum equity that must remain as the price moves. A margin call is triggered by the maintenance requirement, not the initial one, because the initial requirement only applies at inception; once the position is open, what matters is whether current equity as a fraction of current asset value has fallen too low. The maintenance level must be lower than the initial level: if it were equal or higher, essentially any adverse price move — or even no move at all — would immediately breach it, and no leveraged position could survive normal fluctuation. The gap between the two levels is the cushion that lets a position absorb ordinary price movement before a call is issued.

MKT-C5. A long buyer’s worst case is that the price falls to zero: she can lose only the amount she paid, so her loss is bounded. A short seller has sold at today’s price and must eventually buy back to return the shares; there is no ceiling on how high the price can climb, so her potential loss is in principle unlimited (a stock that doubles, triples, or more forces her to repurchase at ever-higher prices). The short seller must make good any dividend because she has borrowed and sold someone else’s shares. The lender of the shares still expects the full economic benefit of ownership, including dividends, as if the shares had never left. When the company pays a dividend, whoever now holds the sold shares receives it — not the lender — so the short seller must pay the lender an amount equal to the dividend to make the lender whole. The lender bears none of this cost; the obligation falls entirely on the short seller who created it by borrowing and selling.

MKT-C6. Marking to market means the position’s value is recomputed at the current market price each period, and the margin account is adjusted to reflect the resulting equity. For a short position the liability is the cost of buying back the shares needed to close, which equals (number of shares) times the current price. As the price changes, that liability is re-measured — marked to market — and the equity in the account (initial margin plus sale proceeds minus the current buy-back cost) is updated accordingly. It is used so that collateral always reflects the true, current exposure rather than stale values, protecting the broker against a loss that has already built up on paper. Because a short position loses money when the price rises, marking to market causes equity to fall as the price climbs; once the marked-to-market equity divided by the current buy-back cost falls below the maintenance requirement, a margin call is issued — the mirror image of the long case, where the call comes when the price falls.

MKT-C7. The specialist buys at the bid ($17.49) and sells at the offer ($17.50), pocketing the $0.01 spread on each round trip. If it buys 1,000 shares from a seller and sells 1,000 shares to a buyer during the day, its net share position is zero but it has captured $0.01 × 1,000 = $10 in cash. Profit comes from the spread, not from taking a directional position in the stock. During periods of unusually high volatility a market maker charged with maintaining an orderly market may widen the spread because it faces greater inventory risk: any shares it is forced to hold between an incoming buy and an incoming sell can move sharply against it, and there is a higher chance it is trading against better-informed counterparties. A wider spread compensates it for this risk and lets it keep quoting continuously rather than withdrawing — which would leave the market with no liquidity at all.

MKT-C8. An OTC market has no centralized exchange; transactions occur bilaterally between dealers, often by recorded phone call. Its chief advantage is flexibility: because each trade is a private negotiation, the contract can be tailored to almost any form — nonstandard sizes, dates, or terms — which is valuable for customized instruments. A centralized exchange (open outcry like the NYSE or electronic like NASDAQ) concentrates order flow in one place and typically has a specialist or market maker standing ready to quote a bid and an offer at all times. Its advantage is more reliable price discovery and liquidity: with all interest visible in one venue and a market maker always willing to trade, a participant can buy or sell quickly at a transparent, continuously quoted price, and the aggregation of many orders produces a price that better reflects overall supply and demand. The trade-off is standardization — exchange-traded contracts are uniform — against the bespoke flexibility of the OTC market.

MKT-C9. By becoming the buyer to every seller and the seller to every buyer (novation), the clearinghouse ensures that each side faces the clearinghouse rather than an unknown counterparty. It backs this guarantee with member margin accounts and variation-margin calls: each member posts margin, and if a member’s balance falls below the maintenance level the clearinghouse demands additional (variation) margin, seizing the account if the member fails to post. Because the clearinghouse holds collateral against every member’s exposure and can top it up as prices move, it can make the non-defaulting side whole even when one party defaults — the defaulter’s margin, plus the clearinghouse’s own reserves, covers the shortfall. The AIG episode shows the dark side of the same collateral logic in OTC contracts. AIG’s OTC derivatives required it to post collateral as the contracts’ values moved and as its creditworthiness changed. When rating agencies downgraded AIG on 15 September 2008, its contracts required roughly $14.5 billion in additional collateral immediately. Here the collateral mechanism did not contain stress locally; the downgrade triggered a sudden, correlated demand for cash that AIG could not meet, transmitting distress outward to its many counterparties and the broader system.

MKT-C10. “$T+3$” is the convention that a trade agreed on day $T$ settles — asset delivered to the buyer, cash delivered to the seller — three business days later. Settlement is not instantaneous because the back-office work of confirming, netting, and transferring securities and cash across accounts and institutions takes time, especially when many trades must be reconciled. During the interval between trade and settlement, the clearinghouse stands between the two sides as guarantor: it holds members’ securities and cash in trust, records the agreed transaction, and commits to move the asset and the cash on the settlement date. This interposition reduces counterparty risk because neither side is relying on the other’s promise to perform three days later — each relies instead on the clearinghouse, which is collateralized by member margin and its own reserves. If a party were to default in the interim, the clearinghouse, not the innocent counterparty, absorbs the exposure.

MKT-C11. Under gross margining the member posts margin on the total value of all positions, long and short, held in the clearing account: here that is 50 shares long plus 30 shares short, i.e. margin on 80 shares of GE. Under net margining the long and short positions offset, leaving a net position of 50 − 30 = 20 shares, so the member posts margin on only 20 shares. A clearinghouse worried about a member’s total risk may prefer gross margining because the offset assumed by netting is not risk-free: the long and short positions belong to different clients, and if the member or a client fails, the clearinghouse may not be able to neatly cancel one against the other. Gross margining collects collateral against the full underlying exposure, giving a larger buffer against default even though it ties up more of the member’s capital and is therefore more expensive for the member.

MKT-C12. A haircut is the discount applied to an asset’s market value when it is accepted as collateral: a security worth $100 might be credited as only $98 (a 2% haircut) or $75 (a 25% haircut). Its function is to protect the lender — the broker or clearinghouse — against the risk that the collateral loses value in the time between a margin breach and the actual liquidation of the position. Volatile stocks receive a larger haircut (around 25%) than Treasury securities (around 2%) precisely because their prices can move much further and faster: the collateral could fall substantially before it is sold, so a bigger cushion is needed. Treasuries are stable and highly liquid, so a small discount suffices. In effect the haircut sizes the buffer to the collateral’s own price risk, ensuring that even after an adverse move the collateral still covers the exposure it secures.

MKT-Q1. With a 50% initial-margin requirement, the investor’s own cash of $60,000 must be at least half the position, so the maximum position is $60,000 / 0.50 = $120,000. (a) Largest dollar position = $120,000; shares = $120,000 / $40 = 3,000 shares. (b) Immediately after the purchase: assets (stock) = $120,000; loan liability = $120,000 − $60,000 = $60,000; equity = assets − liability = $120,000 − $60,000 = $60,000.

MKT-Q2. A margin call occurs when the margin ratio hits the maintenance level 0.30: \[\frac{P\cdot N - L}{N\cdot P} = \frac{2000P - 70{,}000}{2000P} = 0.30.\] Then $2000P - 70{,}000 = 0.30(2000P) = 600P$, so $1400P = 70{,}000$ and $P = \$50.00$. She receives a margin call when the price falls to $50.00 per share.

MKT-Q3. With $N = 1000$, $L = 40{,}000$, and maintenance 0.25: \[\frac{1000P - 40{,}000}{1000P} = 0.25.\] Then $1000P - 40{,}000 = 0.25(1000P) = 250P$, so $750P = 40{,}000$ and $P = 40{,}000/750 = \$53.33$. She receives a margin call when the price falls to about $53.33 per share (below her $80 purchase price by about 33%).

MKT-Q4. Levered: at $95.00 the assets are $2303 \times 95 = \$218{,}785$. Subtracting the $100,000 loan gives equity = $118,785. Return on the $100,000 invested = $118,785/$100,000 − 1 = 0.18785 ≈ 18.8%. Unlevered: with $100,000 she buys 1,151 shares at $86.82 (cost $99,929.82). The per-share gain is $95.00 − $86.82 = $8.18, so the profit is $1151 \times 8.18 = \$9{,}415.18$, a return of $8.18/$86.82 ≈ 9.4%. Leverage roughly doubles the return (18.8% vs. 9.4%), reflecting the 2-to-1 exposure per dollar of equity.

MKT-Q5. (a) The $100,000 position rises 10% to $110,000. Equity = assets − loan = $110,000 − $50,000 = $60,000. Return on her $50,000 = $60,000/$50,000 − 1 = 0.20 = 20%. (b) With no leverage she would put up the full $100,000 and earn the 10% price return, i.e. 10%. The leverage factor is (total position)/(own equity) = $100,000/$50,000 = 2, and indeed the levered return (20%) equals 2 × the unlevered return (10%). Leverage of 2 doubles the percentage return in either direction.

MKT-Q6. Proceeds from the short sale = $200 \times \$8.00 = \$1{,}600$. (a) Initial margin = 50% of the asset value = $0.50 \times \$1{,}600 = \$800$. (b) Equity = proceeds + initial margin − buy-back cost = $1600 + 800 - 200P = 2400 - 200P$, so the margin ratio is \[Margin = \frac{2400 - 200P}{200P}.\] (c) Setting the ratio equal to the maintenance level 0.30: $2400 - 200P = 0.30(200P) = 60P$, so $2400 = 260P$ and $P = 2400/260 = \$9.23$. She receives a margin call when the price rises to about $9.23 per share.

MKT-Q7. Proceeds = $100 \times \$5.00 = \$500$; initial margin = $0.50 \times \$500 = \$250$. Equity = $500 + 250 - 100P = 750 - 100P$, so \[Margin = \frac{750 - 100P}{100P}.\] Setting this equal to 0.30: $750 - 100P = 0.30(100P) = 30P$, so $750 = 130P$ and $P = 750/130 = \$5.77$. She receives a margin call when the price rises to about $5.77 per share.

MKT-Q8. She shorted 200 shares at $8.00 and buys them back at $5.00 to close. Profit = $200 \times (\$8.00 - \$5.00) = 200 \times \$3.00 = \$600$. Her posted initial margin was $800 (from MKT-Q6), so the rate of return on that capital is $600/$800 = 0.75 = 75%.

17.2 Mean-Variance Portfolios

MV-C1. Because both investors use the same CAL, their complete portfolios lie on the single line $Er_C = r_f + \frac{\sigma_C}{\sigma_P}(Er_P - r_f)$; each investor simply picks a point on it. The point is pinned down by the standard deviation $\sigma_C = y\sigma_P$, so a larger $\sigma_C$ means a larger risky fraction $y$. Ben, with $\sigma_C = 0.15 > 0.05 = \sigma_A$, holds more of the risky asset and sits farther up and to the right on the CAL, so Ana is the more risk-averse investor: for a mean-variance utility $U = Er_C - \tfrac12 A \sigma_C^2$, a higher $A$ lowers the optimal $y$ and hence $\sigma_C$. Crucially, the composition of the risky portion is identical for both — both hold the same risky portfolio $P$ and differ only in how much of their wealth they put in it versus the risk-free asset. This is the separation property made visible: the CAL fixes the risky portfolio for everyone, and risk aversion only determines the position along the line.

MV-C2. The CAL slope $\frac{Er_P - r_f}{\sigma_P}$ — the Sharpe ratio — is determined entirely by the properties of the underlying risky portfolio and the risk-free rate, not by the broker. For the brokerage to genuinely offer a steeper line using the same risky portfolio, it would need a different (lower) $r_f$ at which clients can borrow or lend, or it would have to change $Er_P$ or $\sigma_P$ — but by assumption the risky portfolio is the same, so $Er_P$ and $\sigma_P$ are fixed. An ordinary broker cannot make the line steeper at will because the slope reflects the market’s actual reward-per-unit-of-risk tradeoff, which is a property of the assets, not a marketing choice. Economically, the slope is exactly that reward: the extra expected return the market grants per unit of standard deviation borne. A “steeper CAL” that is not backed by a better risky portfolio or a better risk-free rate is not a free lunch — it is either a repackaging or a claim of superior portfolio selection (a better tangency portfolio).

MV-C3. By $y = \frac{Er_P - r_f}{A\sigma^2_P}$, doubling the risk premium (with $A$ and $\sigma_P$ fixed) doubles the optimal $y$. Starting from $y = 1$, the investor should move toward $y = 2$ — i.e., borrow at the risk-free rate to lever up her position in the risky portfolio, moving up and to the right along the CAL past the point $P$. Both a very risk-averse and a nearly risk-neutral investor respond in the same direction (both increase $y$), because $y$ is proportional to the risk premium for every $A > 0$. They differ only in magnitude: the near-risk-neutral investor (small $A$) has a large $y$ that increases by a large absolute amount, while the very risk-averse investor (large $A$) has a small $y$ that increases by a smaller absolute amount. The sign of the response is universal; the size scales inversely with risk aversion.

MV-C4. In $y = \frac{Er_P - r_f}{A\sigma^2_P}$, risk aversion $A$ and variance $\sigma^2_P$ both sit in the denominator. A higher $A$ means the investor penalizes each unit of portfolio variance more heavily in $U = Er_C - \tfrac12 A\sigma^2_C$, so she demands less risk exposure and cuts $y$. A lower $\sigma^2_P$ means each unit of $y$ buys less risk (since $\sigma_C = y\sigma_P$), so for the same risk premium she can hold more of the risky asset, raising $y$. For two investors with the same $Er_P$, $\sigma_P$, $r_f$ but $A_2 = 2A_1$, the more risk-averse one holds exactly half the risky fraction: $y_2 = y_1/2$, placing her lower and to the left on the CAL. But the difference in $A$ changes only how much of the risky portfolio each holds; it does not change which risky portfolio — both hold the identical tangency portfolio, so composition is unaffected by risk aversion (separation).

MV-C5. Portfolio variance is $\sigma^2_P = w^2_D\sigma^2_D + w^2_E\sigma^2_E + 2w_Dw_E\sigma_D\sigma_E\rho_{DE}$; the cross term carries the correlation. Asset $D$ has strongly negative correlation with $E$, so adding it drives the covariance term negative and pulls total portfolio variance down — often more than enough to offset $D$’s slightly higher own variance. Asset $F$, with the same variance as $E$ but positive correlation, adds a positive cross term and reduces the risk-reduction available. What matters for portfolio risk is not an asset’s own variance in isolation but its covariance with the rest of the portfolio; a modestly volatile asset that moves opposite the rest is more valuable than a same-variance asset that moves with the rest. An asset like $D$, with negative correlation to the other holdings, is called a hedge.

MV-C6. Holding $w_D, w_E, \sigma_D, \sigma_E$ fixed, only the cross term $2w_Dw_E\sigma_D\sigma_E\rho_{DE}$ varies with $\rho_{DE}$, and it is strictly increasing in $\rho_{DE}$. So portfolio variance is strictly decreasing in correlation: it is largest at $\rho_{DE}=1$ and smallest at $\rho_{DE}=-1$. At $\rho_{DE}=1$ the expression becomes the perfect square $(w_D\sigma_D + w_E\sigma_E)^2$, so $\sigma_P = w_D\sigma_D + w_E\sigma_E$ — exactly the weighted average of the individual standard deviations, meaning no risk reduction at all. For any $\rho_{DE}<1$ the cross term is smaller, so $\sigma_P$ is strictly below that weighted average. This reveals that the diversification benefit comes entirely from imperfect correlation: it is the fact that assets do not move in lockstep, not the mere act of holding several assets, that lowers risk below the average of the parts.

MV-C7. The number of holdings is irrelevant; what matters is how they co-move. Writing $\sigma^2_P = \sum_j w_j\,Cov(r_j, r_P)$, total risk is the weight-averaged covariance of each holding with the whole portfolio. If 500 stocks are all strongly positively correlated — driven by common factors — then each $Cov(r_j, r_P)$ is large and positive, and the sum stays large despite the many names. Diversification requires low covariances, not a high count. This is exactly the Magnificent Seven point: once a handful of highly correlated technology names dominate the index, the “500-stock” fund behaves like a concentrated bet, because the covariances beneath it are large. Genuine diversification is a property of the covariance structure, not the number of tickers.

MV-C8. $Cov(r_j, r_P) = \sum_i w_i Cov(r_i, r_j)$ measures how asset $j$’s return moves together with the return of the whole portfolio it sits in. It is the right notion of the risk asset $j$ contributes because total portfolio variance decomposes exactly as $\sum_j w_j\,Cov(r_j, r_P) = Cov(r_P, r_P) = \sigma^2_P$: each asset’s contribution to $\sigma^2_P$ is $w_j\,Cov(r_j, r_P)$, not $w_j^2\sigma^2_j$. An asset’s own variance $\sigma^2_j$ overstates its marginal risk contribution if it moves opposite the rest of the portfolio, because part of that variance is offset by negative covariances with other holdings. An asset with low $Cov(r_j, r_P)$ contributes little to total risk; one with negative $Cov(r_j, r_P)$ actually reduces $\sigma^2_P$, which is why hedge assets are prized even when volatile on their own.

MV-C9. The minimum-variance frontier gives, for each target return $\bar r$, the lowest attainable variance. A portfolio strictly inside the frontier has more variance than some frontier portfolio at the same expected return (and less expected return than some frontier portfolio at the same variance). It is “dominated” because a mean-variance investor, who likes higher $Er$ and dislikes higher $\sigma$, can be made strictly better off. Two distinct improvements are available by moving to the frontier: (1) hold expected return fixed and slide left to the frontier, lowering variance for the same return; or (2) hold variance fixed and slide up to the frontier, raising expected return for the same risk. Because at least one of these strict improvements is always possible, no rational mean-variance investor holds an interior portfolio.

MV-C10. From $\sigma^2_P = \frac{C\bar r^2 - 2A\bar r + B}{D}$, variance is a parabola in $\bar r$ opening upward with vertex at $\bar r^\ast = A/C$. Because a parabola is symmetric about its vertex, two target returns equidistant from $\bar r^\ast$ — one above, one below — yield the same variance but different expected returns. A mean-variance investor always prefers the one with the higher expected return, since more expected return at equal variance is strictly better. Hence only the branch with $\bar r \ge \bar r^\ast = A/C$ is ever chosen; portfolios below the vertex are dominated by their mirror images above it. The acceptable upper half of the frontier is called the efficient frontier.

MV-C11. The skeptic is right that the retiree and the 25-year-old hold different complete portfolios — but that is fully consistent with separation. Separation says the composition of the risky portion is the same for both: both hold the identical tangency (or minimum-variance-frontier) risky portfolio. Where they differ is in the split between that risky portfolio and the risk-free asset, i.e., the fraction $y = \frac{Er_P - r_f}{A\sigma^2_P}$. The cautious retiree (high $A$) holds a small $y$ — mostly risk-free — while the aggressive youth (low $A$) holds a large $y$, possibly levered. The difference shows up in the risk-free/risky mix; it does not show up in the internal makeup of the risky holding, which is common to all investors who agree on the inputs.

MV-C12. Because every frontier portfolio can be written $w(\bar r) = g + \bar r\, h$ with $g, h$ fixed vectors, the entire frontier is a one-parameter straight line in weight space indexed by $\bar r$. Pick any two distinct frontier portfolios, say $w(\bar r_1)$ and $w(\bar r_2)$; any other frontier portfolio $w(\bar r)$ is an affine combination of these two (choosing the mixing coefficient so the target returns line up), because affine combinations of points on a line stay on the line. Thus two fixed funds span the whole frontier — the two-fund separation result. When a risk-free asset is added, the efficient set collapses to the single straight CAL from $r_f$ to the tangency portfolio; the only risky fund anyone needs is that tangency portfolio, so the “two funds” become the risk-free asset plus one common risky portfolio, identical across investors.

MV-C13. The weights $w^\ast = \frac{\Sigma^{-1}\mathbf{1}}{\mathbf{1}'\Sigma^{-1}\mathbf{1}}$ solve $\min w'\Sigma w$ s.t. $w'\mathbf{1}=1$ with no return constraint imposed — the objective is pure variance and the only side condition is that weights sum to one. Since neither the objective ($\Sigma$) nor the constraint ($\mathbf{1}$) involves $\mu$, expected returns never enter, and the answer depends solely on the covariance structure. Intuitively, if you only want the least-risky portfolio and do not care what return you earn, expected returns are simply irrelevant to the problem. A practitioner might deliberately hold $w^\ast$ when expected-return estimates are too unreliable to trust: covariances are estimated far more precisely than means, so a portfolio that ignores $\mu$ entirely can be more robust, accepting lower expected return in exchange for stability and minimal risk.

MV-C14. The weights are proportional to $\Sigma^{-1}\mathbf{1}$. Loosely, $\Sigma^{-1}$ “downweights” directions of high variance and high covariance: assets that are volatile or strongly co-move with others receive smaller entries in $\Sigma^{-1}\mathbf{1}$, while low-variance, low-covariance assets receive larger ones, because putting weight on the latter reduces total variance most effectively. (In the diagonal, uncorrelated case this is exact: weights are proportional to $1/\sigma^2_i$.) In practice, sample expected returns are extremely noisy, and portfolio rules that target a particular $\bar r$ amplify that noise, producing unstable, extreme weights. Because $w^\ast$ uses only $\Sigma$ — which is estimated much more accurately — it sidesteps the worst estimation errors, giving stable, well-behaved weights. That robustness is why some practitioners favor the global minimum-variance portfolio even though it makes no attempt to earn a high expected return.

MV-Q1. (a) The CAL is $Er_C = r_f + \frac{\sigma_C}{\sigma_P}(Er_P - r_f) = 0.03 + \frac{0.11 - 0.03}{0.20}\,\sigma_C = 0.03 + 0.4\,\sigma_C$. (b) The Sharpe ratio is the slope, $\frac{Er_P - r_f}{\sigma_P} = \frac{0.08}{0.20} = 0.4$. (c) At $\sigma_C = 0.08$: $Er_C = 0.03 + 0.4(0.08) = 0.03 + 0.032 = 0.062$, i.e. $6.2\%$.

MV-Q2. (a) Solve $Er_C = r_f + \frac{\sigma_C}{\sigma_P}(Er_P - r_f)$ for $\sigma_C$: $0.06 = 0.02 + \frac{\sigma_C}{0.25}(0.10 - 0.02) = 0.02 + 0.32\,\sigma_C$, so $\sigma_C = \frac{0.04}{0.32} = 0.125$, i.e. $12.5\%$. (b) Since $\sigma_C = y\sigma_P$, $y = \frac{\sigma_C}{\sigma_P} = \frac{0.125}{0.25} = 0.5$. She holds half her wealth in the risky portfolio.

MV-Q3. (a) $y = \frac{Er_P - r_f}{A\sigma^2_P} = \frac{0.12 - 0.04}{4 \times (0.20)^2} = \frac{0.08}{4 \times 0.04} = \frac{0.08}{0.16} = 0.5$. (b) $Er_C = r_f + y(Er_P - r_f) = 0.04 + 0.5(0.08) = 0.08$, i.e. $8\%$; $\sigma_C = y\sigma_P = 0.5 \times 0.20 = 0.10$, i.e. $10\%$. (c) With $A = 2$: $y = \frac{0.08}{2 \times 0.04} = \frac{0.08}{0.08} = 1.0$. Halving risk aversion doubles the risky fraction (from $0.5$ to $1.0$); the less risk-averse investor holds the entire portfolio in the risky asset, sitting farther out on the CAL.

MV-Q4. (a) From $y = \frac{Er_P - r_f}{A\sigma^2_P}$: $0.5 = \frac{Er_P - r_f}{3 \times (0.20)^2} = \frac{Er_P - r_f}{3 \times 0.04} = \frac{Er_P - r_f}{0.12}$, so $Er_P - r_f = 0.5 \times 0.12 = 0.06$, a risk premium of $6\%$. (b) With $\sigma_P = 0.25$: $y = \frac{0.06}{3 \times (0.25)^2} = \frac{0.06}{3 \times 0.0625} = \frac{0.06}{0.1875} = 0.32$. The optimal risky fraction falls from $0.5$ to $0.32$ because higher portfolio variance makes each unit of $y$ riskier, so the investor reduces her exposure.

MV-Q5. $\sigma^2_P = (0.4)^2(0.30)^2 + (0.6)^2(0.20)^2 + 2(0.4)(0.6)(0.30)(0.20)(0.25)$. Term by term: $0.16 \times 0.09 = 0.0144$; $0.36 \times 0.04 = 0.0144$; cross term $= 2 \times 0.24 \times 0.06 \times 0.25 = 2 \times 0.0036 = 0.0072$. So $\sigma^2_P = 0.0144 + 0.0144 + 0.0072 = 0.0360$, and $\sigma_P = \sqrt{0.036} = 0.1897 \approx 0.19$. The weighted average of the standard deviations is $0.4(0.30) + 0.6(0.20) = 0.12 + 0.12 = 0.24$. The portfolio’s $\sigma_P \approx 0.19$ is well below $0.24$; the gap ($\approx 0.05$) is the diversification benefit, arising because $\rho_{DE} = 0.25 < 1$.

MV-Q6. With $\rho_{DE} = -1$, $\sigma_P = 0$ requires $w_D\sigma_D - w_E\sigma_E = 0$, i.e. $w_D(0.40) = w_E(0.10)$, so $w_D = 0.25\,w_E$. Combined with $w_D + w_E = 1$: $0.25w_E + w_E = 1 \Rightarrow 1.25\,w_E = 1 \Rightarrow w_E = 0.8$, $w_D = 0.2$. Verification: $w_D\sigma_D - w_E\sigma_E = 0.2(0.40) - 0.8(0.10) = 0.08 - 0.08 = 0$, so $\sigma^2_P = 0^2 = 0$. The portfolio is risk-free.

MV-Q7. (a) $D = BC - A^2 = (0.30)(20) - (2.0)^2 = 6.0 - 4.0 = 2.0$. (b) $\sigma^2_P = \frac{C\bar r^2 - 2A\bar r + B}{D} = \frac{20(0.15)^2 - 2(2.0)(0.15) + 0.30}{2.0} = \frac{20(0.0225) - 0.60 + 0.30}{2.0} = \frac{0.45 - 0.60 + 0.30}{2.0} = \frac{0.15}{2.0} = 0.075$. (c) $\bar r^\ast = A/C = 2.0/20 = 0.10$; $\sigma^2_{min} = 1/C = 1/20 = 0.05$.

MV-Q8. (a) $D = BC - A^2 = (0.20)(8) - (1.2)^2 = 1.6 - 1.44 = 0.16$. (b) At $\bar r = 0.10$: $\sigma^2_P = \frac{8(0.10)^2 - 2(1.2)(0.10) + 0.20}{0.16} = \frac{0.08 - 0.24 + 0.20}{0.16} = \frac{0.04}{0.16} = 0.25$. At $\bar r = 0.20$: $\sigma^2_P = \frac{8(0.20)^2 - 2(1.2)(0.20) + 0.20}{0.16} = \frac{0.32 - 0.48 + 0.20}{0.16} = \frac{0.04}{0.16} = 0.25$. (c) Both give $0.25$ because $\bar r^\ast = A/C = 1.2/8 = 0.15$ is exactly midway, so $0.10$ and $0.20$ are symmetric about the vertex and yield equal variance. At $\bar r^\ast = 0.15$: $\sigma^2_P = \frac{8(0.0225) - 2(1.2)(0.15) + 0.20}{0.16} = \frac{0.18 - 0.36 + 0.20}{0.16} = \frac{0.02}{0.16} = 0.125 = 1/8 = 1/C$, confirming the minimum equals $1/C$.

MV-Q9. $\Sigma^{-1} = \frac{1}{\det\Sigma}\begin{pmatrix} 0.09 & -0.01 \\ -0.01 & 0.04 \end{pmatrix}$ with $\det\Sigma = (0.04)(0.09) - (0.01)^2 = 0.0036 - 0.0001 = 0.0035$. Then $\Sigma^{-1}\mathbf{1} = \frac{1}{0.0035}\begin{pmatrix} 0.09 - 0.01 \\ -0.01 + 0.04 \end{pmatrix} = \frac{1}{0.0035}\begin{pmatrix} 0.08 \\ 0.03 \end{pmatrix}$. The sum of these entries is $\mathbf{1}'\Sigma^{-1}\mathbf{1} = \frac{0.08 + 0.03}{0.0035} = \frac{0.11}{0.0035} = 31.43 = C$. Normalizing, $w^\ast = \frac{1}{0.11}\begin{pmatrix} 0.08 \\ 0.03 \end{pmatrix} = \begin{pmatrix} 0.727 \\ 0.273 \end{pmatrix}$. Minimum variance $= 1/C = 0.0035/0.11 = 0.0318$. Asset 1 gets the larger weight ($\approx 0.73$) because it has the lower variance ($0.04$ vs. $0.09$); the min-variance portfolio tilts toward the less volatile asset.

MV-Q10. For a diagonal $\Sigma$, $\Sigma^{-1} = \begin{pmatrix} 1/0.01 & 0 \\ 0 & 1/0.04 \end{pmatrix} = \begin{pmatrix} 100 & 0 \\ 0 & 25 \end{pmatrix}$. Then $\Sigma^{-1}\mathbf{1} = \begin{pmatrix} 100 \\ 25 \end{pmatrix}$ and $\mathbf{1}'\Sigma^{-1}\mathbf{1} = 100 + 25 = 125 = C$. So $w^\ast = \frac{1}{125}\begin{pmatrix} 100 \\ 25 \end{pmatrix} = \begin{pmatrix} 0.8 \\ 0.2 \end{pmatrix}$, and the minimum variance is $1/C = 1/125 = 0.008$. The weights $0.8:0.2$ are in ratio $100:25 = (1/0.01):(1/0.04)$, i.e. inversely proportional to the variances. This makes sense: with no correlation, the only way to cut risk is to load up on the lower-variance asset, so the safer asset (variance $0.01$) gets four times the weight of the riskier one (variance $0.04$).

17.3 The Capital Asset Pricing Model

CAPM-C1. Because both investors have homogeneous expectations and are mean-variance optimizers, they face identical inputs $(\mu,\Sigma,r_f)$ and solve the same optimization problem. The first-order condition gives risky weights proportional to $\Sigma^{-1}(\mu - r_f\mathbf{1})$ for every investor; the only thing that differs across them is the scalar $1/A_i$ that multiplies this common vector. Normalizing to unit weight, both investors hold the identical tangency portfolio $w_T$, so the composition of the risky part is the same. Each investor expresses her risk preference solely through the fraction $y$ of wealth she allocates to that risky portfolio versus the risk-free asset: the cautious retiree chooses a small $y$ (much in the risk-free asset), while the aggressive professional chooses a large $y$ (possibly borrowing, $y>1$). The risk-free asset is the “dial” that scales overall risk without changing the mix of risky holdings — this is the separation property.

CAPM-C2. Every investor holds the same risky portfolio with the same weights on each asset. Since all assets are publicly traded and must be held by someone, aggregating across all investors means the common weights must equal the value weights of the assets in the market as a whole. Concretely, if every investor puts, say, 20% of her risky wealth in asset 1, then across all investors 20% of aggregate risky wealth is in asset 1, so asset 1 constitutes 20% of the market’s value. Hence the common risky portfolio must be the value-weighted market portfolio. If the common risky portfolio placed zero weight on some traded asset, then no one would hold it, yet the asset exists and must be held — a contradiction of market clearing. Its price would have to fall until its expected return rose enough for investors to want to hold it, restoring positive weight.

CAPM-C3. With homogeneous expectations, all investors agree on $(\mu,\Sigma)$ and therefore compute the same tangency portfolio, which market clearing forces to be the market portfolio. If a group of investors instead believes technology stocks have a higher variance (or covariances) than everyone else, that group solves a different mean-variance problem: their $\Sigma$ differs, so their $\Sigma^{-1}(\mu-r_f\mathbf{1})$ differs, and they choose a different-composition risky portfolio — typically underweighting tech relative to the rest of the market. Because investors now hold different risky portfolios, there is no single portfolio that is simultaneously tangency-optimal for everyone. The aggregate of these differing portfolios is still the market (all assets are held), but the market portfolio is no longer the tangency portfolio for any given investor, and the clean single-beta CAPM pricing relation breaks down.

CAPM-C4. Mean-variance optimization (Assumption 5) means every investor cares only about the mean and variance of her portfolio return, so her demand is fully summarized by the first-order condition $w_i^* = \tfrac{1}{A_i}\Sigma^{-1}(\mu - r_f\mathbf{1})$; the form of the solution is common to all. Homogeneous expectations (Assumption 6) means the inputs $\mu$ and $\Sigma$ in that solution are literally the same numbers for everyone, so the direction $\Sigma^{-1}(\mu - r_f\mathbf{1})$ — and hence the composition of the risky portfolio — is identical across investors. Market clearing then forces this common tangency portfolio to be the market portfolio. If investors instead disagreed about expected returns (different $\mu_i$), they would compute different directions $\Sigma^{-1}(\mu_i - r_f\mathbf{1})$ and hold different-composition risky portfolios; no single portfolio would be tangency-optimal for all, and the identification of the market portfolio with the tangency portfolio would fail.

CAPM-C5. In equilibrium the expected excess return on any asset is $Er_i - r_f = \beta_i(Er_M - r_f)$ with $\beta_i = Cov(r_i,r_M)/\sigma^2_M$. An asset’s standalone variance $\sigma^2_i$ can be decomposed as $\sigma^2_i = \beta_i^2\sigma^2_M + \sigma^2_{\epsilon_i}$; the idiosyncratic piece $\sigma^2_{\epsilon_i}$ can be diversified away for free and earns no reward, so only beta is priced. Asset A, with $\beta_A = 1.4$, contributes more to the variance of the market portfolio — the only portfolio anyone holds — than asset B with $\beta_B = 0.3$, so A must offer the higher expected return to be held willingly. The fact that A and B have equal $\sigma_i$ is irrelevant: the CAPM prices the systematic contribution to aggregate risk (beta), not total variance, because a diversified investor bears only the systematic part.

CAPM-C6. Beta, $\beta_i = Cov(r_i,r_M)/\sigma^2_M$, measures how much an asset moves with the market and hence how much it adds to the variance of the market portfolio — the only portfolio a diversified investor holds. A stock can have very high standalone volatility $\sigma_i$ yet low beta if most of that volatility is idiosyncratic (uncorrelated with the market). Because idiosyncratic risk diversifies away at no cost, the market pays nothing for it, so a high-$\sigma_i$ but low-$\beta_i$ stock does not command a large premium. If the stock had a beta of exactly zero, its covariance with the market is zero, it adds nothing to market variance, and in equilibrium it earns only the risk-free rate — despite its large standalone volatility. The advisor’s error is confusing total variance with systematic (priced) risk.

CAPM-C7. The equilibrium condition $Er_M - r_f = \bar A\,\sigma^2_M$ says the market risk premium is the product of the average (wealth-weighted harmonic mean) risk aversion $\bar A$ and market variance $\sigma^2_M$. Holding $\sigma^2_M$ fixed, if investors collectively become more risk-averse during a panic — each $A_i$ rises, so $\bar A$ rises — the equilibrium premium $\bar A\,\sigma^2_M$ rises. Intuitively, more risk-averse investors demand a larger reward per unit of market variance to be induced to hold the same aggregate risky supply; since supply is fixed, prices fall and expected returns (the premium) rise until markets clear. The mechanism is entirely on the risk-aversion side, so no change in $\sigma^2_M$ is required.

CAPM-C8. Each investor $i$ places dollars $W_i y_i$ into the risky market portfolio and $W_i(1-y_i)$ into the risk-free asset. Because any borrowing by one investor (negative risk-free holding) must be matched by lending from another, the risk-free asset is in zero net supply: $\sum_i W_i(1-y_i)=0$. This is equivalent to $\sum_i W_i y_i = W$, i.e. the wealth-weighted average of $y$ equals one, $\sum_i (W_i/W)y_i = 1$. Equivalently, all aggregate wealth must be held in real (risky) assets because that is the only net supply that exists. Substituting the individual demands $y_i = (Er_M-r_f)/(A_i\sigma^2_M)$ into this clearing condition and solving is exactly what pins down the market risk premium $Er_M - r_f = \bar A\sigma^2_M$.

CAPM-C9. Writing $r_i = \alpha_i + \beta_i r_M + \epsilon_i$ and taking variances gives $\sigma^2_i = \beta_i^2\sigma^2_M + \sigma^2_{\epsilon_i}$. The systematic term $\beta_i^2\sigma^2_M$ is the risk that comes from the asset’s exposure to economy-wide movements — it is shared with the whole market and cannot be escaped by diversification. The idiosyncratic term $\sigma^2_{\epsilon_i}$ is firm-specific: it reflects events affecting this asset alone and is uncorrelated with the market. A concrete example that would appear in $\epsilon_i$ but not in $r_M$ is a company-specific event such as a product recall, a failed drug trial, a lawsuit, or a surprise CEO resignation at that one firm — such news moves that stock without moving the broad market. By contrast, a change in interest rates or a recession moves $r_M$ and is systematic.

CAPM-C10. The residual $\epsilon_i$ is defined as the part of $r_i$ left over after projecting on the market: it is the regression residual, and by construction of a regression it is uncorrelated with the regressor, $Cov(\epsilon_i, r_M)=0$, with $E\epsilon_i = 0$. When we take the variance of $r_i = \alpha_i + \beta_i r_M + \epsilon_i$, the cross term is $2\beta_i Cov(r_M,\epsilon_i)$, which is exactly zero by that construction. That is why the variance splits cleanly into $\sigma^2_i = \beta_i^2\sigma^2_M + \sigma^2_{\epsilon_i}$ with no cross term. The first term, $\beta_i^2\sigma^2_M$, captures the asset’s exposure to economy-wide (market) movements; the second, $\sigma^2_{\epsilon_i}$, captures firm-specific events uncorrelated with the market.

CAPM-C11. For an equally weighted portfolio of $n$ uncorrelated-residual assets, the idiosyncratic variance is $Var\!\big(\tfrac1n\sum\epsilon_i\big) = \tfrac1{n^2}\sum\sigma^2_{\epsilon_i} \le \bar\sigma^2_\epsilon/n$, which shrinks toward zero as $n$ grows. With only three stocks, $n=3$ is small, so the firm-specific portion of the variance is large and the portfolio swings on idiosyncratic news. Adding many more uncorrelated stocks drives $\bar\sigma^2_\epsilon/n \to 0$ at essentially no cost (no expected-return sacrifice, since idiosyncratic risk is unpriced). What remains no matter how many stocks she adds is the systematic variance $\bar\beta^2\sigma^2_M$, the market-risk portion, which does not diversify away because it is shared across all assets.

CAPM-C12. In $r_P = \bar\alpha + \bar\beta\, r_M + \tfrac1n\sum\epsilon_i$, the systematic loading is the average beta $\bar\beta = \tfrac1n\sum\beta_i$, which stays of order one as $n$ grows, so the systematic variance $\bar\beta^2\sigma^2_M$ does not shrink. The idiosyncratic term is an average of uncorrelated shocks, whose variance falls like $1/n$. The fundamental difference is correlation: idiosyncratic shocks are uncorrelated across assets, so averaging them causes cancellation (the law-of-large-numbers effect), whereas the market factor $r_M$ hits every asset in the same direction and cannot cancel. Systematic risk is common risk that everyone bears together; there is no free lunch because it cannot be netted out by combining assets. Only idiosyncratic risk is “free” to eliminate.

CAPM-C13. Suppose an asset offered a positive premium as compensation for its idiosyncratic risk $\sigma^2_{\epsilon_i}$. Investors would recognize that this firm-specific risk can be diversified away at no cost by holding the asset inside a large well-diversified portfolio. They would rush to buy the asset to capture the “free” extra return without bearing extra undiversifiable risk. This buying pressure raises the asset’s price and lowers its expected return, and it continues until the idiosyncratic premium is competed down to zero. In equilibrium, then, only systematic risk — the part that survives in every diversified portfolio and cannot be escaped — is compensated. This is precisely why the Security Market Line prices assets by beta rather than by total variance.

CAPM-C14. By the Security Market Line, $Er_i - r_f = \beta_i(Er_M - r_f)$. With $\beta_i = 0$, the asset earns only the risk-free rate, $Er_i = r_f$, regardless of how large its standalone standard deviation is. The reason is the decomposition $\sigma^2_i = \beta_i^2\sigma^2_M + \sigma^2_{\epsilon_i}$: with $\beta_i = 0$ the systematic part vanishes and all of the stock’s large variance is idiosyncratic, which diversifies away for free inside the market portfolio and hence earns no premium. A naive investor who prices the stock by its total volatility would demand a large risk premium and expect a high return; the CAPM says she is wrong, because the market pays only for the systematic (beta) contribution, which here is zero.

CAPM-Q1. Apply the Security Market Line $Er_i - r_f = \beta_i(Er_M - r_f)$. The market risk premium is $Er_M - r_f = 0.09 - 0.02 = 0.07$. Then $Er_i - r_f = 1.3 \times 0.07 = 0.091$, so $Er_i = 0.02 + 0.091 = 0.111$. The asset’s equilibrium expected return is $Er_i = 0.111$ (11.1%).

CAPM-Q2. With premium $Er_M - r_f = 0.06$: for asset A, $Er_A = r_f + \beta_A(Er_M - r_f) = 0.03 + 0.5\times 0.06 = 0.03 + 0.03 = 0.06$ (6%). For asset B, $Er_B = 0.03 + 1.6\times 0.06 = 0.03 + 0.096 = 0.126$ (12.6%). The difference is $Er_B - Er_A = 0.126 - 0.06 = 0.066$ (6.6 percentage points), equal to $(\beta_B - \beta_A)(Er_M - r_f) = 1.1\times 0.06$.

CAPM-Q3. First, $\beta_i = Cov(r_i,r_M)/\sigma^2_M = 0.018/0.036 = 0.5$. Then apply the SML: $Er_i - r_f = \beta_i(Er_M - r_f) = 0.5 \times 0.055 = 0.0275$, so $Er_i = 0.025 + 0.0275 = 0.0525$. The stock’s expected return is $Er_i = 0.0525$ (5.25%).

CAPM-Q4. The covariance is $Cov(r_i,r_M) = \rho_{iM}\,\sigma_i\,\sigma_M = 0.4 \times 0.30 \times 0.20 = 0.024$. The market variance is $\sigma^2_M = (0.20)^2 = 0.04$. Therefore $\beta_i = Cov(r_i,r_M)/\sigma^2_M = 0.024/0.04 = 0.6$. The stock’s beta is $\beta_i = 0.6$.

CAPM-Q5. Using $Er_M - r_f = \bar A\,\sigma^2_M = 2.5 \times 0.04 = 0.10$, the equilibrium market risk premium is $0.10$ (10%). With $r_f = 0.03$, the expected market return is $Er_M = r_f + 0.10 = 0.03 + 0.10 = 0.13$ (13%).

CAPM-Q6. Solve $Er_M - r_f = \bar A\,\sigma^2_M$ for $\bar A$: $\bar A = (Er_M - r_f)/\sigma^2_M = 0.06/0.05 = 1.2$. The implied average degree of risk aversion is $\bar A = 1.2$.

CAPM-Q7. Using the demand rule $y = (Er_M - r_f)/(A\sigma^2_M) = 0.07/(3 \times 0.04) = 0.07/0.12 \approx 0.583$. Since $y < 1$, she places about 58.3% of wealth in the risky market portfolio and the remaining 41.7% in the risk-free asset, so she is a net lender at the risk-free rate.

CAPM-Q8. Set $y = 1$ in $y = (Er_M - r_f)/(A\sigma^2_M)$: $1 = 0.05/(A \times 0.025)$, so $A = 0.05/0.025 = 2$. An investor with $A = 2$ holds exactly the market portfolio and nothing in the risk-free asset. Since $Er_M - r_f = \bar A\sigma^2_M$, an investor whose $A$ equals the market’s average risk aversion $\bar A$ chooses $y = 1$; here $\bar A = (Er_M-r_f)/\sigma^2_M = 0.05/0.025 = 2 = A$, confirming that this investor’s risk aversion is exactly the equilibrium average.

CAPM-Q9. Systematic variance $= \beta_i^2\sigma^2_M = (1.2)^2 \times 0.05 = 1.44 \times 0.05 = 0.072$. Total variance $\sigma^2_i = \beta_i^2\sigma^2_M + \sigma^2_{\epsilon_i} = 0.072 + 0.02 = 0.092$. Total standard deviation $\sigma_i = \sqrt{0.092} \approx 0.303$. The systematic fraction is $0.072/0.092 \approx 0.783$, so about 78.3% of the total variance is systematic.

CAPM-Q10. Systematic variance $= \beta_i^2\sigma^2_M = (0.8)^2 \times 0.045 = 0.64 \times 0.045 = 0.0288$. From $\sigma^2_i = \beta_i^2\sigma^2_M + \sigma^2_{\epsilon_i}$, the idiosyncratic variance is $\sigma^2_{\epsilon_i} = \sigma^2_i - \beta_i^2\sigma^2_M = 0.10 - 0.0288 = 0.0712$. The systematic share of total variance is $0.0288/0.10 = 0.288$, i.e. about 28.8%.

17.4 Portfolio performance evaluation

PERF-C1. She should choose Fund B, the fund with the higher Sharpe ratio, even though it has the lower average return. Because she is holding all of her risky wealth in one fund and then mixing it with the risk-free asset, the set of expected-return/risk combinations available to her is the capital allocation line through the fund, a straight line rising from $r_f$ with slope equal to the fund’s Sharpe ratio. A higher Sharpe ratio means a steeper CAL, and a steeper CAL delivers more expected return at every level of total risk she might choose to bear. So Fund B dominates Fund A: at any risk level she likes, the CAL through B lies above the CAL through A. She is not stuck with Fund B’s own expected return, however. By levering up — placing more than $100\%$ of her wealth in Fund B (borrowing at $r_f$ to do so) — she moves out along the steeper CAL and can reach an expected return as high as, or higher than, Fund A’s, but at a lower standard deviation than Fund A would impose for that same expected return. The Sharpe ratio, not the raw average return, is what ranks the two funds because it is the slope of the opportunity set she actually faces.

PERF-C2. The first analyst, who used total standard deviation $\sigma_P$, has applied the Sharpe ratio correctly: the Sharpe ratio is by definition excess return divided by total risk, $S_P = (Er_P - r_f)/\sigma_P$, and it is the appropriate measure precisely when the fund is (or would be) held as the investor’s entire risky position, so that the investor bears all of the fund’s volatility. The second analyst, by dividing excess return by systematic risk (beta) instead, has not computed a Sharpe ratio at all — he has effectively computed the Treynor measure, $T_P = (Er_P - r_f)/\beta_P$. That measure is the right one, but only under a different assumption about how the fund is held: when the fund is one component of a large, well-diversified portfolio, its idiosyncratic risk is diversified away and only its beta contributes to the risk the investor ultimately bears. The two analysts have not made an arithmetic error against a common standard; they have implicitly assumed different roles for the fund, and each measure is correct for its own role.

PERF-C3. With forty sleeves already in place, any single sleeve is only a small component of a large, well-diversified whole, and its idiosyncratic risk is diversified away by the other thirty-nine holdings. What the new sleeve actually adds to the fund’s risk is therefore not its total volatility but its covariance with everything else — summarized by its beta. The Sharpe ratio charges the sleeve for its total standard deviation, including diversifiable risk the pension fund will never actually bear, so it is the wrong yardstick here; it would penalize a sleeve for idiosyncratic swings that vanish in the aggregate. The Treynor measure, $T_P = (Er_P - r_f)/\beta_P$, charges the sleeve only for its systematic risk and so measures reward per unit of the risk the sleeve genuinely contributes. The feature of the CAPM that justifies this is the lesson that in equilibrium only systematic (non-diversifiable) risk is priced: idiosyncratic risk earns no premium because it can be diversified away at no cost, so a component held inside a diversified portfolio should be rewarded — and evaluated — only for the market risk it brings.

PERF-C4. Equal Treynor measures with $\sigma_X \gg \sigma_Y$ tells you the two portfolios differ substantially in idiosyncratic risk. Recall that total variance decomposes as $\sigma_P^2 = \beta_P^2\sigma_M^2 + \sigma^2(\varepsilon_P)$. Since the Treynor measures are equal, the portfolios earn the same excess return per unit of beta; if they also have similar betas (as their equal Treynor and equal reward-to-beta pricing suggest), their systematic-risk components $\beta_P^2\sigma_M^2$ are comparable, so the much larger total variance of X must come from a much larger residual variance $\sigma^2(\varepsilon_X)$. Portfolio X therefore carries far more diversifiable risk than Y. Yet an investor holding either as one small piece of a well-diversified whole would be nearly indifferent between them: that extra idiosyncratic risk in X is exactly the kind of risk that gets diversified away inside the larger portfolio, so it does not affect the risk the investor ultimately bears. What matters for a diversified holder is the systematic contribution and the reward per unit of it — and on that dimension, captured by the Treynor measure, the two portfolios are identical.

PERF-C5. “Abnormal” means the fund’s expected return lies $2\%$ above the Security Market Line: the CAPM assigns a fund of beta $\beta_P$ the fair return $r_f + \beta_P(Er_M - r_f)$, and Jensen’s alpha is the vertical gap between the return actually earned and that fair return. A positive alpha is return the single-factor CAPM cannot explain by market exposure alone. Two distinct reasons it is not, by itself, convincing evidence of skill: (1) Statistical noise. An estimated alpha is the intercept of a regression run on a finite sample and carries a standard error; a positive point estimate can easily arise from luck. Whether it is distinguishable from zero is a question about its $t$-statistic, which is roughly $IR_P\sqrt{T}$ and typically requires many years of data to clear the significance threshold. (2) Model misspecification. A positive alpha may reflect exposure to a priced source of risk that the one-factor CAPM omits (a value, size, or momentum tilt, say) rather than any genuine informational edge. What looks like skill relative to a single-factor benchmark can be ordinary compensation for a risk the benchmark fails to account for.

PERF-C6. Fund G, the lower-beta fund, almost certainly has the larger Jensen’s alpha. Alpha is the raw return minus the CAPM-required return $r_f + \beta_P(Er_M - r_f)$. Both funds earned the same raw return of $11\%$, but Fund H’s higher beta of $1.4$ entitles it to a higher CAPM-required return than Fund G’s beta of $0.8$ (assuming a positive market risk premium). Subtracting a larger required return from the same raw return leaves Fund H with the smaller — possibly negative — alpha, while Fund G, held to a lower bar, keeps more of its return as abnormal. The general lesson is that raw return can mask poor risk-adjusted performance: a fund can post an impressive headline return simply by taking more systematic risk, and once you charge it for the return that risk was supposed to earn, the apparent outperformance can shrink or vanish. Two funds with identical raw returns are not equally good if one reached that return by running a much higher beta.

PERF-C7. The high-conviction active fund has the larger tracking error. Tracking error is the volatility of the active return $r_P - r_b$, and its square decomposes as $TE_P^2 = \sigma_P^2 + \sigma_b^2 - 2\rho_{Pb}\sigma_P\sigma_b$. A pure index fund is built to replicate the S&P 500, so its return moves almost in lockstep with the benchmark: $\rho_{Pb}$ is near $1$ and $\sigma_P \approx \sigma_b$, which drives the expression toward zero and gives a tracking error near zero. The high-conviction active fund takes large positions away from the index, lowering $\rho_{Pb}$ and often raising $\sigma_P$ relative to $\sigma_b$; both effects enlarge the right-hand side, so its tracking error is large. But a large tracking error is neither good nor bad news on its own, because tracking error is a measure of risk, not of performance: it says only how far the manager is willing to stray from the benchmark, not whether straying paid off. A manager can run a large tracking error and still underperform; tracking error becomes informative only as the denominator of the information ratio, which pairs it with active return.

PERF-C8. The statement is wrong because tracking error measures only risk, not performance. Two funds with the same tracking error have taken the same amount of active (benchmark-relative) risk, but that says nothing about what they earned for taking it. The information ratio, $IR_P = \overline{r_P - r_b}/TE_P$, is the measure of performance: it divides the active return — how much the fund actually beat or trailed its benchmark — by the tracking error. Two funds with equal tracking error are distinguished by their numerators: the one with the higher average active return has the higher information ratio and is the better active manager, while a fund with the same tracking error but zero or negative active return is worse despite the identical risk. To rank the two funds one needs their active returns $\overline{r_P - r_b}$; with those in hand, the information ratio orders them by reward per unit of active risk.

PERF-C9. The parallel is exact once the right analogues are identified. The Sharpe ratio measures the excess return over the risk-free rate per unit of total risk, $S_P = (Er_P - r_f)/\sigma_P$. The information ratio replaces both pieces with their active-management counterparts: the numerator becomes the excess return over the benchmark — the active return $Er_P - Er_b$ — and the denominator becomes the tracking error, the volatility of that active return, which is the total risk of the active bet. So where the Sharpe ratio measures reward per unit of risk for total investing, the information ratio measures reward per unit of risk for the active-management overlay, treating the benchmark rather than cash as the zero point. The information ratio, not Jensen’s alpha, is the natural reward-to-risk measure for a benchmarked manager because alpha is an absolute number of return units that says nothing about how much risk was taken to earn it: two managers with the same alpha are not equally good if one incurred far more active risk. Dividing by tracking error converts alpha into a reward-per-risk ratio, exactly the standard by which a benchmarked manager should be judged.

PERF-C10. The two definitions — active return over tracking error, $\overline{r_P - r_b}/TE_P$, and alpha over residual risk, $\alpha_P/\sigma(\varepsilon_P)$ — coincide in the special case where the benchmark is the market portfolio and the manager’s only systematic exposure is to that market. In that case the active return is exactly the alpha plus residual noise, the tracking error is exactly the residual volatility, and the two ratios are the same object written two ways. They diverge whenever the benchmark differs from the market, or the portfolio has systematic exposures beyond the single market factor: then the active return $r_P - r_b$ contains a component due to the portfolio’s beta differing from the benchmark’s (a systematic tilt), so tracking error includes systematic as well as residual risk, and it will generally differ from $\sigma(\varepsilon_P)$. It matters to state which definition is in use because the two can give materially different numbers for the same fund (as the chapter’s worked example shows, $0.50$ versus about $0.32$); a reported information ratio is not interpretable without knowing whether its denominator is benchmark-relative tracking error or regression residual risk.

PERF-C11. From $t(\hat\alpha_P) \approx IR_P\sqrt{T}$, an information ratio of $0.5$ produces a $t$-statistic of $2$ — the conventional threshold for statistical significance — only when $0.5\sqrt{T} = 2$, i.e. $\sqrt{T} = 4$, i.e. $T = 16$ years. So even a respectable, sustained information ratio of $0.5$ requires sixteen years of data before the manager’s alpha is distinguishable from luck at the usual standard. This is what makes the information ratio a far more demanding standard than a single year’s outperformance: one good year, or even a handful, produces a $t$-statistic far below $2$ and is statistically indistinguishable from a lucky streak. Because the $t$-statistic grows only with the square root of the sample length, establishing skill with confidence takes a long, consistent track record; a headline number from one year carries almost no evidentiary weight about whether genuine skill is present.

PERF-C12. The relationship rests on two ingredients. First, the $t$-statistic for testing whether the true alpha is zero is, as always, the estimate divided by its standard error, $t(\hat\alpha_P) = \hat\alpha_P/se(\hat\alpha_P)$. Second, the standard error of the regression intercept is approximately the residual volatility divided by the square root of the number of observations, $se(\hat\alpha_P) \approx \sigma(\varepsilon_P)/\sqrt{T}$ — more data shrinks the sampling uncertainty in the estimated intercept at the usual $1/\sqrt{T}$ rate. Substituting the second into the first, $t(\hat\alpha_P) \approx \hat\alpha_P/[\sigma(\varepsilon_P)/\sqrt{T}] = [\hat\alpha_P/\sigma(\varepsilon_P)]\sqrt{T} = IR_P\sqrt{T}$, since $IR_P = \alpha_P/\sigma(\varepsilon_P)$ is the information ratio in its residual-risk form. The $\sqrt{T}$ factor is why two managers with the same information ratio can have very different $t$-statistics: the one with the longer track record has accumulated more evidence and so a larger $t$. It also pins down what a manager can and cannot do to raise her $t$-statistic: she can raise her information ratio (more skill relative to residual risk) or accumulate more years of data, but she cannot make a short record convincing — the $\sqrt{T}$ term guarantees that a genuinely skilled manager evaluated over too few periods will still fail to clear the significance threshold.

PERF-Q1. Apply the Sharpe ratio, excess return over total risk, to each. For the fund (ex post form): $\hat S_P = (\bar r_P - \bar r_f)/s_P = (10 - 2.5)/15 = 7.5/15 = 0.5$. For the market: $S_M = (8 - 2.5)/16 = 5.5/16 \approx 0.344$. Since $0.5 > 0.344$, the fund delivered more reward per unit of total risk than the market.

PERF-Q2. (a) The Sharpe ratio is $S_P = (Er_P - r_f)/\sigma_P = (14 - 4)/25 = 10/25 = 0.4$. (b) On the capital allocation line, a fraction $y$ in Fund Z gives expected return $r_f + y(Er_P - r_f)$. Setting this to $11\%$: $11 = 4 + y(14 - 4) = 4 + 10y$, so $10y = 7$ and $y = 0.7$. She places $70\%$ of her wealth in Fund Z (and $30\%$ in the risk-free asset). (c) The standard deviation of the combined position is $y\sigma_P = 0.7 \times 25 = 17.5\%$. (Check: slope times risk gives excess return, $0.4 \times 17.5 = 7 = 11 - 4$, consistent.)

PERF-Q3. Treynor measure is excess return over beta. Portfolio: $T_P = (Er_P - r_f)/\beta_P = (13 - 3)/1.5 = 10/1.5 \approx 6.67\%$. Market: $T_M = (Er_M - r_f)/1 = (9 - 3)/1 = 6.0\%$. Since $T_P = 6.67\% > T_M = 6.0\%$, the portfolio earns more excess return per unit of systematic risk than the market, which is equivalent to saying it plots above the Security Market Line.

PERF-Q4. $T_D = (Er_D - r_f)/\beta_D = (10 - 3)/0.9 = 7/0.9 \approx 7.78\%$. $T_E = (Er_E - r_f)/\beta_E = (15 - 3)/1.6 = 12/1.6 = 7.5\%$. Since $T_D \approx 7.78\% > T_E = 7.5\%$, portfolio D is the better performer per unit of systematic risk, even though E has the higher raw return: E’s larger excess return is more than accounted for by its larger beta.

PERF-Q5. CAPM-predicted return: $r_f + \beta_P(Er_M - r_f) = 2 + 0.7(10 - 2) = 2 + 0.7 \times 8 = 2 + 5.6 = 7.6\%$. Jensen’s alpha: $\alpha_P = Er_P - 7.6 = 11 - 7.6 = 3.4\%$. The alpha is positive, so the fund earned $3.4\%$ more than its market exposure alone would justify — it plots above the Security Market Line, putative (though not conclusive) evidence of skill.

PERF-Q6. Alpha is zero when the raw return equals the CAPM-required return: $Er_P = r_f + \beta_P(Er_M - r_f)$, i.e. $12 = 3 + \beta_P(8 - 3) = 3 + 5\beta_P$. Solving, $5\beta_P = 9$, so $\beta_P = 1.8$. A beta above $1.8$ raises the CAPM-required return above the fund’s $12\%$ raw return, so the required return exceeds the realized return and the fund’s alpha becomes negative.

PERF-Q7. The information ratio is active return over tracking error: $IR_P = \overline{r_P - r_b}/TE_P = 3/6 = 0.5$. Interpretation: the fund earned half a percentage point of active return for each percentage point of active (benchmark-relative) risk it took. Equivalently, its reward-to-risk ratio for active management is $0.5$ — a respectable but not exceptional figure, and one that (by $t \approx IR\sqrt{T}$) would take roughly sixteen years to establish as statistically significant.

PERF-Q8. First compute the tracking error from the decomposition: $TE_P^2 = \sigma_P^2 + \sigma_b^2 - 2\rho_{Pb}\sigma_P\sigma_b = 18^2 + 15^2 - 2(0.9)(18)(15) = 324 + 225 - 486 = 63$. So $TE_P = \sqrt{63} \approx 7.94\%$. The average active return is $\bar r_P - \bar r_b = 10.5 - 8 = 2.5\%$. The information ratio is therefore $IR_P = 2.5/7.94 \approx 0.315$.

PERF-Q9. The information ratio is $IR_P = 3/6 = 0.5$. Using $t(\hat\alpha_P) \approx IR_P\sqrt{T}$ and requiring $t = 2$: $2 = 0.5\sqrt{T}$, so $\sqrt{T} = 4$ and $T = 16$. The manager needs about sixteen years of data for her alpha to reach a $t$-statistic of $2$, the conventional significance threshold.

PERF-Q10. Information ratio (residual-risk form): $IR_P = \alpha_P/\sigma(\varepsilon_P) = 1.5/6 = 0.25$. Approximate $t$-statistic: $t(\hat\alpha_P) \approx IR_P\sqrt{T} = 0.25 \times \sqrt{25} = 0.25 \times 5 = 1.25$. Since $1.25 < 2$, the manager’s alpha is not statistically distinguishable from zero at the conventional threshold, despite twenty-five years of data — her information ratio is simply too low for that sample length to establish skill.

PERF-Q11. Residual-risk definition: $IR_P = \alpha_P/\sigma(\varepsilon_P) = 2.4/8 = 0.30$. Benchmark-difference definition: $IR_P = \overline{r_P - r_b}/TE_P = 2.0/5 = 0.40$. They differ because the two definitions measure active risk differently: the residual standard deviation $\sigma(\varepsilon_P)$ from the market-model regression is not the same quantity as the tracking error $TE_P$ relative to the benchmark (they coincide only when the benchmark is the market and the portfolio’s sole systematic exposure is to it), so dividing by different denominators yields different ratios.

PERF-Q12. From the identity $\sigma_P^2 = \beta_P^2\sigma_M^2 + \sigma^2(\varepsilon_P)$, solve for residual variance: $\sigma^2(\varepsilon_P) = \sigma_P^2 - \beta_P^2\sigma_M^2 = 22^2 - 1.1^2 \times 15^2 = 484 - 1.21 \times 225 = 484 - 272.25 = 211.75$. So $\sigma(\varepsilon_P) = \sqrt{211.75} \approx 14.55\%$. The residual-risk information ratio is then $\alpha_P/\sigma(\varepsilon_P) = 1.9/14.55 \approx 0.131$.

17.5 Rational expectations and the efficient markets hypothesis

EMH-C1. The chart-based strategy relies only on the history of prices and trading volume — exactly the information set covered by the weak form of the EMH. If weak-form efficiency holds, all information in past prices and volume is already impounded in the current price, so patterns in that history cannot be exploited for expected returns beyond compensation for risk; this is precisely why weak-form efficiency rules out profitable technical analysis. The fund trading on newly released accounting ratios uses publicly available (but non-price) information, so ruling it out requires semi-strong efficiency, which asserts that all public information — earnings, filings, ratios — is already priced. Semi-strong efficiency is a stronger assumption because its information set strictly contains the weak-form set (past prices are a subset of all public information), so it makes a broader claim about what prices already reflect and rules out both technical and fundamental/factor analysis.

EMH-C2. The three forms differ by the information set they assume is already reflected in prices. Weak form: the history of market prices (and volume). Semi-strong form: all publicly available information — the price history plus earnings, filings, news, and other public data. Strong form: all information relevant to the asset’s payoffs, including private/insider information. The forms are nested because each information set contains the previous one: private-plus-public information (strong) includes all public information (semi-strong), which includes past prices (weak). Hence if prices reflect the larger set they necessarily reflect the smaller subset, so strong implies semi-strong implies weak. An example of information in the semi-strong set but not the weak set is a company’s just-released quarterly earnings report: it is public but is not contained in the past price history.

EMH-C3. The observation bears on the strong form of the EMH, because insider (private) information is exactly what the strong form asserts is already impounded in prices. A profitable insider trade shows only that non-public information had value, which contradicts strong-form efficiency; it is not evidence against the weak or semi-strong forms, since those make claims only about past-price and public information and are silent on the value of insider information. The strong form is generally the hardest to defend empirically precisely because there is abundant evidence — and legal cases — that insiders can and do profit from private information; it is implausible that prices already reflect everything a CEO or insider knows before it becomes public.

EMH-C4. Technical analysis uses only past prices, so weak-form efficiency is enough to render it unprofitable in expectation: if past-price information is already priced, chart patterns carry no exploitable signal. Fundamental and factor analysis use public non-price data (earnings, balance-sheet items, characteristics), so ruling them out requires semi-strong efficiency, under which all public information is already impounded. Insider trading uses private information, so only strong-form efficiency rules it out. If the strong form holds, then prices already reflect literally all payoff-relevant information, so no analysis of any kind — including insider information — can identify an asset expected to outperform its risk-adjusted benchmark. In that world there is nothing to be gained from active security selection, so every investor should simply hold the market portfolio and pursue a passive strategy.

EMH-C5. Two reasons the chapter gives: (1) Volatility masks the signal. Market returns are so volatile that a strategy raising returns by only 5% is very hard to distinguish statistically from noise over any realistic sample; a much larger edge (say 20%) would be detectable, but an edge that large would itself imply the market is grossly inefficient. (2) The finite-sample “dart-throwing monkey” problem. We observe only finite histories, and out of many managers some will beat a benchmark for 5, 10, or 20 years purely by luck — like a monkey throwing darts who happens to throw in the right direction. Ten years of outperformance therefore does not establish a genuine repeatable strategy. More convincing evidence would include a long out-of-sample track record, performance persistence across many independent periods and assets, a clear economic mechanism, and results robust to the benchmark used (addressing the joint-hypothesis problem) — ideally with statistical significance that survives correction for the number of strategies searched.

EMH-C6. If a trader has a genuinely profitable strategy, she has every incentive to keep it secret and trade on it herself until it stops working, only revealing it afterward. This means the strategies most damaging to the EMH are precisely the ones researchers cannot observe while they are effective, biasing the observable record toward strategies that have already decayed. On detectability: because market returns have high variance, a 5% improvement in expected return is small relative to the year-to-year swings in realized returns, so it is easily lost in the noise and would require a very long sample to confirm statistically. A 20% improvement is large enough to stand out against that volatility and would be observable — but the ability to earn an extra 20% reliably would itself mean the market was leaving enormous, easily exploited profits on the table, i.e., that it was highly inefficient. So the very edges we could detect are the ones an efficient market should not permit.

EMH-C7. The conclusion does not follow because biases at the individual level need not translate into biases in market prices. If rational arbitrageurs can profit from trading against biased investors, they will buy underpriced and sell overpriced assets, pushing prices back toward fundamental value and eliminating the mispricing. For individual biases to move equilibrium prices, arbitrage must be limited — the arbitrageurs must be unable or unwilling to trade away the mispricing. Two concrete limits: (1) transaction costs and bid-ask spreads that make small mispricings unprofitable to correct; and (2) the risk that the mispricing widens before it converges (so a correctly identified bet can lose money in the interim), possibly combined with financing/short-sale constraints or capital withdrawals that force the arbitrageur to unwind at a loss. When such limits bind, biased traders can affect prices.

EMH-C8. Overconfidence is overestimating the precision of one’s own beliefs or forecasts; it can lead an investor to take a larger position in a particular stock than a correctly calibrated assessment of the odds would justify, and to trade too much — behavior inconsistent with expected-utility maximization under accurate beliefs. Representativeness bias (the “law of small numbers”) is treating small samples as more informative than they are; it leads investors to extrapolate recent trends, underestimating how noisy those trends are. On mutual-fund survival: if many investors suffer representativeness bias and chase recently strong (extrapolated) performance, they allocate capital to funds that had good recent runs regardless of whether the managers are skilled. Because biased investors keep supplying capital to such funds, even funds run by otherwise irrational managers can attract flows and survive rather than being competed out of existence — the biased beliefs of the investors sustain them.

EMH-C9. With frictionless arbitrage, any gap between price and fundamental value is a riskless (or nearly riskless) profit opportunity: arbitrageurs would immediately buy the underpriced asset and sell the overpriced one in unlimited size, and their trading would move prices back to fundamentals before any lasting mispricing could persist. So biased traders would have no durable effect. The risk that a mispricing worsens before it corrects deters arbitrage in a way transaction costs do not: even an arbitrageur who has correctly identified that an asset is overpriced can suffer mark-to-market losses if the overpricing grows further in the short run. If she faces a finite horizon, margin calls, or investors who withdraw capital after interim losses, she may be forced to close the position at the worst time. This “noise-trader” or convergence risk means arbitrage capital is limited even when the arbitrageur is right, leaving room for biases to persist in prices.

EMH-C10. (Any three of the six.) All investors are price takers: fails when a large institution unwinding a position, an activist, or a large order in a thin market moves the price by its own trading, disturbing the competitive equilibrium. One identical holding period: fails because a young saver, a retiree, and a quarterly-judged trading desk optimize over very different horizons, and long-horizon investors may want to hedge shifts in the opportunity set that a one-period model cannot see. All investments are publicly traded: fails because human capital, private businesses, and closely held real estate are large, non-tradable, market-correlated assets. No transaction costs or taxes: fails because commissions, spreads, and differing tax situations drive wedges so investors need not agree on prices. All investors are mean-variance optimizers: fails because investors care about downside risk, skewness, and ruin. Homogeneous expectations: fails because investors visibly disagree, and the sheer volume of trade is hard to reconcile with universal agreement. Any one failure undermines the sharp conclusion because the security-market-line result is derived from these assumptions holding jointly; if even one breaks, the derived equilibrium in which every asset must lie exactly on the SML no longer follows, and assets can plausibly earn returns off the line.

EMH-C11. Non-traded wealth includes human capital and future labor income, privately held businesses, closely held real estate, and the returns to education — none of which trades in public markets. Because these assets are large and correlated with the market yet cannot be freely bought, sold, diversified, or hedged, the “market portfolio” of publicly traded assets that the CAPM prices is not the true portfolio of aggregate wealth; a large slice of real wealth is simply missing from it. This matters for an investor whose largest asset is her own future labor income: her overall risk-return position is dominated by an asset the CAPM does not describe, so the CAPM’s prescription (hold the traded market portfolio) can be badly wrong for her. For instance, if her human capital already behaves like a risky, market-correlated asset, she may rationally want to tilt her traded holdings away from assets correlated with her earnings, something the standard CAPM cannot capture.

EMH-C12. Alpha is the return an asset or portfolio earns in excess of what the benchmark model — here the CAPM — says it should earn given its risk. Relative to the CAPM the benchmark return is the security market line, $r_f + \beta_i(Er_M - r_f)$, so $\alpha_i = Er_i - [\,r_f + \beta_i(Er_M - r_f)\,]$. Geometrically, alpha is the vertical distance between the asset’s expected return and the security market line: a positive alpha means the asset plots above the line (more return than its beta entitles it to), and a negative alpha means it plots below. If both the CAPM and the EMH hold, the CAPM makes the SML the correct description of equilibrium expected returns and the EMH ensures prices actually equal those fair values, so every asset lies exactly on the line and $\alpha_i = 0$ for everything.

EMH-C13. The joint-hypothesis problem is that any test of alpha is simultaneously a test of the asset-pricing model used to define the benchmark and of market efficiency, so a nonzero alpha cannot by itself say which of the two has failed. For a significant positive estimated alpha the chapter offers three distinct explanations: (1) bad benchmark — the CAPM is the wrong model (an assumption fails, e.g., non-traded human capital or heterogeneous beliefs), so the asset earns fair compensation for a risk the CAPM omits and only appears to have alpha; (2) genuine mispricing — the CAPM is right but the EMH fails, so the price departs from fundamental value and the alpha is a real, temporarily available return; and (3) finite-sample chance — the true alpha is zero and the estimate is nonzero purely by luck in a limited sample of returns. A single nonzero estimate cannot discriminate among these because all three produce the same number; you need additional evidence (out-of-sample tests, alternative benchmarks, larger samples) to separate them.

EMH-C14. The CAPM says that unless one of its six assumptions is violated, no opportunity can offer returns higher than its risk warrants — every legitimate opportunity sits on the security market line. So a “high return, low risk” pitch leaves only two possibilities: the claim is false, or an assumption is genuinely being violated. The checklist walks the assumptions one at a time: Is the return really compensation for illiquidity/non-tradability (assumption 3, e.g., a private business or real estate)? Is there a genuine tax advantage specific to you (assumption 4)? Do you actually possess private information others lack (strong-form EMH and assumption 6)? Is the “high return” hiding a fat downside or tail risk (assumption 5)? Could a large player be moving the price (assumption 1)? If you can name the assumption the opportunity relies on, that names the specific risk you are being paid to bear. If, after going through the whole list, you cannot identify any assumption that is plausibly violated, the CAPM’s verdict is that the promised returns are almost certainly illusory, mismeasured because risk is being ignored, or outright fraudulent.

EMH-C15. Affinity fraud is fraud that spreads through tight-knit groups — families, religious congregations, ethnic or professional communities — in which shared identity substitutes for due diligence. The chapter argues that a trusted-community source is a reason to work through the checklist more carefully, not to skip it, because trust in the messenger does exactly the work the CAPM says the numbers cannot: it makes an off-the-security-market-line promise “feel safe” even though nothing about the returns has been justified. The empirical record (e.g., recruitment concentrated in affinity groups in the Fortune Hi-Tech Marketing scheme) shows these frauds exploit precisely that trust, often turning victims into recruiters. So the right response is to run the assumptions regardless of who is pitching, and to treat a familiar, trusted source as raising the stakes of skipping analysis rather than lowering the need for it.

EMH-C16. The EMH predicts this pattern because if prices already impound available information, consistently identifying mispriced securities is extremely hard, so the fees active managers charge become a near-certain drag on returns; the average active fund should therefore underperform a cheap index after fees, and rational savers should shift toward low-cost index funds — which is what the flows show. However, persistent active underperformance is not decisive proof of efficiency. The same evidence is also consistent with a world where some mispricings exist but are too small, too costly, or too risky to exploit reliably (limits to arbitrage), where genuine skill exists but is rare and hard to identify ex ante among many luck-driven track records (the dart-throwing-monkey problem), or where the benchmark used to judge “underperformance” is itself the wrong model (the joint-hypothesis problem). Fee drag alone can explain average underperformance without the market being fully efficient.

EMH-Q1. (a) The security-market-line (CAPM) required return is $r_f + \beta_i(Er_M - r_f) = 0.03 + 1.2\,(0.09 - 0.03) = 0.03 + 1.2(0.06) = 0.03 + 0.072 = 0.102 = 10.2\%$.

$\alpha_i = Er_i - [\,r_f + \beta_i(Er_M - r_f)\,] = 0.12 - 0.102 = 0.018 = 1.8\%$.
Since the alpha is positive, the stock plots above the security market line — the analyst forecasts $1.8\%$ more than its beta entitles it to. By the joint-hypothesis problem, this nonzero alpha is not self-interpreting: it could mean the CAPM is the wrong benchmark (a “bad benchmark,” so the stock is fairly compensating some risk the CAPM omits), it could reflect a genuine mispricing (the CAPM holds but the EMH fails), or it could be an artifact of a finite sample / an inaccurate forecast. The estimate alone cannot say which.

EMH-Q2. (a) CAPM required return is $r_f + \beta(Er_M - r_f)$ with $r_f = 0.02$ and market premium $0.05$.

Portfolio A: required $= 0.02 + 0.8(0.05) = 0.02 + 0.04 = 0.06 = 6\%$; $\alpha_A = 0.07 - 0.06 = 0.01 = 1\%$.

Portfolio B: required $= 0.02 + 1.5(0.05) = 0.02 + 0.075 = 0.095 = 9.5\%$; $\alpha_B = 0.09 - 0.095 = -0.005 = -0.5\%$.

Portfolio A has the larger alpha ($+1\%$ versus $-0.5\%$). Portfolio B has the higher raw expected return ($9\%$ versus $7\%$) but the lower alpha — in fact a negative one. This shows that raw expected return is a misleading basis for comparison: B earns more only because it takes more systematic risk (higher beta), and once you adjust for that risk it actually falls short of its CAPM benchmark, whereas A exceeds its benchmark. Risk-adjusted return (alpha), not raw return, is the correct comparison.
If the true population alphas are both zero, the nonzero estimates would be explained by finite-sample chance — the estimated alpha is nonzero purely by luck in a limited sample of returns.

17.6 Interest

INT-C1. The time value of money says most people are not indifferent between $1000 today and $1000 in a year: a dollar today can be invested at the risk-free rate $r$ and will grow to $\$1000(1+r) > \$1000$ by next year, so the future payment is worth strictly less. By the arbitrage principle, the present value of the delayed $1000 is exactly the amount you would need to set aside today to reproduce it — namely $\$1000/(1+r)$, which is below $1000 whenever $r > 0$. Taking the payment today therefore leaves you strictly wealthier (you can always wait and lend), so a certain payment today dominates the same certain payment later.

INT-C2. Asset A (paid in 3 years) has the larger present value: discounting compounds with the horizon, so $PV_T = (1+r)^{-T}x$ shrinks as $T$ grows, and 3 years of discounting removes less value than 8 years. As $r$ rises, the discount factor $(1+r)^{-T}$ falls for both assets, so both present values decline; because the longer-dated asset is discounted over more periods, its value falls proportionally faster, so the dollar gap between A and B narrows toward zero at high $r$ (both values head to zero). Only in the limit $r \to 0$ do the two present values coincide at $5000, since with no discounting timing is irrelevant.

INT-C3. The value $V = M/r$ falls as $r$ rises because a higher interest rate means a smaller present-day deposit is needed to throw off the same perpetual payment (equivalently, each future dollar is discounted more heavily). Even a small $M$ can be worth a large lump sum because the payments continue forever: dividing by a small $r$ magnifies the sum. The replication/arbitrage argument guarantees exactness because $M/r$ is precisely the principal that, invested at rate $r$, earns interest $r \cdot (M/r) = M$ each period; paying out exactly the interest $M$ leaves the principal intact to earn $M$ again next period, perpetually reproducing the stream without depleting capital. Any other price would let someone arbitrage the difference.

INT-C4. Both routes price the same cash flows — a payment of $M$ at each date $1,\dots,T$ — so by the law of one price they must carry the same value; if they differed, one could buy the cheap version and sell the dear one for a riskless profit. The arbitrage route makes this concrete: buying a perpetuity ($M/r$ today) and selling the rights to all payments after date $T$ (worth $M/r$ at date $T$, hence $\frac{M/r}{(1+r)^T}$ today) leaves exactly the first $T$ payments, so the annuity is worth $M/r - \frac{M/r}{(1+r)^T} = M\frac{1}{r}(1 - (1+r)^{-T})$, matching the geometric-series result. As $T \to \infty$ the term $(1+r)^{-T} \to 0$ and the annuity value approaches $M/r$ — obvious from the picture, since selling payments infinitely far in the future removes something of zero present value, leaving the whole perpetuity.

INT-C5. For a one-year horizon with interest credited a single time at year-end, simple and compound interest give the identical balance $x(1+r)$: there has been no prior interest for compounding to act upon, so the student is right in that narrow case. The student is wrong once interest is credited more than once (e.g. compounded $n$ times, giving $x(1+r/n)^n > x(1+r)$ for $n>1$) or the money is left for several years: compound interest earns interest on previously credited interest, so after $t$ years compound growth $x(1+r)^t$ pulls ahead of simple growth $x(1+rt)$, and the gap widens with both the frequency and the horizon.

INT-C6. Under compound interest the first year’s interest $xr$ is itself left in the account and earns interest in the second year, so the two-year balance is $x(1+r)^2 = x(1 + 2r + r^2)$. Under simple interest only the original principal earns, giving $x(1+2r)$. The extra term compounding adds is $xr^2$ — the interest earned on the first year’s interest $xr$. Because each extra year adds further “interest on interest,” the gap between the compound balance $x(1+r)^t$ and the simple balance $x(1+rt)$ widens as the horizon $t$ lengthens.

INT-C7. Both CDs quote the same annual rate $r = 6\%$, but the annual rate ignores compounding within the year. The monthly-compounding CD credits interest twelve times, and each credited amount itself earns interest for the rest of the year, so it accumulates more than the semiannual CD, which compounds only twice. The effective annual rate is $r^* = (1 + r/n)^n - 1$, the actual one-year percentage increase in the balance; with $n=12$ it exceeds the $n=2$ value. Because $r^*$ folds in the compounding frequency while the quoted rate $r$ does not, $r^*$ is the correct apples-to-apples figure for comparing the two CDs.

INT-C8. Raising $n$ credits interest more often, and each earlier credit earns interest for the remainder of the year, so $r^* = (1+r/n)^n - 1$ rises with $n$. It does not grow without bound because the extra “interest-on-interest” from each additional split is progressively smaller; the sequence $(1+r/n)^n$ converges. The chapter shows via L’Hôpital’s rule that $\lim_{n\to\infty}(1+r/n)^n = e^r$, so the limiting gross annual return is $e^r$ and the limiting effective rate is $e^r - 1$. The natural exponential appears as the ceiling because continuous compounding is the mathematical limit of ever-finer compounding, and $e^r$ is exactly the value that self-consistent instantaneous growth at rate $r$ produces.

INT-C9. The expression $(1+r/n)^n$ has the variable $n$ in both the base and the exponent, which makes the limit hard to attack directly; taking the logarithm turns it into $n\log(1+r/n)$, a product, and then a quotient $\frac{\log(1+r/n)}{1/n}$ that ordinary limit tools can handle. As $n \to \infty$ the numerator $\log(1+r/n) \to \log 1 = 0$ and the denominator $1/n \to 0$, so the quotient has the indeterminate form $0/0$, exactly the setting in which L’Hôpital’s rule applies; differentiating top and bottom yields the limit $r$. Exponentiating back gives $e^r$. The balance formula $B_t = xe^{rt}$ then says a continuously compounded deposit grows exponentially and smoothly through time, with the instantaneous growth rate equal to $r$ at every moment.

INT-C10. The nominal rate is the rate the account literally pays in dollars; the real rate, by the Fisher relation, is roughly the nominal rate minus expected inflation. When the Fed’s nominal rate (~3.6%) equals inflation (~3.6%), the real return is about zero: the saver’s dollar balance grows at 3.6% per year, but prices rise just as fast, so the basket of goods those dollars can buy is unchanged (and shrinks if inflation exceeds the rate). The compounding machinery operates on the nominal rate because that is the rate at which actual dollars in the account accumulate; inflation is a separate erosion of what those dollars are worth, applied after the nominal growth is computed.

INT-C11. She should discount the payment stream at a rate consistent with its nominal terms and then judge the result in real terms. The stream of monthly payments is an annuity (a fixed-life level payment), so its present value is the annuity value of this chapter; a lifelong-but-uncertain-horizon pension behaves closer to a perpetuity-like stream. The key Fisher point is that a fixed nominal annuity pays the same dollar amount every month regardless of prices, so inflation steadily erodes its purchasing power — the real value of each payment shrinks over time. A payment indexed to prices rises with inflation, preserving real value, so it does not suffer this erosion. Thus the lump-sum-versus-annuity choice hinges on both the discount rate used and whether the annuity is nominal or inflation-indexed.

INT-C12. With a positive interest rate, a dollar received at maturity is worth less than a dollar today, and by $PV_T = (1+r)^{-T}x$ the bill’s price is the face value discounted back over its life: $(1+r)^{-T} < 1$ when $r > 0$, so the price sits strictly below face value. This discount is precisely the investor’s return — the gap between purchase price and the face value paid at maturity. If the six-month rate $r$ suddenly rose, the discount factor $(1+r)^{-T}$ would fall, so the bill’s price would drop: higher required return means a lower price today for the same fixed future payment.

INT-Q1. Discount each payment and sum, with $r = 0.08$: \[PV = \frac{400}{1.08} + \frac{600}{1.08^2} + \frac{1000}{1.08^3}.\] Term by term: $400/1.08 = 370.37$; $600/1.1664 = 514.40$; $1000/1.259712 = 793.83$. Summing gives $PV \approx \$1678.61$.

INT-Q2. Single payment, $x = 2500$. At $T = 4$, $r = 0.07$: \[PV = 2500(1.07)^{-4} = 2500/1.310796 \approx \$1907.24.\] At $T = 9$, $r = 0.07$ held fixed: \[PV = 2500(1.07)^{-9} = 2500/1.838459 \approx \$1359.83.\] The longer horizon lowers the present value.

INT-Q3. Perpetuity value $V = M/r$ with $M = 750$. At $r = 0.05$: \[V = 750/0.05 = \$15{,}000.\] At $r = 0.04$: \[V = 750/0.04 = \$18{,}750.\] Lowering the interest rate raises the perpetuity’s value — value moves inversely with $r$, since each future payment is discounted less heavily.

INT-Q4. Annual rate $r = 0.08$, so periodic (quarterly) rate $r/4 = 0.02$ and quarterly coupon $rM/4 = 0.08 \cdot 1000/4 = \$20$. The perpetuity of quarterly coupons, valued immediately after a payment, is \[P = \frac{rM/4}{r/4} = \frac{20}{0.02} = \$1000.\] Immediately before a coupon, the buyer also collects the imminent $20 payment, so the price is $1000 + 20 = \$1020$.

INT-Q5. Annuity factor with $M = 1500$, $r = 0.06$, $T = 15$: \[V = 1500 \cdot \frac{1}{0.06}\left(1 - \frac{1}{1.06^{15}}\right).\] Here $1.06^{15} = 2.396558$, so $1/1.06^{15} = 0.417265$ and $1 - 0.417265 = 0.582735$. Then $V = 1500 \cdot (1/0.06) \cdot 0.582735 = 25000 \cdot 0.582735 \approx \$14{,}568.37$.

INT-Q6. Annuity with $M = 800$, $r = 0.09$, $T = 25$: \[V = 800 \cdot \frac{1}{0.09}\left(1 - \frac{1}{1.09^{25}}\right).\] Here $1.09^{25} = 8.623081$, so $1/1.09^{25} = 0.115968$ and $1 - 0.115968 = 0.884032$. Then $V = 800 \cdot (1/0.09) \cdot 0.884032 = 8888.89 \cdot 0.884032 \approx \$7858.06$. Check: the corresponding perpetuity is $M/r = 800/0.09 = \$8888.89$, and indeed $7858.06 < 8888.89$, as expected since the annuity omits all payments after year 25.

INT-Q7. Compounded balance with $x = 5000$, $r = 0.10$, $n = 4$, $t = 8$: \[B_t = 5000\left(1 + \frac{0.10}{4}\right)^{4 \cdot 8} = 5000(1.025)^{32}.\] Since $(1.025)^{32} = 2.203757$, $B_t = 5000 \cdot 2.203757 \approx \$11{,}018.78$.

INT-Q8. Effective annual rate with $r = 0.18$. Monthly ($n = 12$): \[r^* = \left(1 + \frac{0.18}{12}\right)^{12} - 1 = (1.015)^{12} - 1 = 1.195618 - 1 \approx 0.19562 \ (19.56\%).\] Daily ($n = 365$): \[r^* = \left(1 + \frac{0.18}{365}\right)^{365} - 1 \approx 1.197164 - 1 \approx 0.19716 \ (19.72\%).\] More frequent compounding raises the effective rate slightly.

INT-Q9. Continuous compounding with $x = 3000$, $r = 0.05$, $t = 4$: \[B_t = 3000\,e^{0.05 \cdot 4} = 3000\,e^{0.2} = 3000 \cdot 1.221403 \approx \$3664.21.\] Effective annual rate: $r^* = e^{0.05} - 1 = 1.051271 - 1 \approx 0.051271 \ (5.13\%)$.

INT-Q10. Continuous compounding with $x = 10{,}000$, $r = 0.03$, $t = 10$: \[B_t = 10000\,e^{0.03 \cdot 10} = 10000\,e^{0.3} = 10000 \cdot 1.349859 \approx \$13{,}498.59.\] Effective annual rate: $r^* = e^{0.03} - 1 = 1.030455 - 1 \approx 0.030455 \ (3.05\%)$.

17.7 Fixed Income Yields

YLD-C1. The yield to maturity $y$ is defined as the single discount rate that equates the present value of a bond’s promised cash flows to its observed price, i.e. the solution to $P = \sum_{t=1}^{T}\frac{b}{(1+y)^t} + \frac{M}{(1+y)^T}$. The colleague’s statement is correct only under a specific condition: the yield to maturity is the return earned by holding to maturity provided every coupon can be reinvested at the rate $y$ until maturity. It is a promised or “if-all-goes-as-assumed” return, not a guaranteed one. If reinvestment rates differ from $y$ over the bond’s life, the realized compound return will differ from the quoted yield. So the statement is correct as a definition of the internal rate of return implied by today’s price, but incorrect as an unconditional forecast of the return the investor will actually earn.

YLD-C2. A single discount rate is used because the yield to maturity is defined as the one number $y$ that makes the discounted sum of all promised payments equal the price. It is the internal rate of return of the bond’s cash-flow stream: rather than assigning a distinct spot rate to each maturity, it compresses the whole term structure into one summary rate specific to that bond. This is what lets markets quote value and compare instruments in a single figure — two bonds with different coupons and maturities can be ranked by their yields even though their cash-flow patterns differ. The cost of this convenience is that $y$ is bond-specific and embeds the reinvestment assumption; it is not a set of maturity-by-maturity discount rates.

YLD-C3. Write the per-period coupon as $b = cM$, where $c$ is the coupon rate, and evaluate the pricing equation at the candidate yield $y = c$. The coupon stream is an annuity, so using the annuity factor $\frac{1}{c}\left(1-(1+c)^{-T}\right)$, $P = cM\cdot\frac{1}{c}\left(1-\frac{1}{(1+c)^T}\right) + \frac{M}{(1+c)^T} = M\left(1-\frac{1}{(1+c)^T}\right)+\frac{M}{(1+c)^T}$. The two occurrences of $M(1+c)^{-T}$ cancel, leaving $P = M$. So discounting at the coupon rate prices the bond exactly at par. Because the pricing equation has a unique yield solution for any given price, the converse follows: if $P = M$ then $y$ must equal $c$.

YLD-C4. If a bond sells above par, buyers are paying more than the face value they will get back, so the promised payments must be discounted at a rate below the coupon rate to make their present value large enough to reach that high price; hence YTM $< c$. If a bond sells below par, the promised payments must be discounted at a rate above the coupon rate to shrink their present value down to that low price; hence YTM $> c$. Both rules follow from the definition of $y$ as the discount rate equating the present value of promised payments to $P$: raising $y$ lowers present value and raising the price requires lowering $y$, so price and yield move inversely around the par/coupon-rate pivot.

YLD-C5. The realized compound return will be below the quoted yield to maturity. The yield to maturity implicitly assumes that every coupon received is reinvested at the original yield until the bond matures. When market rates fall below that original yield, the coupons are reinvested at lower rates, so the terminal value of the reinvested coupon stream is smaller than the yield-to-maturity calculation presumed. Solving $M(1+r)^T = \text{final cash}$ then produces a realized $r$ less than $y$. The violated assumption is precisely the constant-reinvestment-at-$y$ assumption embedded in the yield to maturity.

YLD-C6. The objection misunderstands what the reinvestment assumption governs. The assumption matters whenever the return is measured as a compound return over the holding horizon: even coupons that are spent still have a value, and comparing the yield to maturity to a realized compound return implicitly asks what those coupons could have earned. An investor who spends every coupon and holds to maturity does earn the coupon cash flows, but relative to the quoted yield he forgoes the interest-on-interest that the yield to maturity credited him with. When rates rise after purchase, an investor who reinvests gains because coupons compound at higher rates; the investor who spends them captures no such gain, so his realized experience diverges from the yield to maturity exactly through the reinvestment channel — even though, for him, that divergence shows up as forgone reinvestment income rather than lower cash in hand.

YLD-C7. The holding period return is $R = \frac{P_1 + b - P_0}{P_0}$, which splits into a coupon component $\frac{b}{P_0}$ and a capital-gain component $\frac{P_1 - P_0}{P_0}$. The coupon component is fixed and positive over the period. When market interest rates rise sharply just after purchase, the bond must be discounted at a higher yield, so its resale price $P_1$ falls below the purchase price $P_0$, making the capital-gain component negative. If that capital loss is large enough to exceed the coupon income, the total return turns negative. So it is the capital-gain (price) component — driven by the inverse price–yield relationship — that can convert a positive coupon return into a negative total return.

YLD-C8. A rate increase raises the discount rate applied to all remaining cash flows, so the market price of the bond falls immediately. For the investor who plans to sell after one year, this is a direct loss: the price at which she can sell has dropped, so she realizes a capital loss on the sale. For the investor holding to maturity, the price drop is irrelevant to the terminal payoff — he still collects every coupon and the face value $M$ — but the higher rates actually help him, because the coupons he receives can now be reinvested at the higher prevailing rates, raising his realized compound return. The same rate move thus hurts the early seller through the price channel and helps the buy-and-hold investor through the reinvestment channel.

YLD-C9. In $P = \sum_{t=1}^{T}\frac{b}{(1+y)^t} + \frac{M}{(1+y)^T}$, every term has $(1+y)$ in the denominator, so raising $y$ shrinks each discounted payment and lowers the total present value $P$; lowering $y$ raises $P$. Price and yield therefore move in opposite directions by construction. The news item illustrates the same mechanism from the holder’s side: a hot inflation report pushed the 10-year Treasury yield up to about 4.46%, and because the promised coupon and face payments are fixed, the only way the market can offer a higher yield on those fixed payments is by paying a lower price for them. Investors already holding the bond therefore saw its price — and hence its market value — fall as yields rose.

YLD-C10. For a given rise in the market yield, the 20-year bond’s price falls by more in percentage terms than the 2-year bond’s. In the pricing equation, a longer maturity means more distant cash flows and a larger power of $(1+y)$ in the denominators; distant payments are far more sensitive to a change in the discount rate because the discount factor $(1+y)^{-t}$ changes proportionally faster as $t$ grows. The 20-year bond has a large fraction of its value in payments many periods away, so when $y$ rises, those far-off terms shrink dramatically and the whole price drops sharply. The 2-year bond’s value is concentrated in near-term payments that are barely affected by the higher discount rate, so its price moves little. Longer maturity therefore means greater price sensitivity to yield changes.

YLD-C11. The current yield is annual coupon income divided by price, $b/P$; it measures only the cash coupon return relative to the amount invested. The yield to maturity is the single discount rate solving the full pricing equation, and it incorporates all sources of return — coupon income and the capital gain or loss realized as the price converges to face value at maturity, along with the timing of every payment. The current yield omits this capital-gain/loss and timing dimension. A scenario in which the higher current yield delivers the lower yield to maturity: a premium bond (priced above par) has a high coupon and thus a high current yield, but because it will decline toward face value by maturity, that capital loss drags its yield to maturity below the current yield; a discount bond with a lower coupon (lower current yield) enjoys a capital gain to par that lifts its yield to maturity. Comparing the two by current yield alone can reverse their true ranking.

YLD-C12. For a bond with semiannual coupons and periodic yield $y$, the bond equivalent yield annualizes simply by doubling, $2y$, ignoring compounding of the mid-year coupon. The effective annual yield compounds the periodic yield over the two half-years, $(1+y)^2 - 1$, thereby crediting interest earned on the first semiannual coupon during the second half-year. Because $(1+y)^2 - 1 = 2y + y^2 \ge 2y$ for any $y \ge 0$, the effective annual yield is always at least as large as the bond equivalent yield; the extra term $y^2$ is exactly the interest-on-interest that simple annualization ignores. The two coincide only when $y = 0$ (no compounding to accumulate).

YLD-Q1. The bond sells at par: $P = M = \$1000$. By the par-bond result, a bond priced at par has yield to maturity equal to its coupon rate. The coupon rate here is $b/M = 70/1000 = 0.07$, so $y = 0.07$. No numerical root-finding is needed — one can verify directly that $\sum_{t=1}^{3}\frac{70}{1.07^t} + \frac{1000}{1.07^3} = 1000$. Result: $y = 0.07$ (7%).

YLD-Q2. Substitute $y = 0.07$: $\frac{60}{1.07} + \frac{60}{1.07^2} + \frac{1000}{1.07^2} = 56.075 + 52.406 + 873.439 = 981.92$, which equals the observed price $P = \$981.92$. So $y = 0.07$ is confirmed. Since $P = \$981.92 < M = \$1000$, the bond trades below par, and consistent with the chapter’s rule the yield to maturity (7%) exceeds the coupon rate $b/M = 60/1000 = 6\%$.

YLD-Q3. Coupons of $\$80$ are received at $t = 1, 2, 3$; reinvested to maturity ($t = 3$) at 10%: the $t=1$ coupon grows to $80(1.10)^2 = 96.80$, the $t=2$ coupon to $80(1.10) = 88.00$, and the $t=3$ coupon stays $80.00$. Adding the face value $\$1000$: final cash $= 96.80 + 88.00 + 80.00 + 1000 = \$1264.80$. Then $1000(1+r)^3 = 1264.80 \Rightarrow r = (1.2648)^{1/3} - 1 = 0.0815$, i.e. about 8.15%. Because reinvestment occurred at 10% > 8%, the realized compound return exceeds the quoted yield of 8%.

YLD-Q4. The first coupon ($\$50$, received at $t=1$) is reinvested at 3% to $t=2$: $50(1.03) = 51.50$. The second coupon ($\$50$) is received at $t=2$, plus the face value $\$1000$. Final cash $= 51.50 + 50 + 1000 = \$1101.50$. Then $1000(1+r)^2 = 1101.50 \Rightarrow r = (1.1015)^{1/2} - 1 = 0.0495$, about 4.95%. Because the coupon was reinvested at 3% < 5%, the realized compound return (4.95%) falls short of the quoted yield to maturity (5%).

YLD-Q5. Purchase price at periodic yield $y = 0.04$, 20 periods: $P_0 = \frac{6}{0.04}\left(1 - \frac{1}{1.04^{20}}\right) + \frac{100}{1.04^{20}} = 150(1 - 0.456387) + 45.639 = 81.542 + 45.639 = \$127.181$. Six months later, 19 periods remain: $P_1 = \frac{6}{0.04}\left(1 - \frac{1}{1.04^{19}}\right) + \frac{100}{1.04^{19}} = 150(1 - 0.474642) + 47.464 = 78.804 + 47.464 = \$126.268$. Holding period return over the first six months: $R = \frac{P_1 + b - P_0}{P_0} = \frac{126.268 + 6 - 127.181}{127.181} = \frac{5.087}{127.181} = 0.040$, i.e. 4% — exactly the new periodic market rate, as expected.

YLD-Q6. $R = \frac{P_1 + b - P_0}{P_0} = \frac{955 + 40 - 980}{980} = \frac{15}{980} = 0.0153$, about 1.53%. Coupon component: $\frac{b}{P_0} = \frac{40}{980} = 0.0408$ (4.08%). Capital-gain component: $\frac{P_1 - P_0}{P_0} = \frac{955 - 980}{980} = -0.0255$ (−2.55%). The two sum to the total: $0.0408 - 0.0255 = 0.0153$. The capital loss partly offsets the coupon income.

YLD-Q7. The periodic yield is $y = 0.04$. The bond equivalent yield (simple annualization) is $2y = 2(0.04) = 0.08$, or 8%. The effective annual yield is $(1+y)^2 - 1 = (1.04)^2 - 1 = 1.0816 - 1 = 0.0816$, or 8.16%. The effective annual yield exceeds the bond equivalent yield by 16 basis points, reflecting compounding of the first semiannual coupon.

YLD-Q8. The periodic yield is $y = 0.035$. Bond equivalent yield: $2y = 2(0.035) = 0.07$, or 7%. Effective annual yield: $(1+y)^2 - 1 = (1.035)^2 - 1 = 1.071225 - 1 = 0.071225$, or about 7.1225%. The effective annual yield exceeds the bond equivalent yield by $0.071225 - 0.07 = 0.001225$, i.e. about 12.25 basis points (the interest-on-interest term $y^2 = 0.001225$).

YLD-Q9. Current yield $= \frac{b}{P} = \frac{65}{928.57} = 0.0700$, i.e. 7%. The coupon rate is $b/M = 65/1000 = 6.5\%$. Since $P = \$928.57 < M = \$1000$, the bond sells below par, so by the chapter’s rule the yield to maturity is above the coupon rate of 6.5%.

YLD-Q10. Current yield $= \frac{b}{P} = \frac{70}{1050} = 0.0667$, i.e. 6.67%. Coupon rate $= b/M = 70/1000 = 0.07$, i.e. 7%. Since $P = \$1050 > M = \$1000$, the bond trades above par, so the yield to maturity is below the coupon rate. For a premium bond the ordering from highest to lowest is: coupon rate (7%) > current yield (6.67%) > yield to maturity (below 6.67%). The coupon rate tops the list because it is measured against face value; the current yield is lower because the price paid exceeds face; and the yield to maturity is lowest because it further subtracts the capital loss the holder will incur as the premium price declines to par at maturity.

17.8 The Yield Curve

YC-C1. When the yield curve is not flat, each cash flow of the note must be discounted at the spot rate whose maturity matches the timing of that cash flow, not at a single rate. The stripping-and-reconstitution argument shows why: the coupon due in one year is, by itself, a one-year zero and must sell at the one-year spot rate $y_1$; the coupon-plus-principal due in two years is a two-year zero and must sell at the two-year spot rate $y_2$. If the note were instead priced by discounting both flows at $y_2$, its price would differ from the sum of the stripped pieces, and one could strip (or reconstitute) the note through the Treasury to capture a riskless profit. The correct pricing rule is therefore $P = \frac{b}{1+y_1} + \frac{b+M}{(1+y_2)^2}$: coupon $b$ discounted at $y_1$, and the final coupon-plus-face $b+M$ discounted at $y_2$. Discounting everything at $y_2$ is correct only in the special case of a flat curve, $y_1 = y_2$.

YC-C2. If the coupon bond and a portfolio of zeros with identical cash flows traded at different prices, the arbitrageur buys the cheaper and sells the dearer today. Concretely, if the coupon bond is cheaper than its stripped pieces, buy the bond, ask the Treasury to strip it into its component zeros, and sell those zeros for more than you paid — a riskless profit realized immediately. If the bond is more expensive than the pieces, do the reverse: buy the individual zeros, reconstitute them into the coupon bond through the Treasury, and sell the reconstituted bond. Because the cash flows are identical by construction, the trade carries no risk and requires no waiting for the market to “correct.” The mere availability of this trade forces the coupon bond’s price to equal the sum of the values of its stripped cash flows: any gap would be seized instantly by arbitrageurs, so no gap can persist. This is the law of one price, and it holds independently of investor optimism or pessimism because it rests on cash-flow identity, not on beliefs.

YC-C3. The statement confuses a spot rate with a forward rate. The three-year spot rate $y_3$ is the yield to maturity, quoted today, on a zero-coupon bond that matures in three years; it summarizes the average rate an investor locks in over the entire three-year horizon starting now. It is not a forecast of any single future short rate. The forward rate $f_3$ is the future one-year rate, for the interval from year 2 to year 3, that is implied by today’s curve; it is what the market’s spot rates say the year-2-to-year-3 rate must be to rule out arbitrage. The two are linked by $1 + f_3 = \frac{(1+y_3)^3}{(1+y_2)^2}$: the three-year spot rate embeds the forward rates for each year, so $f_3$ is extracted from $y_3$ and $y_2$ rather than being read off $y_3$ directly. Only in a world of certainty does the forward rate $f_3$ equal the short rate $r_3$ that will actually prevail three years out.

YC-C4. The spot rate $y_n$ is the yield to maturity observed today on an $n$-period zero, so it is observable now. The forward rate $f_n$ is computed today from the observed spot curve via $1 + f_n = \frac{(1+y_n)^n}{(1+y_{n-1})^{n-1}}$, so it too is observable now — it is a deterministic function of today’s data. The short rate $r_n$ is the one-period rate that will actually prevail over interval $n$; except for $r_1$ (today’s short rate, equal to $y_1$), the future short rates $r_2, r_3, \ldots$ are random variables not known until they arrive. Under certainty there is no randomness, so the no-arbitrage forces the realized short rate to equal the implied forward rate, $r_n = f_n$. Under uncertainty they generally differ, and the chapter’s theories concern the sign of $f_n - Er_n$. Finally, $y_n$ is always a geometric average because compounding over $n$ periods gives $(1+y_n)^n = (1+r_1)(1+r_2)\cdots(1+r_n)$ (under certainty) or the analogous product of one-plus-forward-rates, so $1 + y_n = \left[(1+r_1)\cdots(1+r_n)\right]^{1/n}$.

YC-C5. The investor buys $x/M$ units of the two-year zero today and simultaneously shorts enough one-year zeros to pay for them, so the net cash outlay today is zero. Wait — for a loan received at time 1 and repaid at time 2, the strategy is: short (sell) $x/M$ one-year zeros today, which raises cash now, and use it to buy two-year zeros; equivalently, following the chapter’s general construction, short enough two-year zeros to fund a purchase of one-year zeros so that $x$ arrives at time 1 and $(1+f_2)x$ is owed at time 2. Either way, the one-year zeros deliver exactly $x$ at time 1 (the loan proceeds), and the short position in two-year zeros must be closed at time 2 by paying their face value, which totals $(1+f_2)x$. Because every quantity is fixed today from the observed spot curve, there is no uncertainty. The effective borrowing rate is exactly $f_2$ — not $y_1$ (which covers only year 1) or $y_2$ (which covers the full two years) — because the trade isolates the single year-1-to-year-2 interval, and $f_2$ is precisely the rate the current curve implies for that interval.

YC-C6. No explicit forward contract is needed because a complete set of zeros lets the borrower manufacture the forward loan out of spot instruments. The construction shorts $(1+f_2)\,x/M$ two-year zeros today and uses the proceeds to buy $x/M$ one-year zeros today; the two legs are sized so that their time-0 cash flows exactly cancel, so nothing net changes hands today. At time 1, the one-year zeros mature and pay the borrower $x$ — this is the loan being “received.” Between time 0 and time 1 the two-year zero short position simply sits open, generating no intermediate cash flow. At time 2, the borrower must buy back the two-year zeros to close the short, paying their face value $(1+f_2)x$ — this is the loan being “repaid.” Thus cash flows out only at time 2 and in only at time 1, exactly like a one-period loan running from year 1 to year 2, and the synthetic contract replicates a forward loan at rate $f_2$ without any counterparty signing a forward agreement.

YC-C7. With only long-horizon (two-year) investors, everyone can lock in a certain two-year payoff by holding the two-year zero; the roll-over strategy (one-year zero, then reinvest at the random $r_2$) exposes them to interest-rate risk they do not want. Since investors are risk-averse, they will hold the risky roll-over strategy only if it offers a higher expected payoff than the safe two-year zero. Formally, indifference requires $EU((1+y_1)(1+r_2)) = U((1+y_2)^2)$, and Jensen’s inequality (for the strictly concave $U$) gives $U(E[(1+y_1)(1+r_2)]) > EU((1+y_1)(1+r_2)) = U((1+y_2)^2)$. Since $U$ is increasing, $E[(1+y_1)(1+r_2)] > (1+y_2)^2 = (1+y_1)(1+f_2)$, hence $E(1+r_2) > 1 + f_2$, i.e. $f_2 < Er_2$. The forward rate lies below the expected future short rate because long-horizon investors accept a lower implied rate on the safe long bond in exchange for avoiding reinvestment risk.

YC-C8. Under the expectations hypothesis $f_n = Er_n$, so the forward rates embedded in today’s curve are exactly the market’s forecasts of future short rates. Because the two-year (and longer) spot rate is a geometric average of the sequence of forward rates, $(1+y_2)^2 = (1+y_1)(1+f_2)$, the curve slopes upward ($y_2 > y_1$) precisely when $f_2 > y_1$, i.e. when the market expects the future short rate $Er_2$ to exceed today’s short rate $y_1$. So an upward slope signals expected rate increases. A downward-sloping (inverted) curve, $y_2 < y_1$, requires $f_2 < y_1$, which under the expectations hypothesis means $Er_2 < y_1$: the market expects future short rates to fall below today’s short rate — the configuration often read as a recession signal.

YC-C9. Liquidity preference theory decomposes the forward rate as $f_2 = Er_2 + LP$ with $LP > 0$. Because the forward rate is what drives the slope — $(1+y_2)^2 = (1+y_1)(1+f_2)$ — a positive $LP$ can make $f_2 > y_1$, and hence $y_2 > y_1$, even when the market expects short rates to be flat ($Er_2 = y_1$). The economic force $LP$ represents is a premium that risk-averse short-horizon investors demand as compensation for bearing the interest-rate risk of holding longer-term bonds. Contrast: under the expectations hypothesis an upward slope means the market forecasts rising short rates ($Er_2 > y_1$); under liquidity preference an upward slope can arise purely from the term premium $LP$, even with unchanged expected short rates. The two theories thus attribute the same observed slope to different causes — expectations versus risk compensation.

YC-C10. Short-horizon (one-year) investors would prefer to hold the safe one-year zero and earn $y_1$ with certainty. To hold the two-year zero for one year and sell it, they bear the risk that the resale price depends on the random future short rate $r_2$. Because they are risk-averse, they will accept this gamble only if its expected return exceeds the safe return, which forces $E\frac{(1+y_2)^2}{1+r_2} > \frac{(1+y_2)^2}{1+f_2}$; carrying the Jensen argument through shows this requires $f_2 > Er_2$, i.e. a positive liquidity premium $LP = f_2 - Er_2 > 0$. The liquidity preference theory posits that the market is dominated by these short-horizon investors — not the long-horizon investors of YC-C7 — so their demand for compensation dominates, pushing forward rates above expected future short rates and tilting the curve upward. It is the assumption that most real-world investors have short effective horizons (and are averse to interest-rate risk) that makes the short-investor case the empirically relevant one.

YC-C11. Under certainty the two strategies — (i) hold a one-year zero for one year, and (ii) buy a two-year zero and sell it after one year — deliver payoffs that are known in advance, and no-arbitrage forces their one-year returns to be equal. If they differed, an investor could short the lower-returning strategy and go long the higher-returning one for a riskless profit. After one year the two-year zero has become a one-year zero priced at $\frac{1000}{1+r_2}$, and substituting the no-arbitrage relation $1+r_2 = \frac{(1+y_2)^2}{1+y_1}$ shows the resale value grows the purchase price by exactly a factor of $1+y_1$. Therefore the gross one-year return is $1 + y_1$, and the net one-year holding-period return is $r = (1+y_1) - 1 = y_1$. It is the net return that equals $y_1$; the gross return is $1 + y_1$.

YC-C12. The statement is wrong because it reports the gross return as if it were the holding-period (net) return. The gross return is the ratio of ending value to beginning value, $\frac{\text{resale price}}{\text{purchase price}}$, and under certainty this ratio equals $1 + y_1$. The net holding-period return is the gross return minus one, $\frac{\text{resale price}}{\text{purchase price}} - 1 = (1+y_1) - 1 = y_1$. So the net one-year holding-period return is $y_1$, while $1 + y_1$ is the gross return. In the chapter’s derivation the “$-1$” enters exactly at $r = \frac{(1+y_2)^2}{1+r_2} - 1$, i.e. when the ending-over-beginning price ratio is converted into a net return by subtracting the initial dollar. Saying the holding-period return is $1 + y_1$ double-counts the return of principal.

YC-C13. Under the expectations hypothesis, the forward rates embedded in the curve equal expected future short rates, so an inverted curve — long yields below short yields — means the market expects future short rates to fall. Since forward rates below today’s short rate translate into $Er_n < y_1$, the market is forecasting rate cuts, which typically accompany an anticipated economic slowdown; this is why an inverted curve is historically a recession warning. A bank treasurer who borrows short and lends long should pay close attention: the profitability of that maturity mismatch depends on short funding rates staying low relative to the long assets, and an inversion signals both an expected reversal in rate direction and the heightened chance of a downturn that could impair the loan book and squeeze the funding spread. The configuration is precisely the term-structure signal the chapter frames as central to fixed-income risk management.

YC-C14. Borrowing short and lending long is typically profitable when the curve slopes upward because the bank pays the low short-term rate ($\approx y_1$) on its deposits while earning the higher long-term rate ($y_n$) on its loans, capturing the positive spread. In term-structure language, an upward slope means the forward rates — and hence the long spot rates that are their geometric average — exceed today’s short rate, whether because the market expects rising short rates (expectations hypothesis) or because a positive term/liquidity premium $LP$ is embedded (liquidity preference). The risk is that this spread is not locked in: the bank must keep rolling over its short-term funding at whatever short rate actually prevails, i.e. at the realized future short rates $r_n$, which are random. If realized short rates $r_n$ turn out higher than the forward rates $f_n$ priced into the long assets, the bank’s funding cost rises above the yield on its fixed long-term loans and the spread can vanish or turn negative — the maturity-mismatch exposure that the forward-rate framework of this chapter lets the treasurer measure rather than ignore.

YC-Q1. No-arbitrage price with $b = 60$, $M = 1000$, $(y_1, y_2) = (0.030, 0.035)$: \[P = \frac{60}{1.030} + \frac{60 + 1000}{1.035^2} = 58.2524 + \frac{1060}{1.071225} = 58.2524 + 989.5203 = 1047.773.\] So the correct price is about $1047.77. Discounting both flows at $y_2 = 0.035$ instead gives $\frac{60}{1.035} + \frac{1060}{1.035^2} = 57.9710 + 989.5203 = 1047.491$, about $1047.49 — a different (incorrect) price because the year-1 coupon is over-discounted at $y_2 > y_1$. The two agree only if the curve were flat.

YC-Q2. (a) Spot rates from $P = \frac{1000}{(1+y_n)^n}$: $y_1$: $1 + y_1 = 1000/970.874 = 1.030000 \Rightarrow y_1 = 0.0300$. $y_2$: $(1+y_2)^2 = 1000/942.596 = 1.060897 \Rightarrow 1 + y_2 = 1.030000 \Rightarrow y_2 = 0.0300$… check: $1.030000^2 = 1.0609$, and $1000/1.0609 = 942.60$, consistent, so $y_2 = 0.0300$. Recompute precisely: $1000/942.596 = 1.060897$; $\sqrt{1.060897} = 1.030$; $y_2 = 0.0300$. $y_3$: $(1+y_3)^3 = 1000/915.142 = 1.092727 \Rightarrow 1 + y_3 = 1.092727^{1/3} = 1.030 \Rightarrow y_3 = 0.0300$. (Here the three prices happen to imply a flat 3.00% curve.) (b) Three-year note, face $1000, annual coupon $50, discount each flow at its own spot rate: \[P = \frac{50}{1.030} + \frac{50}{1.030^2} + \frac{1050}{1.030^3} = 48.5437 + 47.1298 + 960.9787 = 1056.652.\] Price is about $1056.65. (Equivalently, using the observed zero prices per $1000 par: $50\cdot0.970874 + 50\cdot0.942596 + 1050\cdot0.915142 = 48.5437 + 47.1298 + 960.899 = 1056.57$, matching up to rounding of $y_3$.)

YC-Q3. With $(y_1, y_2, y_3) = (0.030, 0.035, 0.038)$: \[1 + f_2 = \frac{(1.035)^2}{1.030} = \frac{1.071225}{1.030} = 1.040024 \Rightarrow f_2 = 0.040024 \approx 4.00\%.\] \[1 + f_3 = \frac{(1.038)^3}{(1.035)^2} = \frac{1.118386}{1.071225} = 1.044026 \Rightarrow f_3 = 0.044026 \approx 4.40\%.\]

YC-Q4. Spot rates from prices ($M = 1000$): $1 + y_1 = 1000/952.381 = 1.050000 \Rightarrow y_1 = 0.0500$. $(1+y_2)^2 = 1000/898.452 = 1.113023 \Rightarrow 1 + y_2 = 1.055000 \Rightarrow y_2 = 0.0550$. Forward rate: \[1 + f_2 = \frac{(1.055)^2}{1.050} = \frac{1.113025}{1.050} = 1.060024 \Rightarrow f_2 = 0.060024 \approx 6.00\%.\]

YC-Q5. Under certainty $(1+y_3)^3 = (1+r_1)(1+r_2)(1+r_3)$: \[(1+y_3)^3 = 1.020 \times 1.030 \times 1.040 = 1.092624.\] \[1 + y_3 = 1.092624^{1/3} = 1.029968 \Rightarrow y_3 = 0.029968 \approx 3.00\%.\]

YC-Q6. Arbitrage relation $(1+y_2)^2 = (1+y_1)(1+r_2)$ with $y_2 = 0.05$, $y_1 = 0.04$: \[1 + r_2 = \frac{(1.05)^2}{1.04} = \frac{1.1025}{1.04} = 1.060096 \Rightarrow r_2 = 0.060096 \approx 6.01\%.\]

YC-Q7. $(y_1, y_2) = (0.04, 0.05)$, face $1000. (a) Purchase price of the two-year zero today: $P_0 = \frac{1000}{(1.05)^2} = \frac{1000}{1.1025} = 907.029$. (b) Short rate under certainty: $1 + r_2 = \frac{(1.05)^2}{1.04} = \frac{1.1025}{1.04} = 1.060096 \Rightarrow r_2 = 0.060096$. After one year the bond is a one-year zero, resale price $P_1 = \frac{1000}{1+r_2} = \frac{1000}{1.060096} = 943.310$. (c) Net one-year holding-period return: $r = \frac{P_1 - P_0}{P_0} = \frac{943.310 - 907.029}{907.029} = \frac{36.281}{907.029} = 0.04000 = y_1$. It equals $y_1 = 0.04$. ✓ (d) The corresponding gross return is $\frac{P_1}{P_0} = \frac{943.310}{907.029} = 1.04000 = 1 + y_1$.

YC-Q8. $(y_1, y_2) = (0.03, 0.045)$, face $1000. Purchase price today: $P_0 = \frac{1000}{(1.045)^2} = \frac{1000}{1.092025} = 915.730$. Short rate: $1 + r_2 = \frac{(1.045)^2}{1.03} = \frac{1.092025}{1.03} = 1.060218 \Rightarrow r_2 = 0.060218$. Resale price after one year: $P_1 = \frac{1000}{1.060218} = 943.201$. Gross one-year holding-period return: $\frac{P_1}{P_0} = \frac{943.201}{915.730} = 1.03000 = 1 + y_1$. Net return: $1.03000 - 1 = 0.03000 = y_1$. ✓ Reporting $1 + y_1 = 1.03$ as “the holding-period return” would be an error because that is the gross return (ending value per dollar invested); the holding-period (net) return subtracts the returned principal, giving $y_1 = 0.03$.

YC-Q9. $(y_1, y_2) = (0.0154, 0.0174)$. Forward rate: $1 + f_2 = \frac{(1.0174)^2}{1.0154} = \frac{1.035118}{1.0154} = 1.019419 \Rightarrow f_2 = 0.019419 \approx 1.9419\%$. Under the expectations hypothesis $Er_2 = f_2 = 0.019419$. Under liquidity preference with $LP = 0.004$ and the same $f_2$: $Er_2 = f_2 - LP = 0.019419 - 0.004 = 0.015419 \approx 1.5419\%$. The expected future short rate is lower once the term premium is stripped out of the forward rate.

YC-Q10. $y_1 = 0.03$, $Er_2 = 0.05$. (a) Expectations hypothesis, $f_2 = Er_2 = 0.05$: $(1+y_2)^2 = (1.03)(1.05) = 1.0815 \Rightarrow 1 + y_2 = 1.039952 \Rightarrow y_2 = 0.039952 \approx 4.00\%$. The curve slopes up ($y_2 > y_1$) because the market expects the short rate to rise. (b) With $LP = 0.01$, $f_2 = Er_2 + LP = 0.06$: $(1+y_2)^2 = (1.03)(1.06) = 1.0918 \Rightarrow 1 + y_2 = 1.044892 \Rightarrow y_2 = 0.044892 \approx 4.49\%$. The liquidity premium raises the forward rate and hence the two-year spot rate, steepening the curve relative to the expectations-hypothesis case (4.49% vs 4.00%).

YC-Q11. $(y_1, y_2) = (0.020, 0.025)$, $M = 1000$, want $2M at time 1. (a) Prices: one-year zero $P_1 = \frac{1000}{1.020} = 980.392$; two-year zero $P_2 = \frac{1000}{(1.025)^2} = \frac{1000}{1.050625} = 951.814$. (b) To receive $2,000,000 at time 1, buy $2{,}000{,}000/1000 = 2000$ one-year zeros, costing $2000 \times 980.392 = \$1{,}960{,}784$. (c) Fund this by shorting $q = 1{,}960{,}784 / 951.814 = 2060.264$ two-year zeros. (d) Repay at time 2 the face value of the shorted zeros: $qM = 2060.264 \times 1000 = \$2{,}060{,}264$. Verification: effective interest $= 2{,}060{,}264 - 2{,}000{,}000 = \$60{,}264$ on $2M, i.e. $60{,}264/2{,}000{,}000 = 0.030132$. Forward rate $1 + f_2 = \frac{(1.025)^2}{1.020} = \frac{1.050625}{1.020} = 1.030025 \Rightarrow f_2 = 0.030025$, matching the effective rate (up to rounding). ✓

YC-Q12. $(y_1, y_2, y_3) = (0.030, 0.035, 0.040)$, $x = \$500{,}000$, $M = 1000$. Forward rate: $1 + f_3 = \frac{(1.040)^3}{(1.035)^2} = \frac{1.124864}{1.071225} = 1.050073 \Rightarrow f_3 = 0.050073 \approx 5.01\%$. Number of (time-3) zeros to short: $q = (1 + f_3)\frac{x}{M} = 1.050073 \times \frac{500{,}000}{1000} = 1.050073 \times 500 = 525.036$. Amount repaid at time 3: $(1 + f_3)x = 1.050073 \times 500{,}000 = \$525{,}036$ (equivalently $qM = 525.036 \times 1000 = \$525{,}036$). The interest cost is about $25,036, the forward rate $f_3 \approx 5.01\%$ applied to the $500,000 loan.

17.9 Fixed Income Portfolio Management

FIP-C1. Property 2 states that Macaulay duration decreases in the coupon rate. Bond A (3% coupon) delivers a larger fraction of its total present value at maturity, so the center of gravity of its cash-flow stream sits farther out in time; bond B (9% coupon) returns more value early, pulling its center of gravity toward the present. Hence $D_A > D_B$. Because modified duration $D^* = D/(1+y)$ inherits this ordering and $\frac{\Delta P}{P} \approx -D^*\Delta y$, the higher-duration bond A loses the larger percentage of its value when yields rise by a small $\Delta y$.

FIP-C2. In $D = \frac{1}{P}\sum_t \frac{t\,CF_t}{(1+y)^t}$, the weight attached to time $t$ is the present value $CF_t/(1+y)^t$ divided by $P$. Raising $y$ discounts distant cash flows more heavily than near ones, because $(1+y)^{-t}$ falls faster the larger $t$ is. The far-future payments — which carry the largest time indices $t$ and therefore the biggest terms in the numerator — lose the most weight in the average. This shifts the time-weighted average (the center of gravity) toward the present, so Macaulay duration falls as $y$ rises.

FIP-C3. Modified duration measures interest-rate sensitivity: $\frac{\Delta P}{P}\approx -D^*\Delta y$. Although both 30-year bonds mature at the same time, the coupon bond returns substantial value before maturity, so its cash-flow center of gravity — its Macaulay duration — is well below 30, whereas the zero’s Macaulay duration equals its maturity $T = 30$. The zero therefore has the far larger $D^*$ and is much more sensitive, so it falls more in price for a given increase in $y$. “Same maturity” is not “same sensitivity.”

FIP-C4. With $\frac{\Delta P}{P}\approx -D^*\Delta y$, sensitivity is governed entirely by $D^*$. Longer maturity spreads cash flows further into the future, raising duration (property 3), while a lower coupon shifts more value toward the distant principal payment, also raising duration (property 2). For a long-maturity, low-coupon bond both effects push duration up, so it is far more rate-sensitive than a short-maturity, high-coupon bond, where both effects push duration down. Because $D^*$ collapses the whole cash-flow timing into a single number, and portfolio duration is the value-weighted average of component durations, a manager can compare and aggregate this sensitivity across the entire portfolio with one statistic.

FIP-C5. Property 2: duration decreases in the coupon rate, because a higher coupon moves more of the bond’s value forward in time, lowering the time-weighted average date at which the holder receives value. A fixed-coupon perpetuity has no maturity yet a duration of $(1+y)/y$ that is independent of maturity (it has no maturity to depend on); a zero coupon bond has duration exactly equal to its maturity $T$ and pays no coupon at all. It is the perpetuity whose duration does not depend on maturity. This connects to property 2 because as the coupon rate rises the bond behaves more like a stream of early payments — its center of gravity, and hence its duration, moves toward the present.

FIP-C6. Lengthening maturity usually adds cash flows further into the future and pushes the principal payment further out, both of which raise the time-weighted average date of receipt and hence duration. For a bond selling at a very deep discount, however, the price is dominated by the distant principal while the coupons are small relative to that principal’s present value; adding still more maturity can change the balance of present values in a way that fails to increase — and can even slightly decrease — the time-weighted average, so duration need not rise monotonically with maturity.

FIP-C7. “Duration-matched” means the Macaulay (equivalently modified) duration of the asset portfolio equals that of the liability stream, so both sides of the balance sheet have the same first-order sensitivity to yield. (a) If asset duration equals liability duration, a small parallel rise in yields lowers the present value of assets and liabilities by approximately the same percentage, leaving the net position roughly unchanged — the fund is immunized. (b) If asset duration is much shorter than liability duration, a rise in yields shrinks the liabilities more than the assets, which happens to help; but the dangerous case is a rise when assets are longer than liabilities. Silicon Valley Bank held long-duration Treasuries against short-duration deposits: when rates rose in 2022 the assets fell far more in value than the liabilities, opening a large funding gap and forcing losses on sale.

FIP-C8. The condition $V = 0$ guarantees that the present value of the zero exactly funds the present value of the obligation today ($Z = P_{obl}$): the two sides start balanced. The condition $\partial V/\partial y = 0$ guarantees that a small yield change moves both sides equally, so the net position is first-order insensitive to yield. Writing $\partial P_{obl}/\partial(1+y) = -\frac{D_{obl}}{1+y}P_{obl}$ and the zero’s $\partial Z/\partial(1+y) = -\frac{\tau}{1+y}Z$, the second condition becomes $\frac{D_{obl}}{1+y}P_{obl} - \frac{\tau}{1+y}Z = 0$; since $V=0$ already forces $Z = P_{obl}$, the common factors cancel and $\tau = D_{obl}$. It is the Macaulay duration of the obligation, not its maturity $T$, because duration — not maturity — measures the yield sensitivity that must be matched, and the obligation pays value throughout its life rather than all at $T$.

FIP-C9. Bond price is convex in yield, so the price-yield curve lies above its tangent line. Duration is the slope of that tangent (a first-order, linear approximation). The positive second derivative $\partial^2 P/\partial y^2 = \sum_t \frac{t(t+1)CF_t}{(1+y)^{t+2}} > 0$ means the true price curves upward away from the tangent as $\Delta y$ grows, so the linear estimate falls below the true price and understates it by more the larger the move. The sign of the error is the same whether yields rise or fall because convexity is a symmetric curvature term: the correction $+\tfrac12 C P(\Delta y)^2$ depends on $(\Delta y)^2$, which is positive regardless of the direction of the move.

FIP-C10. For two bonds of identical duration, convexity is a free upgrade: because the price-yield curve of the more convex bond bows out more, it gains more when yields fall and loses less when yields rise, so it weakly dominates the less convex bond for any yield move. A rational investor therefore always prefers higher convexity, and competition bids its price up — the manager is willing to pay a premium because the extra convexity delivers a favorable asymmetric payoff at no cost in duration.

FIP-C11. The hedge must satisfy three conditions: match present value, match duration, and match convexity. A single zero offers only two free choices (its face value $F$ and its maturity $\tau$), which can satisfy at most present value and duration — leaving convexity unmatched. Two zeros provide four free choices ($p_1, p_2, \tau_1, \tau_2$), enough to satisfy all three matching equations with a degree of freedom to spare. In the worked example, fixing equal present-value weights reduced the duration and convexity conditions to $\tau_1 + \tau_2 = 5.472$ and $\tau_1(\tau_1+1) + \tau_2(\tau_2+1) = 21.190$ — two equations in the two maturities, which a single zero could never satisfy simultaneously.

FIP-C12. The present value of a portfolio is the sum of the present values of its parts, and both duration and convexity are defined through derivatives of price divided by price. Differentiating $V = p_1 + p_2$ and dividing by $V$ makes each component’s contribution scale by its share of total value $w_i = p_i/V$; hence portfolio duration and convexity are the present-value-weighted (not face-value-weighted) averages of the components’. Matching Macaulay duration automatically matches modified duration because $D^* = D/(1+y)$ and both the obligation and the hedge are discounted at the same yield $y$, so the common factor $1/(1+y)$ cancels from both sides of the matching equation — only one duration condition is needed.

FIP-Q1. With cash flows $CF_1 = CF_2 = 60$ and $CF_3 = 1060$, $y = 0.06$, $P = 1000$: \[D = \frac{1}{1000}\left(1\cdot\frac{60}{1.06} + 2\cdot\frac{60}{1.06^2} + 3\cdot\frac{1060}{1.06^3}\right).\] The terms are $56.60$, $106.79$, and $2669.99$, summing to $2833.39$. Dividing by $1000$ gives $D = 2.8334$ years.

FIP-Q2. With $CF_1 = 40$, $CF_2 = 1040$, $y = 0.04$, $P = 1000$: \[D = \frac{1}{1000}\left(1\cdot\frac{40}{1.04} + 2\cdot\frac{1040}{1.04^2}\right) = \frac{1}{1000}\left(38.46 + 1923.08\right) = \frac{1961.54}{1000} = 1.9615 \text{ years}.\]

FIP-Q3. A zero’s Macaulay duration equals its maturity, so $D = T = 8$ years. Its modified duration is \[D^* = \frac{D}{1+y} = \frac{8}{1.05} = 7.619 \text{ years}.\] (The face value $M = 5000$ does not affect duration.)

FIP-Q4. $D^* = \frac{D}{1+y} = \frac{2.833}{1.06} = 2.673$ years. Economic meaning: since $\frac{\Delta P}{P}\approx -D^*\Delta y$, a 1-percentage-point rise in yield ($\Delta y = 0.01$) lowers the bond’s value by about $D^*\times 0.01 = 2.673\% \approx 2.67\%$.

FIP-Q5. Percentage change: $\frac{\Delta P}{P} = -D^*\Delta y = -7.2\times 0.0040 = -0.0288 = -2.88\%$. Dollar change: $\Delta P = -0.0288\times 40{,}000 = -\$1{,}152$. The portfolio loses about $1,152.

FIP-Q6. Percentage change: $\frac{\Delta P}{P} = -D^*\Delta y = -6.5\times(-0.0025) = +0.01625 = +1.625\%$. Dollar change: $\Delta P = 0.01625\times 25{,}000 = +\$406.25$. A yield decline raises the portfolio value by about $406.25.

FIP-Q7. For first-order immunity the zero’s modified duration must equal the liability’s, $D^* = 11$ years, so that both sides move together under $\frac{\Delta P}{P}\approx -D^*\Delta y$. A zero’s modified duration is $D^* = \tau/(1+y)$, so \[\tau = D^*(1+y) = 11\times 1.04 = 11.44 \text{ years}.\] (The zero must also have present value $250,000 to satisfy $V=0$.)

FIP-Q8. Present value: $V = \frac{200}{0.05}\left(1 - \frac{1}{1.05^4}\right) = 4000\left(1 - 0.822702\right) = 4000\times 0.177298 = \$709.19$. Required face value: from $V = F/(1+y)^\tau$, \[F = V(1+y)^\tau = 709.19\times 1.05^{2.5} = 709.19\times 1.12968 = \$801.19.\]

FIP-Q9. With $CF_1 = 50$, $CF_2 = 1050$, $y = 0.05$, $P = 1000$: \[Convexity = \frac{1}{1000(1.05)^2}\left(\frac{1\cdot 2\cdot 50}{1.05^1} + \frac{2\cdot 3\cdot 1050}{1.05^2}\right).\] The bracket is $\frac{100}{1.05} + \frac{6300}{1.1025} = 95.238 + 5714.286 = 5809.524$. Dividing by $1000\times 1.1025 = 1102.5$ gives $C = 5.2694$.

FIP-Q10. Using the zero’s closed form with $T = 3$, $y = 0.05$: \[Convexity = \frac{T(T+1)}{(1+y)^2} = \frac{3\cdot 4}{1.05^2} = \frac{12}{1.1025} = 10.884.\] Check via the general formula on a single cash flow at $t=3$ with $P = M/1.05^3$: the numerator $\frac{3\cdot4\,M}{1.05^{3}}$ divided by $P(1.05)^2 = \frac{M}{1.05^3}(1.05)^2$ leaves $\frac{12}{1.05^2} = 10.884$ — the same value.

FIP-Q11. $D^* = D/(1+y) = 1.9524/1.05 = 1.85943$. With $P = 1000$, $C = 5.2694$, $\Delta y = 0.02$: \[\Delta P \approx -D^* P\,\Delta y + \tfrac12 C P(\Delta y)^2 = -1.85943(1000)(0.02) + \tfrac12(5.2694)(1000)(0.0004).\] This is $-37.1886 + 1.0539 = -\$36.13$. The convexity term adds back about $1.05 relative to the duration-only estimate of $-\$37.19$.

FIP-Q12. $D^* = 1.9524/1.05 = 1.85943$. With $P = 1000$, $C = 5.2694$, $\Delta y = -0.03$: \[\Delta P \approx -1.85943(1000)(-0.03) + \tfrac12(5.2694)(1000)(0.0009) = +55.783 + 2.371 = +\$58.15.\] Duration alone predicts a $+\$55.78$ gain; the convexity term ($+\$2.37$, always positive because it depends on $(\Delta y)^2$) raises the estimated gain. Convexity works in the investor’s favor: it enlarges the gain when yields fall, just as (in FIP-Q11) it cushions the loss when yields rise.

17.10 Forwards and Futures

FF-C1. The gold-mining company is the hedger: it already has an asset position (10,000 ounces it will produce and sell) and wants to remove the price risk. It goes short a gold forward/futures contract — agreeing to deliver (sell) gold at a locked-in price $K$. If the spot price falls, the loss on the physical gold is offset by the gain $K - S_T$ on the short contract, so the company’s realized sale price is fixed near $K$. The hedge-fund trader is the speculator: holding no gold, the trader has no underlying exposure to offset and simply wants to profit from a directional view. Believing prices will rise, the trader goes long a contract — agreeing to buy at $K$ — earning $S_T - K > 0$ if the price rises as expected. The long trader takes on precisely the price risk the miner sheds.

FF-C2. The farmer is short futures at price $K$, so the short payoff is $K - S_T$. When the spot price rises sharply to a high $S_T$, the farmer sells the physical wheat in the spot market at that high price, earning more revenue than expected — but the short futures position loses $S_T - K$ (i.e. the payoff $K - S_T$ is negative). These two effects offset: the extra revenue on the physical sale is handed over on the futures, so the farmer’s net proceeds are approximately the locked-in price $K$ per bushel regardless of $S_T$. The farmer is not made worse off in absolute terms (net proceeds are what was targeted, near $K$); the hedge did exactly its job of removing uncertainty. What the farmer locked in is the delivery price $K$; what the farmer gave up is the upside — the windfall that an unhedged farmer would have kept when prices rose. Hedging trades away both downside and upside for certainty.

FF-C3. Being long a forward that delivers $Y$ units in one year is economically equivalent to a second strategy: borrow $SY$ today at rate $r$, buy $Y$ units spot now, and hold them. Both strategies deliver $Y$ units of the asset in one year in exchange for a cash outlay — the forward pays $FY$ at delivery, while the borrow-and-carry strategy repays the loan $(1+r)SY$. Because the two portfolios produce identical positions, no-arbitrage forces equal cost: $(1+r)SY = FY$, i.e. $F = (1+r)S$. If instead $(1+r)S > F$, an arbitrageur short-sells the asset today, receiving $S$ and investing it at $r$, while simultaneously going long the forward. In one year the invested proceeds have grown to $(1+r)S$; the trader pays $F$ under the forward to receive the asset, uses it to close the short, and keeps the riskless profit $(1+r)S - F > 0$. Such trading bids the forward price back up until $(1+r)S = F$.

FF-C4. Buying the asset today ties up cash that could otherwise earn the risk-free rate (or fund another use); the forward defers payment to delivery. The forward price therefore embeds the financing cost of carrying the asset — this is the cost-of-carry intuition, giving $F = Se^{rT}$ (continuous compounding) so that $F > S$ when $r > 0$. If, however, holding the asset itself generates value while it is carried — a dividend, a lease/storage-adjusted income stream, or a convenience yield $k$ (the benefit of physically holding a commodity) — then carrying the asset is cheaper by that yield. The net cost of carry falls to $r - k$, and the pricing relation becomes $F = Se^{(r-k)T}$. When $k > r$ the forward price falls below the spot price; when $k < r$ it remains above. The yield $k$ thus directly offsets the financing cost in the exponent.

FF-C5. An over-the-counter forward is a private bilateral agreement, so each side bears counterparty credit risk: if the price moves far against one party, that party may default, and the winning party has no institutional protection. A futures contract removes this by being standardized (fixed quantity, grade, and delivery dates) so it can trade on an organized exchange, and by interposing the exchange/clearinghouse as counterparty to every trade. To keep default risk from accumulating, the exchange requires each trader to post margin and marks the contract to market daily: gains and losses are settled every day by transferring cash between the short and long margin accounts. Because losses are collected as they occur rather than allowed to build up until maturity, no party can accumulate a large unpaid obligation, and a trader whose margin falls below the maintenance level receives a margin call. Standardization plus daily settlement together convert the forward’s open-ended credit exposure into a tightly controlled, continuously-settled position.

FF-C6. Both long traders are marked to market daily. Each day the corn futures price falls, the contract loses value, so the clearinghouse deducts that day’s price decline (per bushel × 5,000 bushels per contract) from each long trader’s margin account and credits it to the shorts. As the price keeps falling over the week, the long accounts are debited each day and the balances erode. A margin call is triggered when an account’s balance falls to (or below) the maintenance margin level; the trader must then deposit additional funds to restore the account, or the position is closed out. This differs from an otherwise-identical forward, which involves no interim cash flows at all: a forward settles only once, at maturity, so the same cumulative loss would be realized as a single payment at delivery. The futures thus demands cash as losses accrue (creating interim liquidity/financing demands), whereas the forward defers the entire settlement to expiration.

FF-C7. Basis risk is the residual price uncertainty that remains because the hedging instrument does not perfectly match the exposure. In the airline’s hedge there are two distinct mismatches. First, an asset mismatch: it hedges jet fuel using crude-oil futures, and the jet-fuel price and crude-oil price do not move one-for-one (refining spreads vary), so the spot price of fuel it must buy differs from the crude price the futures tracks. Second, a timing mismatch: the crude contract delivers one month after the airline actually buys fuel, so even for the same commodity the futures price at close-out need not equal the spot price on the airline’s purchase date. Because of these gaps, the price the airline effectively pays after closing the futures position is the futures price plus the (uncertain) basis, not a perfectly fixed number. Closing out the position removes the outright price level risk but leaves the basis — the difference between the fuel spot and the crude futures — unhedged, so the hedge is imperfect.

FF-C8. By definition $Basis(T) = S(T) - F_\tau(T)$, where $S(T)$ is the June spot and $F_\tau(T)$ is the July futures price at the June close-out date. The basis is generally not zero because the July contract has not expired at time $T$: it still reflects an extra month of cost-of-carry (financing, storage, convenience yield) and its own supply/demand, so July futures and June spot differ. Convergence forces the basis toward zero only at the July contract’s expiration, not in June. Nonetheless, the basis is far smaller and more predictable than the outright price: over a one-month horizon the difference between two closely related prices moves much less than either price level moves on its own. The farmer with no hedge faces the full swing in the spot price of wheat; the hedged farmer faces only the residual variation in the basis, which is a substantial reduction in risk even though it is not zero.

FF-C9. In such a market the natural hedgers are producers who are long the physical commodity and want to lock in a sale price, so they crowd onto the short side of the futures. For the market to clear, speculators must be induced to take the long side — bearing the price risk the producers shed. Speculators will only accept that risk if they expect to be compensated, which requires that the price they pay today be low enough to leave an expected profit: the futures price must sit below the expected future spot price, $F < ES_T$ (equivalently $ES_T > F$), the condition of normal backwardation. The long speculator then expects, on average, to buy at $F$ and see the asset worth $ES_T > F$ at delivery, earning $ES_T - F$. Economically, this discount is a risk premium — the reward paid by hedging producers to speculators for absorbing commodity price risk.

FF-C10. The expectations hypothesis holds that the futures price is simply the market’s unbiased forecast of the future spot, $ES_T = F$, with no risk premium built in either direction. Contango is the reverse of normal backwardation: it arises when the consumers of a commodity (natural buyers) dominate the demand for hedging. Wishing to lock in a purchase price, these hedgers crowd onto the long side of the futures market. To clear the market, speculators must be induced to take the short side and bear the risk the consumers shed. They will do so only if compensated, which requires selling futures at a price above the expected future spot, so that on average they gain $F - ES_T$. Hence in contango $ES_T < F$: the short side (the speculators) is rewarded for holding the risk the long-hedging consumers wanted to offload.

FF-C11. From $F = e^{(r-k)T}ES_T$, the sign of $F - ES_T$ is governed by $r - k$. A high-$\beta$ commodity carries substantial systematic risk, so investors demand a high required discount rate $k$ to hold it — in particular $k > r$. Then $r - k < 0$, the exponential factor $e^{(r-k)T} < 1$, and the futures price lies below the expected spot, $F < ES_T$ (normal backwardation). The pure expectations hypothesis, by contrast, predicts $F = ES_T$ exactly, with no discount. The two views are inconsistent for a positive-$\beta$ asset precisely because expectations pricing ignores the risk premium: a positive-$\beta$ asset commands $k > r$, which drives a wedge $F < ES_T$ that the expectations hypothesis assumes away. Only for a zero-risk-premium asset ($k = r$) do the two coincide.

FF-C12. The two starting equations are pricing statements for the same spot price $S$. First, $S = e^{-kT}ES_T$ discounts the expected future spot at the risk-adjusted rate $k$ — the rate investors require to hold the (risky) asset, so it embeds the asset’s risk premium. Second, $S = e^{-rT}F$ is the cost-of-carry relation $F = Se^{rT}$ rearranged, discounting the futures price at the risk-free rate (valid when interest rates are uncorrelated with the mark-to-market flows). Setting the two right-hand sides equal, $e^{-rT}F = e^{-kT}ES_T$, and multiplying both sides by $e^{rT}$ isolates $F = e^{(r-k)T}ES_T$. For a negative-$\beta$ asset, its returns move opposite the market, so it is desirable as a hedge and investors accept a low required return: $k < r$. Then $r - k > 0$, $e^{(r-k)T} > 1$, and $F > ES_T$ — the futures price exceeds the expected spot (a contango outcome), because the asset’s negative systematic risk commands a negative risk premium.

FF-Q1. Cost-of-carry gives the fair forward $F = (1+r)S = (1.04)(30) = \$31.20$ per ounce. If the market instead quotes $F = \$32 > 31.20$, the forward is overpriced, so arbitrage the pattern $(1+r)S < F$: short the forward (agree to deliver at 32) and borrow $30$ to buy one ounce spot now. In one year, deliver the ounce to settle the short forward, receive $\$32$, and repay the loan $(1+r)S = \$31.20$. Riskless profit $= F - (1+r)S = 32 - 31.20 = \$0.80$ per ounce.

FF-Q2. Fair forward: $F = Se^{rt} = 2000\,e^{0.05 \times 0.5} = 2000\,e^{0.025} = 2000(1.025315) = \$2{,}050.63$. The actual six-month forward of $\$2{,}080$ exceeds this, so the forward is overpriced relative to cost-of-carry by $2080 - 2050.63 = \$29.37$.

FF-Q3. Gold forward: $F = Se^{rt} = 1800\,e^{0.05 \times 0.5} = 1800\,e^{0.025} = 1800(1.025315) = \$1{,}845.57$ per ounce. Currency implied rate: with $S = 1.4208$, $F = 1.4187$, $T = 0.25$, \[r = \frac{1}{T}\ln\!\left(\frac{F}{S}\right) = \frac{1}{0.25}\ln\!\left(\frac{1.4187}{1.4208}\right) = 4\ln(0.998522) = 4(-0.0014791) = -0.0059165 \approx -0.59\%.\] The implied rate is negative. Because the forward is below the spot ($F < S$), the ratio $F/S < 1$, its natural log is negative, and so is $r$: the currency’s forward trading at a discount to spot signals a slightly negative annualized implied dollar interest rate over the quarter.

FF-Q4. Long payoff $S_T - K$ with $K = 59$: (a) $56 - 59 = -\$3$; (b) $63 - 59 = +\$4$. Short payoff $K - S_T$: (a) $59 - 56 = +\$3$; (b) $59 - 63 = -\$4$. In each case the long and short payoffs sum to zero: $-3 + 3 = 0$ and $+4 + (-4) = 0$. A forward is a zero-sum contract — one party’s gain is exactly the other’s loss.

FF-Q5. (a) Physical revenue: $1{,}000{,}000 \times \$52 = \$52{,}000{,}000$. (b) Short futures payoff per barrel is $K - S_T = 59 - 52 = \$7$; over 1,000,000 barrels (1,000 contracts × 1,000 bbl) the gain is $7 \times 1{,}000{,}000 = \$7{,}000{,}000$. (c) Net proceeds $= \$52\text{M} + \$7\text{M} = \$59{,}000{,}000$, i.e. $\$59$ per barrel — the hedge locks in approximately the futures price of $\$59$/barrel regardless of the realized spot.

FF-Q6. $Basis(T) = S(T) - F_\tau(T) = 6.20 - 6.35 = -\$0.15$ per bushel. The basis is negative (spot below futures). A farmer who is short the futures to hedge a sale effectively receives $F_{\text{initial}} + Basis(T)$ at close-out (sell spot at $S(T)$, buy back the futures at $F_\tau(T)$); a negative basis means the spot came in below the futures level, which hurt the farmer relative to a zero basis — the realized proceeds are $\$0.15$/bushel lower than they would have been had the basis been zero.

FF-Q7. $Basis(T) = 385 - 383 = 2$ cents per bushel (spot above futures). Since convergence forces the basis to zero at expiration, a trader expecting the 2-cent gap to close can buy the futures (long, at 383) and sell the corn short in the spot market (at 385); as the futures price rises toward the spot at expiration, the trader closes both legs and captures the 2-cent convergence. Gross profit per 5,000-bushel contract $= 0.02 \times 5{,}000 = \$100$ (2 cents × 5,000 bushels).

FF-Q8. $F = e^{(r-k)T}ES_T = e^{(0.03 - 0.08)(1)}(100) = e^{-0.05}(100) = 0.951229 \times 100 = \$95.12$. Since $k > r$, the exponent is negative and $F = \$95.12 < ES_T = \$100$: the futures price lies below the expected spot. This is consistent with normal backwardation ($ES_T > F$).

FF-Q9. $F = e^{(r-k)T}ES_T = e^{(0.04 - 0.01)(2)}(50) = e^{0.06}(50) = 1.061837 \times 50 = \$53.09$. Here $k < r$ (negative-$\beta$ asset), the exponent is positive, and $F = \$53.09 > ES_T = \$50$: the futures price exceeds the expected spot. This matches contango ($ES_T < F$).

FF-Q10. (a) New contract value at 390 cents: $3.90 \times 5{,}000 = \$19{,}500$. (b) Marking-to-market gain to the long: $19{,}500 - 19{,}409.375 = \$90.625$, credited to the long margin account. (c) New margin balance: $2{,}000 + 90.625 = \$2{,}090.625$. (d) A margin call occurs when the balance falls to the $1,500 maintenance level, a loss of $2{,}000 - 1{,}500 = \$500$ from the starting $2,000. That corresponds to the contract value falling by $500, to $19{,}409.375 - 500 = \$18{,}909.375$, i.e. price $= 18{,}909.375 / 5{,}000 = \$3.781875$ per bushel $= 378.1875$ cents, or $378\tfrac{6}{32}$ cents per bushel.

17.11 Options

OPT-C1. The call is a right, not an obligation. If the stock falls to $30, the strike-$42 call is out of the money and the holder simply walks away, exercising nothing; the payoff is $\max\{30-42,0\} - 4 = 0 - 4 = -4$. The premium $c$ is the entire cost, and the $\max\{S_T-K,0\}$ term can never go below zero, so it caps the intrinsic loss at that premium. The leveraged stock holder, by contrast, holds the underlying directly and bears every dollar of decline: 100 shares falling from $40 to $30 loses $1{,}000, wiping out the $4 margin many times over and generating a margin call. The floor in the option payoff comes precisely from the $\max\{\cdot,0\}$ operator — the option converts an unlimited downside into a fixed, prepaid premium, which is what “right without obligation” means in cash terms.

OPT-C2. For the underlying price $S_0$: a higher $S_0$ makes a call more likely to finish above $K$ (in the money) and less likely a put finishes below $K$, so the call rises and the put falls. For the strike $K$: a higher $K$ is a higher hurdle for the call (less likely in the money, so the call falls) but a higher floor for the put (more likely in the money, so the put rises). These two factors are exact mirror images because a call profits when $S_T$ exceeds $K$ and a put profits when $S_T$ falls below $K$ — opposite tails of the same distribution. Volatility is different: greater dispersion of $S_T$ fattens both tails, and since each option’s payoff is truncated at zero on its unfavorable side, only the favorable tail matters to each holder. More volatility therefore raises the value of both the call and the put — the sole factor that moves them the same way.

OPT-C3. The strike is a fixed amount owed at exercise. A higher $r$ lowers the present value $Ke^{-rT}$ of that fixed payment. For the call holder, who will pay $K$ to buy the stock, a smaller present value of the obligation makes the call more valuable — this is exactly the $-Ke^{-rT}$ term entering the Black-Scholes call price, which grows (less negative) as $r$ rises. For the put holder, who will receive $K$ by selling the stock, a higher $r$ shrinks the present value of that receipt, lowering the put. Put-call parity $c - p = S_0 - Ke^{-rT}$ makes this crisp: raising $r$ raises the right-hand side, so $c-p$ increases — the call gains and the put loses. The colleague’s error is treating $K$ as a cost to both parties; it is a payment owed by the call holder but received by the put holder, so discounting it helps one and hurts the other.

OPT-C4. Parity requires $c + Ke^{-rT} = p + S_0$. Portfolio 1 (call plus $Ke^{-rT}$ cash) costs $c + Ke^{-rT} = 6 + 46 = \$52$. Portfolio 2 (stock plus put) costs $S_0 + p = 50 + 1 = \$51$. The two sides differ by $1, so parity is violated and an arbitrage exists: portfolio 1 (the call-plus-cash side) is overpriced by $1 relative to portfolio 2. The arbitrageur sells the expensive portfolio — writes the call and lends $Ke^{-rT} = \$46$ (collecting the $6 call premium) — and buys the cheap one — buys the stock for $50 and the put for $1. The net cash today is $+6 + 1 - 50 = -\$43$ from the trades plus $-\$46$ lent… more simply, she collects $52 from the short portfolio and pays $51 for the long portfolio, netting $+\$1$ up front. At expiration both portfolios pay exactly $\max(S_T, K)$, so the long and short legs cancel dollar-for-dollar in every state, leaving zero exposure. The $1 collected today is therefore a riskless profit.

OPT-C5. From parity, $c = p + S_0 - Ke^{-rT}$, and since $p \ge 0$ we have $c \ge S_0 - Ke^{-rT}$. An American call is worth at least its European counterpart, $C \ge c$, so $C \ge c \ge S_0 - Ke^{-rT} > S_0 - K$ (the last strict inequality because $Ke^{-rT} < K$ for $r>0$). The right-hand quantity $S_0 - K$ is exactly what exercising the call today yields; the left-hand quantity $C$ is what selling the option yields. Since selling always beats exercising, early exercise is never optimal and $C = c$. For a put the logic reverses: a put’s value is bounded above by $K$, and once a firm goes bankrupt the stock price is 0 and can fall no further, so the payoff $K$ is already maximal. Waiting only forgoes the interest that could be earned by collecting $K$ now; exercising early to reinvest the proceeds can therefore be strictly optimal, which is why $P > p$.

OPT-C6. Put-call parity is the statement that two economically distinct portfolios have identical payoffs at expiration and must therefore cost the same today. Portfolio 1 is a call plus $Ke^{-rT}$ in cash; if $S_T < K$ the call expires worthless and the cash grows to $K$, giving $K$; if $S_T > K$ the call pays $S_T - K$ and the cash gives $K$, totaling $S_T$. Portfolio 2 is one share plus a put; if $S_T < K$ the put pays $K - S_T$ and the share is worth $S_T$, giving $K$; if $S_T > K$ the put is worthless and the share is worth $S_T$. In both states each portfolio equals $\max(S_T, K)$. Because their future payoffs coincide state by state, no-arbitrage forces their present prices to coincide: $c + Ke^{-rT} = p + S_0$. This is an identity, not an approximation, because any price gap would let a trader buy the cheaper portfolio, sell the dearer one, and collect a certain profit while the payoffs cancel at $T$.

OPT-C7. “Perfectly hedged” means the combined position (write 3.33 calls, hold one share) delivers the same dollar payoff regardless of which state occurs. In the up state the share is worth $55 and the written calls owe $3.33 \times \$3 = \$10$, netting $45; in the down state the share is worth $45 and the calls owe nothing, again $45. Because the payoff is a certain $45 in every state, the position is riskless and must be priced by discounting at the risk-free rate. This is what pins the option to one value: the replicating portfolio’s cost is fixed by arbitrage, and the probability of an up move never enters. Two investors who disagree about that probability must still agree on the option price, because either could construct the same riskless hedge — preferences and beliefs about the up-probability are irrelevant, which is the central insight of replication-based pricing.

OPT-C8. The hedge ratio is $(C_u - C_d)/(S_u - S_d)$, the range of option payoffs over the range of stock prices. For stock A the stock range is $55 - 45 = \$10$; for stock B it is $60 - 40 = \$20$. With the same strike, stock B’s wider terminal spread produces a larger range of option payoffs in the favorable state while the unfavorable payoff stays floored at zero, so the numerator grows even as the denominator doubles. More importantly, the replicating portfolio for stock B must combine more borrowing and shares to span the wider payoff gap, and the cost of that portfolio — the option’s value — is higher. Economically, the option holder captures only the favorable tail; a wider spread of possible terminal prices lengthens that favorable tail without adding downside (the payoff cannot go below zero), so the option on the more volatile stock B is worth more. This is the discrete-time version of the general result that option value increases in volatility.

OPT-C9. Volatility $\sigma$ is the only input to Black-Scholes that cannot be read off a screen: $S_0$, $K$, $r$, and $T$ are all observable, but the standard deviation of future returns is not. Implied volatility is the value of $\sigma$ that, when plugged into the formula, reproduces the option’s observed market price exactly. Because the call price is strictly increasing in $\sigma$, this inversion is unique: given a market price one solves $C^{\text{market}} = C^{\text{BS}}(\sigma)$ numerically for the implied $\sigma$. The CBOE VIX applies exactly this logic to a basket of S&P 500 options, aggregating their implied volatilities into a single index of the market’s expected near-term volatility. If two options on the same stock with different strikes imply different volatilities, the model’s assumption of a single constant $\sigma$ is violated — the empirical pattern known as the volatility “smile” or “skew,” signaling that the true return distribution has fatter tails or asymmetry that Black-Scholes does not capture.

OPT-C10. In the Black-Scholes call $C_0 = S_0 N(d_1) - Ke^{-rT}N(d_2)$, the term $N(d_2)$ is the (risk-neutral) probability the option finishes in the money. When both $N(d_1)$ and $N(d_2)$ approach 1, exercise is virtually certain: the holder will almost surely pay $K$ to acquire a share worth $S_0$ today, so the call is worth $S_0 - Ke^{-rT}$ — the current stock price minus the present value of the near-certain payment. This is the correct value for an all-but-guaranteed forward purchase. When both terms approach 0, exercise is almost impossible: the stock will almost surely finish below $K$, the right to buy is never used, and the call is worth essentially nothing. Both limits confirm the formula respects the economics: value equals the stock’s worth minus the discounted obligation, weighted by how likely that obligation is to be triggered.

OPT-C11. A straddle — a long call and a long put at the same strike — pays off whenever $S_T$ moves far from $K$ in either direction: the call captures large up moves, the put captures large down moves, and only a stock that stays near $K$ leaves both nearly worthless. This is exactly a bet on the magnitude of the move, not its sign, which fits a trader who expects a big FDA-driven jump but does not know which way. A strangle uses a call struck above and a put struck below the current price, so both options start further out of the money and are cheaper — but the stock must travel past one of the two wider strikes before either pays, requiring a larger move to profit. In both cases the trader is implicitly buying a view on volatility (realized dispersion), in contrast to a bull spread, which is a directional bet that profits only if the stock rises.

OPT-C12. Buying the call at $K_1$ alone gives unlimited upside above $K_1$ for the cost $c_1$. Adding a written call at $K_2 > K_1$ brings in premium $c_2$ (with $c_2 < c_1$ because the higher-strike call is less likely to pay), lowering the net cost of the position to $c_1 - c_2$. What the investor gives up is all upside beyond $K_2$: above $K_2$ the written call’s obligation grows one-for-one with the stock, capping the spread’s payoff at $K_2 - K_1$. What she gains is the reduced entry cost and therefore a lower breakeven and smaller maximum loss. A moderately bullish investor — one who expects the stock to rise toward $K_2$ but not far beyond — prefers the spread because she is not sacrificing upside she expects to realize, and she pays less for the exposure she actually wants.

OPT-C13. Delta $\Delta = N(d_1)$ depends on $d_1$, which itself depends on the current stock price $S$; as $S$ changes, $d_1$ and hence $N(d_1)$ change, so delta is not a constant. The Greek measuring the rate of change of delta with respect to the stock price is gamma, $\Gamma = N'(d_1)/(S\sigma\sqrt{T})$, the second derivative of the option price in $S$. When a call is deeply in the money, $d_1$ is large and positive, so $N(d_1) \approx 1$: the option moves almost dollar-for-dollar with the stock, behaving like a share. When it is deeply out of the money, $d_1$ is large and negative, so $N(d_1) \approx 0$: the option barely responds to stock moves because it is nearly certain to expire worthless. Because delta drifts as the stock moves, a delta-neutral hedge must be continually rebalanced — the practical reason gamma matters.

OPT-C14. Gamma $\Gamma = N'(d_1)/(S\sigma\sqrt{T})$ measures how fast delta changes as the underlying moves; vega $\mathcal{V} = S_0\sqrt{T}N'(d_1)$ measures how the option value changes with volatility. A short-dated (small $T$) at-the-money option has very large gamma — the $\sqrt{T}$ in the denominator blows up as expiry nears — so its delta swings sharply with even small price moves. A market maker short many such calls must therefore rehedge frequently and in the wrong direction (buying as the market rises, selling as it falls), incurring cumulative transaction costs; this is the cost of being short gamma. Separately, being short options means negative vega: if volatility spikes, the options she has written become more valuable and her position loses money even if the underlying price does not move at all. The tension is that the short-option position earns time decay but is punished by both realized moves (gamma) and rising implied volatility (vega).

OPT-Q1. Call payoffs $\max\{S_T-45,0\}-3$: at $38, $0-3=-\$3$; at $45, $0-3=-\$3$; at $52, $7-3=+\$4$. Put payoffs $\max\{45-S_T,0\}-2$: at $38, $7-2=+\$5$; at $45, $0-2=-\$2$; at $52, $0-2=-\$2$. Breakeven for the long call: $S_T = K + c = 45 + 3 = \$48$. Breakeven for the long put: $S_T = K - p = 45 - 2 = \$43$.

OPT-Q2. Payoff $= \max(S_T-50,0) - \max(S_T-55,0) - 6 + 3 = \max(S_T-50,0) - \max(S_T-55,0) - 3$. At $S_T=48$: $0 - 0 - 3 = -\$3$. At $S_T=53$: $3 - 0 - 3 = \$0$. At $S_T=60$: $10 - 5 - 3 = +\$2$. Maximum profit occurs for $S_T \ge K_2$: $(K_2 - K_1) - (c_1 - c_2) = (55-50) - (6-3) = 5 - 3 = +\$2$. Maximum loss occurs for $S_T \le K_1$: $-(c_1 - c_2) = -\$3$.

OPT-Q3. Put-call parity: $p = c + Ke^{-rT} - S_0$. Discount factor: $Ke^{-rT} = 100 \cdot e^{-0.04 \times 0.5} = 100 \cdot e^{-0.02} = 100 \cdot 0.98020 = 98.020$. Then $p = 7.50 + 98.020 - 98 = \$7.52$.

OPT-Q4. Left side: $c + Ke^{-rT} = 5 + 62e^{-0.03} = 5 + 62 \cdot 0.97045 = 5 + 60.168 = 65.168$. Right side: $p + S_0 = 5.50 + 60 = 65.50$. They do not match. The gap is $65.50 - 65.168 = \$0.332$. The right-hand portfolio (put plus stock) is overpriced by about $0.33 relative to parity; equivalently, the put is roughly $0.33 too expensive (or the call too cheap). An arbitrageur would sell the stock-plus-put portfolio and buy the call-plus-cash portfolio.

OPT-Q5. Up state $S_u = 1.1 \times 50 = \$55$, down state $S_d = 0.9 \times 50 = \$45$. (a) Payoffs: $C_u = \max\{55-52,0\} = \$3$, $C_d = \max\{45-52,0\} = \$0$. (b) Hedge ratio $= (C_u - C_d)/(S_u - S_d) = (3-0)/(55-45) = 3/10 = 0.3$. (c) Replicate: hold $h = 0.3$ shares and borrow the present value of the down-state stock value $h \cdot S_d = 0.3 \times 45 = 13.5$, i.e. borrow $13.5/1.01 = 13.366$. Cost today $C = h S_0 - 13.366 = 0.3 \times 50 - 13.366 = 15 - 13.366 = \$1.634$. This matches the chapter’s answer $C = 1.635$ (rounding).

OPT-Q6. Up state $55, down state $45, strike $48. Payoffs: $C_u = \max\{55-48,0\} = \$7$, $C_d = \max\{45-48,0\} = \$0$. Hedge ratio $= (7-0)/(55-45) = 7/10 = 0.7$. Replicate with $h = 0.7$ shares. The riskless “hold one share, write $1/h = 1.4286$ calls” portfolio pays $45$ in each state (up: $55 - 1.4286 \times 7 = 55 - 10 = 45$; down: $45 - 0 = 45$), worth $45/1.01 = 44.554$ today. Equivalently, borrow $h S_d / 1.01 = 0.7 \times 45 / 1.01 = 31.5/1.01 = 31.188$; then $C = h S_0 - 31.188 = 0.7 \times 50 - 31.188 = 35 - 31.188 = \$3.812$.

OPT-Q7. $\sigma\sqrt{T} = 0.20 \times \sqrt{0.5} = 0.20 \times 0.70711 = 0.14142$. $d_1 = \dfrac{\ln(100/100) + (0.05 + 0.02)\times 0.5}{0.14142} = \dfrac{0 + 0.035}{0.14142} = 0.2475$. $d_2 = d_1 - \sigma\sqrt{T} = 0.2475 - 0.14142 = 0.1061$. With $N(d_1) = 0.5977$ and $N(d_2) = 0.5422$: $Ke^{-rT} = 100 e^{-0.025} = 97.531$. $C_0 = 100 \times 0.5977 - 97.531 \times 0.5422 = 59.77 - 52.882 = \$6.89$.

OPT-Q8. $\sigma\sqrt{T} = 0.30 \times \sqrt{0.25} = 0.30 \times 0.5 = 0.15$. $d_1 = \dfrac{\ln(100/105) + (0.05 + 0.045)\times 0.25}{0.15} = \dfrac{-0.04879 + 0.023750}{0.15} = \dfrac{-0.025040}{0.15} = -0.1669$. $d_2 = -0.1669 - 0.15 = -0.3169$. With $N(d_1) = 0.4337$, $N(d_2) = 0.3756$: $Ke^{-rT} = 105 e^{-0.0125} = 105 \times 0.98758 = 103.696$. $C_0 = 100 \times 0.4337 - 103.696 \times 0.3756 = 43.37 - 38.948 = \$4.42$. Put by parity: $p = C_0 + Ke^{-rT} - S_0 = 4.42 + 103.696 - 100 = \$8.11$.

OPT-Q9. Using $S_0=100$, $\sigma=0.30$, $T=0.25$ so $\sigma\sqrt{T} = 0.15$, and $N'(d_1) = 0.3934$. Gamma: $\Gamma = \dfrac{N'(d_1)}{S_0\sigma\sqrt{T}} = \dfrac{0.3934}{100 \times 0.15} = \dfrac{0.3934}{15} = 0.02623$. Vega: $\mathcal{V} = S_0\sqrt{T}\,N'(d_1) = 100 \times 0.5 \times 0.3934 = 19.67$. Vega is the change in call price per one-unit (i.e. per 1.00, or 100 percentage points) change in $\sigma$; for a one-percentage-point rise ($\Delta\sigma = 0.01$) the call price changes by approximately $\mathcal{V} \times 0.01 = 19.67 \times 0.01 = \$0.197$, so about $0.20 more expensive.

OPT-Q10. (a) Total shares controlled $= 20 \times 100 = 2000$. Delta-neutral short $= \Delta \times 2000 = 0.55 \times 2000 = 1100$ shares shorted. (b) If the ETF falls by $1, the call position loses about $\Delta \times \$1 \times 2000 = 0.55 \times 2000 = \$1100$, while the 1100 short shares gain $1100 \times \$1 = \$1100$ — the two offset. (c) If $\Delta \approx 1$ (deep in the money), the hedge would require shorting $1 \times 2000 = 2000$ shares — the full share count, since the calls then move nearly one-for-one with the ETF and behave like the stock itself.

OPT-Q11. (a) Call delta $= N(d_1) = 0.62$; put delta $= N(d_1) - 1 = 0.62 - 1 = -0.38$. (b) For a $2 rise: long call changes by $\Delta_{\text{call}} \times 2 = 0.62 \times 2 = +\$1.24$; long put changes by $\Delta_{\text{put}} \times 2 = -0.38 \times 2 = -\$0.76$. (c) The put delta is negative because a put loses value as the stock rises (it is a right to sell). Differentiating put-call parity $c - p = S_0 - Ke^{-rT}$ with respect to $S_0$ gives $\partial c/\partial S_0 - \partial p/\partial S_0 = 1$, i.e. $\Delta_{\text{call}} - \Delta_{\text{put}} = 1$; here $0.62 - (-0.38) = 1$, confirming the relationship.

OPT-Q12. (a) Total shares controlled $= 10 \times 100 = 1000$; delta-neutral short $= \Delta \times 1000 = 0.4 \times 1000 = 400$ shares. (b) If the ETF falls from $50 to $49, the 400 short shares gain $400 \times \$1 = \$400$. The call price falls by about $\Delta \times \$1 = 0.4 \times 1 = \$0.40$ per share, from $5.00 to $4.60, so the call position is now worth $4.60 \times 1000 = \$4600$, a loss of $400 from the original $5000. The $400 stock gain offsets the $400 call loss. (c) Delta is not constant: as the ETF price moves, $N(d_1)$ changes (gamma is nonzero), so the number of shares needed to stay delta-neutral changes, forcing the hedge to be rebalanced after the move.

17.12 The Black-Litterman model and active portfolio management

BL-C1. The optimizer treats the point estimates of $\mu$ as if they were the true expected returns, known with certainty, and it fully exploits every difference between them. When two assets are highly correlated, the optimizer can construct a nearly offsetting long-short pair whose relative weight is enormous, because even a tiny estimated difference in their means looks like a “free” opportunity to earn return with little marginal variance. But that tiny estimated difference is well inside the range of sampling error. A two-month change in the estimation window shifts the sample means by amounts comparable to their own standard errors, and since the optimal weights depend on $\Sigma^{-1}\mu$ — where $\Sigma^{-1}$ amplifies differences among correlated assets — the sign and magnitude of the large offsetting positions swing wildly. XLK and XLE flip because the optimizer was never estimating a stable quantity in the first place; it was leveraging noise.

BL-C2. The root cause the chapter identifies is that unconstrained mean-variance optimization treats point estimates of expected returns as if they were known with certainty and fully exploits any differences between them, so it over-fits estimation noise into extreme, unstable weights. Black-Litterman’s remedy is to treat the expected returns themselves as random variables with a prior distribution rather than as fixed parameters. Anchoring the prior to the equilibrium returns $\Pi$ (those that reproduce market-cap weights) and then updating with the investor’s views in a Bayesian way means the posterior expected returns cannot drift far from the equilibrium anchor unless a view with real precision pushes them. Noise in a return estimate no longer translates directly into a large weight, because the model’s output is a precision-weighted compromise that shrinks toward the market whenever the investor has no confident view to the contrary.

BL-C3. Reverse optimization starts from the mean-variance first order condition $\mu = A\Sigma\theta$. In forward optimization we treat $\mu$, $A$, and $\Sigma$ as known and solve for the optimal weights $\theta = \tfrac{1}{A}\Sigma^{-1}\mu$. In reverse optimization we instead treat the weights as known — we set them equal to the observed market-capitalization weights $\theta_{\text{mkt}}$ — together with $A$ and $\Sigma$, and we back out the expected returns $\Pi = A\Sigma\theta_{\text{mkt}}$ that would make those weights optimal. The quantity being solved for is $\Pi$; the market weights and covariance are the knowns. Anchoring the prior to market weights is sensible because an investor who begins by effectively owning the whole market is, by revealed preference, already holding the portfolio that these implied returns rationalize; absent any special opinion, her best guess for expected returns is precisely the set consistent with holding the market.

BL-C4. Even before either manager expresses a view, Manager A’s baseline is a vector of raw sample means, each carrying large sampling error, and the optimizer will lever up the noisy differences among them into extreme long-short positions. Manager B’s baseline is $\Pi = A\Sigma\theta_{\text{mkt}}$, which by construction reproduces the market weights when run through the optimizer — a diversified, sensible, fully-invested portfolio with no extreme positions. The structural feature doing the work is that the reverse-optimization prior is internally consistent with a reasonable portfolio: it is the fixed point of the optimizer, so feeding it back in yields the market rather than a wild allocation. Manager B therefore starts from a stable, intuitive point and only departs from it deliberately, whereas Manager A starts from noise.

BL-C5. In the classical treatment $\mu$ is a known constant, fully trusted. Here $\mu = \Pi + \epsilon^{e}$ with $\epsilon^{e} \sim N(0, \tau\Sigma)$ makes the expected returns random, centered on the equilibrium prior $\Pi$ but uncertain. The scalar $\tau$ scales the magnitude of that uncertainty: it measures how much the investor distrusts the equilibrium prior as a statement of the true means. A very small $\tau$ means $\epsilon^{e}$ has tiny variance, so the investor is highly confident that the true $\mu$ equals $\Pi$; the prior is nearly rigid and views must be very precise to move the posterior much. A large $\tau$ means the prior is loosely held, so even modestly confident views can pull the posterior substantially. Relative to the classical constant-$\mu$ assumption, this converts a single trusted point into a distribution whose spread the investor tunes through $\tau$.

BL-C6. Scaling the prior uncertainty by $\tau\Sigma$ says that uncertainty about mean returns has the same correlation structure as returns themselves: assets whose returns are volatile and move together have expected returns that are correspondingly uncertain and jointly uncertain. This is a natural default — the same forces that make returns co-move make our estimates of their means co-move — and it means the investor needs to specify only the single scalar $\tau$ rather than an entire new covariance matrix. It is also convenient in the derivation: because both the prior covariance $\tau\Sigma$ and the return covariance $\Sigma$ share the same structure, the conditional-normal algebra collapses cleanly, and the posterior precision $(\tau\Sigma)^{-1} + P'\Omega^{-1}P$ combines the equilibrium and view information without requiring any unrelated matrix to be estimated.

BL-C7. The two managers’ posteriors differ only through the view variance — $\omega^2$ in the single-view case, or the corresponding diagonal entry of $\Omega$. The directional view $p$ (long XLK, short XLE) and its target $q$ are identical for both. The highly confident manager assigns a small $\omega^2$, giving her view large precision $1/\omega^2$; in the posterior mean $\bar\mu = [(\tau\Sigma)^{-1} + p'\tfrac{1}{\omega^2}p]^{-1}[(\tau\Sigma)^{-1}\Pi + p'\tfrac{1}{\omega^2}q]$ this pulls the blend strongly toward $q$. The “hunch” manager assigns a large $\omega^2$, low precision, so her posterior stays close to the equilibrium prior $\Pi$. The single input encoding the difference between conviction and hunch is $\omega^2$ (equivalently $\Omega$), the variance of the view error $\epsilon^{v}$.

BL-C8. A single row of $P$ specifies the portfolio the view is about: its entries are the weights on each asset in the linear combination whose expected return the investor is expressing an opinion on (e.g. $+1$ on the asset believed to outperform, $-1$ on the asset it is believed to beat, zeros elsewhere for a relative view). The corresponding entry of $Q$ gives the expected return the investor assigns to that portfolio. The corresponding diagonal entry of $\Omega$ is the variance $\omega^2$ of the view’s error $\epsilon^{v}$ — how uncertain the investor is about that particular view. A view is written as a linear combination because most investment opinions are naturally relative (“A will beat B”) and because expressing views on portfolios lets the model translate an opinion on a spread into coherent implications for all correlated assets, rather than forcing every opinion to be an isolated absolute forecast.

BL-C9. The conservative property comes from the structure of $P$: a row that expresses no view about a given asset has a zero in that asset’s column, so that asset never enters the view term $P'\Omega^{-1}Q$ nor the view precision $P'\Omega^{-1}P$ except through its correlation with assets that are in a view. Its posterior expected return therefore stays at (or very near) its equilibrium prior value $\Pi$, and moves only to the extent that $\Sigma$ links it to an asset the investor has an opinion about. In a naive optimizer there is no such anchor: revising a single asset’s estimated mean feeds directly into $\theta = \tfrac{1}{A}\Sigma^{-1}\mu$, and because $\Sigma^{-1}$ is dense, that one revised number can swing the weights of many unrelated assets. Black-Litterman confines the disturbance to where the investor actually has information.

BL-C10. The posterior mean is a matrix-weighted average of the prior $\Pi$ and the view target $Q$, with weights given by the two precision matrices $(\tau\Sigma)^{-1}$ (prior precision) and $P'\Omega^{-1}P$ (view precision). Writing $\bar\mu = W^{-1}[(\tau\Sigma)^{-1}\Pi + P'\Omega^{-1}Q]$ with $W = (\tau\Sigma)^{-1} + P'\Omega^{-1}P$, the coefficients on $\Pi$ and on the view are each positive-semidefinite and sum (in the precision sense) to $W$, so $\bar\mu$ is always a convex-type compromise between the prior and the views — it can never lie outside the range they span. As $\Omega \to \infty$ the view precision $P'\Omega^{-1}P \to 0$: the views carry no information and $\bar\mu \to \Pi$, the pure equilibrium prior. As $\Omega \to 0$ the view precision explodes and, along the directions spanned by $P$, $\bar\mu$ is pulled all the way onto the views $Q$ (the views are treated as certain), while directions orthogonal to $P$ remain at the prior.

BL-C11. The chapter ties the size of the tilt directly to the precision of the view. Posterior expected returns deviate from equilibrium in a direction and by an amount governed by view precision, and those posterior returns feed back through the FOC $\theta = \tfrac{1}{A}\Sigma^{-1}\bar\mu$ into the weights. Near-certain view (very low $\omega^2$): the posterior mean is pulled essentially onto the view value $q$, producing a large deviation from $\Pi$ and hence a large tilt of the optimal weights toward the implied position. Almost-no-confidence view (very high $\omega^2$): the view has negligible precision, the posterior mean stays essentially at $\Pi$, and the optimal weights barely move from the market portfolio. Confidence is thus an interpretable dial: conviction scales the tilt.

BL-C12. Starting from the equilibrium prior means the CIO’s default portfolio is the market itself — she is not betting anything until she chooses to. Because the posterior is a precision-weighted blend, each view moves the weights only in proportion to the confidence she assigns it, so her opinions produce disciplined tilts: large where conviction is high, small where it is weak, and always anchored to the market. That is exactly “expressing an opinion without betting the fund.” If instead she simply set her favored sectors’ expected returns to large positive numbers and re-optimized, she would be back in the failure mode of BL-C1: the optimizer would treat those fabricated means as certain, lever them up through $\Sigma^{-1}$, and generate extreme, unstable, undiversified positions with no anchor to the market and no principled scaling by conviction.

BL-Q1. Compute $\Sigma\theta$ first. $\Sigma\theta = (0.04\cdot0.6 + 0.01\cdot0.4,\ 0.01\cdot0.6 + 0.02\cdot0.4)' = (0.024+0.004,\ 0.006+0.008)' = (0.028,\ 0.014)'$. Then $\Pi = A\Sigma\theta = 3\cdot(0.028, 0.014)' = (0.084,\ 0.042)'$. So the implied equilibrium expected returns are $\Pi_1 = 0.084$ and $\Pi_2 = 0.042$ per period.

BL-Q2. With $A = 5$ and the same $\Sigma\theta = (0.028, 0.014)'$, $\Pi = 5\cdot(0.028, 0.014)' = (0.140,\ 0.070)'$. To recover the weights, solve $\theta = \tfrac{1}{A}\Sigma^{-1}\Pi = \tfrac{1}{5}\Sigma^{-1}(0.140, 0.070)'$. But $\Pi$ was defined as $5\Sigma\theta_{\text{mkt}}$, so $\tfrac{1}{5}\Sigma^{-1}(5\Sigma\theta_{\text{mkt}}) = \theta_{\text{mkt}} = (0.6, 0.4)'$. The market weights are recovered exactly. They must be, because reverse optimization defines $\Pi$ as precisely the return vector that makes $\theta_{\text{mkt}}$ satisfy the first order condition; forward-optimizing that same $\Pi$ inverts the operation and returns the weights we started from.

BL-Q3. With assets ordered (XLB, XLK, XLE), a portfolio long XLK and short XLE by equal amounts has view vector $p = (0,\ 1,\ -1)$. The scalar is $q = 0.015$. The view error $\epsilon^{v}$ has variance $\omega^2 = 0.05^2 = 0.0025$. Applying $p$ to the mean vector $\mu = (\mu_{\text{XLB}}, \mu_{\text{XLK}}, \mu_{\text{XLE}})'$ gives $p\mu = 0\cdot\mu_{\text{XLB}} + 1\cdot\mu_{\text{XLK}} - 1\cdot\mu_{\text{XLE}} = \mu_{\text{XLK}} - \mu_{\text{XLE}}$, exactly the expected outperformance of XLK over XLE that the view is about. The specification is thus $p\mu = q + \epsilon^{v}$ with $q = 0.015$ and $\mathrm{Var}(\epsilon^{v}) = 0.0025$.

BL-Q4. Order the assets (A, B, C, D). View (i) is relative — A over B by $0.010$ — so its row is $(1, -1, 0, 0)$ with target $0.010$. View (ii) is absolute on C — expected return $0.008$ — so its row is $(0, 0, 1, 0)$ with target $0.008$. Hence \[P = \begin{pmatrix} 1 & -1 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{pmatrix}, \qquad Q = \begin{pmatrix} 0.010 \\ 0.008 \end{pmatrix}.\] The error standard deviations are $0.04$ and $0.06$, and the views are independent, so \[\Omega = \begin{pmatrix} 0.04^2 & 0 \\ 0 & 0.06^2 \end{pmatrix} = \begin{pmatrix} 0.0016 & 0 \\ 0 & 0.0036 \end{pmatrix}.\]

BL-Q5. In scalar form: prior precision $(\tau\Sigma)^{-1} = 1/0.02 = 50$; view precision $1/\omega^2 = 1/0.01 = 100$ (with $p = 1$). The posterior precision is $50 + 100 = 150$, so the posterior variance is $1/150$. The posterior mean is \[\bar\mu = \frac{50\cdot 0.006 + 100\cdot 0.012}{150} = \frac{0.30 + 1.20}{150} = \frac{1.50}{150} = 0.010.\] Since $0.006 < 0.010 < 0.012$, the posterior lies between the prior and the view. It is closer to the view ($0.012$) because the view carries more precision ($100 > 50$); the posterior is a precision-weighted average, so it leans toward whichever source is more confident.

BL-Q6. Keep $(\tau\Sigma)^{-1} = 50$, $\Pi = 0.006$, $q = 0.012$.

$\omega^2 = 0.001 \Rightarrow 1/\omega^2 = 1000$. $\bar\mu = \dfrac{50\cdot0.006 + 1000\cdot0.012}{50 + 1000} = \dfrac{0.30 + 12.0}{1050} = \dfrac{12.3}{1050} \approx 0.01171$.
$\omega^2 = 0.2 \Rightarrow 1/\omega^2 = 5$. $\bar\mu = \dfrac{50\cdot0.006 + 5\cdot0.012}{50 + 5} = \dfrac{0.30 + 0.06}{55} = \dfrac{0.36}{55} \approx 0.00655$.

In (a) the very confident view ($\omega^2 = 0.001$) has precision $1000$, dwarfing the prior precision $50$, so $\bar\mu \approx 0.0117$ sits almost on the view $0.012$. In (b) the very unconfident view ($\omega^2 = 0.2$) has precision only $5$, far below $50$, so $\bar\mu \approx 0.0065$ barely leaves the prior $0.006$. Lower $\omega^2$ (higher confidence) pulls the posterior strongly toward the view, exactly as the chapter argues.

BL-Q7. Posterior variance $= [(\tau\Sigma)^{-1} + p'\tfrac{1}{\omega^2}p]^{-1} = [50 + 100]^{-1} = 1/150 \approx 0.00667$. Compare: prior variance $\tau\Sigma = 0.02$; view variance $\omega^2 = 0.01$. The posterior variance $0.00667$ is smaller than both. It must be, because precisions add: the posterior precision $50 + 100 = 150$ exceeds either individual precision ($50$ or $100$), and inverting a larger precision gives a smaller variance. Combining two independent noisy sources can only sharpen the estimate. For $\omega^2 = 0.001$ (precision $1000$): posterior variance $= [50 + 1000]^{-1} = 1/1050 \approx 0.000952$, smaller still and now below even the view variance $0.001$ — a highly confident view drives the posterior uncertainty down toward the view’s own tiny variance.

BL-Q8. Fix $\Sigma = 0.02$, $\Pi = 0.006$, $q = 0.012$, $\omega^2 = 0.01$ (view precision $1/\omega^2 = 100$).

$\tau = 0.1$: $\tau\Sigma = 0.002$, prior precision $= 1/0.002 = 500$. $\bar\mu = \dfrac{500\cdot0.006 + 100\cdot0.012}{500 + 100} = \dfrac{3.0 + 1.2}{600} = \dfrac{4.2}{600} = 0.0070$.

$\tau = 1.0$: $\tau\Sigma = 0.02$, prior precision $= 50$. $\bar\mu = \dfrac{50\cdot0.006 + 100\cdot0.012}{150} = 0.010$ (as in BL-Q5).

Shrinking $\tau$ from $1.0$ to $0.1$ moves the posterior from $0.010$ down toward the prior $0.006$ (to $0.007$). A smaller $\tau$ shrinks the prior variance $\tau\Sigma$, raising the prior precision, which means the investor is more confident in the equilibrium prior; the posterior therefore leans more heavily on $\Pi$ and away from the view.

BL-Q9. Equilibrium returns $\Pi = (0.084, 0.042)'$ from BL-Q1, and $\bar\mu = \Pi + (0.006, 0)' = (0.090,\ 0.042)'$. Need $\theta = \tfrac{1}{A}\Sigma^{-1}\bar\mu$ with $A = 3$. First $\Sigma^{-1}$: $\det\Sigma = 0.04\cdot0.02 - 0.01^2 = 0.0008 - 0.0001 = 0.0007$, so \[\Sigma^{-1} = \frac{1}{0.0007}\begin{pmatrix} 0.02 & -0.01 \\ -0.01 & 0.04 \end{pmatrix}.\] Then $\Sigma^{-1}\bar\mu$: component 1 $= (0.02\cdot0.090 - 0.01\cdot0.042)/0.0007 = (0.0018 - 0.00042)/0.0007 = 0.00138/0.0007 = 1.9714$; component 2 $= (-0.01\cdot0.090 + 0.04\cdot0.042)/0.0007 = (-0.0009 + 0.00168)/0.0007 = 0.00078/0.0007 = 1.1143$. Divide by $A = 3$: $\theta = (0.6571,\ 0.3714)'$. Tilt $\theta - \theta_{\text{mkt}} = (0.6571 - 0.6,\ 0.3714 - 0.4)' = (+0.0571,\ -0.0286)'$. Raising only asset 1’s expected return still changes both weights because $\Sigma^{-1}$ is not diagonal: the assets are positively correlated, so overweighting asset 1 to exploit its higher return means the optimizer trims asset 2 to keep total portfolio risk in check.

BL-Q10. Now $\bar\mu = \Pi + (0, -0.007)' = (0.084,\ 0.035)'$, same $A = 3$ and $\Sigma^{-1}$ as BL-Q9. $\Sigma^{-1}\bar\mu$: component 1 $= (0.02\cdot0.084 - 0.01\cdot0.035)/0.0007 = (0.00168 - 0.00035)/0.0007 = 0.00133/0.0007 = 1.9$; component 2 $= (-0.01\cdot0.084 + 0.04\cdot0.035)/0.0007 = (-0.00084 + 0.0014)/0.0007 = 0.00056/0.0007 = 0.8$. Divide by $3$: $\theta = (0.6333,\ 0.2667)'$. Tilt $\theta - \theta_{\text{mkt}} = (+0.0333,\ -0.1333)'$. Asset 2’s weight falls sharply because its posterior return was lowered — the investor wants to hold less of it. Asset 1’s weight rises even though its own expected return is unchanged: because the two assets are positively correlated, cutting the position in asset 2 removes some exposure that the optimizer partly restores by adding to the still-attractive, positively-correlated asset 1, keeping the risk-return trade-off optimal.

17.13 Credit Risk Models

CR-C1. The objection confuses the shape of the default-time distribution with the tools available to handle it. The transformation proceeds in two steps: first map $t$ through its own CDF, $u = F(t)$, which is a probability-integral transform yielding a uniform-$(0,1)$ variable; then map that uniform through the inverse standard-normal CDF, $x = N^{-1}(u)$, yielding a standard normal variable. Because $F$ and $N$ are (weakly) increasing, the event $\{t \le t_0\}$ is identical to the event $\{x \le x(t_0)\}$, so $\Pr(t \le t_0) = \Pr(x \le x(t_0))$: the transformation preserves the entity’s marginal default probability exactly. Nothing about the borrower’s economics is altered; we have only relabeled the outcome axis onto a normal scale so that normal-distribution machinery (correlation, factor models) can be brought to bear. The right-skewness of $t$ is fully accommodated because it is baked into $F$, which is used to build $x$.

CR-C2. Each entity is transformed through its own marginal CDF: $x_1 = N^{-1}(F_1(t_1))$ and $x_2 = N^{-1}(F_2(t_2))$. The probability-integral transform guarantees that $F_i(t_i)$ is uniform on $(0,1)$ regardless of the shape of $F_i$, and applying $N^{-1}$ to a uniform variable produces a standard normal. Hence both $x_1$ and $x_2$ are standard normal even though the fast-defaulting sub-prime mortgage ($t_1$) and the slow-defaulting investment-grade bond ($t_2$) have wildly different, non-identical marginals. Putting both entities on a common standard-normal scale is exactly what lets the modeler impose correlation with familiar Gaussian tools — a single correlation parameter (or factor loading) now has a clean meaning as the correlation between two standard normals, and the joint behavior can be handled with the multivariate normal, without ever forcing $F_1 = F_2$.

CR-C3. $Y$ is the systematic or common factor — a single economy-wide random driver shared by every borrower, representing forces like the business cycle, interest-rate environment, or nationwide house prices. $\epsilon_i$ is the idiosyncratic shock specific to entity $i$ — firm-level or household-level fortunes independent across borrowers ($cov(\epsilon_i,\epsilon_j)=0$). The loading $a_i$ splits each borrower’s default driver between these two sources. When every $a_i \to 1$, $x_i \approx Y$: all borrowers move together, defaults cluster heavily, and diversification fails because a bad $Y$ sinks everyone at once. When every $a_i \to 0$, $x_i \approx \epsilon_i$: defaults are essentially independent and pool losses are highly diversifiable. A concrete real-world $Y$ is a nationwide collapse in house prices (or a deep recession), which simultaneously pushes many mortgages toward default.

CR-C4. The coefficient $\sqrt{1-a_i^2}$ is chosen so that the variance of $x_i$ is exactly one. Since $Y$ and $\epsilon_i$ are independent standard normals, $\mathrm{Var}(x_i) = a_i^2\,\mathrm{Var}(Y) + (1-a_i^2)\,\mathrm{Var}(\epsilon_i) = a_i^2 + (1-a_i^2) = 1$, and the mean is zero; being a linear combination of independent normals, $x_i$ is itself standard normal. With unit variance, the correlation of $x_i$ with $Y$ equals their covariance, $cov(x_i,Y) = a_i\,\mathrm{Var}(Y) = a_i$, so $a_i$ is the correlation of the default driver with the common factor. Preserving the standard-normal marginal of $x_i$ is essential because the copula construction required $x_i = N^{-1}(F_i(t_i))$ to be standard normal in the first place; if the factor model rescaled the variance, the threshold $N^{-1}(F_i(T))$ would no longer correspond to the entity’s true default probability $F_i(T)$ and the marginal calibration would break.

CR-C5. Conditional on a fixed value of $Y$, each entity’s default driver is $x_i = a_i Y + \sqrt{1-a_i^2}\,\epsilon_i$, and once $Y$ is held constant the only remaining randomness is the idiosyncratic $\epsilon_i$, which are mutually independent across entities. Therefore, given $Y$, the default events are independent, and the joint probability that a group all default factors into the product of their individual conditional probabilities $F_i(T|Y)$. This is what makes the model tractable: to get an unconditional joint probability one integrates the product against the one-dimensional density of $Y$, a single integral, instead of confronting a high-dimensional dependent distribution. If one tried to model the joint default distribution directly, one would have to specify and estimate the full correlation structure among all entities simultaneously — an intractable, parameter-heavy object with no conditional-independence shortcut — losing exactly the computational and estimation simplicity the factor structure provides.

CR-C6. The expression says every entity’s conditional default probability is a monotone function of the single scalar $Y$. Because $Y$ enters with a minus sign, a low (unfavorable) draw of $Y$ raises the numerator $N^{-1}(F(T)) - \rho Y$ and hence raises $F(T|Y)$ for every entity at once — this simultaneous shift is precisely the mechanism that makes defaults cluster in bad states. As $\rho \to 0$, the factor drops out and $F(T|Y) \to N(N^{-1}(F(T))) = F(T)$, the unconditional probability, so defaults are independent and the conditional probability no longer depends on $Y$. As $\rho \to 1$, the denominator $\sqrt{1-\rho^2} \to 0$ and the conditional probability collapses toward 0 or 1 depending on the sign of the numerator: defaults become perfectly comonotone, either everyone survives or everyone fails. This one expression captures the entire correlation structure because all cross-entity dependence in the model flows through the single shared factor $Y$; conditioning on it removes all dependence.

CR-C7. Positive correlation, introduced through the common factor, makes the two loans’ outcomes move together: it makes the joint events “both default” and “both survive” more likely and thins out the middle “one defaults, one survives” case. The senior tranche (first payment received) defaults only when both borrowers fail; its safety rested on that joint failure being unlikely, so fattening the both-default event raises its default probability from $\phi^2$ to $\phi^2 + a^2\phi^2$ and lowers its value. The junior tranche pays only when neither borrower defaults; fattening the both-survive event raises its survival probability from $(1-\phi)^2$ to $(1-\phi)^2 + a^2\phi^2$ and raises its value. Diversification “evaporates” for the senior position because in a bad common state ($y$ low) many borrowers default at once — exactly when the senior tranche needed the loans to be independent, they instead all fail together.

CR-C8. The two tranches, taken together, receive the first payment plus the second payment — that is, whatever total the two loans pay — so their combined payoff is identical to simply holding both loans outright, whose value is $\frac{2(1-\phi)M}{1+r_f}$. This combined value does not depend on $a$ at all: it is fixed by the two individual loans’ expected payoffs. Correlation cannot change the sum, so any value the senior tranche loses must reappear in the junior tranche. Algebraically the senior falls by $a^2\phi^2 M/(1+r_f)$ and the junior rises by the same $a^2\phi^2 M/(1+r_f)$, and the terms cancel. Hence correlation redistributes value between the tranches rather than creating or destroying it — it reslices a pie whose total size is unchanged.

CR-C9. The two-loan example is a miniature of a mortgage-backed security: many individual loans pooled and carved into a safe senior tranche and a risky junior tranche. Rating the senior tranche under an independence assumption ($a=0$) assigns it default probability $\phi^2$, which for small $\phi$ is tiny — the basis for a AAA rating. But if a common factor is actually present, the senior tranche’s true default probability is $\phi^2 + a^2\phi^2$, strictly larger. The dangerous error is that the rating ignored the correlation: in the crisis the common factor $y$ was a nationwide shock to house prices, which drove many mortgages into default simultaneously. When that shock hit, the diversification the AAA rating assumed vanished, and “safe” senior tranches suffered losses far beyond what an independence-based rating implied.

CR-C10. The reasoning treats the senior tranche’s default probability as if it were $\phi^2$ — vanishingly small for small $\phi$ — which is correct only under the counterfactual assumption of independent defaults ($a=0$). Once a common factor is present, the correct default probability is $\phi^2 + a^2\phi^2$, and the added $a^2\phi^2$ term is of the same order as $\phi^2$ (it is $a^2$ times as large), so it can inflate the senior default probability substantially. The single assumption whose misjudgment does the most damage is the assumed default correlation — the loading $a$ (equivalently $\rho$). Getting the marginal default probability $\phi$ roughly right is not enough; underestimating $a$ makes the senior tranche look far safer than it is, because the senior tranche’s entire risk sits in the joint-default event that correlation controls.

CR-Q1. (a) $F(5) = 1 - e^{-0.03 \times 5} = 1 - e^{-0.15} = 0.139292$. (b) $x(5) = N^{-1}(0.139292) = -1.083506$. So the entity defaults within 5 years with probability about $0.1393$, and the corresponding normal threshold is about $-1.0835$.

CR-Q2. (a) We need $N^{-1}(F(t^{*})) = 0$, i.e. $F(t^{*}) = N(0) = 0.5$. So $1 - e^{-0.05 t^{*}} = 0.5 \Rightarrow e^{-0.05 t^{*}} = 0.5 \Rightarrow t^{*} = \frac{\ln 2}{0.05} = 13.862944$ years. (b) Check: $F(13.862944) = 1 - e^{-0.05 \times 13.862944} = 1 - e^{-\ln 2} = 1 - 0.5 = 0.500000 = N(0)$. So $t^{*} \approx 13.863$ years and $F(t^{*}) = 0.5$, confirming the transform maps this default time to the normal threshold $x=0$.

CR-Q3. With $F_i(T) = 0.05$, $N^{-1}(0.05) = -1.644854$, and $a_i = 0.5$ so $\sqrt{1-0.25} = 0.866025$. (a) $Y=+1$: $F_i(T|Y) = N\!\left(\frac{-1.644854 - 0.5}{0.866025}\right) = N(-2.4766) = 0.006631$. (b) $Y=-1$: $F_i(T|Y) = N\!\left(\frac{-1.644854 + 0.5}{0.866025}\right) = N(-1.3219) = 0.093090$. The “bad” state $Y=-1$ produces far more defaults (about $9.3\%$ vs $0.66\%$), since a low common factor pushes every entity toward default.

CR-Q4. With $F(T)=0.10$, $N^{-1}(0.10) = -1.281552$, and $Y = -1.5$. (a) $\rho=0.2$: denominator $\sqrt{1-0.04}=0.979796$; $F(T|Y) = N\!\left(\frac{-1.281552 - 0.2(-1.5)}{0.979796}\right) = N(-1.0018) = 0.158222$. (b) $\rho=0.7$: denominator $\sqrt{1-0.49}=0.714143$; $F(T|Y) = N\!\left(\frac{-1.281552 - 0.7(-1.5)}{0.714143}\right) = N(-0.3243) = 0.372879$. The higher loading gives a larger conditional default probability ($37.3\%$ vs $15.8\%$) in the bad state because a larger $\rho$ ties the entity more tightly to the common factor, so an unfavorable $Y$ raises its default probability more.

CR-Q5. (a) $cov(x_i,x_j) = a_i a_j = 0.6 \times 0.8 = 0.48$. (b) A common loading $a$ with $a^2 = 0.48$ gives the same pairwise correlation, so $a = \sqrt{0.48} = 0.692820$. Result: correlation $0.48$; equivalent common loading $\approx 0.6928$.

CR-Q6. (a) $cov(x_1,x_2) = 0.5 \times 0.5 = 0.25$; $cov(x_1,x_3) = 0.5 \times 0.9 = 0.45$; $cov(x_2,x_3) = 0.5 \times 0.9 = 0.45$. (b) The pairs $(1,3)$ and $(2,3)$ are most correlated, at $0.45$. Entity 3’s large loading $a_3 = 0.9$ dominates because pairwise correlation is the product of loadings, so any pair including the high-loading entity inherits a large correlation. Correlations: $0.25,\ 0.45,\ 0.45$.

CR-Q7. With $\phi=0.10$, $M=\$1{,}000$, $r_f=0.04$. (a) Senior: $P_{\text{senior}} = \frac{(1-\phi^2)M}{1+r_f} = \frac{(1-0.01)(1000)}{1.04} = \frac{990}{1.04} = \$951.92$. (b) Junior: $P_{\text{junior}} = \frac{(1-\phi)^2 M}{1+r_f} = \frac{(0.9)^2(1000)}{1.04} = \frac{810}{1.04} = \$778.85$.

CR-Q8. With $a = 0.20$, so $a^2\phi^2 = 0.04 \times 0.01 = 0.0004$. Senior: $P_{\text{senior}} = \frac{(1-0.01-0.0004)(1000)}{1.04} = \frac{989.6}{1.04} = \$951.54$. Junior: $P_{\text{junior}} = \frac{(0.81+0.0004)(1000)}{1.04} = \frac{810.4}{1.04} = \$779.23$. Changes relative to CR-Q7: senior falls by $\$0.3846$ (from $\$951.92$ to $\$951.54$); junior rises by $\$0.3846$ (from $\$778.85$ to $\$779.23$). The loss to the senior equals the gain to the junior.

CR-Q9. With $a=0.30$, $a^2\phi^2 = 0.09 \times 0.01 = 0.0009$. (a) Transfer $= \frac{a^2\phi^2 M}{1+r_f} = \frac{0.0009 \times 1000}{1.04} = \frac{0.9}{1.04} = \$0.8654$ moves from senior to junior. (b) $P_{\text{senior}} = \frac{(0.99 - 0.0009)(1000)}{1.04}$ and $P_{\text{junior}} = \frac{(0.81 + 0.0009)(1000)}{1.04}$; summing, the $\pm 0.0009$ terms cancel: $P_{\text{senior}}+P_{\text{junior}} = \frac{(0.99+0.81)(1000)}{1.04} = \frac{1800}{1.04} = \$1{,}730.77$. Holding both loans outright: $\frac{2(1-\phi)M}{1+r_f} = \frac{2(0.9)(1000)}{1.04} = \frac{1800}{1.04} = \$1{,}730.77$. The two match, confirming value conservation. Transfer $\approx \$0.87$; combined value $\$1{,}730.77$.

CR-Q10. With $\phi=0.10$. (a) Under independence ($a=0$): default probability $= \phi^2 = 0.01$. Under $a=0.5$: $\phi^2 + a^2\phi^2 = 0.01 + 0.25 \times 0.01 = 0.0125$. (b) Percentage increase $= \frac{0.0125 - 0.01}{0.01} = \frac{a^2\phi^2}{\phi^2} = a^2 = 0.25 = 25\%$. So the senior tranche’s default probability rises from $0.0100$ to $0.0125$, a $25\%$ increase — a substantial jump driven entirely by the correlation loading.

17.14 Market Makers

MM-C1. Stock Y, with $\mu = 0.40$, has the wider spread. In the Glosten-Milgrom model a competitive market maker cannot tell an informed order from an uninformed one, so she must protect herself against the possibility that the trader on the other side knows the true value. When a buy order arrives, the posterior probability that it came from an informed trader (who buys only when $v = V_1$) rises with $\mu$; the zero-profit condition then forces $A = E[V|\mbox{buy order}]$, which lies above $E[V]$ because a buy order is partly informative “good news.” Symmetrically a sell order pushes $B = E[V|\mbox{sell order}]$ below $E[V]$. With more informed traders (stock Y), each order carries more information, the conditional expectations move farther from $E[V]$, and the spread $A - B$ widens. The money lost trading against informed traders is exactly recouped from the uninformed, so the market maker breaks even on average.

MM-C2. The spread does not require monopoly power; it is an adverse-selection cost. Even perfectly competitive market makers who earn zero expected profit must quote $A = E[V|\mbox{buy}] > E[V]$ and $B = E[V|\mbox{sell}] < E[V]$, because otherwise they would lose money on the informed trades and, being unable to distinguish informed from uninformed, could not stay solvent. If every arriving trader were uninformed ($\mu = 0$), then a buy or sell order conveys no information, $A = B = E[V]$, and the spread collapses to zero. This directly refutes the commentator: the spread is generated by information asymmetry, not market power, and it vanishes precisely when there is nothing to be adversely selected against.

MM-C3. A buy order is not a random event: informed traders buy only when the value is high, so seeing a buy order raises the market maker’s assessment of the probability that $v = V_1$. Bayes’ rule quantifies this. The unconditional probability of a buy is $P(\mbox{buy}) = \pi\mu + \frac12(1-\mu)$, and of that the informed contribute $\pi\mu$, giving $P(\mbox{In}|\mbox{buy}) = \frac{\pi\mu}{\pi\mu + \frac12(1-\mu)}$. Because informed buyers are certain the value is high, the posterior expectation $E[V|\mbox{buy}]$ tilts upward relative to $E[V]$. Setting $A = E[V|\mbox{buy order}]$ rather than $E[V]$ means the market maker charges buyers for the “bad news to her” that a buy order represents; quoting the unconditional $E[V]$ would leave her expecting to lose money whenever the buyer turns out to be informed.

MM-C4. The spread $A - B = \frac{4\pi(1-\pi)\mu(V_1 - V_0)}{1 - (2\pi-1)^2\mu^2}$ is (i) proportional to the value gap $V_1 - V_0$ and (ii) increasing in the informed fraction $\mu$ (the numerator rises linearly and the denominator falls as $\mu$ grows). Channel (i) corresponds to the chapter’s third narrowing force — lower volatility of the underlying asset shrinks $V_1 - V_0$ and hence the spread. Channel (ii) corresponds to the second force — a decrease in the probability of informed trading. The first listed force, competition between market makers, is what drives the market maker to the zero-profit quotes $A = E[V|\mbox{buy}]$ and $B = E[V|\mbox{sell}]$ in the first place, without which the spread could be wider still.

MM-C5. If the insider bought without limit, her enormous order flow would reveal that $v$ is high, the market maker’s price $P(y) = \mu + \lambda y$ would jump, and she would end up buying at prices near the true value — eliminating her profit. She therefore restrains demand to the optimal $x = \frac{v-\mu}{2\lambda}$, trading enough to exploit her signal but little enough to keep the price impact from swallowing her gains. The noise traders’ demand $u \sim N(0,\sigma^2_u)$ provides camouflage: because the market maker sees only total order flow $y = x + u$, the insider’s trade is hidden inside random noise, so a large $y$ could reflect either informed buying or a noise shock. If $\sigma^2_u \to 0$ there is no camouflage, order flow becomes fully revealing of $v$, the price moves one-for-one against the insider, and her profit — and any incentive to trade on information — disappears.

MM-C6. From $\lambda = \frac{\sqrt{\Sigma_0}}{2\sigma_u}$, asset A (high $\Sigma_0$, low $\sigma_u$) has the larger $\lambda$ and thus the shallower market; asset B (low $\Sigma_0$, high $\sigma_u$) has the smaller $\lambda$ and the deeper market. “Deeper” means a trader can push through a large order with less price impact: each unit of order flow moves the price by only $\lambda$, so a small $\lambda$ lets big orders execute close to the prior price. Market makers rationally choose a steeper pricing rule (larger $\lambda$) for asset A because its high fundamental uncertainty and thin noise trading make order flow very informative, so each unit of imbalance signals a large revision in value and must be priced aggressively to avoid losses to informed traders.

MM-C7. The optimal order $x = \frac{v-\mu}{2\lambda}$ is proportional to the surprise $v - \mu$ because the insider profits from the gap between the true value and the price the market maker would otherwise set; a bigger informational edge justifies a bigger position. It is inversely proportional to $2\lambda$ because $\lambda$ measures how fast her own trading moves the price against her: the factor of $2$ reflects that trading a marginal share both raises the price paid on that share and on all inframarginal shares. A deeper market (smaller $\lambda$) means less price impact per share, so she can exploit the same signal more aggressively before the price catches up. In equilibrium this linear demand feeds back into a normally distributed order flow, and matching the coefficient on $y$ in $E[v|y]$ to $\lambda$ pins down $\lambda = \frac{\sqrt{\Sigma_0}}{2\sigma_u}$, confirming the market maker’s conjectured linear rule is self-consistent.

MM-C8. If the price $P = a + bs - dZ$ perfectly revealed $s$, an uninformed trader could read the signal straight off the price for free, so no one would pay the cost of becoming informed — the private return to information would be zero. But if no one acquires the signal, there is nothing for the price to reveal, so a fully revealing price cannot be sustained in equilibrium. The noisy supply $Z \sim N(0,\sigma^2_z)$ breaks this impossibility: because the price mixes the signal $s$ together with the unobserved supply shock $Z$, observers cannot invert the price to recover $s$ exactly. This partial revelation leaves a positive return to being informed, so a strictly interior fraction of traders is willing to pay for the signal, and equilibrium exists.

MM-C9. With $\sigma^2_z \to 0$ the price becomes an exact, invertible function of the signal — there is no longer a noisy supply term to garble it, so the price fully reveals $s$. Uninformed traders can then infer the signal costlessly from the price, making the private value of paying for information zero. Rational traders would therefore stop acquiring the costly signal, so the fraction $\lambda$ collapses toward zero. This is the Grossman-Stiglitz paradox in reverse: a healthy market needs some traders who trade for reasons unrelated to fundamental value (noise/liquidity traders), because their randomness is exactly what keeps prices from being perfectly revealing and thereby preserves the incentive for anyone to gather information at all.

MM-C10. In both models the transaction price sits away from the market maker’s unconditional expectation of value, but the wedge is generated differently. In Glosten-Milgrom the wedge is the discrete bid-ask spread: values are binary ($V_0$ or $V_1$), each trader demands exactly one share, and the adverse-selection cost is the chance that a single order comes from an informed trader. In Kyle values and quantities are continuous and Gaussian, the informed trader chooses a strategic quantity $x = \frac{v-\mu}{2\lambda}$, and the adverse-selection cost shows up as the price-impact slope $\lambda$ applied to order flow. In both the market maker is competitive and earns zero expected profit: she systematically loses to informed traders but recovers exactly those losses from uninformed/noise traders, so the quotes (spread in one, $\lambda$ in the other) are set precisely at the break-even level.

MM-C11. Informed and uninformed traders hold different positions because their conditioning information differs: the informed trader uses $E[v|s]$ and the tighter variance $\sigma^2(v|s)$, while the uninformed trader uses $E[v|P]$ and the wider $\sigma^2(v|P)$. Since the informed posterior mean typically diverges from the price more sharply and with more confidence, her demand $\theta = \frac{E[v|I]-P}{\gamma\sigma^2(v|I)}$ is generally larger in magnitude. The term $\sigma^2(v|I)$ in the denominator encodes the value of information: better information means a smaller posterior variance, which shrinks the denominator and lets the trader take a larger position on the same perceived mispricing. A risk-averse trader scales back positions she is uncertain about, so reducing that uncertainty is exactly what makes information valuable.

MM-C12. In an information-based theory such as Glosten-Milgrom, the spread compensates the dealer for adverse selection — the risk that the counterparty is better informed — so it is generated purely by information asymmetry and would exist even for a risk-neutral dealer with unlimited capital. In an inventory-based theory, the spread reflects the dealer’s need to manage risk from holding unbalanced positions: a dealer who accumulates a large long position lowers both her bid and ask to encourage buyers and discourage further sellers, nudging her inventory back toward a target. One distinguishing empirical prediction: inventory models imply quote revisions that depend on the dealer’s accumulated position and mean-revert as inventory normalizes, whereas information models predict quote changes that are permanent (prices ratchet in the direction of the informative order flow and do not revert).

MM-Q1. With $\pi = 0.5$, $E[V] = 0.5(110) + 0.5(90) = 100$. Ask numerator $\pi(1+\mu)V_1 + (1-\pi)(1-\mu)V_0 = 0.5(1.3)(110) + 0.5(0.7)(90) = 71.5 + 31.5 = 103$; denominator $1 + (2\pi-1)\mu = 1 + 0 = 1$, so $A = 103$. Bid numerator $0.5(0.7)(110) + 0.5(1.3)(90) = 38.5 + 58.5 = 97$; denominator $1$, so $B = 97$. Spread $A - B = 103 - 97 = 6$. The simplified formula gives $A - B = \mu(V_1 - V_0) = 0.3(20) = 6$, matching. Result: $A = 103$, $B = 97$, $A - B = 6$.

MM-Q2. $E[V] = 0.7(80) + 0.3(20) = 56 + 6 = 62$. Ask numerator $\pi(1+\mu)V_1 + (1-\pi)(1-\mu)V_0 = 0.7(1.5)(80) + 0.3(0.5)(20) = 84 + 3 = 87$; denominator $1 + (2\pi-1)\mu = 1 + (0.4)(0.5) = 1.2$. So $A = 87/1.2 = 72.5$. Indeed $A = 72.5 > 62 = E[V]$: a buy order raises the posterior probability the value is high, so the market maker prices above the unconditional expectation to avoid losing to informed buyers. Result: $E[V] = 62$, $A = 72.5$.

MM-Q3. $E[V] = 0.6(60) + 0.4(40) = 52$ (for reference). Ask numerator $0.6(1.25)(60) + 0.4(0.75)(40) = 45 + 12 = 57$; denominator $1 + (2\pi-1)\mu = 1 + (0.2)(0.25) = 1.05$; $A = 57/1.05 = 54.2857$. Bid numerator $0.6(0.75)(60) + 0.4(1.25)(40) = 27 + 20 = 47$; denominator $1 + (1-2\pi)\mu = 1 + (-0.2)(0.25) = 0.95$; $B = 47/0.95 = 49.4737$. Spread $A - B = 54.2857 - 49.4737 = 4.812$. Closed form: $\frac{4(0.6)(0.4)(0.25)(20)}{1 - (0.2)^2(0.25)^2} = \frac{4.8}{1 - 0.0025} = \frac{4.8}{0.9975} = 4.812$, matching. Result: $A \approx 54.29$, $B \approx 49.47$, $A - B \approx 4.81$.

MM-Q4. With $\pi = 0.5$, spread $A - B = \mu(V_1 - V_0) = 0.4(140 - 100) = 0.4(40) = 16$. $E[V] = 0.5(140) + 0.5(100) = 120$; because $\pi = 0.5$ the spread is symmetric about $E[V]$, so $A = 120 + 8 = 128$ and $B = 120 - 8 = 112$. If $\mu$ rises to $0.8$, the spread becomes $0.8(40) = 32$. Result: $A - B = 16$, $A = 128$, $B = 112$; at $\mu = 0.8$, $A - B = 32$.

MM-Q5. $\lambda = \frac{\sqrt{\Sigma_0}}{2\sigma_u} = \frac{\sqrt{16}}{2(2)} = \frac{4}{4} = 1$. With $\mu = p_0 = 100$ and $y = 3$: $P(y) = \mu + \lambda y = 100 + 1(3) = 103$. Result: $\lambda = 1$, $P(3) = 103$.

MM-Q6. (a) $\lambda = \frac{\sqrt{36}}{2(3)} = \frac{6}{6} = 1$. (b) With $\sigma^2_u = 36$, $\sigma_u = 6$: $\lambda = \frac{\sqrt{36}}{2(6)} = \frac{6}{12} = 0.5$. (c) The market became deeper (smaller $\lambda$): with more noise trading to hide behind, each unit of order flow is less informative, so a large order moves the price by less — lower price impact. Result: $\lambda = 1$ then $\lambda = 0.5$; deeper.

MM-Q7. (a) $\lambda = \frac{\sqrt{25}}{2\sqrt{100}} = \frac{5}{2(10)} = \frac{5}{20} = 0.25$. (b) With $\mu = p_0 = 50$ and $v = 60$: $x = \frac{v-\mu}{2\lambda} = \frac{60 - 50}{2(0.25)} = \frac{10}{0.5} = 20$. (c) With $\sigma^2_u = 400$, $\sigma_u = 20$: $\lambda = \frac{5}{2(20)} = \frac{5}{40} = 0.125$; then $x = \frac{10}{2(0.125)} = \frac{10}{0.25} = 40$. The insider trades more aggressively because the deeper market (smaller $\lambda$) means less price impact per share. Result: $\lambda = 0.25$, $x = 20$; then $\lambda = 0.125$, $x = 40$.

MM-Q8. (a) $\lambda = \frac{\sqrt{64}}{2(4)} = \frac{8}{8} = 1$; $\mu = p_0 = 30$. (b) $x = \frac{v-\mu}{2\lambda} = \frac{45 - 30}{2(1)} = \frac{15}{2} = 7.5$. (c) $y = x + u = 7.5 + (-2.5) = 5$; $P(y) = \mu + \lambda y = 30 + 1(5) = 35$. The insider bought at $35$, below the true value $v = 45$, so she earns a positive profit of $45 - 35 = 10$ per share. Result: $\lambda = 1$, $x = 7.5$, $y = 5$, $P = 35$ (below $v$).

MM-Q9. $M = \frac{\sigma^2_v}{\sigma^2_v + \sigma^2_\epsilon} = \frac{9}{9 + 3} = \frac{9}{12} = 0.75$. With $\mu = 20$ and $s = 26$: $E[v|s] = \mu + M s = 20 + 0.75(26) = 20 + 19.5 = 39.5$. Result: $M = 0.75$, $E[v|s] = 39.5$.

MM-Q10. (a) $M = \frac{16}{16 + 16} = \frac{16}{32} = 0.5$. (b) $E[v|s] = \mu + M s = 50 + 0.5(70) = 50 + 35 = 85$. (c) As $\sigma^2_\epsilon \to \infty$, $M \to 0$: an extremely noisy signal carries almost no information, so the trader places essentially no weight on it and relies on the prior. Result: $M = 0.5$, $E[v|s] = 85$; $M \to 0$ as noise grows.

MM-Q11. $\theta = \frac{E[v|I] - P}{\gamma \sigma^2(v|I)} = \frac{25 - 22}{2(5)} = \frac{3}{10} = 0.3$. If posterior uncertainty $\sigma^2(v|s)$ doubled to $10$, the denominator would double and her demand would halve to $0.15$ — greater uncertainty makes her scale back her position on the same expected gain. Result: $\theta = 0.3$; doubling $\sigma^2(v|s)$ halves it to $0.15$.

MM-Q12. (a) $\theta = \frac{E[v|I] - P}{\gamma \sigma^2(v|I)} = \frac{30 - 24}{4(2)} = \frac{6}{8} = 0.75$ for each trader. (b) With $\gamma = 1$: $\theta = \frac{30 - 24}{1(2)} = \frac{6}{2} = 3$. (c) Demand is inversely proportional to risk aversion: for a given expected gain and posterior variance, a less risk-averse investor takes a proportionally larger position. Result: $\theta = 0.75$ (at $\gamma = 4$), $\theta = 3$ (at $\gamma = 1$).