Efficient Portfolios

Keith A. Lewis

April 25, 2024

Abstract
The Capital Asset Pricing Model holds as equality of random variables, not just their expected values.

Given two random realized returns on an investment, which is to be preferred? This is a fundamental problem in finance that has no definitive answer except in the case one investment always returns more than the other, in which case arbitrage exists. In 1952 Markowitz(Markowitz 1952) and Roy(Roy 1952) introduced a criterion for risk vs. return in portfolio selection: if two portfolios have the same expected realized return then prefer the one with smaller variance. An efficient portfolio has the least variance among all portfolios having the same expected realized return. This was developed into the Capital Asset Pricing Model by Treynor(Treynor 1961), Sharpe(Sharpe 1964), Lintner(Lintner 1965), and many others.

The Capital Asset Pricing Model marked the transition from the due diligence required for Graham-Todd security analysis to using the wisdom of the markets to inform investing. The “market portfolio” was assumed to be in an efficient “equilibrium” resulting from the cadre of investment professionals performing “market clearing” trades. This short note is agnostic to the quoted terms and proves a simple mathematical result about efficient portfolios.

There are well-founded criticisms of the CAPM, but it has value as an easily understood model. Portfolio managers use Sharpe ratios to tailor returns for an investment strategy while accounting for risk. The CAPM demonstrates a constraint on expected returns and covariance of efficient portfolios. We show a much stronger constraint: efficient portfolios satisfy an equality of realized returns as random variables. This allows the value-at-risk, or any risk measure, of efficient portfolios to be calculated, something not possible using the classical result that only holds for expected values.

This result follows directly from writing down a mathematical model for one period investments. The only thing remarkable is that this has not heretofore been noted in the literature. Prior work fails to explicitly specify a sample space and probability measure, the first step in any model involving probability since Kolomogorov legitimized probability as a branch of measure theory (Kolmogorov 1956).

CAPM

The CAPM places a constraint on the excess expected realized return of efficient portfolios. \tag{1} E[R] - R_0 = \beta(E[R_1] - R_0) where R is the realized return of an efficient portfolio, R_0 is the realized return of a risk-less portfolio, R_1 is the realized return of the “market portfolio”, and \beta = \operatorname{Cov}(R, R_1)/\operatorname{Var}(R_1).

This short note shows the random realized return R of any efficient portfolio satisfies \tag{2} R - R_0 = \beta(R_1 - R_0) where R_0 and R_1 are the random realized returns of any two independent efficient portfolios. This implies \beta = \operatorname{Cov}(R - R_0, R_1 - R_0)/\operatorname{Var}(R_1 - R_0). Taking expected values of both sides when R_0 has zero variance and R_1 is the “market portfolio” yields the classical CAPM formula

One-Period Model

Let I be the set of market instruments and \Omega be the set of possible market outcomes over the period. The one-period model specifies the initial instrument prices x\in\mathbf{R}^I and the final instrument prices X\colon\Omega\to\mathbf{R}^I depending on the outcome \omega\in\Omega that occurs. The one period model also specifies a probability measure on the space of outcomes. It is common to assume \Omega is \mathbf{R}^I, X is the identity function, and P is multivariate normal. We allow arbitrary distributions to be specified for final prices.

A portfolio \xi\in\mathbf{R}^I is the number of shares initially purchased in each instrument. It costs {\xi^* x = \sum_{i\in I} \xi_i x_i} to acquire the portfolio at the beginning of the period and returns {\xi^* X(\omega) = \sum_{i\in I} \xi_i X_i(\omega)} when liquidated at the end of the period if \omega\in\Omega occurs. The realized return of \xi is {R_\xi = \xi^* X/\xi^* x} when \xi^* x \not= 0.

Efficient Portfolio

A portfolio is efficient if its variance is less than or equal to the variance of any portfolio having the same expected realized return. Note R_\xi = R_{t\xi} for any non-zero t\in\mathbf{R} so there is no loss in assuming \xi^* x = 1. In this case R_\xi = \xi^* X is the realized return of the portfolio. If \xi^* x = 1 then the variance of the realized return is \operatorname{Var}(R_\xi) = \xi^*V\xi where {V = E[X X^*] - E[X] E[X^*]}.

For a given expected realized return r\in\mathbf{R} we can use Lagrange multipliers to minimize {\frac{1}{2}\xi^* V\xi - \lambda(\xi^* x - 1) - \mu(\xi^* E[X] - r)} over \xi\in\mathbf{R}^I, {\lambda, \mu\in\mathbf{R}}. As is well-known, the first order condition on \xi is {0 = V\xi - \lambda x - \mu E[X]}. See the Appendix for a proof.

If V is invertable then \xi = \lambda V^{-1}x + \mu V^{-1} E[X]. This shows every efficient portfolio is in the span of V^{-1}x and V^{-1} E[X].

The only novel result in this paper is the observation that if \xi_0 and \xi_1 are any two independent efficient portfolios then {\xi = \beta_0\xi_0 + \beta_1\xi_1} for some scalars \beta_0 and \beta_1 so {R_\xi = (\beta_0 R_{\xi_0} + \beta_1 R_{\xi_1})/(\beta_0 + \beta_1)}. This shows R_\xi - R_{\xi_0} = \beta(R_{\xi_1} - R_{\xi_0}) as random variables where \beta = \beta_1/(\beta_0 + \beta_1). Taking the covariance with {R_{\xi_1} - R_{\xi_0}} on both sides gives \beta = \operatorname{Cov}(R_\xi - R_{\xi_0}, R_{\xi_1} - R_{\xi_0})/\operatorname{Var}(R_{\xi_1} - R_{\xi_0}).

If V is not invertable then there exists \zeta\in\mathbf{R}^I with {V\zeta = 0}. The first order condition {0 = -\lambda x - \mu E[X]} gives x = (-\mu/\lambda)E[X]. The first order conditions {0 = \zeta^*x - 1}, and {0 = \zeta^*E[X] - r} show {1 = (-\mu/\lambda)r} so {x = (1/r)E[X]}. This is a special case of the condition for a one-period model to be arbitrage-free.

There may be two independent portfolios having variance zero. If they have different returns then arbitrage exists. If they have the same return then the model has redundant assets.

Appendix

We use the notation \xi^* for what is usually denoted by the transpose \xi^T or x'. It is simpler and more illuminating to work with abstract vector spaces and linear operators between them than with \mathbf{R}^n and matrices. Matrix multiplication is just composition of linear operators.

Recall \mathbf{R}^I = \{x\colon I\to\mathbf{R}\} is the vector space of all functions from the set I to \mathbf{R} with scalar multiplication and vector addition defined point-wise: {(ax)(i) = ax(i)} and {(x + y)(i) = x(i) y(i)} for a\in\mathbf{R}, {x,y\in\mathbf{R}^I}, and i\in I.

For \xi\in\mathbf{R}^I define \xi^*\colon\mathbf{R}^I\to\mathbf{R} by \xi(x) = \xi^*x = \sum_{i\in I} \xi_i x_i if I is finite. Note \xi^* is linear.

Let \mathcal{L}(V,W) be the set of all linear operators from the vector space V to W. Note \mathcal{L}(V,W) is also a vector space with scalar multiplication and addition defined point-wise. The dual of a vector space V is V^*=\mathcal{L}(V,\mathbf{R}). For \xi\in\mathbf{R}^I we have \xi^*\in (\mathbf{R}^I)^* and \xi^*x = x\xi^* allows us to identify (\mathbf{R}^I)^* with \mathbf{R}^I. If T\in\mathcal{L}(V,W) its adjoint is T^*\in\mathcal{L}(W^*,V^*) defined by T^*w^*\in V^* where T^*w^*(v) = w^*(Tv), w^*\in W^*, v\in V. If S\in\mathcal{L}(W,U) then ST\in\mathcal{L}(V,U) and (ST)^* = T^*S^*.

Let \mathcal{B}(V,W) be the set of bounded linear operators from the normed linear spaces V to W. A linear operator T\in\mathcal{L}(V,W) is bounded if there exists C\in\mathbf{R} with \|Tv\| \le C\|v\| for all v\in V. The least upper bound of such constants is the norm of T. This makes \mathcal{B}(V,W) a normed vector space.

Fréchet derivative

If F\colon V\to W is a function between normed vector spaces its Fréchet derivative {DF\colon V\to \mathcal{B}(V,W)} is defined by F(\xi + h) = F(\xi) + DF(\xi)h + o(\|h\|) where f(h) = g(h) + o(\|h\|) means (\|f(h) - \|g(h)\|)/\|h\|\to 0 as \|h\|\to 0. If the Fréchet derivative exists at \xi then F can be approximated by a linear operator near \xi.

Given c\in\mathbf{R}^I define F\colon\mathbf{R}^I\to\mathbf{R} by F(\xi) = \xi^*c. We have {F(\xi + h) = \xi^*c + h^*c} so DF(\xi) = c^* since h^*c = c^*h.

Given T\colon\mathbf{R}^I\to\mathbf{R}^I define F\colon\mathbf{R}^I\to\mathbf{R} by F(\xi) = \xi^*T\xi. We have \begin{aligned} F(\xi + h) &= (\xi + h)^*T(\xi + h) \\ &= \xi^*T\xi + \xi^*Th + h^*T\xi + h^*h \\ &= \xi^*T\xi + \xi^*Th + \xi^*T^*h + o(\|h\|). \\ \end{aligned} This shows DF(\xi) = \xi^*(T + T^*).

Lagrange Multiplier

To find the minimum value of \operatorname{Var}(R_\xi) given E[R_\xi] = r we use Lagrange multipliers and solve \min \frac{1}{2}\xi^* V\xi - \lambda(\xi^* x - 1) - \mu(\xi^* E[X] - r) for \xi\in\mathbf{R}^I, \lambda, \mu\in\mathbf{R}. If {\xi^* \xi = 1} then {R_\xi = \xi^* E[X]} and {\operatorname{Var}(R_\xi) = \xi^* V\xi} where {V = E[XX^*] - E[X]E[X^*]}.

Since V^* = V, the first order conditions for an extremum are \begin{aligned} 0 &= \xi^*V - \lambda x^* - \mu E[X^*] \\ 0 &= \xi^* x - 1 \\ 0 &= \xi^* E[X] - r \\ \end{aligned} Assuming V is left invertable \xi = V^{-1}(\lambda x + \mu E[X]). Note every extremum lies in the (at most) two dimensional subspace spanned by V^{-1}x and V^{-1}E[X].

The constraints 1 = x^*\xi and r = E[X^*]\xi can be written \begin{bmatrix} 1 \\ r \\ \end{bmatrix} = \begin{bmatrix} \lambda x^*V^{-1}x + \mu x^*V^{-1}E[X] \\ \lambda E[X^*]V^{-1}x + \mu E[X^*]V^{-1}E[X] \\ \end{bmatrix} = \begin{bmatrix} A & B \\ B & C\\ \end{bmatrix} \begin{bmatrix} \lambda \\ \mu \end{bmatrix} with A = x^* V^{-1}x, B = x^* V^{-1}E[X] = E[X^*]V^{-1}x, and C = E[X^*] V^{-1}E[X]. Inverting gives \begin{bmatrix} \lambda \\ \mu \end{bmatrix} = \frac{1}{D} \begin{bmatrix} C & -B \\ -B & A\\ \end{bmatrix} \begin{bmatrix} 1 \\ r \end{bmatrix} = \begin{bmatrix} (C - r B)/D \\ (-B + r A)/D\\ \end{bmatrix} where D = AC - B^2. The solution is \lambda = (C - r B)/D, \mu = (-B + r A)/D, and \xi = \frac{C - r B}{D} V^{-1}x + \frac{-B + r A}{D} V^{-1}E[X].

A straightforward calculation shows the variance is \operatorname{Var}(R_\xi) = \xi^* V\xi = (C - 2Br + Ar^2)/D.

FTAP

Arbitrage exists in the one-period model if there is a \xi\in\mathbf{R}^I with \xi^* x < 0 and \xi^* X(\omega)\ge0 for \omega\in\Omega. The cost of putting on a position \xi\in\mathbf{R}^I is \xi^*x so you make money entering the position and never lose money unwinding it.

Note if x = \sum_j X(\omega_j) D_j where \omega_j\in\Omega and D_j are non-negative scalars then {\xi^*x = \sum_j \xi^*X(\omega_j) D_j \ge 0} if \xi^*X\ge0. In this case there is no arbitrage.

The one-period Fundamental Theorem of Asset Pricing states there is no model arbitrage if and only if x belongs to the smallest closed cone containing the range of X. Note this statement does not involve any measures. The FTAP is a geometric result, not a probabilistic result.

Recall that a cone is a subset of a vector space closed under addition and multiplication by a positive scalar, that is, C + C\subseteq C and tC\subseteq C for t > 0. For example, the set of arbitrage portfolios is a cone.

The above proves the “easy” direction. The contra-positive follows from the

Lemma. If x\in\mathbf{R}^n and C is a closed cone in \mathbf{R}^n with x\not\in C then there exists \xi\in\mathbf{R}^n with \xi^* x < 0 and \xi^* y \ge0 for y\in C.

Proof. Since C is closed and convex there exists nearest \hat{x}\in C with 0 < \|\hat{x} - x\| \le \|y - x\| for all y\in C. Let \xi = \hat{x} - x. For any y\in C and t > 0 we have ty + \hat{x}\in C so \|\xi\| \le \|ty + \xi\|. Simplifying gives t^2\|y\|^2 + 2t\xi^* y\ge 0. Dividing by t > 0 and letting t decrease to 0 shows \xi^* y\ge 0. Take y = \hat{x} then t\hat{x} + \hat{x}\in C for t \ge -1. By similar reasoning, letting t increase to 0 shows \xi^* \hat{x}\le 0 so \xi^* \hat{x} = 0. Because 0 < \|\xi\|^2 = \xi^* (\hat{x} - x) = -\xi^* x we have \xi^* x < 0.

The proof also shows \xi is an arbitrage when one exists.

If X is bounded, as it is in the real world, then there exists a positive finitely-additive measure (Dunford and Schwartz 1958) with x = \int_\Omega X\,dD. Since D/D(\Omega) is a positive measure with mass 1 we have x = E[X]D(\Omega) under this “probability” measure.

We say \zeta\in\mathbf{R}^I is a zero coupon bond if \zeta^* X = 1. Since \zeta^*x = \int_\Omega dD the realized return on \zeta is is the constant R_\zeta = 1/D(\Omega). The discount of the zero coupon bond is D(\Omega) = 1/R_\zeta. In this case x is the discounted “expected value” of X.

References

Dunford, Nelson, and Jacob T Schwartz. 1958. Linear Operators i. Interscience Publishers.
Kolmogorov, Andrey. 1956. Foundations of the Theory of Probability. New York, US: Chelsea Publishing Company.
Lintner, John. 1965. “The Valuation of Risk Assets and the Selection of Risky Investments in Stock Portfolios and Capital Budgets.” The Review of Economics and Statistics 47 (1): 13–37. https://www.jstor.org/stable/1924119.
Markowitz, Harry. 1952. “Portfolio Selection.” The Journal of Finance 7 (1): 77–91.
Roy, A. D. 1952. “Safety First and the Holding of Assets.” Econometrica 20 (3): 431–49. https://www.jstor.org/stable/1907413.
Sharpe, William F. 1964. Capital Asset Prices: A Theory of Market Equilibrium under conditions of risk.” The Journal of Finance 19 (3): 425–42.
Treynor, Jack L. 1961. “Market Value, Time, and Risk.” SSRN Electronic Journal. http://www.ssrn.com/abstract=2600356.