Note: This page has been translated by MathWorks. Please click here

To view all translated materials including this page, select Japan from the country navigator on the bottom of this page.

To view all translated materials including this page, select Japan from the country navigator on the bottom of this page.

*Market risk* is the risk of losses in
positions arising from movements in market prices. Value-at-risk (VaR)
is one of the main measures of financial risk. VaR is an estimate
of how much value a portfolio can lose in a given time period with
a given confidence level. For example, if the 1-day 95% VaR of a portfolio
is 10MM, then there is a 95% chance that the portfolio loses less
than 10MM the following day. In other words, only 5% of the time (or
about once in 20 days) the portfolio losses exceed 10MM.

For many portfolios, especially trading portfolios, VaR is computed daily. At the closing of the following day, the actual profits and losses for the portfolio are known and can be compared to the VaR estimated the day before. You can use this daily data to assess the performance of VaR models, which is the goal of VaR backtesting. The performance of VaR models can be measured in different ways. In practice, many different metrics and statistical tests are used to identify VaR models that are performing poorly or performing better. As a best practice, use more than one criterion to backtest the performance of VaR models, because all tests have strengths and weaknesses.

Suppose that you have VaR limits and corresponding returns or
profits and losses for days *t* = 1,…,*N*.
Use VaR*t* to denote the VaR estimate for day *t* (determined
on day *t* − 1). Use *Rt* to
denote the actual return or profit and loss observed on day *t*.
Profits and losses are expressed in monetary units and represent value
changes in a portfolio. The corresponding VaR limits are also given
in monetary units. Returns represent the change in portfolio value
as a proportion (or percentage) of its value on the previous day.
The corresponding VaR limits are also given as a proportion (or percentage).
The VaR limits must be produced from existing VaR models. Then, to
perform a VaR backtesting analysis, provide these limits and their
corresponding returns as data inputs to the VaR backtesting tools
in Risk Management
Toolbox™.

The toolbox supports these VaR backtests:

Binomial test

Traffic light test

Kupiec’s tests

Christoffersen’s tests

Haas’s tests

The most straightforward test is to compare the observed number
of exceptions, *x*, to the expected number of exceptions.
From the properties of a binomial distribution, you can build a confidence
interval for the expected number of exceptions. Using exact probabilities
from the binomial distribution or a normal approximation, the `bin`

function
uses a normal approximation. By computing the probability of observing *x* exceptions,
you can compute the probability of wrongly rejecting a good model
when *x* exceptions occur. This is the *p*-value
for the observed number of exceptions *x*. For a
given test confidence level, a straightforward accept-or-reject result
in this case is to fail the VaR model whenever *x* is
outside the test confidence interval for the expected number of exceptions.
“Outside the confidence interval” can mean too many
exceptions, or too few exceptions. Too few exceptions might be a sign
that the VaR model is too conservative.

The test statistic is

$${Z}_{bin}=\frac{x-Np}{\sqrt{Np(1-p)}}$$

where *x* is the number of failures, *N* is
the number of observations, and *p* = `1`

–
VaR level. The binomial test is approximately distributed as a standard
normal distribution.

For more information, see References for Jorion and `bin`

.

A variation on the binomial test proposed by the Basel Committee
is the *traffic light test* or *three
zones test*. For a given number of exceptions *x*,
you can compute the probability of observing up to *x* exceptions.
That is, any number of exceptions from 0 to *x*,
or the cumulative probability up to *x*. The probability
is computed using a binomial distribution. The three zones are defined
as follows:

The “red” zone starts at the number of exceptions where this probability equals or exceeds 99.99%. It is unlikely that too many exceptions come from a correct VaR model.

The “yellow” zone covers the number of exceptions where the probability equals or exceeds 95% but is smaller than 99.99%. Even though there is a high number of violations, the violation count is not exceedingly high.

Everything below the yellow zone is "green." If you have too few failures, they fall in the green zone. Only too many failures lead to model rejections.

For more information, see References for Basel Committee on Banking Supervision and
`tl`

.

Kupiec (1995) introduced a variation on the binomial test called the proportion of
failures (POF) test. The POF test works with the binomial distribution approach. In
addition, it uses a likelihood ratio to test whether the probability of exceptions
is synchronized with the probability *p* implied by the VaR
confidence level. If the data suggests that the probability of exceptions is
different than *p*, the VaR model is rejected. The POF test
statistic is

$$L{R}_{POF}=-2\mathrm{log}\left(\frac{{\left(1-p\right)}^{N-x}{p}^{x}}{{\left(1-\frac{x}{N}\right)}^{N-x}{\left(\frac{x}{N}\right)}^{x}}\right)$$

where *x* is the number of failures, *N* the
number of observations and *p* = `1`

– VaR
level.

This statistic is asymptotically distributed as a chi-square variable with one degree of freedom. The VaR model fails the test if this likelihood ratio exceeds a critical value. The critical value depends on the test confidence level.

Kupiec also proposed a second test called the time until first failure (TUFF). The
TUFF test looks at when the first rejection occurred. If it happens too soon, the
test fails the VaR model. Checking only the first exception leaves much information
out, specifically, whatever happened after the first exception is ignored. The TBFI
test extends the TUFF approach to include all the failures. See `tbfi`

.

The TUFF test is also based on a likelihood ratio, but the underlying distribution
is a geometric distribution. If *n* is the number of days until the
first rejection, the test statistic is given by

$$L{R}_{TUFF}=-2\mathrm{log}\left(\frac{p{\left(1-p\right)}^{n-1}}{\left(\frac{1}{n}\right){\left(1-\frac{1}{n}\right)}^{n-1}}\right)$$

This statistic is asymptotically distributed as a chi-square variable with one
degree of freedom. For more information, see References for
Kupiec, `pof`

, and `tuff`

.

Christoffersen (1998) proposed a test to measure whether the probability of observing an exception on a particular day depends on whether an exception occurred. Unlike the unconditional probability of observing an exception, Christoffersen's test measures the dependency between consecutive days only. The test statistic for independence in Christoffersen’s interval forecast (IF) approach is given by

$$L{R}_{CCI}=-2\mathrm{log}\left(\frac{{\left(1-\pi \right)}^{n00+n10}{\pi}^{n01+n11}}{{\left(1-{\pi}_{0}\right)}^{n00}{\pi}_{0}^{n01}{\left(1-{\pi}_{1}\right)}^{n10}{\pi}_{1}^{n11}}\right)$$

where

*n*`00`

= Number of periods with no failures followed by a period with no failures.*n*`10`

= Number of periods with failures followed by a period with no failures.*n*`01`

= Number of periods with no failures followed by a period with failures.*n*`11`

= Number of periods with failures followed by a period with failures.

and

*π*_{0}— Probability of having a failure on period*t*, given that no failure occurred on period*t*− 1 =*n*`01`

/ (*n*`00`

+*n*`01`

)*π*_{1}— Probability of having a failure on period*t*, given that a failure occurred on period*t*− 1 =*n*`11`

/ (*n*`10`

+*n*`11`

)*π*— Probability of having a failure on period*t*= (*n*`01`

+*n*`11`

/ (*n*`00`

+*n*`01`

+*n*`10`

+*n*`11`

)

This statistic is asymptotically distributed as a chi-square with one degree of freedom. You can combine this statistic with the frequency POF test to get a conditional coverage (CC) mixed test:

`LR`

=
_{CC}`LR`

+
_{POF}`LR`

_{CCI}

This test is asymptotically distributed as a chi-square variable with two degrees of freedom.

For more information, see References for Christoffersen, `cc`

, and `cci`

.

Haas (2001) extended Kupiec’s TUFF test to incorporate the time information between all the exceptions in the sample. Haas’s test applies the TUFF test to each exception in the sample and aggregates the time between failures (TBF) test statistic.

$$L{R}_{TBFI}=-2{\displaystyle {\sum}_{i=1}^{x}\mathrm{log}}\left(\frac{p{\left(1-p\right)}^{{n}_{i}-1}}{\left(\frac{1}{{n}_{i}}\right){\left(1-\frac{1}{{n}_{i}}\right)}^{{n}_{i}-1}}\right)$$

In this statistic, *p* = `1`

– VaR level and
*n*_{i} is the number of
days between failures *i*-1 and *i* (or until the
first exception for *i* = 1). This statistic is asymptotically
distributed as a chi-square variable with *x* degrees of freedom,
where *x* is the number of failures.

Like Christoffersen’s test, you can combine this test with the frequency POF test to get a TBF mixed test, sometimes called Haas’ mixed Kupiec’s test:

$$L{R}_{TBF}=L{R}_{POF}+L{R}_{TBFI}$$

This test is asymptotically distributed as a chi-square variable with
*x*+1 degrees of freedom. For more information, see References for
Haas, `tbf`

, and `tbfi`

.

[1] Basel Committee on Banking Supervision, *Supervisory framework
for the use of “backtesting” in conjunction with the internal
models approach to market risk capital requirements.* January 1996,
http://www.bis.org/publ/bcbs22.htm.

[2] Christoffersen, P. "Evaluating Interval Forecasts."
*International Economic Review.* Vol. 39, 1998, pp.
841–862.

[3] Cogneau, P. *“Backtesting Value-at-Risk: how good is the
model?"* Intelligent Risk, PRMIA, July, 2015.

[4] Haas, M. *"New Methods in Backtesting."* Financial
Engineering, Research Center Caesar, Bonn, 2001.

[5] Jorion, P. *Financial Risk Manager Handbook.*
*6th Edition*, Wiley Finance, 2011.

[6] Kupiec, P. "Techniques for Verifying the Accuracy of Risk Management
Models." *Journal of Derivatives.* Vol. 3, 1995, pp.
73–84.

[7] McNeil, A., Frey, R., and Embrechts, P. *Quantitative Risk
Management.* Princeton University Press, 2005.

[8] Nieppola, O. “Backtesting Value-at-Risk Models.” Master's Thesis, Helsinki School of Economics, 2009.

`bin`

| `cc`

| `cci`

| `pof`

| `runtests`

| `summary`

| `tbf`

| `tbfi`

| `tl`

| `tuff`

| `varbacktest`

Was this topic helpful?