Time series and forecasting from wikipedia

Time Series and Forecasting

Compiled by M.Barros, D.Sc.

December 12th, 2012

Source: Wikipedia

PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information.
PDF generated at: Tue, 11 Dec 2012 03:49:39 UTC

Contents
Articles
Time series 1
Forecasting 8
Stationary process 14
Stochastic process 16
Covariance 20
Autocovariance 24
Autocorrelation 25
Cross-correlation 31
White noise 35
Random walk 41
Brownian motion 55
Wiener process 66
Autoregressive model 74
Moving average 80
Autoregressive–moving-average model 86
Fourier transform 90
Spectral density 110
Signal processing 116
Autoregressive conditional heteroskedasticity 118
Autoregressive integrated moving average 122
Volatility (finance) 124
Stable distribution 129
Mathematical finance 137
Stochastic differential equation 141
Brownian model of financial markets 145
Stochastic volatility 151
Black–Scholes 154
Black model 168
Black–Derman–Toy model 170
Cox–Ingersoll–Ross model 172
Monte Carlo method 173

References
Article Sources and Contributors 185

Image Sources, Licenses and Contributors 188

Article Licenses
License 190

AVAILABLE FREE OF CHARGE AT:
www.mbarros.com
http://mbarrosconsultoria.blogspot.com
http://mbarrosconsultoria2.blogspot.com

Time series 1

Time series
In statistics, signal processing, pattern recognition,
econometrics, mathematical finance, Weather
forecasting, Earthquake prediction,
Electroencephalography, Control engineering and
Communications engineering a time series is a
sequence of data points, measured typically at
successive time instants spaced at uniform time
intervals. Examples of time series are the daily closing
value of the Dow Jones index or the annual flow
volume of the Nile River at Aswan. Time series
analysis comprises methods for analyzing time series
data in order to extract meaningful statistics and other
Time series: random data plus trend, with best-fit line and different
characteristics of the data. Time series forecasting is
smoothings
the use of a model to predict future values based on
previously observed values. Time series are very
frequently plotted via line charts.

Time series data have a natural temporal ordering. This makes time series analysis distinct from other common data
analysis problems, in which there is no natural ordering of the observations (e.g. explaining people's wages by
reference to their respective education levels, where the individuals' data could be entered in any order). Time series
analysis is also distinct from spatial data analysis where the observations typically relate to geographical locations
(e.g. accounting for house prices by the location as well as the intrinsic characteristics of the houses). A stochastic
model for a time series will generally reflect the fact that observations close together in time will be more closely
related than observations further apart. In addition, time series models will often make use of the natural one-way
ordering of time so that values for a given period will be expressed as deriving in some way from past values, rather
than from future values (see time reversibility.)

Methods for time series analyses may be divided into two classes: frequency-domain methods and time-domain
methods. The former include spectral analysis and recently wavelet analysis; the latter include auto-correlation and
cross-correlation analysis.
Additionally time series analysis techniques may be divided into parametric and non-parametric methods. The
parametric approaches assume that the underlying stationary Stochastic process has a certain structure which can be
described using a small number of parameters (for example, using an autoregressive or moving average model). In
these approaches, the task is to estimate the parameters of the model that describes the stochastic process. By
contrast, non-parametric approaches explicitly estimate the covariance or the spectrum of the process without
assuming that the process has any particular structure.
Additionally methods of time series analysis may be divided into linear and non-linear, univariate and multivariate.
Time series analysis can be applied to:
• real-valued, continuous data
• discrete numeric data
• discrete symbolic data (i.e. sequences of characters, such as letters and words in English language[1]).

Time series 2

Analysis
There are several types of data analysis available for time series which are appropriate for different purposes.
In the context of statistics, econometrics, quantitative finance, seismology, meteorology, geophysics the primary goal
of time series analysis is forecasting, in the context of signal processing, control engineering and communication
engineering it is used for signal detection and estimation while in the context of data mining, pattern recognition and
machine learning time series analysis can be used for clustering, classification, query by content, anomaly detection
as well as forecasting.

Exploratory analysis
The clearest way to examine a regular time series manually is with a
line chart such as the one shown for tuberculosis in the United States,
made with a spreadsheet program. The number of cases was
standardized to a rate per 100,000 and the percent change per year in
this rate was calculated. The nearly steadily dropping line shows that
the TB incidence was decreasing in most years, but the percent change
in this rate varied by as much as +/- 10%, with 'surges' in 1975 and
around the early 1990s. The use of both vertical axes allows the
comparison of two time series in one graphic. Other techniques Tuberculosis incidence US 1953-2009

include:

• Autocorrelation analysis to examine serial dependence
• Spectral analysis to examine cyclic behaviour which need not be related to seasonality. For example, sun spot
activity varies over 11 year cycles.[2][3] Other common examples include celestial phenomena, weather patterns,
neural activity, commodity prices, and economic activity.
• Separation into components representing trend, seasonality, slow and fast variation, cyclical irregular: see
decomposition of time series
• Simple properties of marginal distributions

Prediction and forecasting
• Fully formed statistical models for stochastic simulation purposes, so as to generate alternative versions of the
time series, representing what might happen over non-specific time-periods in the future
• Simple or fully formed statistical models to describe the likely outcome of the time series in the immediate future,
given knowledge of the most recent outcomes (forecasting).
• Forecasting on time series is usually done using automated statistical software packages and programming
languages, such as R (programming language), S (programming language), SAS (software), SPSS, Minitab and
many others.

Time series 3

Classification
• Assigning time series pattern to a specific category, for example identify a word based on series of hand
movements in Sign language
See main article: Statistical classification

Regression analysis
• Estimating future value of a signal based on its previous behavior, e.g. predict the price of AAPL stock based on
its previous price movements for that hour, day or month, or predict position of Apollo 11 spacecraft at a certain
future moment based on its current trajectory (i.e. time series of its previous locations).[4]
• Regression analysis is usually based on statistical interpretation of time series properties in time domain,
pioneered by statisticians George Box and Gwilym Jenkins in the 50s: see Box–Jenkins
See main article: Regression analysis

Signal Estimation
• This approach is based on Harmonic analysis and filtering of signals in Frequency domain using Fourier
transform, and Spectral density estimation, the development of which was significantly accelerated during World
War II by mathematician Norbert Weiner, electrical engineers Rudolf E. Kálmán, Dennis Gabor and others for
filtering signal from noise and predicting signal value at a certain point in time, see Kalman Filter, Estimation
theory and Digital Signal Processing

Models
Models for time series data can have many forms and represent different stochastic processes. When modeling
variations in the level of a process, three broad classes of practical importance are the autoregressive (AR) models,
the integrated (I) models, and the moving average (MA) models. These three classes depend linearly[5] on previous
data points. Combinations of these ideas produce autoregressive moving average (ARMA) and autoregressive
integrated moving average (ARIMA) models. The autoregressive fractionally integrated moving average (ARFIMA)
model generalizes the former three. Extensions of these classes to deal with vector-valued data are available under
the heading of multivariate time-series models and sometimes the preceding acronyms are extended by including an
initial "V" for "vector". An additional set of extensions of these models is available for use where the observed
time-series is driven by some "forcing" time-series (which may not have a causal effect on the observed series): the
distinction from the multivariate case is that the forcing series may be deterministic or under the experimenter's
control. For these models, the acronyms are extended with a final "X" for "exogenous".
Non-linear dependence of the level of a series on previous data points is of interest, partly because of the possibility
of producing a chaotic time series. However, more importantly, empirical investigations can indicate the advantage
of using predictions derived from non-linear models, over those from linear models, as for example in nonlinear
autoregressive exogenous models.
Among other types of non-linear time series models, there are models to represent the changes of variance along
time (heteroskedasticity). These models represent autoregressive conditional heteroskedasticity (ARCH) and the
collection comprises a wide variety of representation (GARCH, TARCH, EGARCH, FIGARCH, CGARCH, etc.).
Here changes in variability are related to, or predicted by, recent past values of the observed series. This is in
contrast to other possible representations of locally varying variability, where the variability might be modelled as
being driven by a separate time-varying process, as in a doubly stochastic model.
In recent work on model-free analyses, wavelet transform based methods (for example locally stationary wavelets
and wavelet decomposed neural networks) have gained favor. Multiscale (often referred to as multiresolution)
techniques decompose a given time series, attempting to illustrate time dependence at multiple scales. See also

Time series 4

Markov switching multifractal (MSMF) techniques for modeling volatility evolution.

Notation
A number of different notations are in use for time-series analysis. A common notation specifying a time series X
that is indexed by the natural numbers is written
X = {X1, X2, ...}.
Another common notation is
Y = {Yt: t ∈ T},
where T is the index set.

Conditions
There are two sets of conditions under which much of the theory is built:
• Stationary process
• Ergodic process
However, ideas of stationarity must be expanded to consider two important ideas: strict stationarity and second-order
stationarity. Both models and applications can be developed under each of these conditions, although the models in
the latter case might be considered as only partly specified.
In addition, time-series analysis can be applied where the series are seasonally stationary or non-stationary.
Situations where the amplitudes of frequency components change with time can be dealt with in time-frequency
analysis which makes use of a time–frequency representation of a time-series or signal.[6]

Models
The general representation of an autoregressive model, well known as AR(p), is

where the term εt is the source of randomness and is called white noise. It is assumed to have the following
characteristics:
•
•
•
With these assumptions, the process is specified up to second-order moments and, subject to conditions on the
coefficients, may be second-order stationary.
If the noise also has a normal distribution, it is called normal or Gaussian white noise. In this case, the AR process
may be strictly stationary, again subject to conditions on the coefficients.
Tools for investigating time-series data include:
• Consideration of the autocorrelation function and the spectral density function (also cross-correlation functions
and cross-spectral density functions)
• Scaled cross- and auto-correlation functions[7]
• Performing a Fourier transform to investigate the series in the frequency domain
• Use of a filter to remove unwanted noise
• Principal components analysis (or empirical orthogonal function analysis)
• Singular spectrum analysis
• "Structural" models:

Time series 5

• General State Space Models
• Unobserved Components Models
• Machine Learning
• Artificial neural networks
• Support Vector Machine
• Fuzzy Logic
• Hidden Markov model
• Control chart
• Shewhart individuals control chart
• CUSUM chart
• EWMA chart
• Detrended fluctuation analysis
• Dynamic time warping
• Dynamic Bayesian network
• Time-frequency analysis techniques:
• Fast Fourier Transform
• Continuous wavelet transform
• Short-time Fourier transform
• Chirplet transform
• Fractional Fourier transform
• Chaotic analysis
• Correlation dimension
• Recurrence plots
• Recurrence quantification analysis
• Lyapunov exponents
• Entropy encoding

Measures
Time series metrics or features that can be used for time series classification or regression analysis[8]:
• Univariate linear measures
• Moment (mathematics)
• Spectral band power
• Spectral edge frequency
• Accumulated Energy (signal processing)
• Characteristics of the autocorrelation function
• Hjorth parameters
• FFT parameters
• Autoregressive model parameters
• Univariate non-linear measures
• Measures based on the correlation sum
• Correlation dimension
• Correlation integral
• Correlation density
• Correlation entropy

Time series 6

• Approximate Entropy[9]
• Sample Entropy
• Fourier entropy
• Wavelet entropy
• Rényi entropy
• Higher-order methods
• Marginal predictability
• Dynamical similarity index
• State space dissimilarity measures
• Lyapunov exponent
• Permutation methods
• Local flow
• Other univariate measures
• Algorithmic complexity
• Kolmogorov complexity estimates
• Hidden Markov Model states
• Surrogate time series and surrogate correction
• Loss of recurrence (degree of non-stationarity)
• Bivariate linear measures
• Maximum linear cross-correlation
• Linear Coherence (signal processing)
• Bivariate non-linear measures
• Non-linear interdependence
• Dynamical Entrainment (physics)
• Measures for Phase synchronization
• Similarity measures[10]:
• Dynamic Time Warping
• Hidden Markov Models
• Edit distance
• Total correlation
• Newey–West estimator
• Prais-Winsten transformation
• Data as Vectors in a Metrizable Space
• Minkowski distance
• Mahalanobis distance
• Data as Time Series with Envelopes
• Global Standard Deviation
• Local Standard Deviation
• Windowed Standard Deviation
• Data Interpreted as Stochastic Series
• Pearson product-moment correlation coefficient
• Spearman's rank correlation coefficient
• Data Interpreted as a Probability Distribution Function
• Kolmogorov-Smirnov test
• Cramér-von Mises criterion

Time series 7

References
[1] Lin, Jessica and Keogh, Eamonn and Lonardi, Stefano and Chiu, Bill. A symbolic representation of time series, with implications for
streaming algorithms. Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, 2003.
url: http:/ / doi. acm. org/ 10. 1145/ 882082. 882086
[2] Bloomfield, P. (1976). Fourier analysis of time series: An introduction. New York: Wiley.
[3] Shumway, R. H. (1988). Applied statistical time series analysis. Englewood Cliffs, NJ: Prentice Hall.
[4] Lawson, Charles L., Hanson, Richard, J. (1987). Solving Least Squares Problems. Society for Industrial and Applied Mathematics, 1987.
[5] Gershenfeld, N. (1999). The nature of mathematical modeling. p.205-08
[6] Boashash, B. (ed.), (2003) Time-Frequency Signal Analysis and Processing: A Comprehensive Reference, Elsevier Science, Oxford, 2003
ISBN ISBN 0-08-044335-4
[7] Nikolić D, Muresan RC, Feng W, Singer W (2012) Scaled correlation analysis: a better way to compute a cross-correlogram. European
Journal of Neuroscience, pp. 1–21, doi:10.1111/j.1460-9568.2011.07987.x http:/ / www. danko-nikolic. com/ wp-content/ uploads/ 2012/ 03/
Scaled-correlation-analysis. pdf
[8] Mormann, Florian and Andrzejak, Ralph G. and Elger, Christian E. and Lehnertz, Klaus. 'Seizure prediction: the long and winding road.
Brain, 2007,130 (2): 314-33.url : http:/ / brain. oxfordjournals. org/ content/ 130/ 2/ 314. abstract
[9] Land, Bruce and Elias, Damian. Measuring the "Complexity" of a time series. URL: http:/ / www. nbb. cornell. edu/ neurobio/ land/
PROJECTS/ Complexity/
[10] Ropella, G.E.P.; Nag, D.A.; Hunt, C.A.; , "Similarity measures for automated comparison of in silico and in vitro experimental results,"
Engineering in Medicine and Biology Society, 2003. Proceedings of the 25th Annual International Conference of the IEEE , vol.3, no., pp.
2933- 2936 Vol.3, 17-21 Sept. 2003 doi: 10.1109/IEMBS.2003.1280532 URL: http:/ / ieeexplore. ieee. org/ stamp/ stamp. jsp?tp=&
arnumber=1280532& isnumber=28615

Further reading
• Bloomfield, P. (1976). Fourier analysis of time series: An introduction. New York: Wiley.
• Box, George; Jenkins, Gwilym (1976), Time series analysis: forecasting and control, rev. ed., Oakland,
California: Holden-Day
• Brillinger, D. R. (1975). Time series: Data analysis and theory. New York: Holt, Rinehart. & Winston.
• Brigham, E. O. (1974). The fast Fourier transform. Englewood Cliffs, NJ: Prentice-Hall.
• Elliott, D. F., & Rao, K. R. (1982). Fast transforms: Algorithms, analyses, applications. New York: Academic
Press.
• Gershenfeld, Neil (2000), The nature of mathematical modeling, Cambridge: Cambridge Univ. Press,
ISBN 978-0-521-57095-4, OCLC 174825352
• Hamilton, James (1994), Time Series Analysis, Princeton: Princeton Univ. Press, ISBN 0-691-04289-6
• Jenkins, G. M., & Watts, D. G. (1968). Spectral analysis and its applications. San Francisco: Holden-Day.
• Priestley, M. B. (1981). Spectral Analysis and Time Series. London: Academic Press. ISBN 978-0-12-564901-8
• Shasha, D. (2004), High Performance Discovery in Time Series, Berlin: Springer, ISBN 0-387-00857-8
• Shumway, R. H. (1988). Applied statistical time series analysis. Englewood Cliffs, NJ: Prentice Hall.
• Wiener, N.(1964). Extrapolation, Interpolation, and Smoothing of Stationary Time Series.The MIT Press.
• Wei, W. W. (1989). Time series analysis: Univariate and multivariate methods. New York: Addison-Wesley.
• Weigend, A. S., and N. A. Gershenfeld (Eds.) (1994) Time Series Prediction: Forecasting the Future and
Understanding the Past. Proceedings of the NATO Advanced Research Workshop on Comparative Time Series
Analysis (Santa Fe, May 1992) MA: Addison-Wesley.
• Durbin J., and Koopman S.J. (2001) Time Series Analysis by State Space Methods. Oxford University Press.

Time series 8

External links
• A First Course on Time Series Analysis (http://statistik.mathematik.uni-wuerzburg.de/timeseries/) - an open
source book on time series analysis with SAS
• Introduction to Time series Analysis (Engineering Statistics Handbook) (http://www.itl.nist.gov/div898/
handbook/pmc/section4/pmc4.htm) - a practical guide to Time series analysis
• MATLAB Toolkit for Computation of Multiple Measures on Time Series Data Bases (http://www.jstatsoft.org/
v33/i05/paper)

Forecasting
Forecasting is the process of making statements about events whose actual outcomes (typically) have not yet been
observed. A commonplace example might be estimation of some variable of interest at some specified future date.
Prediction is a similar, but more general term. Both might refer to formal statistical methods employing time series,
cross-sectional or longitudinal data, or alternatively to less formal judgemental methods. Usage can differ between
areas of application: for example, in hydrology, the terms "forecast" and "forecasting" are sometimes reserved for
estimates of values at certain specific future times, while the term "prediction" is used for more general estimates,
such as the number of times floods will occur over a long period.
Risk and uncertainty are central to forecasting and prediction; it is generally considered good practice to indicate the
degree of uncertainty attaching to forecasts. In any case, the data must be up to date in order for the forecast to be as
accurate as possible.[1]
Although quantitative analysis can be very precise, it is not always appropriate. Some experts in the field of
forecasting have advised against the use of mean square error to compare forecasting methods.[2]

Categories of forecasting methods

Qualitative vs. quantitative methods
Qualitative forecasting techniques are subjective, based on the opinion and judgment of consumers, experts;
appropriate when past data is not available. It is usually applied to intermediate-long range decisions. Examples of
qualitative forecasting methods are: informed opinion and judgment, the Delphi method, market research, historical
life-cycle analogy.
Quantitative forecasting models are used to estimate future demands as a function of past data; appropriate when past
data are available. The method is usually applied to short-intermediate range decisions. Examples of quantitative
forecasting methods are: last period demand, simple and weighted moving averages (N-Period), simple exponential
smoothing, multiplicative seasonal indexes.

Forecasting 9

Naïve approach
Naïve forecasts are the most cost-effective and efficient objective forecasting model, and provide a benchmark
against which more sophisticated models can be compared. For stable time series data, this approach says that the
forecast for any period equals the previous period's actual value.

Reference class forecasting
Reference class forecasting was developed by Oxford professor Bent Flyvbjerg to eliminate or reduce bias in
forecasting by focusing on distributional information about past, similar outcomes to that being forecasted.[3] Daniel
Kahneman, Nobel Prize winner in economics, calls Flyvbjerg's counsel to use reference class forecasting to de-bias
forecasts, "the single most important piece of advice regarding how to increase accuracy in forecasting.”[4]

Time series methods
Time series methods use historical data as the basis of estimating future outcomes.
• Moving average
• Weighted moving average
• Kalman filtering
• Exponential smoothing
• Autoregressive moving average (ARMA)
• Autoregressive integrated moving average (ARIMA)
e.g. Box-Jenkins
• Extrapolation
• Linear prediction
• Trend estimation
• Growth curve

Causal / econometric forecasting methods
Some forecasting methods use the assumption that it is possible to identify the underlying factors that might
influence the variable that is being forecast. For example, including information about weather conditions might
improve the ability of a model to predict umbrella sales. This is a model of seasonality which shows a regular pattern
of up and down fluctuations. In addition to weather, seasonality can also be due to holidays and customs such as
predicting that sales in college football apparel will be higher during football season as opposed to the off season.[5]
Casual forecasting methods are also subject to the discretion of the forecaster. There are several informal methods
which do not have strict algorithms, but rather modest and unstructured guidance. One can forecast based on, for
example, linear relationships. If one variable is linearly related to the other for a long enough period of time, it may
be beneficial to predict such a relationship in the future. This is quite different from the aforementioned model of
seasonality whose graph would more closely resemble a sine or cosine wave. The most important factor when
performing this operation is using concrete and substantiated data. Forecasting off of another forecast produces
inconclusive and possibly erroneous results.
Such methods include:
• Regression analysis includes a large group of methods that can be used to predict future values of a variable using
information about other variables. These methods include both parametric (linear or non-linear) and
non-parametric techniques.
• Autoregressive moving average with exogenous inputs (ARMAX)[6]

Forecasting 10

Judgmental methods
Judgmental forecasting methods incorporate intuitive judgements, opinions and subjective probability estimates.
• Composite forecasts
• Delphi method
• Forecast by analogy
• Scenario building
• Statistical surveys
• Technology forecasting

Artificial intelligence methods
• Artificial neural networks
• Group method of data handling
• Support vector machines
Often these are done today by specialized programs loosely labeled
• Data mining
• Machine Learning
• Pattern Recognition

Other methods
• Simulation
• Prediction market
• Probabilistic forecasting and Ensemble forecasting

Forecasting accuracy
The forecast error is the difference between the actual value and the forecast value for the corresponding period.

where E is the forecast error at period t, Y is the actual value at period t, and F is the forecast for period t.
Measures of aggregate error:

Mean absolute error (MAE)

Mean Absolute Percentage Error (MAPE)

Mean Absolute Deviation (MAD)

Percent Mean Absolute Deviation (PMAD)

Mean squared error (MSE)

Root Mean squared error (RMSE)

Forecast skill (SS)

Average of Errors (E)

Forecasting 11

Business forecasters and practitioners sometimes use different terminology in the industry. They refer to the PMAD
as the MAPE, although they compute this as a volume weighted MAPE. For more information see Calculating
demand forecast accuracy.
Reference class forecasting was developed to increase forecasting accuracy by framing the forecasting problem so as
to take into account available distributional information.[7] Daniel Kahneman, winner of the Nobel Prize in
economics, calls the use of reference class forecasting "the single most important piece of advice regarding how to
increase accuracy in forecasting.”[8] Forecasting accuracy, in contrary to belief, cannot be increased by the addition
of experts in the subject area relevant to the phenomenon to be forecast.[9]
See also
• Calculating demand forecast accuracy
• Consensus forecasts
• Forecast error
• Predictability
• Prediction intervals, similar to confidence intervals
• Reference class forecasting

Applications of forecasting
The process of climate change and increasing energy prices has led to the usage of Egain Forecasting of buildings.
The method uses forecasting to reduce the energy needed to heat the building, thus reducing the emission of
greenhouse gases. Forecasting is used in the practice of Customer Demand Planning in every day business
forecasting for manufacturing companies. Forecasting has also been used to predict the development of conflict
situations. Experts in forecasting perform research that use empirical results to gauge the effectiveness of certain
forecasting models.[10] Research has shown that there is little difference between the accuracy of forecasts performed
by experts knowledgeable of the conflict situation of interest and that performed by individuals who knew much
less.[11]
Similarly, experts in some studies argue that role thinking does not contribute to the accuracy of the forecast.[12] The
discipline of demand planning, also sometimes referred to as supply chain forecasting, embraces both statistical
forecasting and a consensus process. An important, albeit often ignored aspect of forecasting, is the relationship it
holds with planning. Forecasting can be described as predicting what the future will look like, whereas planning
predicts what the future should look like.[13][14] There is no single right forecasting method to use. Selection of a
method should be based on your objectives and your conditions (data etc.).[15] A good place to find a method, is by
visiting a selection tree. An example of a selection tree can be found here.[16] Forecasting has application in many
situations:
• Supply chain management - Forecasting can be used in Supply Chain Management to make sure that the right
product is at the right place at the right time. Accurate forecasting will help retailers reduce excess inventory and
therefore increase profit margin. Studies have shown that extrapolations are the least accurate, while company
earnings forecasts are the most reliable.[17] Accurate forecasting will also help them meet consumer demand.
• Economic forecasting
• Earthquake prediction
• Egain Forecasting
• Land use forecasting
• Player and team performance in sports
• Political Forecasting
• Product forecasting
• Sales Forecasting
• Technology forecasting

Forecasting 12

• Telecommunications forecasting
• Transport planning and Transportation forecasting
• Weather forecasting, Flood forecasting and Meteorology

Limitations
As proposed by Edward Lorenz in 1963, long range weather forecasts, those made at a range of two weeks or more,
are impossible to definitively predict the state of the atmosphere, owing to the chaotic nature of the fluid dynamics
equations involved. Extremely small errors in the initial input, such as temperatures and winds, within numerical
models doubles every five days.[18]

References
[1] Scott Armstrong, Fred Collopy, Andreas Graefe and Kesten C. Green (2010 (last updated)). "Answers to Frequently Asked Questions" (http:/
/ qbox. wharton. upenn. edu/ documents/ mktg/ research/ FAQ. pdf). .
[2] J. Scott Armstrong and Fred Collopy (1992). "Error Measures For Generalizing About Forecasting Methods: Empirical Comparisons" (http:/ /
marketing. wharton. upenn. edu/ ideas/ pdf/ armstrong2/ armstrong-errormeasures-empirical. pdf). International Journal of Forecasting 8:
69–80. .
[3] Flyvbjerg, B. (2008). "Curbing Optimism Bias and Strategic Misrepresentation in Planning: Reference Class Forecasting in Practice" (http:/ /
www. sbs. ox. ac. uk/ centres/ bt/ Documents/ Curbing Optimism Bias and Strategic Misrepresentation. pdf). European Planning Studies 16
(1): 3–21. .
[4] Daniel Kahneman, 2011, Thinking, Fast and Slow (New York: Farrar, Straus and Giroux), p. 251
[5] Nahmias, Steven (2009). Production and Operations Analysis.
[6] Ellis, Kimberly (2008). Production Planning and Inventory Control Virginia Tech. McGraw Hill. ISBN 978-0-390-87106-0.
[7] Flyvbjerg, B. (2008) "Curbing Optimism Bias and Strategic Misrepresentation in Planning: Reference Class Forecasting in Practice." (http:/ /
www. sbs. ox. ac. uk/ centres/ bt/ Documents/ Curbing Optimism Bias and Strategic Misrepresentation. pdf) European Planning Studies,16
(1), 3-21.]
[8] Daniel Kahneman (2011) Thinking, Fast and Slow (New York: Farrar, Straus and Giroux) (p. 251)
[9] J. Scott Armstrong (1980). "The Seer-Sucker Theory: The Value of Experts in Forecasting" (http:/ / www. forecastingprinciples. com/
paperpdf/ seersucker. pdf). Technology Review: 16–24. .
[10] J. Scott Armstrong, Kesten C. Green and Andreas Graefe (2010). "Answers to Frequently Asked Questions" (http:/ / qbox. wharton. upenn.
edu/ documents/ mktg/ research/ FAQ. pdf). .
[11] Kesten C. Greene and J. Scott Armstrong (2007). "The Ombudsman: Value of Expertise for Forecasting Decisions in Conflicts" (http:/ /
marketing. wharton. upenn. edu/ documents/ research/ Value of expertise. pdf). Interfaces (INFORMS) 0: 1–12. .
[12] Kesten C. Green and J. Scott Armstrong (1975). "Role thinking: Standing in other people’s shoes to forecast decisions in conflicts" (http:/ /
www. forecastingprinciples. com/ paperpdf/ Escalation Bias. pdf). Role thinking: Standing in other people’s shoes to forecast decisions in
conflicts 39: 111–116. .
[13] "FAQ" (http:/ / www. forecastingprinciples. com/ index. php?option=com_content& task=view& id=3& Itemid=3).
Forecastingprinciples.com. 1998-02-14. . Retrieved 2012-08-28.
[14] Kesten C. Greene and J. Scott Armstrong. 2015.pdf "Structured analogies for forecasting" (http:/ / www. qbox. wharton. upenn. edu/
documents/ mktg/ research/ INTFOR3581 - Publication%) (PDF). qbox.wharton.upenn.edu. 2015.pdf.
[15] "FAQ" (http:/ / www. forecastingprinciples. com/ index. php?option=com_content& task=view& id=3& Itemid=3#D.
_Choosing_the_best_method). Forecastingprinciples.com. 1998-02-14. . Retrieved 2012-08-28.
[16] "Selection Tree" (http:/ / www. forecastingprinciples. com/ index. php?option=com_content& task=view& id=17& Itemid=17).
Forecastingprinciples.com. 1998-02-14. . Retrieved 2012-08-28.
[17] J. Scott Armstrong (1983). "Relative Accuracy of Judgmental and Extrapolative Methods in Forecasting Annual Earnings" (http:/ / www.
forecastingprinciples. com/ paperpdf/ Monetary Incentives. pdf). Journal of Forecasting 2: 437–447. .
[18] Cox, John D. (2002). Storm Watchers. John Wiley & Sons, Inc.. pp. 222–224. ISBN 0-471-38108-X.

• Armstrong, J. Scott (ed.) (2001) (in English). Principles of forecasting: a handbook for researchers and
practitioners. Norwell, Massachusetts: Kluwer Academic Publishers. ISBN 0-7923-7930-6.
• Flyvbjerg, Bent, 2008, "Curbing Optimism Bias and Strategic Misrepresentation in Planning: Reference Class
Forecasting in Practice," European Planning Studies, vol. 16, no. 1, January, pp. 3-21. (http://www.sbs.ox.ac.
uk/centres/bt/Documents/Curbing Optimism Bias and Strategic Misrepresentation.pdf)
• Ellis, Kimberly (2010) (in English). Production Planning and Inventory Control. McGraw-Hill.
ISBN 0-412-03471-9.

Forecasting 13

• Geisser, Seymour (1 June 1993) (in English). Predictive Inference: An Introduction. Chapman & Hall, CRC
Press. ISBN 0-390-87106-0.
• Gilchrist, Warren (1976) (in English). Statistical Forecasting. London: John Wiley & Sons. ISBN 0-471-99403-0.
• Hyndman, R.J., Koehler, A.B (2005) "Another look at measures of forecast accuracy" (http://www.
robjhyndman.com/papers/mase.pdf), Monash University note.
• Makridakis, Spyros; Wheelwright, Steven; Hyndman, Rob J. (1998) (in English). Forecasting: methods and
applications (http://www.robjhyndman.com/forecasting/). New York: John Wiley & Sons.
ISBN 0-471-53233-9.
• Kress, George J.; Snyder, John (30 May 1994) (in English). Forecasting and market analysis techniques: a
practical approach. Westport, Connecticut, London: Quorum Books. ISBN 0-89930-835-X.
• Rescher, Nicholas (1998) (in English). Predicting the future: An introduction to the theory of forecasting. State
University of New York Press. ISBN 0-7914-3553-9.
• Taesler, R. (1990/91) Climate and Building Energy Management. Energy and Buildings, Vol. 15-16, pp 599 –
608.
• Turchin, P. (2007) "Scientific Prediction in Historical Sociology: Ibn Khaldun meets Al Saud". In: History &
Mathematics: Historical Dynamics and Development of Complex Societies. (http://edurss.ru/cgi-bin/db.
pl?cp=&page=Book&id=53185&lang=en&blang=en&list=Found) Moscow: KomKniga. ISBN
978-5-484-01002-8
• Sasic Kaligasidis, A et al. (2006) Upgraded weather forecast control of building heating systems. p. 951 ff in
Research in Building Physics and Building Engineering Paul Fazio (Editorial Staff), ISBN 0-415-41675-2
• United States Patent 6098893 Comfort control system incorporating weather forecast data and a method for
operating such a system (Inventor Stefan Berglund)

External links
• Forecasting Principles: "Evidence-based forecasting" (http://www.forecastingprinciples.com)
• International Institute of Forecasters (http://www.forecasters.org)
• Introduction to Time series Analysis (Engineering Statistics Handbook) (http://www.itl.nist.gov/div898/
handbook/pmc/section4/pmc4.htm) - A practical guide to Time series analysis and forecasting
• Time Series Analysis (http://www.statsoft.com/textbook/sttimser.html)
• Global Forecasting with IFs (http://www.ifs.du.edu)
• Earthquake Electromagnetic Precursor Research (http://www.quakefinder.com)


Stationary process
In mathematics, a stationary process (or strict(ly) stationary process or strong(ly) stationary process) is a
stochastic process whose joint probability distribution does not change when shifted in time or space. Consequently,
parameters such as the mean and variance, if they exist, also do not change over time or position.
Stationarity is used as a tool in time series analysis, where the raw data are often transformed to become stationary;
for example, economic data are often seasonal and/or dependent on a non-stationary price level. An important type
of non-stationary process that does not include a trend-like behavior is the cyclostationary process.
Note that a "stationary process" is not the same thing as a "process with a stationary distribution". Indeed there are
further possibilities for confusion with the use of "stationary" in the context of stochastic processes; for example a
"time-homogeneous" Markov chain is sometimes said to have "stationary transition probabilities". On the other
hand, all stationary Markov random processes are time-homogeneous.

Definition
Formally, let be a stochastic process and let represent the cumulative distribution
function of the joint distribution of at times . Then, is said to be stationary if,
for all , for all , and for all ,

Since does not affect , is not a function of time.

Examples
As an example, white noise is stationary. The sound of a cymbal
clashing, if hit only once, is not stationary because the acoustic power
of the clash (and hence its variance) diminishes with time. However, it
would be possible to invent a stochastic process describing when the
cymbal is hit, such that the overall response would form a stationary
process.

An example of a discrete-time stationary process where the sample
space is also discrete (so that the random variable may take one of N
possible values) is a Bernoulli scheme. Other examples of a
discrete-time stationary process with continuous sample space include
some autoregressive and moving average processes which are both
subsets of the autoregressive moving average model. Models with a Two simulated time series processes, one
non-trivial autoregressive component may be either stationary or stationary the other non-stationary. The
Augmented Dickey–Fuller test is reported for
non-stationary, depending on the parameter values, and important
each process and non-stationarity cannot be
non-stationary special cases are where unit roots exist in the model. rejected for the second process.

Let Y be any scalar random variable, and define a time-series { Xt }, by
.
Then { Xt } is a stationary time series, for which realisations consist of a series of constant values, with a different
constant value for each realisation. A law of large numbers does not apply on this case, as the limiting value of an
average from a single realisation takes the random value determined by Y, rather than taking the expected value of Y.
As a further example of a stationary process for which any single realisation has an apparently noise-free structure,
let Y have a uniform distribution on (0,2π] and define the time series { Xt } by


Then { Xt } is strictly stationary.

Weaker forms of stationarity

Weak or wide-sense stationarity
A weaker form of stationarity commonly employed in signal processing is known as weak-sense stationarity,
wide-sense stationarity (WSS) or covariance stationarity. WSS random processes only require that 1st moment
and covariance do not vary with respect to time. Any strictly stationary process which has a mean and a covariance is
also WSS.
So, a continuous-time random process x(t) which is WSS has the following restrictions on its mean function

and autocovariance function

The first property implies that the mean function mx(t) must be constant. The second property implies that the
covariance function depends only on the difference between and and only needs to be indexed by one variable
rather than two variables. Thus, instead of writing,

the notation is often abbreviated and written as:

This also implies that the autocorrelation depends only on , since

When processing WSS random signals with linear, time-invariant (LTI) filters, it is helpful to think of the correlation
function as a linear operator. Since it is a circulant operator (depends only on the difference between the two
arguments), its eigenfunctions are the Fourier complex exponentials. Additionally, since the eigenfunctions of LTI
operators are also complex exponentials, LTI processing of WSS random signals is highly tractable—all
computations can be performed in the frequency domain. Thus, the WSS assumption is widely employed in signal
processing algorithms.

Second-order stationarity
The case of second-order stationarity arises when the requirements of strict stationarity are only applied to pairs of
random variables from the time-series. The definition of second order stationarity can be generalized to Nth order
(for finite N) and strict stationary means stationary of all orders.
A process is second order stationary if the first and second order density functions satisfy

for all , , and . Such a process will be wide sense stationary if the mean and correlation functions are finite.
A process can be wide sense stationary without being second order stationary.


Other terminology
The terminology used for types of stationarity other than strict stationarity can be rather mixed. Some examples
follow.
• Priestley[1][2] uses stationary up to order m if conditions similar to those given here for wide sense
stationarity apply relating to moments up to order m. Thus wide sense stationarity would be equivalent to
"stationary to order 2", which is different from the definition of second-order stationarity given here.
• Honarkhah[3] also uses the assumption of stationarity in the context of multiple-point geostatistics, where
higher n-point statistics are assumed to be stationary in the spatial domain.

References
[1] Priestley, M.B. (1981) Spectral Analysis and Time Series, Academic Press. ISBN 0-12-564922-3
[2] Priestley, M.B. (1988) Non-linear and Non-stationary Time Series Analysis, Academic Press. ISBN 0-12-564911-8
[3] Honarkhah, M and Caers, J, 2010, Stochastic Simulation of Patterns Using Distance-Based Pattern Modeling (http:/ / dx. doi. org/ 10. 1007/
s11004-010-9276-7), Mathematical Geosciences, 42: 487 - 517

External links
• Spectral decomposition of a random function (Springer) (http://eom.springer.de/s/s086360.htm)

Stochastic process
In probability theory, a stochastic process (pronunciation: /stəʊˈkæstɪk/), or sometimes random process (widely used)
is a collection of random variables; this is often used to represent the evolution of some random value, or system,
over time. This is the probabilistic counterpart to a deterministic process (or deterministic system). Instead of
describing a process which can only evolve in one way (as in the case, for example, of solutions of an ordinary
differential equation), in a stochastic or random process there is some indeterminacy: even if the initial condition (or
starting point) is known, there are several (often infinitely many) directions in which the process may evolve.
In the simple case of discrete time, a stochastic process amounts to a sequence of random variables known as a time
series (for example, see Markov chain). Another basic type of a stochastic process is a random field, whose domain
is a region of space, in other words, a random function whose arguments are drawn from a range of continuously
changing values. One approach to stochastic processes treats them as functions of one or several deterministic
arguments (inputs, in most cases regarded as time) whose values (outputs) are random variables: non-deterministic
(single) quantities which have certain probability distributions. Random variables corresponding to various times (or
points, in the case of random fields) may be completely different. The main requirement is that these different
random quantities all have the same type. Type refers to the codomain of the function. Although the random values
of a stochastic process at different times may be independent random variables, in most commonly considered
situations they exhibit complicated statistical correlations.
Familiar examples of processes modeled as stochastic time series include stock market and exchange rate
fluctuations, signals such as speech, audio and video, medical data such as a patient's EKG, EEG, blood pressure or
temperature, and random movement such as Brownian motion or random walks. Examples of random fields include
static images, random terrain (landscapes), wind waves or composition variations of a heterogeneous material.


Formal definition and basic properties

Definition
Given a probability space and a measurable space , an S-valued stochastic process is a
collection of S-valued random variables on , indexed by a totally ordered set T ("time"). That is, a stochastic
process X is a collection

where each is an S-valued random variable on . The space S is then called the state space of the process.

Finite-dimensional distributions
Let X be an S-valued stochastic process. For every finite subset , the k-tuple
is a random variable taking values in . The distribution
of this random variable is a probability measure on . This is called a finite-dimensional distribution of X.
Under suitable topological restrictions, a suitably "consistent" collection of finite-dimensional distributions can be
used to define a stochastic process (see Kolmogorov extension in the next section).

Construction
In the ordinary axiomatization of probability theory by means of measure theory, the problem is to construct a
sigma-algebra of measurable subsets of the space of all functions, and then put a finite measure on it. For this
purpose one traditionally uses a method called Kolmogorov extension.[1]
There is at least one alternative axiomatization of probability theory by means of expectations on C-star algebras of
random variables. In this case the method goes by the name of Gelfand–Naimark–Segal construction.
This is analogous to the two approaches to measure and integration, where one has the choice to construct measures
of sets first and define integrals later, or construct integrals first and define set measures as integrals of characteristic
functions.

Kolmogorov extension
The Kolmogorov extension proceeds along the following lines: assuming that a probability measure on the space of
all functions exists, then it can be used to specify the joint probability distribution of
finite-dimensional random variables . Now, from this n-dimensional probability distribution we
can deduce an (n − 1)-dimensional marginal probability distribution for . Note that the
obvious compatibility condition, namely, that this marginal probability distribution be in the same class as the one
derived from the full-blown stochastic process, is not a requirement. Such a condition only holds, for example, if the
stochastic process is a Wiener process (in which case the marginals are all gaussian distributions of the exponential
class) but not in general for all stochastic processes. When this condition is expressed in terms of probability
densities, the result is called the Chapman–Kolmogorov equation.
The Kolmogorov extension theorem guarantees the existence of a stochastic process with a given family of
finite-dimensional probability distributions satisfying the Chapman–Kolmogorov compatibility condition.


Separability, or what the Kolmogorov extension does not provide
Recall that in the Kolmogorov axiomatization, measurable sets are the sets which have a probability or, in other
words, the sets corresponding to yes/no questions that have a probabilistic answer.
The Kolmogorov extension starts by declaring to be measurable all sets of functions where finitely many coordinates
are restricted to lie in measurable subsets of . In other words, if a yes/no question about f
can be answered by looking at the values of at most finitely many coordinates, then it has a probabilistic answer.
In measure theory, if we have a countably infinite collection of measurable sets, then the union and intersection of all
of them is a measurable set. For our purposes, this means that yes/no questions that depend on countably many
coordinates have a probabilistic answer.
The good news is that the Kolmogorov extension makes it possible to construct stochastic processes with fairly
arbitrary finite-dimensional distributions. Also, every question that one could ask about a sequence has a
probabilistic answer when asked of a random sequence. The bad news is that certain questions about functions on a
continuous domain don't have a probabilistic answer. One might hope that the questions that depend on uncountably
many values of a function be of little interest, but the really bad news is that virtually all concepts of calculus are of
this sort. For example:
1. boundedness
2. continuity
3. differentiability
all require knowledge of uncountably many values of the function.
One solution to this problem is to require that the stochastic process be separable. In other words, that there be some
countable set of coordinates whose values determine the whole random function f.
The Kolmogorov continuity theorem guarantees that processes that satisfy certain constraints on the moments of
their increments have continuous modifications and are therefore separable.

Filtrations
Given a probability space , a filtration is a weakly increasing collection of sigma-algebras on ,
, indexed by some totally ordered set T, and bounded above by . I.e. for with s < t,
.
A stochastic process X on the same time set T is said to be adapted to the filtration if, for every , is
[2]
-measurable.

The natural filtration
Given a stochastic process , the natural filtration for (or induced by) this process is the
filtration where is generated by all values of up to time s = t. I.e.
.
A stochastic process is always adapted to its natural filtration.

Classification
Stochastic processes can be classified according to the cardinality of its index set (usually interpreted as time) and
state space.


Discrete time and discrete states
If both and belong to , the set of natural numbers, then we have models which lead to Markov chains. For
example:
(a) If means the bit (0 or 1) in position of a sequence of transmitted bits, then can be modelled as a
Markov chain with 2 states. This leads to the error correcting viterbi algorithm in data transmission.
(b) If means the combined genotype of a breeding couple in the th generation in a inbreeding model, it can be
shown that the proportion of heterozygous individuals in the population approaches zero as goes to ∞.[3]

Continuous time and continuous state space
The paradigm of continuous stochastic process is that of the Wiener process. In its original form the problem was
concerned with a particle floating on a liquid surface, receiving "kicks" from the molecules of the liquid. The particle
is then viewed as being subject to a random force which, since the molecules are very small and very close together,
is treated as being continuous and, since the particle is constrained to the surface of the liquid by surface tension, is
at each point in time a vector parallel to the surface. Thus the random force is described by a two component
stochastic process; two real-valued random variables are associated to each point in the index set, time, (note that
since the liquid is viewed as being homogeneous the force is independent of the spatial coordinates) with the domain
of the two random variables being R, giving the x and y components of the force. A treatment of Brownian motion
generally also includes the effect of viscosity, resulting in an equation of motion known as the Langevin equation.[4]

Discrete time and continuous state space
If the index set of the process is N (the natural numbers), and the range is R (the real numbers), there are some
natural questions to ask about the sample sequences of a process {Xi}i ∈ N, where a sample sequence is {Xi(ω)}i ∈ N.
1. What is the probability that each sample sequence is bounded?
2. What is the probability that each sample sequence is monotonic?
3. What is the probability that each sample sequence has a limit as the index approaches ∞?
4. What is the probability that the series obtained from a sample sequence from converges?
5. What is the probability distribution of the sum?
Main applications of discrete time continuous state stochastic models include Markov chain Monte Carlo (MCMC)
and the analysis of Time Series.

Continuous time and discrete state space
Similarly, if the index space I is a finite or infinite interval, we can ask about the sample paths {Xt(ω)}t ∈ I
1. What is the probability that it is bounded/integrable/continuous/differentiable...?
2. What is the probability that it has a limit at ∞
3. What is the probability distribution of the integral?

References
[1] Karlin, Samuel & Taylor, Howard M. (1998). An Introduction to Stochastic Modeling, Academic Press. ISBN 0-12-684887-4.
[2] Durrett, Rick. Probability: Theory and Examples. Fourth Edition. Cambridge: Cambridge University Press, 2010.
[3] Allen, Linda J. S., An Introduction to Stochastic Processes with Applications to Biology, 2th Edition, Chapman and Hall, 2010, ISBN
1-4398-1882-7
[4] Gardiner, C. Handbook of Stochastic Methods: for Physics, Chemistry and the Natural Sciences, 3th Edition, Springer, 2004, ISBN
3540208828


Further reading
• Wio, S. Horacio, Deza, R. Roberto & Lopez, M. Juan (2012). An Introduction to Stochastic Processes and
Nonequilibrium Statistical Physics. World Scientific Publishing. ISBN 978-981-4374-78-1.
• Papoulis, Athanasios & Pillai, S. Unnikrishna (2001). Probability, Random Variables and Stochastic Processes.
McGraw-Hill Science/Engineering/Math. ISBN 0-07-281725-9.
• Boris Tsirelson. "Lecture notes in Advanced probability theory" (http://www.webcitation.org/5cfvVZ4Kd).
• Doob, J. L. (1953). Stochastic Processes. Wiley.
• Klebaner, Fima C. (2011). Introduction to Stochastic Calculus With Applications. Imperial College Press.
ISBN 1-84816-831-4.
• Bruce Hajek (July 2006). "An Exploration of Random Processes for Engineers" (http://www.ifp.uiuc.edu/
~hajek/Papers/randomprocesses.html).
• "An 8 foot tall Probability Machine (named Sir Francis) comparing stock market returns to the randomness of the
beans dropping through the quincunx pattern" (http://www.youtube.com/watch?v=AUSKTk9ENzg). Index
Funds Advisors IFA.com (http://www.ifa.com).
• "Popular Stochastic Processes used in Quantitative Finance" (http://www.sitmo.com/article/
popular-stochastic-processes-in-finance/). sitmo.com.
• "Addressing Risk and Uncertainty" (http://www.goldsim.com/Content.asp?PageID=455).

Covariance
In probability theory and statistics, covariance is a measure of how much two random variables change together. If
the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds
for the smaller values, i.e., the variables tend to show similar behavior, the covariance is positive.[1] In the opposite
case, when the greater values of one variable mainly correspond to the smaller values of the other, i.e., the variables
tend to show opposite behavior, the covariance is negative. The sign of the covariance therefore shows the tendency
in the linear relationship between the variables. The magnitude of the covariance is not that easy to interpret. The
normalized version of the covariance, the correlation coefficient, however, shows by its magnitude the strength of
the linear relation.
A distinction must be made between (1) the covariance of two random variables, which is a population parameter
that can be seen as a property of the joint probability distribution, and (2) the sample covariance, which serves as an
estimated value of the parameter.

Definition
The covariance between two jointly distributed real-valued random variables x and y with finite second moments is
defined[2] as

where E[x] is the expected value of x, also known as the mean of x. By using the linearity property of expectations,
this can be simplified to

For random vectors and (of dimension m and n respectively) the m×n covariance matrix is equal to

Covariance 21

where mT is the transpose of the vector (or matrix) m.
The (i,j)-th element of this matrix is equal to the covariance Cov(xi, yj) between the i-th scalar component of x and
the j-th scalar component of y. In particular, Cov(y, x) is the transpose of Cov(x, y).

For a vector of m jointly distributed random variables with finite second moments, its
covariance matrix is defined as

Random variables whose covariance is zero are called uncorrelated.
The units of measurement of the covariance Cov(x, y) are those of x times those of y. By contrast, correlation
coefficients, which depend on the covariance, are a dimensionless measure of linear dependence. (In fact, correlation
coefficients can simply be understood as a normalized version of covariance.)

Properties
• Variance is a special case of the covariance when the two variables are identical:

• If x, y, w, and v are real-valued random variables and a, b, c, d are constant ("constant" in this context means
non-random), then the following facts are a consequence of the definition of covariance:

For sequences x1, ..., xn and y1, ..., ym of random variables, we have

For a sequence x1, ..., xn of random variables, and constants a1, ..., an, we have

A more general identity for covariance matrices
Let be a random vector, let denote its covariance matrix, and let be a matrix that can act on . The
result of applying this matrix to is a new vector with covariance matrix
.
This is a direct result of the linearity of expectation and is useful when applying a linear transformation, such as a
whitening transformation, to a vector.

Covariance 22

Uncorrelatedness and independence
If x and y are independent, then their covariance is zero. This follows because under independence,

The converse, however, is not generally true. For example, let x be uniformly distributed in [-1, 1] and let y = x2.
Clearly, x and y are dependent, but

In this case, the relationship between y and x is non-linear, while correlation and covariance are measures of linear
dependence between two variables. Still, as in the example, if two variables are uncorrelated, that does not imply that
they are independent.

Relationship to inner products
Many of the properties of covariance can be extracted elegantly by observing that it satisfies similar properties to
those of an inner product:
1. bilinear: for constants a and b and random variables x, y, z, σ(ax + by, z) = a σ(x, z) + b σ(y, z);
2. symmetric: σ(x, y) = σ(y, x);
3. positive semi-definite: σ2(x) = σ(x, x) ≥ 0, and σ(x, x) = 0 implies that x is a constant random variable (K).
In fact these properties imply that the covariance defines an inner product over the quotient vector space obtained by
taking the subspace of random variables with finite second moment and identifying any two that differ by a constant.
(This identification turns the positive semi-definiteness above into positive definiteness.) That quotient vector space
is isomorphic to the subspace of random variables with finite second moment and mean zero; on that subspace, the
covariance is exactly the L2 inner product of real-valued functions on the sample space.
As a result for random variables with finite variance, the inequality

holds via the Cauchy–Schwarz inequality.
Proof: If σ2(y) = 0, then it holds trivially. Otherwise, let random variable

Then we have

QED.

Covariance 23

Calculating the sample covariance
The sample covariance of N observations of K variables is the K-by-K matrix with the entries

,

which is an estimate of the covariance between variable j and variable k.
The sample mean and the sample covariance matrix are unbiased estimates of the mean and the covariance matrix of
the random vector , a row vector whose jth element (j = 1, ..., K) is one of the random variables. The reason the
sample covariance matrix has in the denominator rather than is essentially that the population mean
is not known and is replaced by the sample mean . If the population mean is known, the analogous
unbiased estimate is given by

Comments
The covariance is sometimes called a measure of "linear dependence" between the two random variables. That does
not mean the same thing as in the context of linear algebra (see linear dependence). When the covariance is
normalized, one obtains the correlation coefficient. From it, one can obtain the Pearson coefficient, which gives us
the goodness of the fit for the best possible linear function describing the relation between the variables. In this sense
covariance is a linear gauge of dependence.

References
[1] http:/ / mathworld. wolfram. com/ Covariance. html
[2] Oxford Dictionary of Statistics, Oxford University Press, 2002, p. 104.

External links
• Hazewinkel, Michiel, ed. (2001), "Covariance" (http://www.encyclopediaofmath.org/index.php?title=p/
c026800), Encyclopedia of Mathematics, Springer, ISBN 978-1-55608-010-4
• MathWorld page on calculating the sample covariance (http://mathworld.wolfram.com/Covariance.html)
• Covariance Tutorial using R (http://www.r-tutor.com/elementary-statistics/numerical-measures/covariance)

Autocovariance 24

Autocovariance
In statistics, given a real stochastic process X(t), the autocovariance is the covariance of the variable against a
time-shifted version of itself. If the process has the mean , then the autocovariance is given by

where E is the expectation operator.

Stationarity
If X(t) is stationary process, then the following are true:
for all t, s
and

where

is the lag time, or the amount of time by which the signal has been shifted.
As a result, the autocovariance becomes

where RXX represents the autocorrelation in the signal processing sense.

Normalization
When normalized by dividing by the variance σ2, the autocovariance C becomes the autocorrelation coefficient
function c,[1]

The autocovariance function is itself a version of the autocorrelation function with the mean level removed. If the
signal has a mean of 0, the autocovariance and autocorrelation functions are identical.[1]
However, often the autocovariance is called autocorrelation even if this normalization has not been performed.
The autocovariance can be thought of as a measure of how similar a signal is to a time-shifted version of itself with
an autocovariance of σ2 indicating perfect correlation at that lag. The normalisation with the variance will put this
into the range [−1, 1].

Properties
The autocovariance of a linearly filtered process

is

Autocovariance 25

References
• P. G. Hoel, Mathematical Statistics, Wiley, New York, 1984.
• Lecture notes on autocovariance from WHOI [2]
[1] Westwick, David T. (2003). Identification of Nonlinear Physiological Systems. IEEE Press. pp. 17–18. ISBN 0-471-27456-9.
[2] http:/ / w3eos. whoi. edu/ 12. 747/ notes/ lect06/ l06s02. html

Autocorrelation
Autocorrelation is the cross-correlation of
a signal with itself. Informally, it is the
similarity between observations as a
function of the time separation between
them. It is a mathematical tool for finding
repeating patterns, such as the presence of a
periodic signal which has been buried under
noise, or identifying the missing
fundamental frequency in a signal implied
by its harmonic frequencies. It is often used
in signal processing for analyzing functions
or series of values, such as time domain
signals.

Definitions A plot showing 100 random numbers with a
"hidden" sine function, and an autocorrelation
Different fields of study define
(correlogram) of the series on the bottom.
autocorrelation differently, and not all of
these definitions are equivalent. In some
fields, the term is used interchangeably with
autocovariance.

Statistics
In statistics, the autocorrelation of a random
process describes the correlation between
values of the process at different times, as a
function of the two times or of the time
difference. Let X be some repeatable
process, and i be some point in time after the Visual comparison of convolution, cross-correlation and autocorrelation.

start of that process. (i may be an integer for
a discrete-time process or a real number for a continuous-time process.) Then Xi is the value (or realization)
produced by a given run of the process at time i. Suppose that the process is further known to have defined values for
mean μi and variance σi2 for all times i. Then the definition of the autocorrelation between times s and t is

where "E" is the expected value operator. Note that this expression is not well-defined for all time series or
processes, because the variance may be zero (for a constant process) or infinite. If the function R is well-defined, its

Autocorrelation 26

value must lie in the range [−1, 1], with 1 indicating perfect correlation and −1 indicating perfect anti-correlation.
If Xt is a second-order stationary process then the mean μ and the variance σ2 are time-independent, and further the
autocorrelation depends only on the difference between t and s: the correlation depends only on the time-distance
between the pair of values but not on their position in time. This further implies that the autocorrelation can be
expressed as a function of the time-lag, and that this would be an even function of the lag τ = s − t. This gives the
more familiar form

and the fact that this is an even function can be stated as

It is common practice in some disciplines, other than statistics and time series analysis, to drop the normalization by
σ2 and use the term "autocorrelation" interchangeably with "autocovariance". However, the normalization is
important both because the interpretation of the autocorrelation as a correlation provides a scale-free measure of the
strength of statistical dependence, and because the normalization has an effect on the statistical properties of the
estimated autocorrelations.

Signal processing
In signal processing, the above definition is often used without the normalization, that is, without subtracting the
mean and dividing by the variance. When the autocorrelation function is normalized by mean and variance, it is
sometimes referred to as the autocorrelation coefficient.[1]
Given a signal , the continuous autocorrelation is most often defined as the continuous
cross-correlation integral of with itself, at lag .

where represents the complex conjugate and represents convolution. For a real function, .
The discrete autocorrelation at lag for a discrete signal is

The above definitions work for signals that are square integrable, or square summable, that is, of finite energy.
Signals that "last forever" are treated instead as random processes, in which case different definitions are needed,
based on expected values. For wide-sense-stationary random processes, the autocorrelations are defined as

For processes that are not stationary, these will also be functions of , or .
For processes that are also ergodic, the expectation can be replaced by the limit of a time average. The
autocorrelation of an ergodic process is sometimes defined as or equated to[1]

These definitions have the advantage that they give sensible well-defined single-parameter results for periodic
functions, even when those functions are not the output of stationary ergodic processes.
Alternatively, signals that last forever can be treated by a short-time autocorrelation function analysis, using finite
time integrals. (See short-time Fourier transform for a related process.)

Autocorrelation 27

Multi-dimensional autocorrelation is defined similarly. For example, in three dimensions the autocorrelation of a
square-summable discrete signal would be

When mean values are subtracted from signals before computing an autocorrelation function, the resulting function
is usually called an auto-covariance function.

Properties
In the following, we will describe properties of one-dimensional autocorrelations only, since most properties are
easily transferred from the one-dimensional case to the multi-dimensional cases.
• A fundamental property of the autocorrelation is symmetry, , which is easy to prove from the
definition. In the continuous case,
the autocorrelation is an even function
when is a real function,
and the autocorrelation is a Hermitian function

when is a complex function.
• The continuous autocorrelation function reaches its peak at the origin, where it takes a real value, i.e. for any
delay , . This is a consequence of the Cauchy–Schwarz inequality. The same result holds
in the discrete case.
• The autocorrelation of a periodic function is, itself, periodic with the same period.
• The autocorrelation of the sum of two completely uncorrelated functions (the cross-correlation is zero for all )
is the sum of the autocorrelations of each function separately.
• Since autocorrelation is a specific type of cross-correlation, it maintains all the properties of cross-correlation.
• The autocorrelation of a continuous-time white noise signal will have a strong peak (represented by a Dirac delta
function) at and will be absolutely 0 for all other .
• The Wiener–Khinchin theorem relates the autocorrelation function to the power spectral density via the Fourier
transform:

• For real-valued functions, the symmetric autocorrelation function has a real symmetric transform, so the
Wiener–Khinchin theorem can be re-expressed in terms of real cosines only:

Autocorrelation 28

Efficient computation
For data expressed as a discrete sequence, it is frequently necessary to compute the autocorrelation with high
computational efficiency. The brute force method based on the definition can be used. For example, to calculate the
autocorrelation of , we employ the usual multiplication method with right shifts:
231
×231
________
231
693
462
_____________
2 9 14 9 2
Thus the required autocorrelation is (2,9,14,9,2). In this calculation we do not perform the carry-over operation
during addition because the vector has been defined over a field of real numbers. Note that we can halve the
number of operations required by exploiting the inherent symmetry of the autocorrelation.
While the brute force algorithm is order n2, several efficient algorithms exist which can compute the autocorrelation
in order n log(n). For example, the Wiener–Khinchin theorem allows computing the autocorrelation from the raw
data X(t) with two Fast Fourier transforms (FFT)[2]:
FR(f) = FFT[X(t)]
S(f) = FR(f) FR*(f)
R(τ) = IFFT[S(f)]
where IFFT denotes the inverse Fast Fourier transform. The asterisk denotes complex conjugate.
Alternatively, a multiple τ correlation can be performed by using brute force calculation for low τ values, and then
progressively binning the X(t) data with a logarithmic density to compute higher values, resulting in the same n
log(n) efficiency, but with lower memory requirements.

Estimation
For a discrete process of length defined as with known mean and variance, an estimate
of the autocorrelation may be obtained as

for any positive integer . When the true mean and variance are known, this estimate is unbiased. If
the true mean and variance of the process are not known there are a several possibilities:
• If and are replaced by the standard formulae for sample mean and sample variance, then this is a biased
estimate.
• A periodogram-based estimate replaces in the above formula with . This estimate is always biased;
[3][4]
however, it usually has a smaller mean square error.
• Other possibilities derive from treating the two portions of data and
separately and calculating separate sample means and/or sample variances for use
in defining the estimate.
The advantage of estimates of the last type is that the set of estimated autocorrelations, as a function of , then
form a function which is a valid autocorrelation in the sense that it is possible to define a theoretical process having

Autocorrelation 29

exactly that autocorrelation. Other estimates can suffer from the problem that, if they are used to calculate the
variance of a linear combination of the 's, the variance calculated may turn out to be negative.

Regression analysis
In regression analysis using time series data, autocorrelation of the errors is a problem. Autocorrelation of the errors,
which themselves are unobserved, can generally be detected because it produces autocorrelation in the observable
residuals. (Errors are also known as "error terms" in econometrics.)
Autocorrelation violates the ordinary least squares (OLS) assumption that the error terms are uncorrelated. While it
does not bias the OLS coefficient estimates, the standard errors tend to be underestimated (and the t-scores
overestimated) when the autocorrelations of the errors at low lags are positive.
The traditional test for the presence of first-order autocorrelation is the Durbin–Watson statistic or, if the explanatory
variables include a lagged dependent variable, Durbin's h statistic. A more flexible test, covering autocorrelation of
higher orders and applicable whether or not the regressors include lags of the dependent variable, is the
Breusch–Godfrey test. This involves an auxiliary regression, wherein the residuals obtained from estimating the
model of interest are regressed on (a) the original regressors and (b) k lags of the residuals, where k is the order of the
test. The simplest version of the test statistic from this auxiliary regression is TR2, where T is the sample size and R2
is the coefficient of determination. Under the null hypothesis of no autocorrelation, this statistic is asymptotically
distributed as with k degrees of freedom.
Responses to nonzero autocorrelation include generalized least squares and the Newey–West HAC estimator
(Heteroskedasticity and Autocorrelation Consistent).[5]

Applications
• One application of autocorrelation is the measurement of optical spectra and the measurement of
very-short-duration light pulses produced by lasers, both using optical autocorrelators.
• Autocorrelation is used to analyze Dynamic light scattering data, which notably enables to determine the particle
size distributions of nanometer-sized particles or micelles suspended in a fluid. A laser shining into the mixture
produces a speckle pattern that results from the motion of the particles. Autocorrelation of the signal can be
analyzed in terms of the diffusion of the particles. From this, knowing the viscosity of the fluid, the sizes of the
particles can be calculated.
• The Small-angle X-ray scattering intensity of a nanostructured system is the Fourier transform of the spatial
autocorrelation function of the electron density.
• In optics, normalized autocorrelations and cross-correlations give the degree of coherence of an electromagnetic
field.
• In signal processing, autocorrelation can give information about repeating events like musical beats (for example,
to determine tempo) or pulsar frequencies, though it cannot tell the position in time of the beat. It can also be used
to estimate the pitch of a musical tone.
• In music recording, autocorrelation is used as a pitch detection algorithm prior to vocal processing, as a distortion
effect or to eliminate undesired mistakes and inaccuracies.[6]
• Autocorrelation in space rather than time, via the Patterson function, is used by X-ray diffractionists to help
recover the "Fourier phase information" on atom positions not available through diffraction alone.
• In statistics, spatial autocorrelation between sample locations also helps one estimate mean value uncertainties
when sampling a heterogeneous population.
• The SEQUEST algorithm for analyzing mass spectra makes use of autocorrelation in conjunction with
cross-correlation to score the similarity of an observed spectrum to an idealized spectrum representing a peptide.

Autocorrelation 30

• In Astrophysics, auto-correlation is used to study and characterize the spatial distribution of galaxies in the
Universe and in multi-wavelength observations of Low Mass X-ray Binaries.
• In panel data, spatial autocorrelation refers to correlation of a variable with itself through space.
• In analysis of Markov chain Monte Carlo data, autocorrelation must be taken into account for correct error
determination.

References
[1] Patrick F. Dunn, Measurement and Data Analysis for Engineering and Science, New York: McGraw–Hill, 2005 ISBN 0-07-282538-3
[2] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Upper Saddle River, NJ:
Prentice–Hall, 1994.
[3] Spectral analysis and time series, M.B. Priestley (London, New York : Academic Press, 1982)
[4] Percival, Donald B.; Andrew T. Walden (1993). Spectral Analysis for Physical Applications: Multitaper and Conventional Univariate
Techniques. Cambridge University Press. pp. 190–195. ISBN 0-521-43541-2.
[5] Christopher F. Baum (2006). An Introduction to Modern Econometrics Using Stata (http:/ / books. google. com/ ?id=acxtAylXvGMC&
pg=PA141& dq=newey-west-standard-errors+ generalized-least-squares). Stata Press. ISBN 1-59718-013-0. .
[6] Tyrangiel, Josh (2009-02-05). "Auto-Tune: Why Pop Music Sounds Perfect" (http:/ / www. time. com/ time/ magazine/ article/
0,9171,1877372,00. html). Time Magazine. .

External links
• Weisstein, Eric W., " Autocorrelation (http://mathworld.wolfram.com/Autocorrelation.html)" from
MathWorld.
• Autocorrelation articles in Comp.DSP (DSP usenet group). (http://www.dsprelated.com/comp.dsp/keyword/
Autocorrelation.php)
• GPU accelerated calculation of autocorrelation function. (http://www.iop.org/EJ/abstract/1367-2630/11/9/
093024/)


Cross-correlation
In signal processing, cross-correlation is a measure of similarity of two waveforms as a function of a time-lag
applied to one of them. This is also known as a sliding dot product or sliding inner-product. It is commonly used for
searching a long-signal for a shorter, known feature. It also has applications in pattern recognition, single particle
analysis, electron tomographic averaging, cryptanalysis, and neurophysiology.
For continuous functions, f and g, the cross-correlation is defined as:

where f * denotes the complex conjugate of f.
Similarly, for discrete functions, the cross-correlation is defined as:

The cross-correlation is similar in nature to
the convolution of two functions.
In an autocorrelation, which is the
cross-correlation of a signal with itself, there
will always be a peak at a lag of zero unless
the signal is a trivial zero signal.
In probability theory and statistics,
correlation is always used to include a
standardising factor in such a way that
correlations have values between −1 and +1, Visual comparison of convolution, cross-correlation and autocorrelation.
and the term cross-correlation is used for
referring to the correlation corr(X, Y) between two random variables X and Y, while the "correlation" of a random
vector X is considered to be the correlation matrix (matrix of correlations) between the scalar elements of X.

If and are two independent random variables with probability density functions f and g, respectively, then the
probability density of the difference is formally given by the cross-correlation (in the signal-processing
sense) ; however this terminology is not used in probability and statistics. In contrast, the convolution
(equivalent to the cross-correlation of f(t) and g(−t) ) gives the probability density function of the sum .

Explanation
As an example, consider two real valued functions and differing only by an unknown shift along the x-axis.
One can use the cross-correlation to find how much must be shifted along the x-axis to make it identical to .
The formula essentially slides the function along the x-axis, calculating the integral of their product at each
position. When the functions match, the value of is maximized. This is because when peaks (positive areas)
are aligned, they make a large contribution to the integral. Similarly, when troughs (negative areas) align, they also
make a positive contribution to the integral because the product of two negative numbers is positive.
With complex-valued functions and , taking the conjugate of ensures that aligned peaks (or aligned troughs)
with imaginary components will contribute positively to the integral.
In econometrics, lagged cross-correlation is sometimes referred to as cross-autocorrelation[1]


Properties
• The correlation of functions f(t) and g(t) is equivalent to the convolution of f *(−t) and g(t). I.e.:

• If f is Hermitian, then
•
• Analogous to the convolution theorem, the cross-correlation satisfies:

where denotes the Fourier transform, and an asterisk again indicates the complex conjugate. Coupled with fast
Fourier transform algorithms, this property is often exploited for the efficient numerical computation of
cross-correlations. (see circular cross-correlation)
• The cross-correlation is related to the spectral density. (see Wiener–Khinchin theorem)
• The cross correlation of a convolution of f and h with a function g is the convolution of the correlation of f and g
with the kernel h:

Normalized cross-correlation
For image-processing applications in which the brightness of the image and template can vary due to lighting and
exposure conditions, the images can be first normalized. This is typically done at every step by subtracting the mean
and dividing by the standard deviation. That is, the cross-correlation of a template, with a subimage
is

.

where is the number of pixels in and , is the average of f and is standard deviation of f.
In functional analysis terms, this can be thought of as the dot product of two normalized vectors. That is, if

and

then the above sum is equal to

where is the inner product and is the L² norm. Thus, if f and t are real matrices, their normalized
cross-correlation equals the cosine of the angle between the unit vectors F and T, being thus 1 if and only if F equals
T multiplied by a positive scalar.
Normalized correlation is one of the methods used for template matching, a process used for finding incidences of a
pattern or object within an image. It is also the 2-dimensional version of Pearson product-moment correlation
coefficient.


Time series analysis
In time series analysis, as applied in statistics, the cross correlation between two time series describes the normalized
cross covariance function.
Let represent a pair of stochastic processes that are jointly wide sense stationary. Then the cross
covariance is given by [2]

where and are the means of and respectively.
The cross correlation function is the normalized cross-covariance function.

where and are the standard deviations of processes and respectively.
Note that if for all t, then the cross correlation function is simply the autocorrelation function.

Scaled correlation
In the analysis of time series scaled correlation can be applied to reveal cross-correlation exclusively between fast
components of the signals, the contributions of slow components being removed.[3]

Time delay analysis
Cross-correlations are useful for determining the time delay between two signals, e.g. for determining time delays
for the propagation of acoustic signals across a microphone array.[4][5] After calculating the cross-correlation
between the two signals, the maximum (or minimum if the signals are negatively correlated) of the cross-correlation
function indicates the point in time where the signals are best aligned, i.e. the time delay between the two signals is
determined by the argument of the maximum, or arg max of the cross-correlation, as in

References
[1] Campbell, Lo, and MacKinlay 1996: The Econometrics of Financial Markets, NJ: Princeton University Press.
[2] von Storch, H.; F. W Zwiers (2001). Statistical analysis in climate research. Cambridge Univ Pr. ISBN 0-521-01230-9.
[3] Nikolić D, Muresan RC, Feng W, Singer W (2012) Scaled correlation analysis: a better way to compute a cross-correlogram. European
Journal of Neuroscience, pp. 1–21, doi:10.1111/j.1460-9568.2011.07987.x http:/ / www. danko-nikolic. com/ wp-content/ uploads/ 2012/ 03/
Scaled-correlation-analysis. pdf
[4] Rhudy, Matthew; Brian Bucci, Jeffrey Vipperman, Jeffrey Allanach, and Bruce Abraham (November 2009). "Microphone Array Analysis
Methods Using Cross-Correlations". Proceedings of 2009 ASME International Mechanical Engineering Congress, Lake Buena Vista, FL.
[5] Rhudy, Matthew (November 2009). "Real Time Implementation of a Military Impulse Classifier". University of Pittsburgh, Master's Thesis.


External links
• Cross Correlation from Mathworld (http://mathworld.wolfram.com/Cross-Correlation.html)
• http://scribblethink.org/Work/nvisionInterface/nip.html
• http://www.phys.ufl.edu/LIGO/stochastic/sign05.pdf
• http://www.staff.ncl.ac.uk/oliver.hinton/eee305/Chapter6.pdf
• http://www.springerlink.com/content/0150455247264825/fulltext.pdf

White noise 35

White noise
Colors of noise
White
Pink
Red (Brownian)
Grey

White noise is a random signal or process with a flat power spectral density. In other words, the signal contains
equal power within a fixed bandwidth at any center frequency. White noise draws its name from white light in which
the power spectral density of the light is distributed over the visible band in such a way that the eye's three color
receptors (cones) are approximately equally stimulated.
In statistical sense, a time series rt is characterized as having weak white noise if {rt} is a sequence of serially
uncorrelated random variables with zero mean and finite variance. Strong white noise also has the quality of being
independent and identically distributed, which implies no autocorrelation. In particular, if rt is normally distributed
with mean zero and standard deviation σ, the series is called a Gaussian white noise.[1]
An infinite-bandwidth white noise signal is a purely theoretical construction. The bandwidth of white noise is limited
in practice by the mechanism of noise generation, by the transmission medium and by finite observation capabilities.
A random signal is considered "white noise" if it is observed to have a flat spectrum over a medium's widest possible
bandwidth.

White noise in a spatial context
While it is usually applied in the context of frequency domain signals, the term white noise is also commonly applied
to a noise signal in the spatial domain. In this case, it has an auto correlation which can be represented by a delta
function over the relevant space dimensions. The signal is then "white" in the spatial frequency domain (this is
equally true for signals in the angular frequency domain, e.g., the distribution of a signal across all angles in the
night sky).

Statistical properties
The image to the right displays a finite length, discrete time realization
of a white noise process generated from a computer.
Being uncorrelated in time does not restrict the values a signal can
take. Any distribution of values is possible (although it must have zero
DC component). Even a binary signal which can only take on the
values 1 or -1 will be white if the sequence is statistically uncorrelated.
Noise having a continuous distribution, such as a normal distribution,
can of course be white.
An example realization of a Gaussian white noise
It is often incorrectly assumed that Gaussian noise (i.e., noise with a
process.
Gaussian amplitude distribution — see normal distribution) is
necessarily white noise, yet neither property implies the other.
Gaussianity refers to the probability distribution with respect to the value, in this context the probability of the signal
reaching an amplitude, while the term 'white' refers to the way the signal power is distributed over time or among
frequencies.

White noise 36

We can therefore find Gaussian white noise, but also
Poisson, Cauchy, etc. white noises. Thus, the two
words "Gaussian" and "white" are often both specified
in mathematical models of systems. Gaussian white
noise is a good approximation of many real-world
situations and generates mathematically tractable
models. These models are used so frequently that the
term additive white Gaussian noise has a standard
abbreviation: AWGN.

White noise is the generalized mean-square derivative
of the Wiener process or Brownian motion.
Spectrogram of pink noise (left) and white noise (right), shown with
linear frequency axis (vertical).
Applications
White noise is commonly used in the production of electronic music, usually either directly or as an input for a filter
to create other types of noise signal. It is used extensively in audio synthesis, typically to recreate percussive
instruments such as cymbals which have high noise content in their frequency domain.
It is also used to generate impulse responses. To set up the equalization (EQ) for a concert or other performance in a
venue, a short burst of white or pink noise is sent through the PA system and monitored from various points in the
venue so that the engineer can tell if the acoustics of the building naturally boost or cut any frequencies. The
engineer can then adjust the overall equalization to ensure a balanced mix.
White noise can be used for frequency response testing of amplifiers and electronic filters. It is not used for testing
loudspeakers as its spectrum contains too great an amount of high frequency content. Pink noise is used for testing
transducers such as loudspeakers and microphones. White noise is used as the basis of some random number
generators. For example, Random.org uses a system of atmospheric antennae to generate random digit patterns from
white noise.
White noise is a common synthetic noise source used for sound masking by a tinnitus masker.[2] White noise
machines and other white noise sources are sold as privacy enhancers and sleep aids and to mask tinnitus.[3]
Alternatively, the use of an FM radio tuned to unused frequencies ("static") is a simpler and more cost-effective
source of white noise.[4] However, white noise generated from a common commercial radio receiver tuned to an
unused frequency is extremely vulnerable to being contaminated with spurious signals, such as adjacent radio
stations, harmonics from non-adjacent radio stations, electrical equipment in the vicinity of the receiving antenna
causing interference, or even atmospheric events such as solar flares and especially lightning.
The effects of white noise upon cognitive function are mixed. Recently, a small study found that white noise
background stimulation improves cognitive functioning among secondary students with Attention deficit
hyperactivity disorder (ADHD), while decreasing performance of non-ADHD students.[5][6] Other work indicates it
is effective in improving the mood and performance of workers by masking background office noise,[7] but decreases
cognitive performance in complex card sorting tasks.[8]

White noise 37

Mathematical definition

White random vector
A random vector is a white random vector if and only if its mean vector and autocorrelation matrix are the
following:

That is, it is a zero mean random vector, and its autocorrelation matrix is a multiple of the identity matrix. When the
autocorrelation matrix is a multiple of the identity, we say that it has spherical correlation.

White random process (white noise)
A continuous time random process where is a white noise process if and only if its mean function and
autocorrelation function satisfy the following:

i.e. it is a zero mean process for all time and has infinite power at zero time shift since its autocorrelation function is
the Dirac delta function.
The above autocorrelation function implies the following power spectral density:

since the Fourier transform of the delta function is equal to 1. Since this power spectral density is the same at all
frequencies, we call it white as an analogy to the frequency spectrum of white light.
A generalization to random elements on infinite dimensional spaces, such as random fields, is the white noise
measure.

Random vector transformations
Two theoretical applications using a white random vector are the simulation and whitening of another arbitrary
random vector. To simulate an arbitrary random vector, we transform a white random vector with a carefully chosen
matrix. We choose the transformation matrix so that the mean and covariance matrix of the transformed white
random vector matches the mean and covariance matrix of the arbitrary random vector that we are simulating. To
whiten an arbitrary random vector, we transform it by a different carefully chosen matrix so that the output random
vector is a white random vector.
These two ideas are crucial in applications such as channel estimation and channel equalization in communications
and audio. These concepts are also used in data compression.

White noise 38

Simulating a random vector
Suppose that a random vector has covariance matrix . Since this matrix is Hermitian symmetric and positive
semidefinite, by the spectral theorem from linear algebra, we can diagonalize or factor the matrix in the following
way.

where is the orthogonal matrix of eigenvectors and is the diagonal matrix of eigenvalues. Thus, the inverse
equation also holds.
We can simulate the 1st and 2nd moment properties of this random vector with mean and covariance matrix
via the following transformation of a white vector of unit variance:

where

Thus, the output of this transformation has expectation

and covariance matrix

Random signal transformations
We cannot extend the same two concepts of simulating and whitening to the case of continuous time random signals
or processes. For simulating, we create a filter into which we feed a white noise signal. We choose the filter so that
the output signal simulates the 1st and 2nd moments of any arbitrary random process. For whitening, we feed any
arbitrary random signal into a specially chosen filter so that the output of the filter is a white noise signal.

Simulating a continuous-time random signal
White noise can simulate any
wide-sense stationary, continuous-time
random process with
constant mean and covariance
function

White noise fed into a linear, time-invariant filter to simulate the 1st and 2nd moments of
and power spectral density
an arbitrary random process.

We can simulate this signal using frequency domain techniques.
Because is Hermitian symmetric and positive semi-definite, it follows that is real and can be
factored as

if and only if satisfies the Paley-Wiener criterion.

If is a rational function, we can then factor it into pole-zero form as

White noise 39

Choosing a minimum phase so that its poles and zeros lie inside the left half s-plane, we can then simulate
with as the transfer function of the filter.
We can simulate by constructing the following linear, time-invariant filter

where is a continuous-time, white-noise signal with the following 1st and 2nd moment properties:

Thus, the resultant signal has the same 2nd moment properties as the desired signal .

Whitening a continuous-time random signal
Suppose we have a wide-sense
stationary, continuous-time random
process defined with
the same mean , covariance
function , and power spectral
density as above.
We can whiten this signal using An arbitrary random process x(t) fed into a linear, time-invariant filter that whitens x(t) to
frequency domain techniques. We create white noise at the output.
factor the power spectral density
as described above.

Choosing the minimum phase so that its poles and zeros lie inside the left half s-plane, we can then whiten
with the following inverse filter

We choose the minimum phase filter so that the resulting inverse filter is stable. Additionally, we must be sure that
is strictly positive for all so that does not have any singularities.
The final form of the whitening procedure is as follows:

so that is a white noise random process with zero mean and constant, unit power spectral density

Note that this power spectral density corresponds to a delta function for the covariance function of .

White noise 40

Generation
White noise may be generated digitally with a digital signal processor, microprocessor, or microcontroller.
Generating white noise typically entails feeding an appropriate stream of random numbers to a digital-to-analog
converter. The quality of the white noise will depend on the quality of the algorithm used.[9]

References
[1] Diebold, Frank (2007). Elements of Forecasting (Fourth ed.).
[2] Jastreboff, P. J. (2000). "Tinnitus Habituation Therapy (THT) and Tinnitus Retraining Therapy (TRT)". Tinnitus Handbook. San Diego:
Singular. pp. 357–376.
[3] López, HH; Bracha, AS; Bracha, HS (September 2002). "Evidence based complementary intervention for insomnia" (http:/ / cogprints. org/
5032/ 1/ 2002_H. M. J_White-noise_for_PTSD. pdf). Hawaii Med J 61 (9): 192, 213. PMID 12422383. .
[4] Noell, Courtney A; William L Meyerhoff (2003-02). "Tinnitus. Diagnosis and treatment of this elusive symptom". Geriatrics 58 (2): 28–34.
ISSN 0016-867X. PMID 12596495.
[5] Soderlund, Goran; Sverker Sikstrom, Jan Loftesnes, Edmund Sonuga Barke (2010). "The effects of background white noise on memory
performance in inattentive school children". Behavioral and Brain Functions 6 (1): 55.
[6] Söderlund, Göran; Sverker Sikström, Andrew Smart (2007). "Listen to the noise: Noise is beneficial for cognitive performance in ADHD.".
Journal of Child Psychology and Psychiatry 48 (8): 840–847. doi:10.1111/j.1469-7610.2007.01749.x. ISSN 0021-9630.
[7] Loewen, Laura J.; Peter Suedfeld (1992-05-01). "Cognitive and Arousal Effects of Masking Office Noise" (http:/ / eab. sagepub. com/
content/ 24/ 3/ 381. abstract). Environment and Behavior 24 (3): 381–395. doi:10.1177/0013916592243006. . Retrieved 2011-10-28.
[8] Baker, Mary Anne; Dennis H. Holding (1993-07). "The effects of noise and speech on cognitive task performance.". Journal of General
Psychology 120 (3): 339–355. ISSN 0022-1309.
[9] Matt Donadio. "How to Generate White Gaussian Noise" (http:/ / www. dspguru. com/ dsp/ howtos/ how-to-generate-white-gaussian-noise). .
Retrieved 2012-09-19.

External links
• Meaning of a White-Noise Process (http://www.digitalsignallabs.com/white.pdf) - "proper" definition of the
term white noise

Random walk 41

Random walk
A random walk is a mathematical
formalization of a path that consists of
a succession of random steps. For
example, the path traced by a molecule
as it travels in a liquid or a gas, the
search path of a foraging animal, the
price of a fluctuating stock and the
financial status of a gambler can all be
modeled as random walks, although
they may not be truly random in
reality. The term random walk was
first introduced by Karl Pearson in
1905.[1] Random walks have been used
in many fields: ecology, economics,
psychology, computer science,
physics, chemistry, and Example of eight random walks in one dimension starting at 0. The plot shows the current
[2][3][4][5][6][7][8][9] position on the line (vertical axis) versus the time steps (horizontal axis).
biology. Random
walks explain the observed behaviors
of processes in these fields, and thus
serve as a fundamental model for the
recorded stochastic activity.

Various different types of random
walks are of interest. Often, random
walks are assumed to be Markov
chains or Markov processes, but other,
more complicated walks are also of
interest. Some random walks are on
graphs, others on the line, in the plane,
or in higher dimensions, while some
random walks are on groups. Random
walks also vary with regard to the time
parameter. Often, the walk is in
discrete time, and indexed by the
natural numbers, as in
. However, some
walks take their steps at random times,
and in that case the position is An animated example of a Brownian motion-like random walk with toroidal
boundary conditions.
defined for the continuum of times
. Specific cases or limits of
random walks include the Lévy flight. Random walks are related to the diffusion models and are a fundamental topic
in discussions of Markov processes. Several properties of random walks, including dispersal distributions,
first-passage times and encounter rates, have been extensively studied.

Random walk 42

Lattice random walk
A popular random walk model is that of a random walk on a regular lattice, where at each step the location jumps to
another site according to some probability distribution. In a simple random walk, the location can only jump to
neighboring sites of the lattice. In simple symmetric random walk on a locally finite lattice, the probabilities of the
location jumping to each one of its immediate neighbours are the same. The best studied example is of random walk
on the d-dimensional integer lattice (sometimes called the hypercubic lattice) .

One-dimensional random walk
An elementary example of a random walk is the random walk on the integer number line, , which starts at 0 and
at each step moves +1 or −1 with equal probability.
This walk can be illustrated as follows. A marker is placed at zero on the number line and a fair coin is flipped. If it
lands on heads, the marker is moved one unit to the right. If it lands on tails, the marker is moved one unit to the left.
After five flips, it marker could now be on 1, −1, 3, −3, 5, or −5. With five flips, three heads and two tails, in any
order, will land on 1. There are 10 ways of landing on 1 (by flipping three heads and two tails), 10 ways of landing
on −1 (by flipping three tails and two heads), 5 ways of landing on 3 (by flipping four heads and one tail), 5 ways of
landing on −3 (by flipping four tails and one head), 1 way of landing on 5 (by flipping five heads), and 1 way of
landing on −5 (by flipping five tails). See the figure below for an illustration of the possible outcomes of 5 flips.

All possible random walk outcomes after 5 flips of a fair coin

To define this walk formally, take independent random variables , where each variable is either 1 or −1,

with a 50% probability for either value, and set and The series is called the simple

random walk on . This series (the sum of the sequence of −1s and 1s) gives the distance walked, if each part of
the walk is of length one. The expectation of is zero. That is, the mean of all coin flips approaches zero
as the number of flips increase. This follows by the finite additivity property of expectation:

A similar calculation, using the independence of the random variables and the fact that , shows that:

This hints that , the expected translation distance after n steps, should be of the order of . In fact,

Random walk 43

Derivation of dispersion proportionality

If we have the situation where the probabilities of moving either left or right are equal, and , the probability of taking steps
to the right out of a total of steps is given by

since there are possible ways of taking and steps to the right and left, respectively. The probability of taking any of these

independent steps is 1/2, and so we have the product . Now, the expectation value of taking steps is

It is generally the case that . Note the identity we have used with the binomial coefficient . We use it again below. We must

then calculate the expectation value of :

It is generally the case that . The dispersion is

This result shows that diffusion is ineffective for mixing because of the way the square root behaves for large .

How many times will a random walk cross a boundary line if permitted to continue walking forever? A simple
random walk on will cross every point an infinite number of times. This result has many names: the
level-crossing phenomenon, recurrence or the gambler's ruin. The reason for the last name is as follows: a gambler
with a finite amount of money will always lose when playing a fair game against a bank with an infinite amount of
money. The gambler's money will perform a random walk, and it will reach zero at some point, and the game will be
over.
If a and b are positive integers, then the expected number of steps until a one-dimensional simple random walk
starting at 0 first hits b or −a is ab. The probability that this walk will hit b before −a is , which can be
derived from the fact that simple random walk is a martingale.
Some of the results mentioned above can be derived from properties of Pascal's triangle. The number of different
walks of n steps where each step is +1 or −1 is clearly 2n. For the simple random walk, each of these walks are

Random walk 44

equally likely. In order for Sn to be equal to a number k it is necessary and sufficient that the number of +1 in the
walk exceeds those of −1 by k. Thus, the number of walks which satisfy is precisely the number of ways of choosing
(n + k)/2 elements from an n element set (for this to be non-zero, it is necessary that n + k be an even number), which

is an entry in Pascal's triangle denoted by . Therefore, the probability that is equal to

Pascal's triangle in terms of factorials and using Stirling's formula, one can obtain good estimates for these
probabilities for large values of .
This relation with Pascal's triangle is demonstrated for small values of n. At zero turns, the only possibility will be to
remain at zero. However, at one turn, there is one chance of landing on −1 or one chance of landing on 1. At two
turns, a marker at 1 could move to 2 or back to zero. A marker at −1, could move to −2 or back to zero. Therefore,
there is one chance of landing on −2, two chances of landing on zero, and one chance of landing on 2.

n −5 −4 −3 −2 −1 0 1 2 3 4 5

1

1 1

1 2 1

1 3 3 1

1 4 6 4 1

1 5 10 10 5 1

The central limit theorem and the law of the iterated logarithm describe important aspects of the behavior of simple
random walk on . In particular, the former entails that as n increases, the probabilities (proportional to the
numbers in each row) approach a normal distribution.

As a Markov chain
A one-dimensional random walk can also be looked at as a Markov chain whose state space is given by the integers
For some number p satisfying , the transition probabilities (the probability Pi,j
of moving from state i to state j) are given by

Gaussian random walk
A random walk having a step size that varies according to a normal distribution is used as a model for real-world
time series data such as financial markets. The Black–Scholes formula for modeling option prices, for example, uses
a gaussian random walk as an underlying assumption.
Here, the step size is the inverse cumulative normal distribution where 0 ≤ z ≤ 1 is a uniformly
distributed random number, and μ and σ are the mean and standard deviations of the normal distribution,
respectively.
For steps distributed according to any distribution with zero mean and a finite variance (not necessarily just a normal
distribution), the root mean square translation distance after n steps is

Random walk 45

Higher dimensions
Imagine now a drunkard walking randomly in an
idealized city. The city is effectively infinite and
arranged in a square grid, and at every
intersection, the drunkard chooses one of the
four possible routes (including the one he came
from) with equal probability. Formally, this is a
random walk on the set of all points in the plane
with integer coordinates. Will the drunkard ever
get back to his home from the bar? It turns out
that he will. This is the high dimensional
equivalent of the level crossing problem
discussed above. The probability of returning to
the origin decreases as the number of dimensions
increases. In three dimensions, the probability
decreases to roughly 34%. A derivation, along
with values of p(d) are discussed here: Pólya's
Random Walk Constants [10].

The trajectory of a random walk is the collection
of sites it visited, considered as a set with
disregard to when the walk arrived at the point.
Random walk in two dimensions
In one dimension, the trajectory is simply all
points between the minimum height the walk
achieved and the maximum (both are, on average, on the order of √n). In higher dimensions the set has interesting
geometric properties. In fact, one gets a discrete fractal, that is a set which exhibits stochastic self-similarity on large
scales, but on small scales one can observe "jaggedness" resulting from the grid on which the walk is performed. The
two books of Lawler referenced below are a good source on this topic.

Random walk 46

Random walk in two dimensions with more, and smaller, steps

Random walk in two dimensions with two million even smaller steps. This
image was generated in such a way that points that are more frequently
traversed are darker. In the limit, for very small steps, one obtains Brownian
motion.

Random walk 47

Three random walks in three dimensions

Relation to Wiener process
A Wiener process is a stochastic process with similar behaviour to
Brownian motion, the physical phenomenon of a minute particle
diffusing in a fluid. (Sometimes the Wiener process is called
"Brownian motion", although this is strictly speaking a confusion
of a model with the phenomenon being modeled.)
A Wiener process is the scaling limit of random walk in dimension
1. This means that if you take a random walk with very small steps
you get an approximation to a Wiener process (and, less accurately,
to Brownian motion). To be more precise, if the step size is ε, one
needs to take a walk of length L/ε2 to approximate a Wiener process
walk of length L. As the step size tends to 0 (and the number of
steps increases proportionally) random walk converges to a Wiener
Simulated steps approximating a Wiener process in
process in an appropriate sense. Formally, if B is the space of all two dimensions
paths of length L with the maximum topology, and if M is the space
of measure over B with the norm topology, then the convergence is in the space M. Similarly, a Wiener process in
several dimensions is the scaling limit of random walk in the same number of dimensions.

A random walk is a discrete fractal (a function with integer dimensions; 1, 2, ...), but a Wiener process trajectory is a
true fractal, and there is a connection between the two. For example, take a random walk until it hits a circle of
radius r times the step length. The average number of steps it performs is r2. This fact is the discrete version of the
fact that a Wiener process walk is a fractal of Hausdorff dimension 2.
In two dimensions, the average number of points the same random walk has on the boundary of its trajectory is r4/3.
This corresponds to the fact that the boundary of the trajectory of a Wiener process is a fractal of dimension 4/3, a

Random walk 48

fact predicted by Mandelbrot using simulations but proved only in 2000 by Lawler, Schramm and Werner.[11]
A Wiener process enjoys many symmetries random walk does not. For example, a Wiener process walk is invariant
to rotations, but random walk is not, since the underlying grid is not (random walk is invariant to rotations by 90
degrees, but Wiener processes are invariant to rotations by, for example, 17 degrees too). This means that in many
cases, problems on random walk are easier to solve by translating them to a Wiener process, solving the problem
there, and then translating back. On the other hand, some problems are easier to solve with random walks due to its
discrete nature.
Random walk and Wiener process can be coupled, namely manifested on the same probability space in a dependent
way that forces them to be quite close. The simplest such coupling is the Skorokhod embedding, but other, more
precise couplings exist as well.
The convergence of a random walk toward the Wiener process is controlled by the central limit theorem. For a
particle in a known fixed position at t = 0, the theorem tells us that after a large number of independent steps in the
random walk, the walker's position is distributed according to a normal distribution of total variance:

where t is the time elapsed since the start of the random walk, is the size of a step of the random walk, and is
the time elapsed between two successive steps.
This corresponds to the Green function of the diffusion equation that controls the Wiener process, which
demonstrates that, after a large number of steps, the random walk converges toward a Wiener process.
In 3D, the variance corresponding to the Green's function of the diffusion equation is:

By equalizing this quantity with the variance associated to the position of the random walker, one obtains the
equivalent diffusion coefficient to be considered for the asymptotic Wiener process toward which the random walk
converges after a large number of steps:

(valid only in 3D)

Remark: the two expressions of the variance above correspond to the distribution associated to the vector that
links the two ends of the random walk, in 3D. The variance associated to each component , or is only
one third of this value (still in 3D).

Random walk 49

Anomalous diffusion
In disordered systems such as porous media and fractals may not be proportional to but to . The
[12]
exponent is called the anomalous diffusion exponent and can be larger or smaller than 2.

Applications
The following are some applications of random walk:
• In economics, the "random walk hypothesis" is used to model shares
prices and other factors. Empirical studies found some deviations
from this theoretical model, especially in short term and long term
correlations. See share prices.
• In population genetics, random walk describes the statistical
properties of genetic drift
• In physics, random walks are used as simplified models of physical
Brownian motion and diffusion such as the random movement of
molecules in liquids and gases. See for example diffusion-limited
aggregation. Also in physics, random walks and some of the self
interacting walks play a role in quantum field theory.
• In mathematical ecology, random walks are used to describe
individual animal movements, to empirically support processes of
biodiffusion, and occasionally to model population dynamics. Antony Gormley's Quantum Cloud sculpture in
• In polymer physics, random walk describes an ideal chain. It is the London was designed by a computer using a
random walk algorithm.
simplest model to study polymers.
• In other fields of mathematics, random walk is used to calculate
solutions to Laplace's equation, to estimate the harmonic measure, and for various constructions in analysis and
combinatorics.
• In computer science, random walks are used to estimate the size of the Web. In the World Wide Web
conference-2006 [13], bar-yossef et al. published their findings and algorithms for the same.
• In image segmentation, random walks are used to determine the labels (i.e., "object" or "background") to
associate with each pixel.[14] This algorithm is typically referred to as the random walker segmentation algorithm.
In all these cases, random walk is often substituted for Brownian motion.
• In brain research, random walks and reinforced random walks are used to model cascades of neuron firing in the
brain.
• In vision science, fixational eye movements are well described by a random walk.[15]
• In psychology, random walks explain accurately the relation between the time needed to make a decision and the
probability that a certain decision will be made.[16]
• Random walks can be used to sample from a state space which is unknown or very large, for example to pick a
random page off the internet or, for research of working conditions, a random worker in a given country.
• When this last approach is used in computer science it is known as Markov Chain Monte Carlo or MCMC for
short. Often, sampling from some complicated state space also allows one to get a probabilistic estimate of the
space's size. The estimate of the permanent of a large matrix of zeros and ones was the first major problem
tackled using this approach.
• Random walks have also been used to sample massive online graphs such as online social networks.
• In wireless networking, a random walk is used to model node movement.
• Motile bacteria engage in a biased random walk.
• Random walks are used to model gambling.

Random walk 50

• In physics, random walks underlie the method of Fermi estimation.

Variants of random walks
A number of types of stochastic processes have been considered that are similar to the pure random walks but where
the simple structure is allowed to be more generalized. The pure structure can be characterized by the steps being
being defined by independent and identically distributed random variables.

Random walk on graphs
A random walk of length k on a possibly infinite graph G with a root 0 is a stochastic process with random variables
such that and is a vertex chosen uniformly at random from the neighbors of
. Then the number is the probability that a random walk of length k starting at v ends at w. In particular,
if G is a graph with root 0, is the probability that a -step random walk returns to 0.
Assume now that our city is no longer a perfect square grid. When our drunkard reaches a certain junction he picks
between the various available roads with equal probability. Thus, if the junction has seven exits the drunkard will go
to each one with probability one seventh. This is a random walk on a graph. Will our drunkard reach his home? It
turns out that under rather mild conditions, the answer is still yes. For example, if the lengths of all the blocks are
between a and b (where a and b are any two finite positive numbers), then the drunkard will, almost surely, reach his
home. Notice that we do not assume that the graph is planar, i.e. the city may contain tunnels and bridges. One way
to prove this result is using the connection to electrical networks. Take a map of the city and place a one ohm resistor
on every block. Now measure the "resistance between a point and infinity". In other words, choose some number R
and take all the points in the electrical network with distance bigger than R from our point and wire them together.
This is now a finite electrical network and we may measure the resistance from our point to the wired points. Take R
to infinity. The limit is called the resistance between a point and infinity. It turns out that the following is true (an
elementary proof can be found in the book by Doyle and Snell):
Theorem: a graph is transient if and only if the resistance between a point and infinity is finite. It is not important
which point is chosen if the graph is connected.
In other words, in a transient system, one only needs to overcome a finite resistance to get to infinity from any point.
In a recurrent system, the resistance from any point to infinity is infinite.
This characterization of recurrence and transience is very useful, and specifically it allows us to analyze the case of a
city drawn in the plane with the distances bounded.
A random walk on a graph is a very special case of a Markov chain. Unlike a general Markov chain, random walk on
a graph enjoys a property called time symmetry or reversibility. Roughly speaking, this property, also called the
principle of detailed balance, means that the probabilities to traverse a given path in one direction or in the other
have a very simple connection between them (if the graph is regular, they are just equal). This property has important
consequences.
Starting in the 1980s, much research has gone into connecting properties of the graph to random walks. In addition to
the electrical network connection described above, there are important connections to isoperimetric inequalities, see
more here, functional inequalities such as Sobolev and Poincaré inequalities and properties of solutions of Laplace's
equation. A significant portion of this research was focused on Cayley graphs of finitely generated groups. For
example, the proof of Dave Bayer and Persi Diaconis that 7 riffle shuffles are enough to mix a pack of cards (see
more details under shuffle) is in effect a result about random walk on the group Sn, and the proof uses the group
structure in an essential way. In many cases these discrete results carry over to, or are derived from manifolds and
Lie groups.
A good reference for random walk on graphs is the online book by Aldous and Fill [17]. For groups see the book of
Woess. If the transition kernel is itself random (based on an environment ) then the random walk is

Random walk 51

called a "random walk in random environment". When the law of the random walk includes the randomness of , the
law is called the annealed law; on the other hand, if is seen as fixed, the law is called a quenched law. See the book
of Hughes or the lecture notes of Zeitouni.
We can think about choosing every possible edge with the same probability as maximizing uncertainty (entropy)
locally. We could also do it globally – in maximal entropy random walk (MERW) [18] we want all paths to be
equally probable, or in other words: for each two vertexes, each path of given length is equally probable. This
random walk has much stronger localization properties.

Self-interacting random walks
There are a number of interesting models of random paths in which each step depends on the past in a complicated
manner. All are more complex for solving analytically than the usual random walk; still, the behavior of any model
of a random walker is obtainable using computers. Examples include:
• The self-avoiding walk (Madras and Slade 1996).[19]
The self-avoiding walk of length n on Z^d is the random n-step path which starts at the origin, makes transitions only
between adjacent sites in Z^d, never revisits a site, and is chosen uniformly among all such paths. In two dimensions,
due to self-trapping, a typical self-avoiding walk is very short,[20] while in higher dimension it grows beyond all
bounds. This model has often been used in polymer physics (since the 1960s).
• The loop-erased random walk (Gregory Lawler).[21][22]
• The reinforced random walk (Robin Pemantle 2007).[23]
• The exploration process.
• The multiagent random walk.[24]

Long-range correlated walks
Long-range correlated time series are found in many biological, climatological and economic systems.
• Heartbeat records[25]
• Non-coding DNA sequences[26]
• Volatility time series of stocks[27]
• Temperature records around the globe[28]

Heterogeneous random walks in one dimension
Heterogeneous random walks in one dimension can have either
discrete time or continuous time. The interval is also either discrete or
continuous, and it is either finite or without bounds. In a discrete
system, the connections are among adjacent states. The dynamics are
either Markovian, Semi-Markovian, or even not-Markovian depending
on the model. Heterogeneous random walks in 1D have jump
probabilities that depend on the location in the system, and/or different
jumping time (JT) probability density functions (PDFs) that depend on Figure 1 A part of a semi-Markovian chain in 1D
with directional JT-PDFs. A way for simulating
the location in the system.
such a random walk is when first drawing a
Known important results in simple systems include: random number out of a uniform distribution that
determines the propagation direction according to
• In a symmetric Markovian random walk, the Green's function (also the transition probabilities, and then drawing a
termed the PDF of the walker) for occupying state i is a Gaussian in random time out of the relevant JT-PDF.
the position and has a variance that scales like the time. This result
holds in a system with discrete time and space, yet also in a system with continuous time and space. This result is
for systems without bounds.

Random walk 52

• When there is a simple bias in the system (i.e. a constant force is applied on the system in a particular direction),
the average distance of the random walker from its starting position is linear with time.
• When trying reaching a distance L from the starting position in a finite interval of length L with a constant force,
the time for reaching this distance is exponential with the length L: , when moving against the force,
and is linear with the length L: , when moving with the force. Without force: .
In a completely heterogeneous semi Markovian random walk in a discrete system of L (>1) states, the Green's
function was found in Laplace space[29][30][31] (the Laplace transform of a function is defined with,
). Here, the system is defined through the jumping time (JT) PDFs: connecting

state i with state j (the jump is from state i). The solution is based on the path representation of the Green's function,
calculated when including all the path probability density functions of all lengths:

(1)

Here, and . Also, in Eq. (1),

(2)

and,

(3)

with,

(4)

and,

(5)

For L=1, . The symbol [L/2] that appears in the upper bound in the in eq. (3) is the floor operation
(round towards zero). Finally, the factor in eq. (1) has the same form as in in eqs. (3)-(5), yet it is
calculated on a lattice . Lattice is constructed from the original lattice by taking out from it the states i and j
and the states between them, and then connecting the obtained two fragments. For cases in which a fragment is a
single state, this fragment is excluded; namely, lattice is the longer fragment. When each fragment is a single
state, .
Equations (1)-(5) hold for any 1D semi-Markovian random walk in a L-state chain, and form the most general
solution in an explicit form for random walks in 1d.

Random walk 53

References
[1] Pearson, K. (1905). The Problem of the Random Walk. Nature. 72, 294.
[2] Van Kampen N. G., Stochastic Processes in Physics and Chemistry, revised and enlarged edition (North-Holland, Amsterdam) 1992.
[3] Redner S., A Guide to First-Passage Process (Cambridge University Press, Cambridge, UK) 2001.
[4] Goel N. W. and Richter-Dyn N., Stochastic Models in Biology (Academic Press, New York) 1974.
[5] Doi M. and Edwards S. F., The Theory of Polymer Dynamics (Clarendon Press, Oxford) 1986
[6] De Gennes P. G., Scaling Concepts in Polymer Physics (Cornell University Press, Ithaca and London) 1979.
[7] Risken H., The Fokker–Planck Equation (Springer, Berlin) 1984.
[8] Weiss G. H., Aspects and Applications of the Random Walk (North-Holland, Amsterdam) 1994.
[9] Cox D. R., Renewal Theory (Methuen, London) 1962.
[10] http:/ / mathworld. wolfram. com/ PolyasRandomWalkConstants. html
[11] Dana Mackenzie, Taking the Measure of the Wildest Dance on Earth (http:/ / www. sciencemag. org/ content/ 290/ 5498/ 1883. full),
Science, Vol. 290, no. 5498, pp. 1883–1884.
[12] D. Ben-Avraham and S. Havlin, Diffusion and Reactions in Fractals and Disordered Systems (http:/ / havlin. biu. ac. il/ Shlomo Havlin
books_d_r. php), Cambridge University Press, 2000.
[13] http:/ / www2006. org/
[14] Leo Grady (2006): "Random Walks for Image Segmentation" (http:/ / www. cns. bu. edu/ ~lgrady/ grady2006random. pdf), IEEE
Transactions on Pattern Analysis and Machine Intelligence, pp. 1768–1783, Vol. 28, No. 11
[15] Ralf Engbert, Konstantin Mergenthaler, Petra Sinn, and Arkady Pikovsk: "An integrated model of ﬁxational eye movements and
microsaccades" (http:/ / www. pnas. org/ content/ early/ 2011/ 08/ 17/ 1102730108. full. pdf)
[16] Nosofsky, 1997 (http:/ / 66. 102. 1. 104/ scholar?num=100& hl=en& lr=& safe=off& q=cache:0cCSw8zlPjoJ:oz. ss. uci. edu/ 237/ readings/
EBRW_nosofsky_1997. pdf+ decision+ random+ walk)
[17] http:/ / stat-www. berkeley. edu/ users/ aldous/ RWG/ book. html
[18] http:/ / arxiv. org/ abs/ 0810. 4113
[19] Neal Madras and Gordon Slade (1996), The Self-Avoiding Walk, Birkhäuser Boston. ISBN 0-8176-3891-1.
[20] S. Hemmer and P. C. Hemmer (1984), "An average self-avoiding random walk on the square lattice lasts 71 steps", J. Chem. Phys. 81: 584,
Bibcode 1984JChPh..81..584H, doi:10.1063/1.447349 Papercore summary (http:/ / papercore. org/ Hemmer1984)
[21] Gregory Lawler (1996). Intersection of random walks, Birkhäuser Boston. ISBN 0-8176-3892-X.
[22] Gregory Lawler, Conformally Invariant Processes in the Plane, book.ps (http:/ / www. math. cornell. edu/ ~lawler/ book. ps).
[23] Robin Pemantle (2007), A survey of random processes with reinforcement (http:/ / www. emis. de/ journals/ PS/ images/ getdoc9b04.
pdf?id=432& article=94& mode=pdf).
[24] Alamgir, M and von Luxburg, U (2010). "Multi-agent random walks for local clustering on graphs" (http:/ / www. kyb. mpg. de/ fileadmin/
user_upload/ files/ publications/ attachments/ AlamgirLuxburg2010_[0]. pdf), IEEE 10th International Conference on Data Mining (ICDM),
2010, pp. 18-27.
[25] C.-K. Peng, J. Mietus, J. M. Hausdorff, S. Havlin, H. E. Stanley, A. L. Goldberger (1993). "Long-range anticorrelations and non-gaussian
behavior of the heartbeat" (http:/ / havlin. biu. ac. il/ Publications. php?keyword=Long-range+ anticorrelations+ and+ non-gaussian+
behavior+ of+ the+ heartbeat& year=*& match=all). Phys. Rev. Lett. 70 (9): 1343–6. Bibcode 1993PhRvL..70.1343P.
doi:10.1103/PhysRevLett.70.1343. PMID 10054352. .
[26] C.-K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin, F. Sciortino, M. Simons, H. E. Stanley (1992). "Long-range correlations in
nucleotide sequences" (http:/ / havlin. biu. ac. il/ Publications. php?keyword=Long-range+ correlations+ in+ nucleotide+ sequences&
year=*& match=all). Nature 356 (6365): 168–70. Bibcode 1992Natur.356..168P. doi:10.1038/356168a0. PMID 1301010. .
[27] Y. Liu, P. Cizeau, M. Meyer, C.-K. Peng, H. E. Stanley (1997). "Correlations in economic time series". Physica A 245 (3–4): 437.
doi:10.1016/S0378-4371(97)00368-3.
[28] E. Koscielny-Bunde, A. Bunde, S. Havlin, H. E. Roman, Y. Goldreich, H.-J. Schellenhuber (1998). "Indication of a universal persistence
law governing atmospheric variability" (http:/ / havlin. biu. ac. il/ Publications. php?keyword=Indication+ of+ a+ universal+ persistence+
law+ governing+ atmospheric+ variability& year=*& match=all). Phys. Rev. Lett. 81 (3): 729. Bibcode 1998PhRvL..81..729K.
doi:10.1103/PhysRevLett.81.729. .
[29] Flomenbom O. and Klafter J., Phys. Rev. Lett., 95 (2005) 098105-1
[30] Flomenbom O. and Silbey R. J., J. Chem. Phys. 127, 034102 (2007).
[31] Flomenbom O and Silbey RJ, Phys. Rev. E 76, 041101 (2007).

Random walk 54

Bibliography
• David Aldous and Jim Fill, Reversible Markov Chains and Random Walks on Graphs, http://stat-www.berkeley.
edu/users/aldous/RWG/book.html
• Doyle, Peter G.; Snell, J. Laurie (1984). Random walks and electric networks. Carus Mathematical Monographs.
22. Mathematical Association of America. arXiv:math.PR/0001057. ISBN 978-0-88385-024-4. MR920811
• William Feller (1968), An Introduction to Probability Theory and its Applications (Volume 1). ISBN
0-471-25708-7
Chapter 3 of this book contains a thorough discussion of random walks, including advanced results, using only
elementary tools.
• Barry D. Hughes (1996), Random walks and random environments, Oxford University Press. ISBN
0-19-853789-1
• James Norris (1998), Markov Chains, Cambridge University Press. ISBN 0-521-63396-6
• Springer (http://www.springerlink.com/(brnqxc55mlvpxs452ufzp555)/app/home/contribution.
asp?referrer=parent&backto=issue,13,13;journal,798,1099;linkingpublicationresults,1:100442,1) Pólya (1921),
"Über eine Aufgabe der Wahrscheinlichkeitsrechnung betreffend die Irrfahrt im Strassennetz", Mathematische
Annalen, 84(1-2):149–160, March 1921.
• Pal Révész (1990), Random walk in random and non-random environments, World Scientific Pub Co. ISBN
981-02-0237-7
• Wolfgang Woess (2000), Random walks on infinite graphs and groups, Cambridge tracts in mathematics 138,
Cambridge University Press. ISBN 0-521-55292-3
• Mackenzie, Dana, "Taking the Measure of the Wildest Dance on Earth" (http://www.sciencemag.org/cgi/
content/full/sci;290/5498/1883), Science, Vol. 290, 8 December 2000.
• G. Weiss Aspects and Applications of the Random Walk, North-Holland, 1994.
• D. Ben-Avraham and S. Havlin, Diffusion and Reactions in Fractals and Disordered Systems (http://havlin.biu.
ac.il/Shlomo Havlin books_d_r.php), Cambridge University Press, 2000.
• "Numb3rs Blog." Department of Mathematics. 29 April 2006. Northeastern University. 12 December 2007 http://
www.atsweb.neu.edu/math/cp/blog/?id=137&month=04&year=2006&date=2006-04-29.

External links
• Pólya's Random Walk Constants (http://mathworld.wolfram.com/PolyasRandomWalkConstants.html)
• Random walk in Java Applet (http://vlab.infotech.monash.edu.au/simulations/swarms/random-walk/)

Brownian motion 55

Brownian motion
Brownian motion or pedesis (from Greek: πήδησις Pɛɖeːsɪs
"leaping") is the presumably random moving of particles suspended in
a fluid (a liquid or a gas) resulting from their bombardment by the
fast-moving atoms or molecules in the gas or liquid. The term
"Brownian motion" can also refer to the mathematical model used to
describe such random movements, which is often called a particle
theory.[1]

In 1827 the biologist Robert Brown, looking through a microscope at
pollen grains in water, noted that the grains moved through the water
but was not able to determine the mechanisms that caused this motion.
The direction of the force of atomic bombardment is constantly
changing, and at different times the pollen grain is hit more on one side This is a simulation of the Brownian motion of a
than another, leading to the seemingly random nature of the motion. big particle (dust particle) that collides with a
large set of smaller particles (molecules of a gas)
This type of phenomenon is named in Brown's honor.
which move with different velocities in different
The mathematical model of Brownian motion has several real-world random directions.

applications. Stock market fluctuations are often cited, although Benoit
Mandelbrot rejected its applicability to stock price movements in part
because these are discontinuous.[2]
Brownian motion is among the simplest of the continuous-time
stochastic (or probabilistic) processes, and it is a limit of both simpler
and more complicated stochastic processes (see random walk and
Donsker's theorem). This universality is closely related to the
universality of the normal distribution. In both cases, it is often
mathematical convenience rather than the accuracy of the models that
motivates their use. This is because Brownian motion, whose time
derivative is everywhere infinite, is an idealised approximation to
actual random physical processes, which always have a finite time
scale. This is a simulation of the Brownian motion of 5
particles (yellow) that collide with a large set of
800 particles. The yellow particles leave 5 blue
trails of random motion and one of them has a red
velocity vector.

Brownian motion 56

Three different views of Brownian motion, with
32 steps, 256 steps, and 2048 steps denoted by
progressively lighter colors

A single realisation of three-dimensional
Brownian motion for times 0 ≤ t ≤ 2

Brownian motion 57

History
The Roman Lucretius's scientific poem "On the Nature of Things" (c.
60 BC) has a remarkable description of Brownian motion of dust
particles. He uses this as a proof of the existence of atoms:
"Observe what happens when sunbeams are admitted into
a building and shed light on its shadowy places. You will
see a multitude of tiny particles mingling in a multitude of
ways... their dancing is an actual indication of underlying
movements of matter that are hidden from our sight... It
originates with the atoms which move of themselves [i.e.,
Reproduced from the book of Jean Baptiste
spontaneously]. Then those small compound bodies that
Perrin, Les Atomes, three tracings of the motion
of colloidal particles of radius 0.53 µm, as seen are least removed from the impetus of the atoms are set in
under the microscope, are displayed. Successive motion by the impact of their invisible blows and in turn
positions every 30 seconds are joined by straight cannon against slightly larger bodies. So the movement
[3]
line segments (the mesh size is 3.2 µm).
mounts up from the atoms and gradually emerges to the
level of our senses, so that those bodies are in motion that
we see in sunbeams, moved by blows that remain invisible."
Although the mingling motion of dust particles is caused largely by air currents, the glittering, tumbling motion of
small dust particles is, indeed, caused chiefly by true Brownian dynamics.
Jan Ingenhousz had described the irregular motion of coal dust particles on the surface of alcohol in 1785 —
nevertheless the discovery is often credited to the botanist Robert Brown in 1827. Brown was studying pollen grains
of the plant Clarkia pulchella suspended in water under a microscope when he observed minute particles, ejected by
the pollen grains, executing a jittery motion. By repeating the experiment with particles of inorganic matter he was
able to rule out that the motion was life-related, although its origin was yet to be explained.
The first person to describe the mathematics behind Brownian motion was Thorvald N. Thiele in a paper on the
method of least squares published in 1880. This was followed independently by Louis Bachelier in 1900 in his PhD
thesis "The theory of speculation", in which he presented a stochastic analysis of the stock and option markets.
Albert Einstein (in one of his 1905 papers) and Marian Smoluchowski (1906) brought the solution of the problem to
the attention of physicists, and presented it as a way to indirectly confirm the existence of atoms and molecules.
Their equations describing Brownian motion were subsequently verified by the experimental work of Jean Baptiste
Perrin in 1913.

Einstein's theory
There are two parts to Einstein's theory: the first part consists in the formulation of a diffusion equation for Brownian
particles, in which the diffusion coefficient is related to the mean squared displacement of a Brownian particle, while
the second part consists in relating the diffusion coefficient to measurable physical quantities. In this way Einstein
was able to determine the size of atoms, and how many atoms there are in a mole, or the molecular weight in grams,
of a gas. In accordance to Avogadro's law this volume is the same for all ideal gases, which is 22.414 liters at
standard temperature and pressure. The number of atoms contained in this volume is referred to as Avogadro's
number, and the determination of this number is tantamount to the knowledge of the mass of an atom since the latter
is obtained by dividing the mass of a mole of the gas by Avogadro's number.

Brownian motion 58

The first part of Einstein's argument was to
determine how far a Brownian particle
travels in a given time interval. Classical
mechanics is unable to determine this
distance because of the enormous number of
bombardments a Brownian particle will
undergo, roughly of the order of 1021
collisions per second.[4] Thus Einstein was
led to consider the collective motion of
Brownian particles. He showed that if ρ(x,t)
is the density of Brownian particles at point
x at time t, then ρ satisfies the diffusion
equation:
The characteristic bell-shaped curves of the diffusion of Brownian particles. The
distribution begins as a Dirac delta function, indicating that all the particles are
located at the origin at time t=0, and for increasing times they become flatter and
flatter until the distribution becomes uniform in the asymptotic time limit.
where:
D is the mass diffusivity.
Assuming that all the particles start from the origin at the initial time t=0, the diffusion equation has the solution

This expression allowed Einstein to calculate the moments directly. The first moment is seen to vanish, meaning that
the Brownian particle is equally likely to move to the left as it is to move to the right. The second moment is,
however, non-vanishing, being given by

This expresses the mean squared displacement in terms of the time elapsed and the diffusivity. From this expression
Einstein argued that the displacement of a Brownian particle is not proportional to the elapsed time, but rather to its
square root.[5] His argument is based on a conceptual switch from the "ensemble" of Brownian particles to the
"single" Brownian particle: we can speak of the relative number of particles at a single instant just as well as of the
time it takes a Brownian particle to reach a given point.[6]
The second part of Einstein's theory relates the diffusion constant to physically measurable quantities, such as the
mean squared displacement of a particle in a given time interval. This result enables the experimental determination
of Avogadro's number and therefore the size of molecules. Einstein analyzed a dynamic equilibrium being
established between opposing forces. The beauty of his argument is that the final result does not depend upon which
forces are involved in setting up the dynamic equilibrium. In his original treatment, Einstein considered an osmotic
pressure experiment, but the same conclusion can be reached in other ways. Consider, for instance, particles
suspended in a viscous fluid in a gravitational field. Gravity tends to make the particles settle, whereas diffusion acts
to homogenize them, driving them into regions of smaller concentration. Under the action of gravity, a particle
acquires a downward speed of , where is the mass of the particle, is the acceleration due to
gravity, and is the particle's mobility in the fluid. George Stokes had shown that the mobility for a spherical
particle with radius is , where is the dynamic viscosity of the fluid. In a state of dynamic
equilibrium, the particles are distributed according to the barometric distribution

where is the difference in density of particles separated by a height difference of , is Boltzmann's
constant (namely, the ratio of the universal gas constant, , to Avogadro's number, ), and is the absolute
temperature. It is Avogadro's number that is to be determined.

Brownian motion 59

Dynamic equilibrium is established because the more that particles are
pulled down by gravity, the greater is the tendency for the particles to
migrate to regions of lower concentration. The flux is given by Fick's
law,

where . Introducing the formula for , we find that

In a state of dynamical equilibrium, this speed must also be equal to The equilibrium distribution for particles of
gamboge shows the tendency for granules to
. Notice that both expressions for are proportional to
move to regions of lower concentration when
, reflecting how the derivation is independent of the type of forces affected by gravity.
considered. Equating these two expressions yields a formula for the
diffusivity:

Here the first equality follows from the first part of Einstein's theory, the third equality follows from the definition of
Boltzmann's constant as , and the fourth equality follows from Stokes' formula for the mobility. By
measuring the mean squared displacement over a time interval along with the universal gas constant , the
temperature , the viscosity , and the particle radius , Avogadro's number can be determined.
The type of dynamical equilibrium proposed by Einstein was not new. It had been pointed out previously by J. J.
Thomson[7] in his series of lectures at Yale University in May 1903 that the dynamic equilibrium between the
velocity generated by a concentration gradient given by Fick's law and the velocity due to the variation of the partial
pressure caused when ions are set in motion "gives us a method of determining Avogadro's Constant which is
independent of any hypothesis as to the shape or size of molecules, or of the way in which they act upon each
other".[7]
An identical expression to Einstein's formula for the diffusion coefficient was also found by Walther Nernst in
1888[8] in which he expressed the diffusion coefficient as the ratio of the osmotic pressure to the ratio of the
frictional force and the velocity to which it gives rise. The former was equated to the law of van 't Hoff while the
latter was given by Stokes's law. He writes for the diffusion coefficient , where is the osmotic
pressure and is the ratio of the frictional force to the molecular viscosity which he assumes is given by Stokes's
formula for the viscosity. Introducing the ideal gas law per unit volume for the osmotic pressure, the formula
becomes identical to that of Einstein's.[9] The use of Stokes's law in Nernst's case, as well as in Einstein and
Smoluchowski, is not strictly applicable since it does not apply to the case where the radius of the sphere is small in
comparison with the mean free path.[10]
At first the predictions of Einstein's formula were seemingly refuted by a series of experiments by Svedberg in 1906
and 1907, which gave displacements of the particles as 4 to 6 times the predicted value, and by Henri in 1908 who
found displacements 3 times greater than Einstein's formula predicted.[11] But Einstein's predictions were finally
confirmed in a series of experiments carried out by Chaudesaigues in 1908 and Perrin in 1909. The confirmation of
Einstein's theory constituted empirical progress for the kinetic theory of heat. In essence, Einstein showed that the
motion can be predicted directly from the kinetic model of thermal equilibrium. The importance of the theory lay in
the fact that it confirmed the kinetic theory's account of the second law of thermodynamics as being an essentially
statistical law.[12]

Brownian motion 60

Intuitive metaphor
Consider a large balloon of 100 metres in diameter. Imagine this large balloon in a football stadium. The balloon is
so large that it lies on top of many members of the crowd. Because they are excited, these fans hit the balloon at
different times and in different directions with the motions being completely random. In the end, the balloon is
pushed in random directions, so it should not move on average. Consider now the force exerted at a certain time. We
might have 20 supporters pushing right, and 21 other supporters pushing left, where each supporter is exerting
equivalent amounts of force. In this case, the forces exerted towards the left and the right are imbalanced in favor of
the left; the balloon will move slightly to the left. This type of imbalance exists at all times, and it causes random
motion of the balloon. If we look at this situation from far above, so that we cannot see the supporters, we see the
large balloon as a small object animated by erratic movement.

Brownian motion model of the trajectory of a particle of dye in water.

Consider the particles emitted by Brown's pollen grain moving randomly in water: we know that a water molecule is
about 0.1 by 0.2 nm in size, whereas the particles which Brown observed were of the order of a few micrometres in
size (these are not to be confused with the actual pollen particle which is about 100 micrometres). So a particle from
the pollen may be likened to the balloon, and the water molecules to the fans, except that in this case the balloon is
surrounded by fans. The Brownian motion of a particle in a liquid is thus due to the instantaneous imbalance in the
combined forces exerted by collisions of the particle with the much smaller liquid molecules (which are in random
thermal motion) surrounding it.
An animation of the Brownian motion concept [13] is available as a Java applet.

Theory

Smoluchowski model
Smoluchowski's theory of Brownian motion[14] starts from the same premise as that of Einstein and derives the same
probability distribution for the displacement of a Brownian particle along the in time . He therefore
gets the same expression for the mean squared displacement: . However, when he relates it to a particle of
mass moving at a velocity which is the result of a frictional force governed by Stokes's law, he finds

where is the viscosity coefficient, and is the radius of the particle. Associating the kinetic energy
with the thermal energy , the expression for the mean squared displacement is 64/27 times that found by
Einstein. The fraction 27/64 was commented on by Arnold Sommerfeld in his necrology on Smoluckowski: "The
numerical coefficient of Einstein, which differs from Smoluchowski by 27/64 can only be put in doubt."[15]

Brownian motion 61

Smoluchowski[16] attempts to answer the question of why a Brownian particle should be displaced by bombardments
of smaller particles when the probabilities for striking it in the forward and rear directions are equal. In order to do
so, he uses, unknowingly, the ballot theorem, first proved by W.A. Whitworth in 1887.[17] The ballot theorem states
that if a candidate A scores votes and candidate B scores that the probability throughout the counting
that A will have more votes than B is or , no matter how large
the total number of votes may be. In other words, if one candidate has an edge on the other candidate he will tend
to keep that edge even though there is nothing favoring either candidate on a ballot extraction.
If the probability of gains and losses follows a binomial distribution,

with equal a priori probabilities of , the mean total gain is

If is large enough so that Stirling's approximation can be used in the form

then the expected total gain will be

showing that it increases as the square root of the total population.
Suppose that a Brownian particle of mass is surrounded by lighter particles of mass which are traveling at a
speed . Then, reasons Smoluchowski, in any collision between a surrounding and Brownian particles, the velocity
transmitted to the latter will be . This ratio is of the order of cm/sec. But we also have to take into
consideration that in a gas there will be more than collisions in a second, and even greater in a liquid where we
expect that there will be collision in one second. Some of these collisions will tend to accelerate the Brownian
particle; others will tend to decelerate it. If there is a mean excess of one kind of collision or the other to be of the
order of to collisions in one second, then velocity of the Brownian particle may be anywhere between 10
to a 1000 cm/sec. Thus, even though there are equal probabilities for forward and backward collisions there will be a
net tendency to keep the Brownian particle in motion, just as the ballot theorem predicts.
These orders of magnitude are not exact because they don't take into consideration the velocity of the Brownian
particle, , which depends on the collisions that tend to accelerate and decelerate it. The larger is, the greater
will be the collisions that will retard it so that the velocity of a Brownian particle can never increase without limit.
Could a such a process occur, it would be tantamount to a perpetual motion of the second type. And since
equipartition of energy applies, the kinetic energy of the Brownian particle, , will be equal, on the average,
to the kinetic energy of the surrounding fluid particle, .
In 1906 Smoluchowski published a one-dimensional model to describe a particle undergoing Brownian motion.[18]
The model assumes collisions with M ≫ m where M is the test particle's mass and m the mass of one of the
individual particles composing the fluid. It is assumed that the particle collisions are confined to one dimension and
that it is equally probable for the test particle to be hit from the left as from the right. It is also assumed that every
collision always imparts the same magnitude of . If is the number of collisions from the right and the
number of collisions from the left then after N collisions the particle's velocity will have changed by
. The multiplicity is then simply given by:

Brownian motion 62

and the total number of possible states is given by . Therefore the probability of the particle being hit from the
right times is:

As a result of its simplicity, Smoluchowski's 1D model can only qualitatively describe Brownian motion. For a
realistic particle undergoing Brownian motion in a fluid many of the assumptions cannot be made. For example, the
assumption that on average there occurs an equal number of collisions from the right as from the left falls apart once
the particle is in motion. Also, there would be a distribution of different possible s instead of always just one in
a realistic situation.

Modeling using differential equations
The equations governing Brownian motion relate slightly differently to each of the two definitions of Brownian
motion given at the start of this article.

Mathematics
In mathematics, Brownian motion is
described by the Wiener process; a
continuous-time stochastic process named in
honor of Norbert Wiener. It is one of the
best known Lévy processes (càdlàg
stochastic processes with stationary
independent increments) and occurs
frequently in pure and applied mathematics,
economics and physics.

The Wiener process is characterised by
four facts:
1.
2. is almost surely continuous
3. has independent increments
4. (for
).
denotes the normal distribution
with expected value μ and variance σ2. The An animated example of a Brownian motion-like random walk on a torus. In the
condition that it has independent increments scaling limit, random walk approaches the Wiener process according to Donsker's
means that if theorem.

then and are
independent random variables.
An alternative characterisation of the Wiener process is the so-called Lévy characterisation that says that the Wiener
process is an almost surely continuous martingale with and quadratic variation .
A third characterisation is that the Wiener process has a spectral representation as a sine series whose coefficients are
independent random variables. This representation can be obtained using the Karhunen–Loève theorem.
The Wiener process can be constructed as the scaling limit of a random walk, or other discrete-time stochastic
processes with stationary independent increments. This is known as Donsker's theorem. Like the random walk, the
Wiener process is recurrent in one or two dimensions (meaning that it returns almost surely to any fixed

Brownian motion 63

neighborhood of the origin infinitely often) whereas it is not recurrent in dimensions three and higher. Unlike the
random walk, it is scale invariant.
The time evolution of the position of the Brownian particle itself can be described approximately by a Langevin
equation, an equation which involves a random force field representing the effect of the thermal fluctuations of the
solvent on the Brownian particle. On long timescales, the mathematical Brownian motion is well described by a
Langevin equation. On small timescales, inertial effects are prevalent in the Langevin equation. However the
mathematical Brownian motion is exempt of such inertial effects. Note that inertial effects have to be considered in
the Langevin equation, otherwise the equation becomes singular. so that simply removing the inertia term from this
equation would not yield an exact description, but rather a singular behavior in which the particle doesn't move at all.

Physics
The diffusion equation yields an approximation of the time evolution of the probability density function associated to
the position of the particle going under a Brownian movement under the physical definition. The approximation is
valid on short timescales.
The time evolution of the position of the Brownian particle itself is best described using Langevin equation, an
equation which involves a random force field representing the effect of the thermal fluctuations of the solvent on the
particle.
The displacement of a particle undergoing Brownian motion is obtained by solving the diffusion equation under
appropriate boundary conditions and finding the rms of the solution. This shows that the displacement varies as the
square root of the time (not linearly), which explains why previous experimental results concerning the velocity of
Brownian particles gave nonsensical results. A linear time dependence was incorrectly assumed.
At very short time scales, however, the motion of a particle is dominated by its inertia and its displacement will be
linearly dependent on time: Δx = vΔt. So the instantaneous velocity of the Brownian motion can be measured as v =
Δx/Δt, when Δt << τ, where τ is the momentum relaxation time. In 2010, the instantaneous velocity of a Brownian
particle (a glass microsphere trapped in air with an optical tweezer) was measured successfully.[19] The velocity data
verified the Maxwell-Boltzmann velocity distribution, and the equipartition theorem for a Brownian particle.
The Brownian motion can be modeled by a random walk.[20] Random walks in porous media or fractals are
anomalous.[21]
In the general case, Brownian motion is a non-Markov random process and described by stochastic integral
equations.[22]

Lévy characterisation
The French mathematician Paul Lévy proved the following theorem, which gives a necessary and sufficient
condition for a continuous Rn-valued stochastic process X to actually be n-dimensional Brownian motion. Hence,
Lévy's condition can actually be used as an alternative definition of Brownian motion.
Let X = (X1, ..., Xn) be a continuous stochastic process on a probability space (Ω, Σ, P) taking values in Rn. Then the
following are equivalent:
1. X is a Brownian motion with respect to P, i.e., the law of X with respect to P is the same as the law of an
n-dimensional Brownian motion, i.e., the push-forward measure X∗(P) is classical Wiener measure on
C0([0, +∞); Rn).
2. both
1. X is a martingale with respect to P (and its own natural filtration); and
2. for all 1 ≤ i, j ≤ n, Xi(t)Xj(t) −δijt is a martingale with respect to P (and its own natural filtration), where δij
denotes the Kronecker delta.

Brownian motion 64

Riemannian manifold
The infinitesimal generator (and hence characteristic operator) of a
Brownian motion on Rn is easily calculated to be ½Δ, where Δ denotes
the Laplace operator. This observation is useful in defining Brownian
motion on an m-dimensional Riemannian manifold (M, g): a Brownian
motion on M is defined to be a diffusion on M whose characteristic
operator in local coordinates xi, 1 ≤ i ≤ m, is given by ½ΔLB,
where ΔLB is the Laplace–Beltrami operator given in local coordinates
by

Brownian motion on a 2-sphere

where [gij] = [gij]−1 in the sense of the inverse of a square matrix.

Gravitational motion
In stellar dynamics, a massive body (star, black hole, etc.) can experience Brownian motion as it responds to
gravitational forces from surrounding stars.[23] The rms velocity of the massive object, of mass , is related to
the rms velocity of the background stars by

where is the mass of the background stars. The gravitational force from the massive object causes nearby
stars to move faster than they otherwise would, increasing both and .[23] The Brownian velocity of Sgr A*,
the supermassive black hole at the center of the Milky Way galaxy, is predicted from this formula to be less than
1 km s−1.[24]

Narrow escape
The Narrow escape problem is a ubiquitous problem in biology, biophysics and cellular biology which has the
following formulation: a Brownian particle (ion, molecule, or protein) is confined to a bounded domain (a
compartment or a cell) by a reflecting boundary, except for a small window through which it can escape. The narrow
escape problem is that of calculating the mean escape time. This time diverges as the window shrinks, thus rendering
the calculation a singular perturbation problem.

References
[1] Mörters, Peter; Peres, Yuval (25 May 2008). Brownian Motion (http:/ / www. stat. berkeley. edu/ ~peres/ bmbook. pdf). . Retrieved 25 May
2008.
[2] Mandelbrot, B.; Hudson, R. (2004). The (Mis)behavior of Markets: A Fractal View of Risk, Ruin, and Reward. New York: Basic Books.
ISBN 0465043550.
[3] Perrin, 1914, p. 115 (http:/ / www. archive. org/ stream/ atomsper00perruoft#page/ 115/ mode/ 1up)
[4] Chandrasekhar, S. (1943). "Stochastic problems in physics and astronomy". Reviews of Modern Physics 15 (1): 1–89.
doi:10.1103/RevModPhys.15.1. MR0008130.
[5] A. Einstein, Investigations of the Theory of Brownian Movement (Dover, 1956).
[6] Lavenda, Bernard H. (1985). Nonequilibrium Statistical Thermodynamics. John Wiley & Sons Inc.. pp. 20. ISBN 0-471-90670-0.
[7] "Electricity and Matter" (Yale University Press, New Haven, 1904), pp 80-83
[8] Zeit. Phys. Chem. 9 (1888) 613.
[9] J. Leveugle, La Relativite', Poincaré' et Einstein, Planck, Hilbert (L'Harmattan, Paris, 2004) p. 181

Brownian motion 65

[10] J.E.S. Townsend, "Electricity in Gases" (Clarendon Press, Oxford, 1915), p. 254.
[11] See P. Clark 1976, p. 97
[12] See P. Clark 1976 for this whole paragraph
[13] http:/ / galileo. phys. virginia. edu/ classes/ 109N/ more_stuff/ Applets/ brownian/ brownian. html
[14] Smoluchowski, M. (1906), Bull. Int. de l'Acad. des Sci. de Cracovie: 202
[15] Sommerfeld, A. (15 November 1917), Phys. Zeit.: 535
[16] loc. cite p. 577
[17] Whithworth, W. A. (1965), Choice and Chance, Hafner Pub. Co.
[18] Smoluchowski, M. (1906), "Zur kinetischen Theorie der Brownschen Molekularbewegung und der Suspensionen", Annalen der Physik 326
(14): 756–780, Bibcode 1906AnP...326..756V, doi:10.1002/andp.19063261405
[19] Li, Tongcang; Kheifets, Simon; Medellin, David; Raizen, Mark (June 2010). "Measurement of the instantaneous velocity of a Brownian
particle". Science 328 (5986): 1673–1675. Bibcode 2010Sci...328.1673L. doi:10.1126/science.1189403.
[20] Weiss, G. H. (1994). Aspects and applications of the random walk. North Holland.
[21] Ben-Avraham, D.; Havlin, S. (2000). Diffusion and reaction in disordered systems (http:/ / havlin. biu. ac. il/ Shlomo Havlin books_d_r.
php). Cambridge University Press. .
[22] Morozov, A. N.; Skripkin, A. V. (2011). "Spherical particle Brownian motion in viscous medium as non-Markovian random process".
Physics Letters A 375 (46): 4113–4115. Bibcode 2011PhLA..375.4113M. doi:10.1016/j.physleta.2011.10.001.
[23] Merritt, D.; Berczik, P.; Laun, F. (February 2007), "Brownian motion of black holes in dense nuclei", The Astronomical Journal 133 (2):
553–563, arXiv:astro-ph/0408029, Bibcode 2007AJ....133..553M, doi:10.1086/510294
[24] Reid, M. J.; Brunthaler, A. (December 2004), "The Proper Motion of Sagittarius A*. II. The Mass of Sagittarius A*", The Astrophysical
Journal 616 (2): 872–884, arXiv:astro-ph/0408107, Bibcode 2004ApJ...616..872R, doi:10.1086/424960

Further reading
• Brown, Robert, "A brief account of microscopical observations made in the months of June, July and August,
1827, on the particles contained in the pollen of plants; and on the general existence of active molecules in
organic and inorganic bodies." Phil. Mag. 4, 161–173, 1828. (PDF version of original paper including a
subsequent defense by Brown of his original observations, Additional remarks on active molecules.) (http://
sciweb.nybg.org/science2/pdfs/dws/Brownian.pdf)
• Chaudesaigues, M. (1908). "Le mouvement brownien et la formule d'Einstein". Comptes Rendus 147: 1044–6.
• Clark, P. (1976) 'Atomism versus thermodynamics' in Method and appraisal in the physical sciences, Colin
Howson (Ed), Cambridge University Press 1976
• Einstein, A. (1905), "Über die von der molekularkinetischen Theorie der Wärme geforderte Bewegung von in
ruhenden Flüssigkeiten suspendierten Teilchen." (http://www.physik.uni-augsburg.de/annalen/history/
einstein-papers/1905_17_549-560.pdf), Annalen der Physik 17: 549–560, Bibcode 1905AnP...322..549E,
doi:10.1002/andp.19053220806
• Einstein, A. "Investigations on the Theory of Brownian Movement". New York: Dover, 1956. ISBN
0-486-60304-0 (http://lorentz.phl.jhu.edu/AnnusMirabilis/AeReserveArticles/eins_brownian.pdf)
• Henri, V(1908) 'Etudes cinematographique du mouvement brownien' Comptes Rendus 146 pp 1024–6
• Lucretius, 'On The Nature of Things.', translated by William Ellery Leonard. ( on-line version (http://
onlinebooks.library.upenn.edu/webbin/gutbook/lookup?num=785), from Project Gutenberg. see the heading
'Atomic Motions'; this translation differs slightly from the one quoted).
• Pearle, P., Collett, B., Bart, K., Bilderback, D., Newman, D., and Samuels, S. (2010) What Brown saw and you
can too (http://ajp.aapt.org/resource/1/ajpias/v78/i12/p1278_s1). Am. J. Phys. 78: 1278-1289.
• Nelson, Edward, Dynamical Theories of Brownian Motion (1967) (PDF version of this out-of-print book, from
the author's webpage.) (http://www.math.princeton.edu/~nelson/books.html)
• J. Perrin, "Mouvement brownien et réalité moléculaire". Ann. Chim. Phys. 8ième série 18, 5–114 (1909). See also
Perrin's book "Les Atomes" (1914).
• Ruben D. Cohen (1986) "Self Similarity in Brownian Motion and Other Ergodic Phenomena", Journal of
Chemical Education 63, pp. 933–934 (http://rdcohen.50megs.com/BrownianMotion.pdf)
• Smoluchowski, M. (1906), "Zur kinetischen Theorie der Brownschen Molekularbewegung und der Suspensionen"
(http://gallica.bnf.fr/ark:/12148/bpt6k15328k/f770.chemindefer), Annalen der Physik 21: 756–780,

Brownian motion 66

Bibcode 1906AnP...326..756V, doi:10.1002/andp.19063261405
• Svedberg, T. Studien zur Lehre von den kolloiden Losungen 1907
• Theile, T. N. Danish version: "Om Anvendelse af mindste Kvadraters Methode i nogle Tilfælde, hvor en
Komplikation af visse Slags uensartede tilfældige Fejlkilder giver Fejlene en ‘systematisk’ Karakter". French
version: "Sur la compensation de quelques erreurs quasi-systématiques par la méthodes de moindre carrés"
published simultaneously in Vidensk. Selsk. Skr. 5. Rk., naturvid. og mat. Afd., 12:381–408, 1880.

External links
• Brownian motion java simulation (http://galileo.phys.virginia.edu/classes/109N/more_stuff/Applets/
brownian/applet.html)
• Article for the school-going child (http://xxx.imsc.res.in/abs/physics/0412132)
• Einstein on Brownian Motion (http://www.bun.kyoto-u.ac.jp/~suchii/einsteinBM.html)
• Brownian Motion, "Diverse and Undulating" (http://arxiv.org/abs/0705.1951)
• Discusses history, botany and physics of Brown's original observations, with videos (http://physerver.hamilton.
edu/Research/Brownian/index.html)
• "Einstein's prediction finally witnessed one century later" (http://www.gizmag.com/
einsteins-prediction-finally-witnessed/16212/) : a test to observe the velocity of Brownian motion
• "Café math : brownian motion (Part 1)" (http://cafemath.kegtux.org/mathblog/article.
php?page=BrownianMotion.php) : A blog article describing brownian motion (definition, symmetries,
simulation)

Wiener process
In mathematics, the Wiener process is a
continuous-time stochastic process named in
honor of Norbert Wiener. It is often called
standard Brownian motion, after Robert
Brown. It is one of the best known Lévy
processes (càdlàg stochastic processes with
stationary independent increments) and
occurs frequently in pure and applied
mathematics, economics, quantitative
finance and physics.

The Wiener process plays an important role
both in pure and applied mathematics. In
pure mathematics, the Wiener process gave
rise to the study of continuous time
A single realization of a one-dimensional Wiener process
martingales. It is a key process in terms of
which more complicated stochastic
processes can be described. As such, it plays a vital role in stochastic calculus, diffusion processes and even potential
theory. It is the driving process of Schramm–Loewner

Wiener process 67

evolution. In applied mathematics, the
Wiener process is used to represent the
integral of a Gaussian white noise process,
and so is useful as a model of noise in
electronics engineering, instrument errors in
filtering theory and unknown forces in
control theory.

The Wiener process has applications
throughout the mathematical sciences. In
physics it is used to study Brownian motion,
the diffusion of minute particles suspended
in fluid, and other types of diffusion via the
Fokker–Planck and Langevin equations. It
also forms the basis for the rigorous path
integral formulation of quantum mechanics
(by the Feynman–Kac formula, a solution to
the Schrödinger equation can be represented
in terms of the Wiener process) and the
A single realization of a three-dimensional Wiener process
study of eternal inflation in physical
cosmology. It is also prominent in the
mathematical theory of finance, in particular the Black–Scholes option pricing model.

Characterizations of the Wiener process
The Wiener process Wt is characterized by three properties:[1]
1. W0 = 0
2. The function t → Wt is almost surely everywhere continuous
3. Wt has independent increments with Wt−Ws ~ N(0, t−s) (for 0 ≤ s < t).
N(μ, σ2) denotes the normal distribution with expected value μ and variance σ2. The condition that it has independent
increments means that if 0 ≤ s1 < t1 ≤ s2 < t2 then Wt1−Ws1 and Wt2−Ws2 are independent random variables, and the
similar condition holds for n increments.
An alternative characterization of the Wiener process is the so-called Lévy characterization that says that the Wiener
process is an almost surely continuous martingale with W0 = 0 and quadratic variation [Wt, Wt] = t (which means that
Wt2−t is also a martingale).
A third characterization is that the Wiener process has a spectral representation as a sine series whose coefficients are
independent N(0,1) random variables. This representation can be obtained using the Karhunen–Loève theorem.
The Wiener process can be constructed as the scaling limit of a random walk, or other discrete-time stochastic
processes with stationary independent increments. This is known as Donsker's theorem. Like the random walk, the
Wiener process is recurrent in one or two dimensions (meaning that it returns almost surely to any fixed
neighborhood of the origin infinitely often) whereas it is not recurrent in dimensions three and higher. Unlike the
random walk, it is scale invariant, meaning that

is a Wiener process for any nonzero constant α. The Wiener measure is the probability law on the space of
continuous functions g, with g(0) = 0, induced by the Wiener process. An integral based on Wiener measure may be
called a Wiener integral.

Wiener process 68

Properties of a one-dimensional Wiener process

Basic properties
The unconditional probability density function at a fixed time t:

The expectation is zero:

The variance, using the computational formula, is t:

The covariance and correlation:

The results for the expectation and variance follow immediately from the definition that increments have a normal
distribution, centered at zero. Thus

The results for the covariance and correlation follow from the definition that non-overlapping increments are
independent, of which only the property that they are uncorrelated is used. Suppose that t1 < t2.

Substitute the simple identity :

Since W(t1) = W(t1)−W(t0) and W(t2)−W(t1), are independent,

Thus

Running maximum
The joint distribution of the running maximum

and Wt is

To get the unconditional distribution of , integrate over −∞ < w ≤ m

And the expectation[2]

Wiener process 69

Self-similarity

Brownian scaling
For every c>0 the process is another Wiener process.

Time reversal
The process for 0 ≤ t ≤ 1 is distributed like Wt for 0 ≤ t ≤ 1.

Time inversion
The process is another Wiener process.

A class of Brownian martingales
If a polynomial p(x, t) satisfies the PDE

then the stochastic process

is a martingale.
Example: is a martingale, which shows that the quadratic variation of W on [0, t] is equal to t. It follows
that the expected time of first exit of W from (−c, c) is equal to
More generally, for every polynomial p(x, t) the following stochastic process is a martingale:

where a is the polynomial

Example: the process is a martingale,

which shows that the quadratic variation of the martingale on [0, t] is equal to

About functions p(xa, t) more general than polynomials, see local martingales.

Some properties of sample paths
The set of all functions w with these properties is of full Wiener measure. That is, a path (sample function) of the
Wiener process has all these properties almost surely.

Qualitative properties
• For every ε > 0, the function w takes both (strictly) positive and (strictly) negative values on (0, ε).
• The function w is continuous everywhere but differentiable nowhere (like the Weierstrass function).
• Points of local maximum of the function w are a dense countable set; the maximum values are pairwise different;
each local maximum is sharp in the following sense: if w has a local maximum at t then

The same holds for local minima.

Wiener process 70

• The function w has no points of local increase, that is, no t > 0 satisfies the following for some ε in (0, t): first,
w(s) ≤ w(t) for all s in (t − ε, t), and second, w(s) ≥ w(t) for all s in (t, t + ε). (Local increase is a weaker condition
than that w is increasing on (t − ε, t + ε).) The same holds for local decrease.
• The function w is of unbounded variation on every interval.
• Zeros of the function w are a nowhere dense perfect set of Lebesgue measure 0 and Hausdorff dimension 1/2
(therefore, uncountable).

Quantitative properties

Law of the iterated logarithm

Modulus of continuity
Local modulus of continuity:

Global modulus of continuity (Lévy):

Local time
The image of the Lebesgue measure on [0, t] under the map w (the pushforward measure) has a density Lt(·). Thus,

for a wide class of functions f (namely: all continuous functions; all locally integrable functions; all non-negative
measurable functions). The density Lt is (more exactly, can and will be chosen to be) continuous. The number Lt(x) is
called the local time at x of w on [0, t]. It is strictly positive for all x of the interval (a, b) where a and b are the least
and the greatest value of w on [0, t], respectively. (For x outside this interval the local time evidently vanishes.)
Treated as a function of two variables x and t, the local time is still continuous. Treated as a function of t (while x is
fixed), the local time is a singular function corresponding to a nonatomic measure on the set of zeros of w.
These continuity properties are fairly non-trivial. Consider that the local time can also be defined (as the density of
the pushforward measure) for a smooth function. Then, however, the density is discontinuous, unless the given
function is monotone. In other words, there is a conflict between good behavior of a function and good behavior of
its local time. In this sense, the continuity of the local time of the Wiener process is another manifestation of
non-smoothness of the trajectory.

Wiener process 71

Related processes
The stochastic process defined by

is called a Wiener process with drift μ and infinitesimal variance σ2.
These processes exhaust continuous Lévy processes.
Two random processes on the time interval [0, 1] appear, roughly
speaking, when conditioning the Wiener process to vanish on both
ends of [0,1]. With no further conditioning, the process takes both
positive and negative values on [0, 1] and is called Brownian bridge.
Conditioned also to stay positive on (0, 1), the process is called The generator of a Brownian motion is ½ times
Brownian excursion.[3] In both cases a rigorous treatment involves a the Laplace–Beltrami operator. The image above
is of the Brownian motion on a special manifold:
limiting procedure, since the formula P(A|B) = P(A ∩ B)/P(B) does not
the surface of a sphere.
apply when P(B) = 0.

A geometric Brownian motion can be written

It is a stochastic process which is used to model processes that can never take on negative values, such as the value
of stocks.
The stochastic process

is distributed like the Ornstein–Uhlenbeck process.
The time of hitting a single point x > 0 by the Wiener process is a random variable with the Lévy distribution. The
family of these random variables (indexed by all positive numbers x) is a left-continuous modification of a Lévy
process. The right-continuous modification of this process is given by times of first exit from closed intervals [0, x].
The local time Lt(0) treated as a random function of t is a random process distributed like the process

The local time Lt(x) treated as a random function of x (while t is constant) is a random process described by
Ray–Knight theorems in terms of Bessel processes.

Brownian martingales
Let A be an event related to the Wiener process (more formally: a set, measurable with respect to the Wiener
measure, in the space of functions), and Xt the conditional probability of A given the Wiener process on the time
interval [0, t] (more formally: the Wiener measure of the set of trajectories whose concatenation with the given
partial trajectory on [0, t] belongs to A). Then the process Xt is a continuous martingale. Its martingale property
follows immediately from the definitions, but its continuity is a very special fact – a special case of a general
theorem stating that all Brownian martingales are continuous. A Brownian martingale is, by definition, a martingale
adapted to the Brownian filtration; and the Brownian filtration is, by definition, the filtration generated by the
Wiener process.

Wiener process 72

Integrated Brownian motion

The time-integral of the Wiener process is called integrated Brownian motion or

integrated Wiener process. It arises in many applications and can be shown to be normally distributed with zero
mean and variance , calculus lead using the fact that the covariation of the Wiener process is .

Time change
Every continuous martingale (starting at the origin) is a time changed Wiener process.
Example: 2Wt = V(4t) where V is another Wiener process (different from W but distributed like W).

Example. where and V is another Wiener process.

In general, if M is a continuous martingale then where A(t) is the quadratic variation of M on
[0, t], and V is a Wiener process.
Corollary. (See also Doob's martingale convergence theorems) Let Mt be a continuous martingale, and

Then only the following two cases are possible:

other cases (such as etc.) are of probability 0.
Especially, a nonnegative continuous martingale has a finite limit (as t → ∞) almost surely.
All stated (in this subsection) for martingales holds also for local martingales.

Change of measure
A wide class of continuous semimartingales (especially, of diffusion processes) is related to the Wiener process via a
combination of time change and change of measure.
Using this fact, the qualitative properties stated above for the Wiener process can be generalized to a wide class of
continuous semimartingales.

Complex-valued Wiener process
The complex-valued Wiener process may be defined as a complex-valued random process of the form
where Xt, Yt are independent Wiener processes (real-valued).[4]

Wiener process 73

Self-similarity
Brownian scaling, time reversal, time inversion: the same as in the real-valued case.
Rotation invariance: for every complex number c such that |c|=1 the process cZt is another complex-valued Wiener
process.

Time change
If f is an entire function then the process is a time-changed complex-valued Wiener process.

Example. where and U is another

complex-valued Wiener process.
In contrast to the real-valued case, a complex-valued martingale is generally not a time-changed complex-valued
Wiener process. For example, the martingale 2Xt + iYt is not (here Xt, Yt are independent Wiener processes, as
before).

Notes
[1] Durrett 1996, Sect. 7.1
[2] Shreve, Steven E (2008). Stochastic Calculus for Finance II: Continuous Time Models. Springer. pp. 114. ISBN 978-0-387-40101-0.
[3] Vervaat, W. (1979). "A relation between Brownian bridge and Brownian excursion". Ann. Prob. 7 (1): 143–149. JSTOR 2242845.
[4] Navarro-moreno, J.; Estudillo-martinez, M.D; Fernandez-alcala, R.M.; Ruiz-molina, J.C., "Estimation of Improper Complex-Valued Random
Signals in Colored Noise by Using the Hilbert Space Theory" (http:/ / ieeexplore. ieee. org/ Xplore/ login. jsp?url=http:/ / ieeexplore. ieee. org/
iel5/ 18/ 4957623/ 04957648. pdf?arnumber=4957648& authDecision=-203), IEEE Transactions on Information Theory 55 (6): 2859–2867,
doi:10.1109/TIT.2009.2018329, , retrieved 2010-03-30

References
• Kleinert, Hagen, Path Integrals in Quantum Mechanics, Statistics, Polymer Physics, and Financial Markets, 4th
edition, World Scientific (Singapore, 2004); Paperback ISBN 981-238-107-4 (also available online: PDF-files
(http://www.physik.fu-berlin.de/~kleinert/b5))
• Stark,Henry, John W. Woods, Probability and Random Processes with Applications to Signal Processing, 3rd
edition, Prentice Hall (New Jersey, 2002); Textbook ISBN 0-13-020071-9
• Durrett, R. (2000) Probability: theory and examples,4th edition. Cambridge University Press, ISBN
0-521-76539-0
• Daniel Revuz and Marc Yor, Continuous martingales and Brownian motion, second edition, Springer-Verlag
1994.

External links
• Brownian motion java simulation (http://galileo.phys.virginia.edu/classes/109N/more_stuff/Applets/
brownian/applet.html)
• Article for the school-going child (http://xxx.imsc.res.in/abs/physics/0412132)
• Einstein on Brownian Motion (http://www.bun.kyoto-u.ac.jp/~suchii/einsteinBM.html)
• Brownian Motion, "Diverse and Undulating" (http://arxiv.org/abs/0705.1951)
• Short Movie on Brownian Motion (http://www.composite-agency.com/brownian_movement.htm)
• Discusses history, botany and physics of Brown's original observations, with videos (http://physerver.hamilton.
edu/Research/Brownian/index.html)
• "Einstein's prediction finally witnessed one century later" (http://www.gizmag.com/
einsteins-prediction-finally-witnessed/16212/) : a test to observe the velocity of Brownian motion


Autoregressive model
In statistics and signal processing, an autoregressive (AR) model is a type of random process which is often used to
model and predict various types of natural phenomena. The autoregressive model is one of a group of linear
prediction formulas that attempt to predict an output of a system based on the previous outputs.

Definition
The notation AR(p) indicates an autoregressive model of order p. The AR(p) model is defined as

where are the parameters of the model, is a constant (often omitted for simplicity) and is white
noise.
An autoregressive model can thus be viewed as the output of an all-pole infinite impulse response filter whose input
is white noise.
Some constraints are necessary on the values of the parameters of this model in order that the model remains
wide-sense stationary. For example, processes in the AR(1) model with |φ1| ≥ 1 are not stationary. More generally,
for an AR(p) model to be wide-sense stationary, the roots of the polynomial must lie within
the unit circle, i.e., each root must satisfy .

Characteristic polynomial
The autocorrelation function of an AR(p) process can be expressed as

where are the roots of the polynomial

The autocorrelation function of an AR(p) process is a sum of decaying exponentials.
• Each real root contributes a component to the autocorrelation function that decays exponentially.
• Similarly, each pair of complex conjugate roots contributes an exponentially damped oscillation.

Graphs of AR(p) processes
The simplest AR process is AR(0), which has no dependence between
the terms. Only the error/innovation/noise term contributes to the
output of the process, so in the figure, AR(0) corresponds to white
noise.
For an AR(1) process with a positive , only the previous term in the
process and the noise term contribute to the output. If is close to 0,
then the process still looks like white noise, but as approaches 1, the
output gets a larger contribution from the previous term relative to the
noise. This results in a "smoothing" or integration of the output, similar AR(0), AR(1), and AR(2) processes with white
to a low pass filter. noise


For an AR(2) process, the previous two terms and the noise term contribute to the output. If both and are
positive, the output will resemble a low pass filter, with the high frequency part of the noise decreased. If is
positive while is negative, then the process favors changes in sign between terms of the process. The output
oscillates.

Example: An AR(1)-process
An AR(1)-process is given by:

where is a white noise process with zero mean and variance . (Note: The subscript on has been dropped.)
The process is wide-sense stationary if since it is obtained as the output of a stable filter whose input is
white noise. (If then has infinite variance, and is therefore not wide sense stationary.) Consequently,
assuming , the mean is identical for all values of t. If the mean is denoted by , it follows from

that

and hence

In particular, if , then the mean is 0.
The variance is

where is the standard deviation of . This can be shown by noting that

and then by noticing that the quantity above is a stable fixed point of this relation.
The autocovariance is given by

It can be seen that the autocovariance function decays with a decay time (also called time constant) of
[to see this, write where is independent of . Then note that
and match this to the exponential decay law ].
The spectral density function is the Fourier transform of the autocovariance function. In discrete terms this will be
the discrete-time Fourier transform:

This expression is periodic due to the discrete nature of the , which is manifested as the cosine term in the
denominator. If we assume that the sampling time ( ) is much smaller than the decay time ( ), then we
can use a continuum approximation to :

which yields a Lorentzian profile for the spectral density:


where is the angular frequency associated with the decay time .
An alternative expression for can be derived by first substituting for in the defining
equation. Continuing this process N times yields

For N approaching infinity, will approach zero and:

It is seen that is white noise convolved with the kernel plus the constant mean. If the white noise is a
Gaussian process then is also a Gaussian process. In other cases, the central limit theorem indicates that will
be approximately normally distributed when is close to one.

Calculation of the AR parameters
There are many ways to estimate the coefficients, such as the ordinary least squares procedure, method of moments
(through Yule Walker equations), or Markov chain Monte Carlo methods.
The AR(p) model is given by the equation

It is based on parameters where i = 1, ..., p. There is a direct correspondence between these parameters and the
covariance function of the process, and this correspondence can be inverted to determine the parameters from the
autocorrelation function (which is itself obtained from the covariances). This is done using the Yule-Walker
equations.

Yule-Walker equations
The Yule-Walker equations are the following set of equations.

where m = 0, ..., p, yielding p + 1 equations. Here is the autocorrelation function of Xt, is the standard
deviation of the input noise process, and is the Kronecker delta function.
Because the last part of an individual equation is non-zero only if m = 0, the set of equations can be solved by
representing the equations for m > 0 in matrix form, thus getting the equation

which can be solved for all The remaining equation for m = 0 is

which, once are known, can be solved for


An alternative formulation is in terms of the autocorrelation function. The AR parameters are determined by the first
p+1 elements of the autocorrelation function. The full autocorrelation function can then be derived by
recursively calculating [1]

Examples for some Low-order AR(p) processes
• p=1
•
• Hence
• p=2
• The Yule-Walker equations for an AR(2) process are

• Remember that

• Using the first equation yields

• Using the recursion formula yields

Estimation of AR parameters
The above equations (the Yule-Walker equations) provide several routes to estimating the parameters of an AR(p)
model, by replacing the theoretical covariances with estimated values. Some of these variants can be described as
follows:
• Estimation of autocovariances or autocorrelations. Here each of these terms is estimated separately, using
conventional estimates. There are different ways of doing this and the choice between these affects the properties
of the estimation scheme. For example, negative estimates of the variance can be produced by some choices.
• Formulation as a least squares regression problem in which an ordinary least squares prediction problem is
constructed, basing prediction of values of Xt on the p previous values of the same series. This can be thought of
as a forward-prediction scheme. The normal equations for this problem can be seen to correspond to an
approximation of the matrix form of the Yule-Walker equations in which each appearance of an autocovariance of
the same lag is replaced by a slightly different estimate.
• Formulation as an extended form of ordinary least squares prediction problem. Here two sets of prediction
equations are combined into a single estimation scheme and a single set of norma equations. One set is the set of
forward-prediction equations and the other is a corresponding set of backward prediction equations, relating to the
backward representation of the AR model:

Here predicted of values of Xt would be based on the p future values of the same series. This way of estimating
the AR parameters is due to Burg,[2] and call the Burg method:[3] Burg and later authors called these particular
estimates "maximum entropy estimates",[4] but the reasoning behind this applies to the use of any set of
estimated AR parameters. Compared to the estimation scheme using only the forward prediction equations,
different estimates of the autocovariances are produced, and the estimates have different stability properties.
Burg estimates are particularly associated with maximum entropy spectral estimation.[5]


Other possible approaches to estimation include maximum likelihood estimation. Two distinct variants of maximum
likelihood are available: in one (broadly equivalent to the forward prediction least squares scheme) the likelihood
function considered is that corrresponding to the conditional distribution of later values in the series given the initial
p values in the series; in the second, the likelihood function considered is that corrresponding to the unconditional
joint distribution of all the values in the observed series. Substantial differences in the results of these approaches can
occur if the observed series is short, or if the process is close to non-stationarity.

Spectrum
The power spectral density of an AR(p) process with noise variance
is[1]

AR(0)
For white noise (AR(0))

AR(1)
For AR(1)

• If there is a single spectral peak at f=0, often referred to as
red noise. As becomes nearer 1, there is stronger power at low
frequencies, i.e. larger time lags.
• If there is a minimum at f=0, often referred to as blue
noise

AR(2)
AR(2) processes can be split into three groups depending on the characteristics of their roots:

• When , the process has a pair of complex-conjugate roots, creating a mid-frequency peak at:

Otherwise the process has real roots, and:
• When it acts as a low-pass filter on the white noise with a spectral peak at
• When it acts as a high-pass filter on the white noise with a spectral peak at .
The process is stable when the roots are within the unit circle, or equivalently when the coefficients are in the
triangle .
The full PSD function can be expressed in real form as:


Implementations in statistics packages
• R, the stats package includes an ar function.[6]
• Matlab and Octave: the TSA toolbox contains several estimation functions for uni-variate, multivariate and
adaptive autoregressive models.[7]

Notes
[1] Von Storch, H.; F. W Zwiers (2001). Statistical analysis in climate research. Cambridge Univ Pr. ISBN 0-521-01230-9.
[2] Burg, J. P. (1968). "A new analysis technique for time series data". In Modern Spectrum Analysis (Edited by D. G. Childers), NATO
Advanced Study Institute of Signal Processing with emphasis on Underwater Acoustics. IEEE Press, New York.
[3] Brockwell, Peter J.; Dahlhaus, Rainer; Trindade, A. Alexandre (2005). "Modified Burg Algorithms for Multivariate Subset Autoregression"
(http:/ / www3. stat. sinica. edu. tw/ statistica/ oldpdf/ A15n112. pdf). Statistica Sinica 15: 197–213. .
[4] Burg, J.P. (1967) "Maximum Entropy Spectral Analysis", Proceedings of the 37th Meeting of the Society of Exploration Geophysicists,
Oklahoma City, Oklahoma.
[5] Bos, R.; De Waele, S.; Broersen, P. M. T. (2002). "Autoregressive spectral estimation by application of the burg algorithm to irregularly
sampled data". IEEE Transactions on Instrumentation and Measurement 51 (6): 1289. doi:10.1109/TIM.2002.808031.
[6] "Fit Autoregressive Models to Time Series" (http:/ / finzi. psych. upenn. edu/ R/ library/ stats/ html/ ar. html) (in R)
[7] "Time Series Analysis toolbox for Matlab and Octave" (http:/ / pub. ist. ac. at/ ~schloegl/ matlab/ tsa/ )

References
• Mills, Terence C. (1990) Time Series Techniques for Economists. Cambridge University Press
• Percival, Donald B. and Andrew T. Walden. (1993) Spectral Analysis for Physical Applications. Cambridge
University Press
• Pandit, Sudhakar M. and Wu, Shien-Ming. (1983) Time Series and System Analysis with Applications. John Wiley
& Sons
• Yule, G. Udny (1927) "On a Method of Investigating Periodicities in Disturbed Series, with Special Reference to
Wolfer's Sunspot Numbers" (http://visualiseur.bnf.fr/Visualiseur?Destination=Gallica&O=NUMM-56031),
Philosophical Transactions of the Royal Society of London, Ser. A, Vol. 226, 267–298.]
• Walker, Gilbert (1931) "On Periodicity in Series of Related Terms" (http://visualiseur.bnf.fr/
Visualiseur?Destination=Gallica&O=NUMM-56224), Proceedings of the Royal Society of London, Ser. A, Vol.
131, 518–532.

External links
• AutoRegression Analysis (AR) by Paul Bourke (http://paulbourke.net/miscellaneous/ar/)

Moving average 80

Moving average
In statistics, a moving average, also called rolling average, rolling
mean or running average, is a type of finite impulse response filter
used to analyze a set of data points by creating a series of averages of
different subsets of the full data set.
Given a series of numbers and a fixed subset size, the first element of
the moving average is obtained by taking the average of the initial
fixed subset of the number series. Then the subset is modified by
"shifting forward", that is excluding the first number of the series and
including the next number following the original subset in the series.
This creates a new subset of numbers, which is averaged. This process
is repeated over the entire data series. The plot line connecting all the (fixed) averages is the moving average. A
moving average is a set of numbers, each of which is the average of the corresponding subset of a larger set of datum
points. A moving average may also use unequal weights for each datum value in the subset to emphasize particular
values in the subset.

A moving average is commonly used with time series data to smooth out short-term fluctuations and highlight
longer-term trends or cycles. The threshold between short-term and long-term depends on the application, and the
parameters of the moving average will be set accordingly. For example, it is often used in technical analysis of
financial data, like stock prices, returns or trading volumes. It is also used in economics to examine gross domestic
product, employment or other macroeconomic time series. Mathematically, a moving average is a type of
convolution and so it can be viewed as an example of a low-pass filter used in signal processing. When used with
non-time series data, a moving average filters higher frequency components without any specific connection to time,
although typically some kind of ordering is implied. Viewed simplistically it can be regarded as smoothing the data.

Simple moving average
In financial applications a simple moving average (SMA) is the unweighted mean of the previous n datum points.
However, in science and engineering the mean is normally taken from an equal number of data on either side of a
central value. This ensures that variations in the mean are aligned with the variations in the data rather than being
shifted in time. An example of a simple unweighted running mean for a n-day sample of closing price is the mean of
the previous n days' closing prices. If those prices are then the formula is

When calculating successive values, a new value comes into the sum and an old value drops out, meaning a full
summation each time is unnecessary for this simple case,

The period selected depends on the type of movement of interest, such as short, intermediate, or long term. In
financial terms moving average levels can be interpreted as support in a rising market, or resistance in a falling
market.
If the data used are not centred around the mean, a simple moving average lags behind the latest datum point by half
the sample width. An SMA can also be disproportionately influenced by old datum points dropping out or new data
coming in. One characteristic of the SMA is that if the data have a periodic fluctuation, then applying an SMA of
that period will eliminate that variation (the average always containing one complete cycle). But a perfectly regular
cycle is rarely encountered.[1]

Moving average 81

For a number of applications it is advantageous to avoid the shifting induced by using only 'past' data. Hence a
central moving average can be computed, using data equally spaced either side of the point in the series where the
mean is calculated. This requires using an odd number of datum points in the sample window.

Cumulative moving average
In a cumulative moving average, the data arrive in an ordered datum stream and the statistician would like to get
the average of all of the data up until the current datum point. For example, an investor may want the average price
of all of the stock transactions for a particular stock up until the current time. As each new transaction occurs, the
average price at the time of the transaction can be calculated for all of the transactions up to that point using the
cumulative average, typically an unweighted average of the sequence of i values x1, ..., xi up to the current time:

The brute-force method to calculate this would be to store all of the data and calculate the sum and divide by the
number of datum points every time a new datum point arrived. However, it is possible to simply update cumulative
average as a new value xi+1 becomes available, using the formula:

where can be taken to be equal to 0.
Thus the current cumulative average for a new datum point is equal to the previous cumulative average plus the
difference between the latest datum point and the previous average divided by the number of points received so far.
When all of the datum points arrive (i = N), the cumulative average will equal the final average.
The derivation of the cumulative average formula is straightforward. Using

and similarly for i + 1, it is seen that

Solving this equation for CAi+1 results in:

Weighted moving average
A weighted average is any average that has multiplying factors to give different weights to data at different positions
in the sample window. Mathematically, the moving average is the convolution of the datum points with a fixed
weighting function. One application is removing pixelisation from a digital graphical image.
In technical analysis of financial data, a weighted moving average (WMA) has the specific meaning of weights that
decrease in arithmetical progression.[2] In an n-day WMA the latest day has weight n, the second latest n − 1, etc.,
down to one.

Moving average 82

The denominator is a triangle number equal to In the more

general case the denominator will always be the sum of the individual
weights.
When calculating the WMA across successive values, the difference
between the numerators of WMAM+1 and WMAM is
npM+1 − pM − ... − pM−n+1. If we denote the sum pM + ... + pM−n+1 by
TotalM, then

WMA weights n = 15

The graph at the right shows how the weights decrease, from highest weight for the most recent datum points, down
to zero. It can be compared to the weights in the exponential moving average which follows.

Exponential moving average
An exponential moving average (EMA), also known as an
exponentially weighted moving average (EWMA),[3] is a type of
infinite impulse response filter that applies weighting factors which
decrease exponentially. The weighting for each older datum point
decreases exponentially, never reaching zero. The graph at right shows
an example of the weight decrease.

The EMA for a series Y may be calculated recursively:

for EMA weights N=15

Where:
• The coefficient α represents the degree of weighting decrease, a constant smoothing factor between 0 and 1. A
higher α discounts older observations faster. Alternatively, α may be expressed in terms of N time periods, where
α = 2/(N+1). For example, N = 19 is equivalent to α = 0.1. The half-life of the weights (the interval over which the
weights decrease by a factor of two) is approximately N/2.8854 (within 1% if N > 5).
• Yt is the value at a time period t.
• St is the value of the EMA at any time period t.
S1 is undefined. S1 may be initialized in a number of different ways, most commonly by setting S1 to Y1, though
other techniques exist, such as setting S1 to an average of the first 4 or 5 observations. The prominence of the S1
initialization's effect on the resultant moving average depends on α; smaller α values make the choice of S1 relatively
more important than larger α values, since a higher α discounts older observations faster.
This formulation is according to Hunter (1986).[4] By repeated application of this formula for different times, we can
eventually write St as a weighted sum of the datum points Yt, as:

for any suitable k = 0, 1, 2, ... The weight of the general datum point is .
[5]
An alternate approach by Roberts (1959) uses Yt in lieu of Yt−1 :

Moving average 83

This formula can also be expressed in technical analysis terms as follows, showing how the EMA steps towards the
latest datum point, but only by a proportion of the difference (each time):

Expanding out each time results in the following power series, showing how the weighting factor on
each datum point p1, p2, etc., decreases exponentially:

where
• is
• is
• and so on

,

since .
This is an infinite sum with decreasing terms.
The N periods in an N-day EMA only specify the α factor. N is not a stopping point for the calculation in the way it
is in an SMA or WMA. For sufficiently large N, The first N datum points in an EMA represent about 86% of the
total weight in the calculation[6]:

i.e. simplified,[7] tends to .

The power formula above gives a starting value for a particular day, after which the successive days formula shown
first can be applied. The question of how far back to go for an initial value depends, in the worst case, on the data.
Large price values in old data will affect on the total even if their weighting is very small. If prices have small
variations then just the weighting can be considered. The weight omitted by stopping after k terms is

which is

i.e. a fraction

out of the total weight.
For example, to have 99.9% of the weight, set above ratio equal to 0.1% and solve for k:

terms should be used. Since approaches as N increases,[8] this simplifies to approximately[9]

Moving average 84

for this example (99.9% weight).

Modified moving average
A modified moving average (MMA), running moving average (RMA), or smoothed moving average is defined
as:

In short, this is exponential moving average, with .

Application to measuring computer performance
Some computer performance metrics, e.g. the average process queue length, or the average CPU utilization, use a
form of exponential moving average.

Here is defined as a function of time between two readings. An example of a coefficient giving bigger weight to
the current reading, and smaller weight to the older readings is

where time for readings tn is expressed in seconds, and is the period of time in minutes over which the reading is
said to be averaged (the mean lifetime of each reading in the average). Given the above definition of , the moving
average can be expressed as

For example, a 15-minute average L of a process queue length Q, measured every 5 seconds (time difference is 5
seconds), is computed as

Other weightings
Other weighting systems are used occasionally – for example, in share trading a volume weighting will weight each
time period in proportion to its trading volume.
A further weighting, used by actuaries, is Spencer's 15-Point Moving Average[10] (a central moving average). The
symmetric weight coefficients are -3, -6, -5, 3, 21, 46, 67, 74, 67, 46, 21, 3, -5, -6, -3.
Outside the world of finance, weighted running means have many forms and applications. Each weighting function
or "kernel" has its own characteristics. In engineering and science the frequency and phase response of the filter is
often of primary importance in understanding the desired and undesired distortions that a particular filter will apply
to the data.
A mean does not just "smooth" the data. A mean is a form of low-pass filter. The effects of the particular filter used
should be understood in order to make an appropriate choice. On this point, the French version of this article
discusses the spectral effects of 3 kinds of means (cumulative, exponential, Gaussian).

Moving average 85

Moving median
From a statistical point of view, the moving average, when used to estimate the underlying trend in a time series, is
susceptible to rare events such as rapid shocks or other anomalies. A more robust estimate of the trend is the simple
moving median over n time points:

where the median is found by, for example, sorting the values inside the brackets and finding the value in the middle.
Statistically, the moving average is optimal for recovering the underlying trend of the time series when the
fluctuations about the trend are normally distributed. However, the normal distribution does not place high
probability on very large deviations from the trend which explains why such deviations will have a
disproportionately large effect on the trend estimate. It can be shown that if the fluctuations are instead assumed to
be Laplace distributed, then the moving median is statistically optimal.[11] For a given variance, the Laplace
distribution places higher probability on rare events than does the normal, which explains why the moving median
tolerates shocks better than the moving mean.
When the simple moving median above is central, the smoothing is identical to the median filter which has
applications in, for example, image signal processing.

Notes and references
[1] Statistical Analysis, Ya-lun Chou, Holt International, 1975, ISBN 0-03-089422-0, section 17.9.
[2] "Weighted Moving Averages: The Basics" (http:/ / www. investopedia. com/ articles/ technical/ 060401. asp). Investopedia. .
[3] http:/ / lorien. ncl. ac. uk/ ming/ filter/ filewma. htm
[4] NIST/SEMATECH e-Handbook of Statistical Methods: Single Exponential Smoothing (http:/ / www. itl. nist. gov/ div898/ handbook/ pmc/
section4/ pmc431. htm) at the National Institute of Standards and Technology
[5] NIST/SEMATECH e-Handbook of Statistical Methods: EWMA Control Charts (http:/ / www. itl. nist. gov/ div898/ handbook/ pmc/
section3/ pmc324. htm) at the National Institute of Standards and Technology
[6] The denominator on the left-hand side should be unity, and the numerator will become the right-hand side (geometric series),

.

[7] Because (1+x/n)n becomes ex for large n.
[8] It means -> 0, and the Taylor series of tends to .
[9] loge(0.001) / 2 = -3.45
[10] Spencer's 15-Point Moving Average — from Wolfram MathWorld (http:/ / mathworld. wolfram. com/ Spencers15-PointMovingAverage.
html)
[11] G.R. Arce, "Nonlinear Signal Processing: A Statistical Approach", Wiley:New Jersey, USA, 2005.

Autoregressivemoving-average model 86

Autoregressive–moving-average model
In the statistical analysis of time series, autoregressive–moving-average (ARMA) models provide a parsimonious
description of a (weakly) stationary stochastic-process in terms of two polynomials, one for the auto-regression and
the second for moving averages. The general ARMA model was described in the 1951 thesis of Peter Whittle,
Hypothesis testing in time series analysis, and it was popularized in the 1971 book by George E. P. Box and Gwilym
Jenkins.
Given a time series of data Xt, the ARMA model is a tool for understanding and, perhaps, predicting future values in
this series. The model consists of two parts, an autoregressive (AR) part and a moving average (MA) part. The
model is usually then referred to as the ARMA(p,q) model where p is the order of the autoregressive part and q is the
order of the moving average part (as defined below).

Autoregressive model
The notation AR(p) refers to the autoregressive model of order p. The AR(p) model is written

where are parameters, is a constant, and the random variable is white noise.
An autoregressive model is essentially an all-pole infinite impulse response filter with some additional interpretation
placed on it.
Some constraints are necessary on the values of the parameters of this model in order that the model remains
stationary. For example, processes in the AR(1) model with |φ1| ≥ 1 are not stationary.

Moving-average model
The notation MA(q) refers to the moving average model of order q:

where the θ1, ..., θq are the parameters of the model, μ is the expectation of (often assumed to equal 0), and the
, ,... are again, white noise error terms. The moving-average model is essentially a finite impulse response
filter with some additional interpretation placed on it.

Autoregressive–moving-average model
The notation ARMA(p, q) refers to the model with p autoregressive terms and q moving-average terms. This model
contains the AR(p) and MA(q) models,

The general ARMA model was described in the 1951 thesis of Peter Whittle, who used mathematical analysis
(Laurent series and Fourier analysis) and statistical inference.[1][2] ARMA models were popularized by a 1971 book
by George E. P. Box and Jenkins, who expounded an iterative (Box–Jenkins) method for choosing and estimating
them. This method was useful for low-order polynomials (of degree three or less).[3]


Note about the error terms
The error terms are generally assumed to be independent identically distributed random variables (i.i.d.) sampled
from a normal distribution with zero mean: ~ N(0,σ2) where σ2 is the variance. These assumptions may be
weakened but doing so will change the properties of the model. In particular, a change to the i.i.d. assumption would
make a rather fundamental difference.

Specification in terms of lag operator
In some texts the models will be specified in terms of the lag operator L. In these terms then the AR(p) model is
given by

where represents the polynomial

The MA(q) model is given by

where θ represents the polynomial

Finally, the combined ARMA(p, q) model is given by

or more concisely,

Alternative notation
Some authors, including Box, Jenkins & Reinsel[4] use a different convention for the autoregression coefficients.
This allows all the polynomials involving the lag operator to appear in a similar form throughout. Thus the ARMA
model would be written as


Fitting models
ARMA models in general can, after choosing p and q, be fitted by least squares regression to find the values of the
parameters which minimize the error term. It is generally considered good practice to find the smallest values of p
and q which provide an acceptable fit to the data. For a pure AR model the Yule-Walker equations may be used to
provide a fit.
Finding appropriate values of p and q in the ARMA(p,q) model can be facilitated by plotting the partial
autocorrelation functions for an estimate of p, and likewise using the autocorrelation functions for an estimate of q.
Further information can be gleaned by considering the same functions for the residuals of a model fitted with an
initial selection of p and q.
Brockwell and Davis[5] (p. 273) recommend using AICc for finding p and q.

• In R, the arima function (in standard package stats) is documented in ARIMA Modelling of Time Series [6].
Extension packages contain related and extended functionality, e.g., the tseries package includes an arma
function, documented in "Fit ARMA Models to Time Series" [7]; the fracdiff package [8] contains fracdiff() for
fractionally integrated ARMA processes, etc. The CRAN task view on Time Series [9] contains links to most of
these.
• Mathematica has a complete library of time series functions including ARMA[10]
• MATLAB includes a function ar to estimate AR models, see here for more details [11].
• IMSL Numerical Libraries are libraries of numerical analysis functionality including ARMA and ARIMA
procedures implemented in standard programming languages like C, Java, C# .NET, and Fortran.
• gretl can also estimate ARMA models, see here where it's mentioned [12].
• GNU Octave can estimate AR models using functions from the extra package octave-forge [13].
• Stata includes the function arima which can estimate ARMA and ARIMA models. see here for more details [14]
• SuanShu is a Java library of numerical methods, including comprehensive statistics packages, in which
univariate/multivariate ARMA, ARIMA, ARMAX, etc. models are implemented in an object-oriented approach.
These implementations are documented in "SuanShu, a Java numerical and statistical library" [15].
• SAS has a econometric package, ETS, that estimates ARIMA models see here for more details [16].

Applications
ARMA is appropriate when a system is a function of a series of unobserved shocks (the MA part) as well as its own
behavior. For example, stock prices may be shocked by fundamental information as well as exhibiting technical
trending and mean-reversion effects due to market participants.

Generalizations
The dependence of Xt on past values and the error terms εt is assumed to be linear unless specified otherwise. If the
dependence is nonlinear, the model is specifically called a nonlinear moving average (NMA), nonlinear
autoregressive (NAR), or nonlinear autoregressive–moving-average (NARMA) model.
Autoregressive–moving-average models can be generalized in other ways. See also autoregressive conditional
heteroskedasticity (ARCH) models and autoregressive integrated moving average (ARIMA) models. If multiple time
series are to be fitted then a vector ARIMA (or VARIMA) model may be fitted. If the time-series in question
exhibits long memory then fractional ARIMA (FARIMA, sometimes called ARFIMA) modelling may be
appropriate: see Autoregressive fractionally integrated moving average. If the data is thought to contain seasonal
effects, it may be modeled by a SARIMA (seasonal ARIMA) or a periodic ARMA model.


Another generalization is the multiscale autoregressive (MAR) model. A MAR model is indexed by the nodes of a
tree, whereas a standard (discrete time) autoregressive model is indexed by integers.
Note that the ARMA model is a univariate model. Extensions for the multivariate case are the Vector
Autoregression (VAR) and Vector Autoregression Moving-Average (VARMA).

Autoregressive–moving-average model with exogenous inputs model (ARMAX model)
The notation ARMAX(p, q, b) refers to the model with p autoregressive terms, q moving average terms and b
exogenous inputs terms. This model contains the AR(p) and MA(q) models and a linear combination of the last b
terms of a known and external time series . It is given by:

where are the parameters of the exogenous input .
Some nonlinear variants of models with exogenous variables have been defined: see for example Nonlinear
autoregressive exogenous model.
Statistical packages implement the ARMAX model through the use of "exogenous" or "independent" variables. Care
must be taken when interpreting the output of those packages, because the estimated parameters usually (for
example, in R[17] and gretl) refer to the regression:

where mt incorporates all exogenous (or independent) variables:

References
[1] Hannan, Edward James (1970). Multiple time series. Wiley series in probability and mathematical statistics. New York: John Wiley and Sons.
[2] Whittle, P. (1951). Hypothesis Testing in Time Series Analysis. Almquist and Wicksell.

Whittle, P. (1963). Prediction and Regulation. English Universities Press. ISBN 0-8166-1147-5.
Republished as: Whittle, P. (1983). Prediction and Regulation by Linear Least-Square Methods. University of
Minnesota Press. ISBN 0-8166-1148-3.
[3] Hannan & Deistler (1988, p. 227): Hannan, E. J.; Deistler, Manfred (1988). Statistical theory of linear systems. Wiley series in probability
and mathematical statistics. New York: John Wiley and Sons.
[4] George Box, Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis: Forecasting and Control, third edition. Prentice-Hall, 1994.
[5] Brockwell, P.J., and Davis, R.A. Time Series: Theory and Methods, 2nd ed. Springer, 2009.
[6] http:/ / search. r-project. org/ R/ library/ stats/ html/ arima. html
[7] http:/ / finzi. psych. upenn. edu/ R/ library/ tseries/ html/ arma. html
[8] http:/ / cran. r-project. org/ web/ packages/ fracdiff
[9] http:/ / cran. r-project. org/ web/ views/ TimeSeries. html
[10] Time series features in Mathematica (http:/ / www. wolfram. com/ products/ applications/ timeseries/ features. html)
[11] http:/ / www. mathworks. de/ help/ toolbox/ ident/ ref/ arx. html
[12] http:/ / constantdream. wordpress. com/ 2008/ 03/ 16/ gnu-regression-econometrics-and-time-series-library-gretl/
[13] http:/ / octave. sourceforge. net/
[14] http:/ / www. stata. com/ help. cgi?arima
[15] http:/ / www. numericalmethod. com/ javadoc/ suanshu/
[16] http:/ / support. sas. com/ rnd/ app/ ets/ proc/ ets_arima. html
[17] ARIMA Modelling of Time Series (http:/ / search. r-project. org/ R/ library/ stats/ html/ arima. html), R documentation

• Mills, Terence C. Time Series Techniques for Economists. Cambridge University Press, 1990.
• Percival, Donald B. and Andrew T. Walden. Spectral Analysis for Physical Applications. Cambridge University
Press, 1993.


Fourier transform
The Fourier transform, named after Joseph Fourier, is a mathematical transform with many applications in physics
and engineering. Very commonly it transforms a mathematical function of time, into a new function,
sometimes denoted by or whose argument is frequency with units of cycles or radians per second. The new
function is then known as the Fourier transform and/or the frequency spectrum of the function The Fourier
transform is also a reversible operation. Thus, given the function one can determine the original function,
(See Fourier inversion theorem.) and are also respectively known as time domain and frequency domain
representations of the same "event". Most often perhaps, is a real-valued function, and is complex valued,
where a complex number describes both the amplitude and phase of a corresponding frequency component. In
general, is also complex, such as the analytic representation of a real-valued function. The term "Fourier
transform" refers to both the transform operation and to the complex-valued function it produces.
In the case of a periodic function (for example, a continuous but not necessarily sinusoidal musical sound), the
Fourier transform can be simplified to the calculation of a discrete set of complex amplitudes, called Fourier series
coefficients. Also, when a time-domain function is sampled to facilitate storage or computer-processing, it is still
possible to recreate a version of the original Fourier transform according to the Poisson summation formula, also
known as discrete-time Fourier transform. These topics are addressed in separate articles. For an overview of those
and other related operations, refer to Fourier analysis or List of Fourier-related transforms.

Definition
There are several common conventions for defining the Fourier transform ƒ̂ of an integrable function ƒ: R → C
(Kaiser 1994, p. 29), (Rahman 2011, p. 11). This article will use the definition:

, for every real number ξ.

When the independent variable x represents time (with SI unit of seconds), the transform variable ξ represents
frequency (in hertz). Under suitable conditions, ƒ is determined by ƒ̂ via the inverse transform:

for every real number x.

The statement that ƒ can be reconstructed from ƒ̂ is known as the Fourier integral theorem, and was first introduced in
Fourier's Analytical Theory of Heat (Fourier 1822, p. 525), (Fourier & Freeman 1878, p. 408), although what would
be considered a proof by modern standards was not given until much later (Titchmarsh 1948, p. 1). The functions ƒ
and ƒ̂ often are referred to as a Fourier integral pair or Fourier transform pair (Rahman 2011, p. 10).
For other common conventions and notations, including using the angular frequency ω instead of the frequency ξ,
see Other conventions and Other notations below. The Fourier transform on Euclidean space is treated separately, in
which the variable x often represents position and ξ momentum.


Introduction
The motivation for the Fourier transform comes from the study of Fourier series. In the study of Fourier series,
complicated but periodic functions are written as the sum of simple waves mathematically represented by sines and
cosines. The Fourier transform is an extension of the Fourier series that results when the period of the represented
function is lengthened and allowed to approach infinity.(Taneja 2008, p. 192)
Due to the properties of sine and cosine, it is possible to recover the amplitude of each wave in a Fourier series using
an integral. In many cases it is desirable to use Euler's formula, which states that e2πiθ= cos(2πθ) + i sin(2πθ), to
write Fourier series in terms of the basic waves e2πiθ. This has the advantage of simplifying many of the formulas
involved, and provides a formulation for Fourier series that more closely resembles the definition followed in this
article. Re-writing sines and cosines as complex exponentials makes it necessary for the Fourier coefficients to be
complex valued. The usual interpretation of this complex number is that it gives both the amplitude (or size) of the
wave present in the function and the phase (or the initial angle) of the wave. These complex exponentials sometimes
contain negative "frequencies". If θ is measured in seconds, then the waves e2πiθ and e−2πiθ both complete one cycle
per second, but they represent different frequencies in the Fourier transform. Hence, frequency no longer measures
the number of cycles per unit time, but is still closely related.
There is a close connection between the definition of Fourier series and the Fourier transform for functions ƒ which
are zero outside of an interval. For such a function, we can calculate its Fourier series on any interval that includes
the points where ƒ is not identically zero. The Fourier transform is also defined for such a function. As we increase
the length of the interval on which we calculate the Fourier series, then the Fourier series coefficients begin to look
like the Fourier transform and the sum of the Fourier series of ƒ begins to look like the inverse Fourier transform. To
explain this more precisely, suppose that T is large enough so that the interval [−T/2,T/2] contains the interval on
which ƒ is not identically zero. Then the n-th series coefficient cn is given by:

Comparing this to the definition of the Fourier transform, it follows that cn = ƒ̂(n/T) since ƒ(x) is zero outside
[−T/2,T/2]. Thus the Fourier coefficients are just the values of the Fourier transform sampled on a grid of width 1/T.
As T increases the Fourier coefficients more closely represent the Fourier transform of the function.
Under appropriate conditions, the sum of the Fourier series of ƒ will equal the function ƒ. In other words, ƒ can be
written:

where the last sum is simply the first sum rewritten using the definitions ξn = n/T, and Δξ = (n + 1)/T − n/T = 1/T.
This second sum is a Riemann sum, and so by letting T → ∞ it will converge to the integral for the inverse Fourier
transform given in the definition section. Under suitable conditions this argument may be made precise (Stein &
Shakarchi 2003).
In the study of Fourier series the numbers cn could be thought of as the "amount" of the wave present in the Fourier
series of ƒ. Similarly, as seen above, the Fourier transform can be thought of as a function that measures how much
of each individual frequency is present in our function ƒ, and we can recombine these waves by using an integral (or
"continuous sum") to reproduce the original function.


Example
The following images provide a visual illustration of how the Fourier transform measures whether a frequency is
present in a particular function. The function depicted ƒ(t) = cos(6πt) e-πt2 oscillates at 3 hertz (if t measures
seconds) and tends quickly to 0. (The second factor in this equation is an envelope function that shapes the
continuous sinusoid into a short pulse. Its general form is a Gaussian function). This function was specially chosen to
have a real Fourier transform which can easily be plotted. The first image contains its graph. In order to calculate
ƒ̂(3) we must integrate e−2πi(3t)ƒ(t). The second image shows the plot of the real and imaginary parts of this function.
The real part of the integrand is almost always positive, because when ƒ(t) is negative, the real part of e−2πi(3t) is
negative as well. Because they oscillate at the same rate, when ƒ(t) is positive, so is the real part of e−2πi(3t). The
result is that when you integrate the real part of the integrand you get a relatively large number (in this case 0.5). On
the other hand, when you try to measure a frequency that is not present, as in the case when we look at ƒ̂(5), the
integrand oscillates enough so that the integral is very small. The general situation may be a bit more complicated
than this, but this in spirit is how the Fourier transform measures how much of an individual frequency is present in a
function ƒ(t).

Original function showing Real and imaginary parts of Real and imaginary parts of Fourier transform with 3 and 5
oscillation 3 hertz. integrand for Fourier transform integrand for Fourier transform hertz labeled.
at 3 hertz at 5 hertz

Properties of the Fourier transform
Here we assume ƒ(x), g(x) and h(x) are integrable functions, are Lebesgue-measurable on the real line, and satisfy:

We denote the Fourier transforms of these functions by , and respectively.

Basic properties
The Fourier transform has the following basic properties: (Pinsky 2002).
Linearity

For any complex numbers a and b, if then
Translation

For any real number x0, if then
Modulation

For any real number ξ0 if then
Scaling

For a non-zero real number a, if h(x) = ƒ(ax), then The case a = -1 leads to the

time-reversal property, which states: if then


Conjugation

If then

In particular, if ƒ is real, then one has the reality condition

And if ƒ is purely imaginary, then

Uniform continuity and the Riemann–Lebesgue lemma
The Fourier transform may be defined in some cases for non-integrable
functions, but the Fourier transforms of integrable functions have
several strong properties.
The Fourier transform, ƒ̂, of any integrable function ƒ is uniformly
continuous and (Katznelson 1976). By the
Riemann–Lebesgue lemma (Stein & Weiss 1971),

However, ƒ̂ need not be integrable. For example, the Fourier transform The rectangular function is Lebesgue integrable.
of the rectangular function, which is integrable, is the sinc function,
which is not Lebesgue integrable, because its improper integrals
behave analogously to the alternating harmonic series, in converging to
a sum without being absolutely convergent.
It is not generally possible to write the inverse transform as a Lebesgue
integral. However, when both ƒ and ƒ̂ are integrable, the inverse
equality

holds almost everywhere. That is, the Fourier transform is injective on The sinc function, which is the Fourier transform
L1(R). (But if ƒ is continuous, then equality holds for every x.) of the rectangular function, is bounded and
continuous, but not Lebesgue integrable.

Plancherel theorem and Parseval's theorem
Let ƒ(x) and g(x) be integrable, and let ƒ̂(ξ) and be their Fourier transforms. If ƒ(x) and g(x) are also
square-integrable, then we have Parseval's theorem (Rudin 1987, p. 187):

where the bar denotes complex conjugation.
The Plancherel theorem, which is equivalent to Parseval's theorem, states (Rudin 1987, p. 186):

The Plancherel theorem makes it possible to extend the Fourier transform, by a continuity argument, to a unitary
operator on L2(R). On L1(R)∩L2(R), this extension agrees with original Fourier transform defined on L1(R), thus
enlarging the domain of the Fourier transform to L1(R) + L2(R) (and consequently to Lp(R) for 1 ≤ p ≤ 2). The
Plancherel theorem has the interpretation in the sciences that the Fourier transform preserves the energy of the
original quantity. Depending on the author either of these theorems might be referred to as the Plancherel theorem or
as Parseval's theorem.
See Pontryagin duality for a general formulation of this concept in the context of locally compact abelian groups.


Poisson summation formula
The Poisson summation formula (PSF) is an equation that relates the Fourier series coefficients of the periodic
summation of a function to values of the function's continuous Fourier transform. It has a variety of useful forms that
are derived from the basic one by application of the Fourier transform's scaling and time-shifting properties. The
frequency-domain dual of the standard PSF is also called discrete-time Fourier transform, which leads directly to:
• a popular, graphical, frequency-domain representation of the phenomenon of aliasing, and
• a proof of the Nyquist-Shannon sampling theorem.

Convolution theorem
The Fourier transform translates between convolution and multiplication of functions. If ƒ(x) and g(x) are integrable
functions with Fourier transforms ƒ̂(ξ) and respectively, then the Fourier transform of the convolution is given
by the product of the Fourier transforms ƒ̂(ξ) and (under other conventions for the definition of the Fourier
transform a constant factor may appear).
This means that if:

where ∗ denotes the convolution operation, then:

In linear time invariant (LTI) system theory, it is common to interpret g(x) as the impulse response of an LTI system
with input ƒ(x) and output h(x), since substituting the unit impulse for ƒ(x) yields h(x) = g(x). In this case,
represents the frequency response of the system.
Conversely, if ƒ(x) can be decomposed as the product of two square integrable functions p(x) and q(x), then the
Fourier transform of ƒ(x) is given by the convolution of the respective Fourier transforms and .

Cross-correlation theorem
In an analogous manner, it can be shown that if h(x) is the cross-correlation of ƒ(x) and g(x):

then the Fourier transform of h(x) is:

As a special case, the autocorrelation of function ƒ(x) is:

for which


Eigenfunctions
One important choice of an orthonormal basis for L2(R) is given by the Hermite functions

where are the "probabilist's" Hermite polynomials, defined by

Under this convention for the Fourier transform, we have that

In other words, the Hermite functions form a complete orthonormal system of eigenfunctions for the Fourier
transform on L2(R) (Pinsky 2002). However, this choice of eigenfunctions is not unique. There are only four
different eigenvalues of the Fourier transform (±1 and ±i) and any linear combination of eigenfunctions with the
same eigenvalue gives another eigenfunction. As a consequence of this, it is possible to decompose L2(R) as a direct
sum of four spaces H0, H1, H2, and H3 where the Fourier transform acts on Hek simply by multiplication by ik. This
approach to define the Fourier transform is due to N. Wiener (Duoandikoetxea 2001). Among other properties,
Hermite functions decrease exponentially fast in both frequency and time domains and they are used to define a
generalization of the Fourier transform, namely the fractional Fourier transform used in time-frequency analysis
(Boashash 2003).

Fourier transform on Euclidean space
The Fourier transform can be in any arbitrary number of dimensions n. As with the one-dimensional case, there are
many conventions. For an integrable function ƒ(x), this article takes the definition:

where x and ξ are n-dimensional vectors, and x · ξ is the dot product of the vectors. The dot product is sometimes
written as .
All of the basic properties listed above hold for the n-dimensional Fourier transform, as do Plancherel's and
Parseval's theorem. When the function is integrable, the Fourier transform is still uniformly continuous and the
Riemann–Lebesgue lemma holds. (Stein & Weiss 1971)

Uncertainty principle
Generally speaking, the more concentrated ƒ(x) is, the more spread out its Fourier transform ƒ̂(ξ) must be. In
particular, the scaling property of the Fourier transform may be seen as saying: if we "squeeze" a function in x, its
Fourier transform "stretches out" in ξ. It is not possible to arbitrarily concentrate both a function and its Fourier
transform.
The trade-off between the compaction of a function and its Fourier transform can be formalized in the form of an
uncertainty principle by viewing a function and its Fourier transform as conjugate variables with respect to the
symplectic form on the time–frequency domain: from the point of view of the linear canonical transformation, the
Fourier transform is rotation by 90° in the time–frequency domain, and preserves the symplectic form.
Suppose ƒ(x) is an integrable and square-integrable function. Without loss of generality, assume that ƒ(x) is
normalized:

It follows from the Plancherel theorem that ƒ̂(ξ) is also normalized.


The spread around x = 0 may be measured by the dispersion about zero (Pinsky 2002, p. 131) defined by

In probability terms, this is the second moment of |ƒ(x)|2 about zero.
The Uncertainty principle states that, if ƒ(x) is absolutely continuous and the functions x·ƒ(x) and ƒ′(x) are square
integrable, then

(Pinsky 2002).

The equality is attained only in the case (hence ) where σ > 0 is
2
arbitrary and C1 is such that ƒ is L –normalized (Pinsky 2002). In other words, where ƒ is a (normalized) Gaussian
function with variance σ2, centered at zero, and its Fourier transform is a Gaussian function with variance 1/σ2.
In fact, this inequality implies that:

for any in R (Stein & Shakarchi 2003, p. 158).
In quantum mechanics, the momentum and position wave functions are Fourier transform pairs, to within a factor of
Planck's constant. With this constant properly taken into account, the inequality above becomes the statement of the
Heisenberg uncertainty principle (Stein & Shakarchi 2003, p. 158).
A stronger uncertainty principle is the Hirschman uncertainty principle which is expressed as:

where H(p) is the differential entropy of the probability density function p(x):

where the logarithms may be in any base which is consistent. The equality is attained for a Gaussian, as in the
previous case.

Spherical harmonics
Let the set of homogeneous harmonic polynomials of degree k on Rn be denoted by Ak. The set Ak consists of the
solid spherical harmonics of degree k. The solid spherical harmonics play a similar role in higher dimensions to the
Hermite polynomials in dimension one. Specifically, if ƒ(x) = e−π|x|2P(x) for some P(x) in Ak, then
. Let the set Hk be the closure in L2(Rn) of linear combinations of functions of the form ƒ(|x|)P(x)
where P(x) is in Ak. The space L2(Rn) is then a direct sum of the spaces Hk and the Fourier transform maps each
space Hk to itself and is possible to characterize the action of the Fourier transform on each space Hk (Stein & Weiss
1971). Let ƒ(x) = ƒ0(|x|)P(x) (with P(x) in Ak), then where

Here J(n + 2k − 2)/2 denotes the Bessel function of the first kind with order (n + 2k − 2)/2. When k = 0 this gives a
useful formula for the Fourier transform of a radial function (Grafakos 2004).


Restriction problems
In higher dimensions it becomes interesting to study restriction problems for the Fourier transform. The Fourier
transform of an integrable function is continuous and the restriction of this function to any set is defined. But for a
square-integrable function the Fourier transform could be a general class of square integrable functions. As such, the
restriction of the Fourier transform of an L2(Rn) function cannot be defined on sets of measure 0. It is still an active
area of study to understand restriction problems in Lp for 1 < p < 2. Surprisingly, it is possible in some cases to
define the restriction of a Fourier transform to a set S, provided S has non-zero curvature. The case when S is the unit
sphere in Rn is of particular interest. In this case the Tomas-Stein restriction theorem states that the restriction of the
Fourier transform to the unit sphere in Rn is a bounded operator on Lp provided 1 ≤ p ≤ (2n + 2) / (n + 3).
One notable difference between the Fourier transform in 1 dimension versus higher dimensions concerns the partial
sum operator. Consider an increasing collection of measurable sets ER indexed by R ∈ (0,∞): such as balls of radius
R centered at the origin, or cubes of side 2R. For a given integrable function ƒ, consider the function ƒR defined by:

Suppose in addition that ƒ ∈ Lp(Rn). For n = 1 and 1 < p < ∞, if one takes ER = (−R, R), then ƒR converges to ƒ in Lp
as R tends to infinity, by the boundedness of the Hilbert transform. Naively one may hope the same holds true for n >
1. In the case that ER is taken to be a cube with side length R, then convergence still holds. Another natural candidate
is the Euclidean ball ER = {ξ : |ξ| < R}. In order for this partial sum operator to converge, it is necessary that the
multiplier for the unit ball be bounded in Lp(Rn). For n ≥ 2 it is a celebrated theorem of Charles Fefferman that the
multiplier for the unit ball is never bounded unless p = 2 (Duoandikoetxea 2001). In fact, when p ≠ 2, this shows that
not only may ƒR fail to converge to ƒ in Lp, but for some functions ƒ ∈ Lp(Rn), ƒR is not even an element of Lp.

Fourier transform on other function spaces
The definition of the Fourier transform by the integral formula

is valid for Lebesgue integrable functions ƒ; that is, ƒ ∈ L1(R). The image of L1 a subset of the space C0(R) of
continuous functions that tend to zero at infinity (the Riemann–Lebesgue lemma), although it is not the entire space.
Indeed, there is no simple characterization of the image.
It is possible to extend the definition of the Fourier transform to other spaces of functions. Since compactly
supported smooth functions are integrable and dense in L2(R), the Plancherel theorem allows us to extend the
definition of the Fourier transform to general functions in L2(R) by continuity arguments. Further : L2(R) →
L2(R) is a unitary operator (Stein & Weiss 1971, Thm. 2.3). In particular, the image of L2(R) is itself under the
Fourier transform. The Fourier transform in L2(R) is no longer given by an ordinary Lebesgue integral, although it
can be computed by an improper integral, here meaning that for an L2 function ƒ,

where the limit is taken in the L2 sense. Many of the properties of the Fourier transform in L1 carry over to L2, by a
suitable limiting argument.
The definition of the Fourier transform can be extended to functions in Lp(R) for 1 ≤ p ≤ 2 by decomposing such
functions into a fat tail part in L2 plus a fat body part in L1. In each of these spaces, the Fourier transform of a
function in Lp(R) is in Lq(R), where is the Hölder conjugate of p. by the Hausdorff–Young
inequality. However, except for p = 2, the image is not easily characterized. Further extensions become more
technical. The Fourier transform of functions in Lp for the range 2 < p < ∞ requires the study of distributions
(Katznelson 1976). In fact, it can be shown that there are functions in Lp with p>2 so that the Fourier transform is not


defined as a function (Stein & Weiss 1971).

Tempered distributions
One might consider enlarging the domain of the Fourier transform from L1+L2 by considering generalized functions,
or distributions. A distribution on R is a continuous linear functional on the space Cc(R) of compactly supported
smooth functions, equipped with a suitable topology. The strategy is then to consider the action of the Fourier
transform on Cc(R) and pass to distributions by duality. The obstruction to do this is that the Fourier transform does
not map Cc(R) to Cc(R). In fact the Fourier transform of an element in Cc(R) can not vanish on an open set; see the
above discussion on the uncertainty principle. The right space here is the slightly larger Schwartz functions. The
Fourier transform is an automorphism on the Schwartz space, as a topological vector space, and thus induces an
automorphism on its dual, the space of tempered distributions(Stein & Weiss 1971). The tempered distribution
include all the integrable functions mentioned above, as well as well-behaved functions of polynomial growth and
distributions of compact support.

For the definition of the Fourier transform of a tempered distribution, let f and g be integrable functions, and let
and be their Fourier transforms respectively. Then the Fourier transform obeys the following multiplication
formula (Stein & Weiss 1971),

Every integrable function ƒ defines (induces) a distribution Tƒ by the relation

for all Schwartz functions φ.

So it makes sense to define Fourier transform of Tƒ by

for all Schwartz functions φ.
Extending this to all tempered distributions T gives the general definition of the Fourier transform.
Distributions can be differentiated and the above mentioned compatibility of the Fourier transform with
differentiation and convolution remains true for tempered distributions.

Generalizations

Fourier–Stieltjes transform
The Fourier transform of a finite Borel measure μ on Rn is given by (Pinsky 2002, p. 256):

This transform continues to enjoy many of the properties of the Fourier transform of integrable functions. One
notable difference is that the Riemann–Lebesgue lemma fails for measures (Katznelson 1976). In the case that dμ=
ƒ(x) dx, then the formula above reduces to the usual definition for the Fourier transform of ƒ. In the case that μ is the
probability distribution associated to a random variable X, the Fourier-Stieltjes transform is closely related to the
characteristic function, but the typical conventions in probability theory take eix·ξ instead of e−2πix·ξ (Pinsky 2002).
In the case when the distribution has a probability density function this definition reduces to the Fourier transform
applied to the probability density function, again with a different choice of constants.
The Fourier transform may be used to give a characterization of measures. Bochner's theorem characterizes which
functions may arise as the Fourier–Stieltjes transform of a positive measure on the circle (Katznelson 1976).
Furthermore, the Dirac delta function is not a function but it is a finite Borel measure. Its Fourier transform is a
constant function (whose specific value depends upon the form of the Fourier transform used).


Locally compact abelian groups
The Fourier transform may be generalized to any locally compact abelian group. A locally compact abelian group is
an abelian group which is at the same time a locally compact Hausdorff topological space so that the group operation
is continuous. If G is a locally compact abelian group, it has a translation invariant measure μ, called Haar measure.
For a locally compact abelian group G, the set of irreducible, i.e. one-dimensional, unitary representations are called
its characters. With its natural group structure and the topology of pointwise convergence, the set of characters is
itself a locally compact abelian group, called the Pontryagin dual of G. For a function ƒ in L1(G), its Fourier
transform is defined by (Katznelson 1976):

The Riemann-Lebesgue lemma holds in this case; is a function vanishing at infinity on .

Gelfand transform
The Fourier transform is also a special case of Gelfand transform. In this particular context, it is closely related to the
Pontryagin duality map defined above.
Given an abelian locally compact Hausdorff topological group G, as before we consider space L1(G), defined using a
Haar measure. With convolution as multiplication, L1(G) is an abelian Banach algebra. It also has an involution *
given by

Taking the completion with respect to the largest possibly C*-norm gives its enveloping C*-algebra, called the group
C*-algebra C*(G) of G. (Any C*-norm on L1(G) is bounded by the L1 norm, therefore their supremum exists.)
Given any abelian C*-algebra A, the Gelfand transform gives an isomorphism between A and C0(A^), where A^ is
the multiplicative linear functionals, i.e. one-dimensional representations, on A with the weak-* topology. The map
is simply given by

It turns out that the multiplicative linear functionals of C*(G), after suitable identification, are exactly the characters
of G, and the Gelfand transform, when restricted to the dense subset L1(G) is the Fourier-Pontryagin transform.

Non-abelian groups
The Fourier transform can also be defined for functions on a non-abelian group, provided that the group is compact.
Removing the assumption that the underlying group is abelian, irreducible unitary representations need not always
be one-dimensional. This means the Fourier transform on a non-abelian group takes values as Hilbert space operators
(Hewitt & Ross 1970, Chapter 8). The Fourier transform on compact groups is a major tool in representation theory
(Knapp 2001) and non-commutative harmonic analysis.
Let G be a compact Hausdorff topological group. Let Σ denote the collection of all isomorphism classes of
finite-dimensional irreducible unitary representations, along with a definite choice of representation U(σ) on the
Hilbert space Hσ of finite dimension dσ for each σ ∈ Σ. If μ is a finite Borel measure on G, then the Fourier–Stieltjes
transform of μ is the operator on Hσ defined by

where is the complex-conjugate representation of U(σ) acting on Hσ. If μ is absolutely continuous with respect
to the left-invariant probability measure λ on G, represented as

for some ƒ ∈ L1(λ), one identifies the Fourier transform of ƒ with the Fourier–Stieltjes transform of μ.


The mapping defines an isomorphism between the Banach space M(G) of finite Borel measures (see rca
space) and a closed subspace of the Banach space C∞(Σ) consisting of all sequences E = (Eσ) indexed by Σ of
(bounded) linear operators Eσ: Hσ → Hσ for which the norm

is finite. The "convolution theorem" asserts that, furthermore, this isomorphism of Banach spaces is in fact an
isometric isomorphism of C* algebras into a subspace of C∞(Σ). Multiplication on M(G) is given by convolution of
measures and the involution * defined by

and C∞(Σ) has a natural C*-algebra structure as Hilbert space operators.
The Peter-Weyl theorem holds, and a version of the Fourier inversion formula (Plancherel's theorem) follows: if ƒ ∈
L2(G), then

where the summation is understood as convergent in the L2 sense.
The generalization of the Fourier transform to the noncommutative situation has also in part contributed to the
development of noncommutative geometry. In this context, a categorical generalization of the Fourier transform to
noncommutative groups is Tannaka-Krein duality, which replaces the group of characters with the category of
representations. However, this loses the connection with harmonic functions.

Alternatives
In signal processing terms, a function (of time) is a representation of a signal with perfect time resolution, but no
frequency information, while the Fourier transform has perfect frequency resolution, but no time information: the
magnitude of the Fourier transform at a point is how much frequency content there is, but location is only given by
phase (argument of the Fourier transform at a point), and standing waves are not localized in time – a sine wave
continues out to infinity, without decaying. This limits the usefulness of the Fourier transform for analyzing signals
that are localized in time, notably transients, or any signal of finite extent.
As alternatives to the Fourier transform, in time-frequency analysis, one uses time-frequency transforms or
time-frequency distributions to represent signals in a form that has some time information and some frequency
information – by the uncertainty principle, there is a trade-off between these. These can be generalizations of the
Fourier transform, such as the short-time Fourier transform or fractional Fourier transform, or other functions to
represent signals, as in wavelet transforms and chirplet transforms, with the wavelet analog of the (continuous)
Fourier transform being the continuous wavelet transform. (Boashash 2003).

Applications

Analysis of differential equations
Fourier transforms and the closely related Laplace transforms are widely used in solving differential equations. The
Fourier transform is compatible with differentiation in the following sense: if ƒ(x) is a differentiable function with
Fourier transform ƒ̂(ξ), then the Fourier transform of its derivative is given by 2πiξ ƒ̂(ξ). This can be used to
transform differential equations into algebraic equations. This technique only applies to problems whose domain is
the whole set of real numbers. By extending the Fourier transform to functions of several variables partial
differential equations with domain Rn can also be translated into algebraic equations.


Fourier transform spectroscopy
The Fourier transform is also used in nuclear magnetic resonance (NMR) and in other kinds of spectroscopy, e.g.
infrared (FTIR). In NMR an exponentially shaped free induction decay (FID) signal is acquired in the time domain
and Fourier-transformed to a Lorentzian line-shape in the frequency domain. The Fourier transform is also used in
magnetic resonance imaging (MRI) and mass spectrometry.

Quantum mechanics and signal processing
In quantum mechanics, Fourier transforms of solutions to the Schrödinger equation are known as momentum space
(or k space) wave functions. They display the amplitudes for momenta. Their absolute square is the probabilities of
momenta. This is valid also for classical waves treated in signal processing, such as in swept frequency radar where
data is taken in frequency domain and transformed to time domain, yielding range. The absolute square is then the
power.

Other notations
Other common notations for ƒ̂(ξ) include:

Denoting the Fourier transform by a capital letter corresponding to the letter of function being transformed (such as
ƒ(x) and F(ξ)) is especially common in the sciences and engineering. In electronics, the omega (ω) is often used
instead of ξ due to its interpretation as angular frequency, sometimes it is written as F(jω), where j is the imaginary
unit, to indicate its relationship with the Laplace transform, and sometimes it is written informally as F(2πƒ) in order
to use ordinary frequency.
The interpretation of the complex function ƒ̂(ξ) may be aided by expressing it in polar coordinate form

in terms of the two real functions A(ξ) and φ(ξ) where:

is the amplitude and

is the phase (see arg function).
Then the inverse transform can be written:

which is a recombination of all the frequency components of ƒ(x). Each component is a complex sinusoid of the
form e2πixξ whose amplitude is A(ξ) and whose initial phase angle (at x = 0) is φ(ξ).
The Fourier transform may be thought of as a mapping on function spaces. This mapping is here denoted and
is used to denote the Fourier transform of the function ƒ. This mapping is linear, which means that can
also be seen as a linear transformation on the function space and implies that the standard notation in linear algebra
of applying a linear transformation to a vector (here the function ƒ) can be used to write instead of .
Since the result of applying the Fourier transform is again a function, we can be interested in the value of this
function evaluated at the value ξ for its variable, and this is denoted either as or as . Notice that
in the former case, it is implicitly understood that is applied first to ƒ and then the resulting function is evaluated
at ξ, not the other way around.
In mathematics and various applied sciences it is often necessary to distinguish between a function ƒ and the value of
ƒ when its variable equals x, denoted ƒ(x). This means that a notation like formally can be interpreted as


the Fourier transform of the values of ƒ at x. Despite this flaw, the previous notation appears frequently, often when a
particular function or a function of a particular variable is to be transformed.
For example, is sometimes used to express that the Fourier transform of a rectangular
function is a sinc function,
or is used to express the shift property of the Fourier transform.
Notice, that the last example is only correct under the assumption that the transformed function is a function of x, not
of x0.

Other conventions
The Fourier transform can also be written in terms of angular frequency: ω = 2πξ whose units are radians per
second.
The substitution ξ = ω/(2π) into the formulas above produces this convention:

Under this convention, the inverse transform becomes:

Unlike the convention followed in this article, when the Fourier transform is defined this way, it is no longer a
unitary transformation on L2(Rn). There is also less symmetry between the formulas for the Fourier transform and its
inverse.
Another convention is to split the factor of (2π)n evenly between the Fourier transform and its inverse, which leads to
definitions:

Under this convention, the Fourier transform is again a unitary transformation on L2(Rn). It also restores the
symmetry between the Fourier transform and its inverse.
Variations of all three conventions can be created by conjugating the complex-exponential kernel of both the forward
and the reverse transform. The signs must be opposites. Other than that, the choice is (again) a matter of convention.

Summary of popular forms of the Fourier transform
ordinary frequency ξ (hertz) unitary

angular frequency ω (rad/s) non-unitary

unitary


As discussed above, the characteristic function of a random variable is the same as the Fourier–Stieltjes transform of
its distribution measure, but in this context it is typical to take a different convention for the constants. Typically
characteristic function is defined .

As in the case of the "non-unitary angular frequency" convention above, there is no factor of 2π appearing in either
of the integral, or in the exponential. Unlike any of the conventions appearing above, this convention takes the
opposite sign in the exponential.

Tables of important Fourier transforms
The following tables record some closed form Fourier transforms. For functions ƒ(x), g(x) and h(x) denote their
Fourier transforms by ƒ̂, , and respectively. Only the three most common conventions are included. It may be
useful to notice that entry 105 gives a relationship between the Fourier transform of a function and the original
function, which can be seen as relating the Fourier transform and its inverse.

Functional relationships
The Fourier transforms in this table may be found in Erdélyi (1954) or Kammler (2000, appendix).

Function Fourier transform Fourier transform Fourier transform Remarks
unitary, ordinary unitary, angular frequency non-unitary, angular
frequency frequency

Definition

101 Linearity

102 Shift in time domain

103 Shift in frequency domain, dual
of 102

104 Scaling in the time domain. If
is large, then is
concentrated around 0 and
spreads out and

flattens.

105 Duality. Here needs to be
calculated using the same
method as Fourier transform
column. Results from swapping
"dummy" variables of and
or or .
106

107 This is the dual of 106

108 The notation denotes the
convolution of and — this
rule is the convolution theorem
109 This is the dual of 108


110 For a purely real Hermitian symmetry.
indicates the complex
conjugate.

111 For a purely real , and are purely real even functions.
even function
112 For a purely real , and are purely imaginary odd functions.
odd function

Square-integrable functions
The Fourier transforms in this table may be found in (Campbell & Foster 1948), (Erdélyi 1954), or the appendix of
(Kammler 2000).

unitary, ordinary unitary, angular frequency non-unitary, angular
frequency frequency

201 The rectangular pulse and the
normalized sinc function, here defined
as sinc(x) = sin(πx)/(πx)

202 Dual of rule 201. The rectangular
function is an ideal low-pass filter, and
the sinc function is the non-causal
impulse response of such a filter.

203 The function tri(x) is the triangular
function

204 Dual of rule 203.

205 The function u(x) is the Heaviside unit
step function and a>0.

206 This shows that, for the unitary Fourier
transforms, the Gaussian function
exp(−αx2) is its own Fourier transform
for some choice of α. For this to be
integrable we must have Re(α)>0.

207 For a>0. That is, the Fourier transform
of a decaying exponential function is a
Lorentzian function.

208 Hyperbolic secant is its own Fourier
transform

209 is the Hermite's polynomial. If
then the Gauss-Hermite
functions are eigenfunctions of the
Fourier transform operator. For a
derivation, see Hermite polynomial.
The formula reduces to 206 for
.


Distributions
The Fourier transforms in this table may be found in (Erdélyi 1954) or the appendix of (Kammler 2000).

unitary, ordinary frequency unitary, angular frequency non-unitary, angular
frequency

301 The distribution δ(ξ)
denotes the Dirac delta
function.

302 Dual of rule 301.

303 This follows from 103
and 301.

304 This follows from rules
101 and 303 using Euler's
formula:

305 This follows from 101
and 303 using

306

307

308 Here, n is a natural
number and is
the n-th distribution
derivative of the Dirac
delta function. This rule
follows from rules 107
and 301. Combining this
rule with 101, we can
transform all
polynomials.

309 Here sgn(ξ) is the sign
function. Note that 1/x is
not a distribution. It is
necessary to use the
Cauchy principal value
when testing against
Schwartz functions. This
rule is useful in studying
the Hilbert transform.

310
1/xn is the homogeneous
distribution defined by
the distributional
derivative


311 This formula is valid for
0 > α > −1. For α > 0
some singular terms arise
at the origin that can be
found by differentiating
318. If Re α > −1, then
is a locally
integrable function, and
so a tempered
distribution. The function
is a
holomorphic function
from the right half-plane
to the space of tempered
distributions. It admits a
unique meromorphic
extension to a tempered
distribution, also denoted
for α ≠ −2, −4, ...
(See homogeneous
distribution.)
312 The dual of rule 309. This
time the Fourier
transforms need to be
considered as Cauchy
principal value.

313 The function u(x) is the
Heaviside unit step
function; this follows
from rules 101, 301, and
312.

314 This function is known as
the Dirac comb function.
This result can be derived
from 302 and 102,
together with the fact that

as

distributions.

315 The function J0(x) is the
zeroth order Bessel
function of first kind.

316 This is a generalization of
315. The function Jn(x) is
the n-th order Bessel
function of first kind. The
function Tn(x) is the
Chebyshev polynomial of
the first kind.

317 is the
Euler–Mascheroni
constant.


318 This formula is valid for
1 > α > 0. Use
differentiation to derive
formula for higher
exponents. is the
Heaviside function.

Two-dimensional functions

Function Fourier transform Fourier transform Fourier transform
unitary, ordinary frequency unitary, angular frequency non-unitary, angular frequency

400

401

402

Remarks
To 400: The variables ξx, ξy, ωx, ωy, νx and νy are real numbers. The integrals are taken over the entire plane.
To 401: Both functions are Gaussians, which may not have unit volume.
To 402: The function is defined by circ(r)=1 0≤r≤1, and is 0 otherwise. This is the Airy distribution, and is
expressed using J1 (the order 1 Bessel function of the first kind). (Stein & Weiss 1971, Thm. IV.3.3)

Formulas for general n-dimensional functions

Function Fourier transform Fourier transform Fourier transform
unitary, ordinary frequency unitary, angular frequency non-unitary, angular frequency

500

501

502

Remarks
To 501: The function χ[0,1] is the indicator function of the interval [0, 1]. The function Γ(x) is the gamma function.
The function Jn/2 + δ is a Bessel function of the first kind, with order n/2 + δ. Taking n = 2 and δ = 0 produces 402.
(Stein & Weiss 1971, Thm. 4.15)
To 502: See Riesz potential. The formula also holds for all α ≠ −n, −n − 1, ... by analytic continuation, but then the
function and its Fourier transforms need to be understood as suitably regularized tempered distributions. See
homogeneous distribution.


References
• Boashash, B., ed. (2003), Time-Frequency Signal Analysis and Processing: A Comprehensive Reference, Oxford:
Elsevier Science, ISBN 0-08-044335-4
• Bochner S., Chandrasekharan K. (1949), Fourier Transforms, Princeton University Press
• Bracewell, R. N. (2000), The Fourier Transform and Its Applications (3rd ed.), Boston: McGraw-Hill,
ISBN 0-07-116043-4.
• Campbell, George; Foster, Ronald (1948), Fourier Integrals for Practical Applications, New York: D. Van
Nostrand Company, Inc..
• Duoandikoetxea, Javier (2001), Fourier Analysis, American Mathematical Society, ISBN 0-8218-2172-5.
• Dym, H; McKean, H (1985), Fourier Series and Integrals, Academic Press, ISBN 978-0-12-226451-1.
• Erdélyi, Arthur, ed. (1954), Tables of Integral Transforms, 1, New Your: McGraw-Hill
• Fourier, J. B. Joseph (1822), Théorie Analytique de la Chaleur [1], Paris: Chez Firmin Didot, père et fils
• Fourier, J. B. Joseph; Freeman, Alexander, translator (1878), The Analytical Theory of Heat [2], The University
Press
• Grafakos, Loukas (2004), Classical and Modern Fourier Analysis, Prentice-Hall, ISBN 0-13-035399-X.
• Hewitt, Edwin; Ross, Kenneth A. (1970), Abstract harmonic analysis. Vol. II: Structure and analysis for compact
groups. Analysis on locally compact Abelian groups, Die Grundlehren der mathematischen Wissenschaften, Band
152, Berlin, New York: Springer-Verlag, MR0262773.
• Hörmander, L. (1976), Linear Partial Differential Operators, Volume 1, Springer-Verlag,
ISBN 978-3-540-00662-6.
• James, J.F. (2011), A Student's Guide to Fourier Transforms (3rd ed.), New York: Cambridge University Press,
ISBN 978-0-521-17683-5.
• Kaiser, Gerald (1994), A Friendly Guide to Wavelets [3], Birkhäuser, ISBN 0-8176-3711-7
• Kammler, David (2000), A First Course in Fourier Analysis, Prentice Hall, ISBN 0-13-578782-3
• Katznelson, Yitzhak (1976), An introduction to Harmonic Analysis, Dover, ISBN 0-486-63331-4
• Knapp, Anthony W. (2001), Representation Theory of Semisimple Groups: An Overview Based on Examples [4],
Princeton University Press, ISBN 978-0-691-09089-4
• Pinsky, Mark (2002), Introduction to Fourier Analysis and Wavelets [5], Brooks/Cole, ISBN 0-534-37660-6
• Polyanin, A. D.; Manzhirov, A. V. (1998), Handbook of Integral Equations, Boca Raton: CRC Press,
ISBN 0-8493-2876-4.
• Rudin, Walter (1987), Real and Complex Analysis (Third ed.), Singapore: McGraw Hill, ISBN 0-07-100276-6.
• Rahman, Matiur (2011), Applications of Fourier Transforms to Generalized Functions [6], WIT Press,
ISBN 1845645642.
• Stein, Elias; Shakarchi, Rami (2003), Fourier Analysis: An introduction [7], Princeton University Press,
ISBN 0-691-11384-X.
• Stein, Elias; Weiss, Guido (1971), Introduction to Fourier Analysis on Euclidean Spaces [8], Princeton, N.J.:
Princeton University Press, ISBN 978-0-691-08078-9.
• Taneja, HC (2008), "Chapter 18: Fourier integrals and Fourier transforms" [9], Advanced Engineering
Mathematics:, Volume 2, New Delhi, India: I. K. International Pvt Ltd, ISBN 8189866567.
• Titchmarsh, E (1948), Introduction to the theory of Fourier integrals (2nd ed.), Oxford University: Clarendon
Press (published 1986), ISBN 978-0-8284-0324-5.
• Wilson, R. G. (1995), Fourier Series and Optical Transform Techniques in Contemporary Optics, New York:
Wiley, ISBN 0-471-30357-7.
• Yosida, K. (1968), Functional Analysis, Springer-Verlag, ISBN 3-540-58654-7.


External links
• The Discrete Fourier Transformation (DFT): Definition and numerical examples [10] — A Matlab tutorial
• The Fourier Transform Tutorial Site [11] (thefouriertransform.com)
• Fourier Series Applet [12] (Tip: drag magnitude or phase dots up or down to change the wave form).
• Stephan Bernsee's FFTlab [13] (Java Applet)
• Stanford Video Course on the Fourier Transform [14]
• Hazewinkel, Michiel, ed. (2001), "Fourier transform" [15], Encyclopedia of Mathematics, Springer,
ISBN 978-1-55608-010-4
• Weisstein, Eric W., "Fourier Transform [16]" from MathWorld.
• The DFT “à Pied”: Mastering The Fourier Transform in One Day [17] at The DSP Dimension
• An Interactive Flash Tutorial for the Fourier Transform [18]

References
[1] http:/ / books. google. com/ books?id=TDQJAAAAIAAJ& pg=PA525& dq=%22c%27est-%C3%A0-dire+ qu%27on+ a+
l%27%C3%A9quation%22& hl=en& sa=X& ei=SrC7T9yKBorYiALVnc2oDg& sqi=2& ved=0CEAQ6AEwAg#v=onepage&
q=%22c%27est-%C3%A0-dire%20qu%27on%20a%20l%27%C3%A9quation%22& f=false
[2] http:/ / books. google. com/ books?id=-N8EAAAAYAAJ& pg=PA408& dq=%22that+ is+ to+ say,+ that+ we+ have+ the+ equation%22&
hl=en& sa=X& ei=F667T-u5I4WeiALEwpHXDQ& ved=0CDgQ6AEwAA#v=onepage&
q=%22that%20is%20to%20say%2C%20that%20we%20have%20the%20equation%22& f=false
[3] http:/ / books. google. com/ books?id=rfRnrhJwoloC& pg=PA29& dq=%22becomes+ the+ Fourier+ %28integral%29+ transform%22&
hl=en& sa=X& ei=osO7T7eFOqqliQK3goXoDQ& ved=0CDQQ6AEwAA#v=onepage&
q=%22becomes%20the%20Fourier%20%28integral%29%20transform%22& f=false
[4] http:/ / books. google. com/ ?id=QCcW1h835pwC
[5] http:/ / books. google. com/ books?id=tlLE4KUkk1gC& pg=PA256& dq=%22The+ Fourier+ transform+ of+ the+ measure%22& hl=en&
sa=X& ei=w8e7T43XJsiPiAKZztnRDQ& ved=0CEUQ6AEwAg#v=onepage&
q=%22The%20Fourier%20transform%20of%20the%20measure%22& f=false
[6] http:/ / books. google. com/ books?id=k_rdcKaUdr4C& pg=PA10
[7] http:/ / books. google. com/ books?id=FAOc24bTfGkC& pg=PA158& dq=%22The+ mathematical+ thrust+ of+ the+ principle%22& hl=en&
sa=X& ei=Esa7T5PZIsqriQKluNjPDQ& ved=0CDQQ6AEwAA#v=onepage&
q=%22The%20mathematical%20thrust%20of%20the%20principle%22& f=false
[8] http:/ / books. google. com/ books?id=YUCV678MNAIC& dq=editions:xbArf-TFDSEC& source=gbs_navlinks_s
[9] http:/ / books. google. com/ books?id=X-RFRHxMzvYC& pg=PA192& dq=%22The+ Fourier+ integral+ can+ be+ regarded+ as+ an+
extension+ of+ the+ concept+ of+ Fourier+ series%22& hl=en& sa=X& ei=D4rDT_vdCueQiAKF6PWeCA&
ved=0CDQQ6AEwAA#v=onepage&
q=%22The%20Fourier%20integral%20can%20be%20regarded%20as%20an%20extension%20of%20the%20concept%20of%20Fourier%20series%22&
f=false
[10] http:/ / www. nbtwiki. net/ doku. php?id=tutorial:the_discrete_fourier_transformation_dft
[11] http:/ / www. thefouriertransform. com
[12] http:/ / www. westga. edu/ ~jhasbun/ osp/ Fourier. htm
[13] http:/ / www. dspdimension. com/ fftlab/
[14] http:/ / www. academicearth. org/ courses/ the-fourier-transform-and-its-applications
[15] http:/ / www. encyclopediaofmath. org/ index. php?title=p/ f041150
[16] http:/ / mathworld. wolfram. com/ FourierTransform. html
[17] http:/ / www. dspdimension. com/ admin/ dft-a-pied/
[18] http:/ / www. fourier-series. com/ f-transform/ index. html


Spectral density
In statistical signal processing, statistics, and physics, the spectrum of a time-series or signal is a positive real
function of a frequency variable associated with a stationary stochastic process, or a deterministic function of time,
which has dimensions of power per hertz (Hz), or energy per hertz. Intuitively, the spectrum decomposes the content
of a stochastic process into different frequencies present in that process, and helps identify periodicities. More
specific terms which are used are the power spectrum, spectral density, power spectral density, or energy
spectral density.

Explanation
In physics, the signal is usually a wave, such as an electromagnetic wave, random vibration, or an acoustic wave.
The spectral density of the wave, when multiplied by an appropriate factor, will give the power carried by the wave,
per unit frequency, known as the power spectral density (PSD) of the signal. Power spectral density is commonly
expressed in watts per hertz (W/Hz).[1]
For voltage signals, it is customary to use units of V2Hz−1 for PSD, and V2sHz−1 for ESD.[2]
For random vibration analysis, units of g2Hz−1 are sometimes used for acceleration spectral density.[3]
Although it is not necessary to assign physical dimensions to the signal or its argument, in the following discussion
the terms used will assume that the signal varies in time.

Preliminary conventions on notations for time series
The phrase time series has been defined as "... a collection of observations made sequentially in time."[4] But it is
also used to refer to a stochastic process that would be the underlying theoretical model for the process that
generated the data (and thus include consideration of all the other possible sequences of data that might have been
observed, but weren't). Furthermore, time can be either continuous or discrete. There are, therefore, four different but
closely related definitions and formulas for the power spectrum of a time series.
If (discrete time) or (continuous time) is a stochastic process, we will refer to a possible time series of data
coming from it as a sample or path or signal of the stochastic process. To avoid confusion, we will reserve the word
process for a stochastic process, and use one of the words signal, or sample, to refer to a time series of data.
For X any random variable, standard notations of angle brackets or E will be used for ensemble average, also known
as statistical expectation, and Var for the theoretical variance.

Motivating example
Suppose , from to is a time series (discrete time) with zero mean. Suppose that it is a sum of a
finite number of periodic components (all frequencies are positive):

The variance of is, by definition, . If these data were samples taken from an electrical signal, this

would be its average power (power is energy per unit time, so it is analogous to variance if energy is analogous to the
amplitude squared).
Now, for simplicity, suppose the signal extends infinitely in time, so we pass to the limit as . If the
average power is bounded, which is almost always the case in reality, then the following limit exists and is the
variance of the data.


Again, for simplicity, we will pass to continuous time, and assume that the signal extends infinitely in time in both
directions. Then these two formulas become

and

But obviously the root mean square of either or is , so the variance of is
and that of is . Hence, the power of which comes from the component with frequency
is . All these contributions add up to the power of .
Then the power as a function of frequency is obviously , and its statistical cumulative distribution
function will be

is a step function, monotonically non-decreasing. Its jumps occur at the the frequencies of the periodic
components of , and the value of each jump is the power or variance of that component.
The variance is the covariance of the data with itself. If we now consider the same data but with a lag of , we can
take the covariance of with , and define this to be the autocorrelation function of the signal (or
data) :

When it exists, it is an even function of . If the average power is bounded, then exists everywhere, is finite, and
is bounded by , which is the power or variance of the data.
It is elementary to show that can be decomposed into periodic components with the same periods as :

This is in fact the spectral decomposition of over the different frequencies, and is obviously related to the
distribution of power of over the frequencies: the amplitude of a frequency component of is its contribution to
the power of the signal.

Definition

Energy spectral density
The energy spectral density describes how the energy of a signal or a time series is distributed with frequency.
Here, the term energy is used in the generalized sense of signal processing. This energy spectral density is most
suitable for transients, i.e., pulse-like signals, having a finite total energy; mathematically, we require that the signal
is described by a square-integrable function. In this case, the energy spectral density of the signal is the square
of the magnitude of the Fourier transform of the signal

where is the angular frequency, (i.e., times the ordinary frequency ) and is the Fourier transform of
, and is its complex conjugate. As is always the case, the multiplicative factor of is not


universally agreed upon, but rather depends on the particular normalizing constants used in the definition of the
various Fourier transforms.
As an example, if represents the potential (in volts) of an electrical signal propagating across a transmission
line, then the units of measure for spectral density would appear as volt2×seconds2, which is per se not yet
dimensionally correct for an spectral energy density in the sense of the physical sciences. However, after dividing by
the characteristic impedance Z (in ohms) of the transmission line, the dimensions of would become
volt2×seconds2 per ohm, which is equivalent to joules per hertz, the SI unit for spectral energy density as defined in
the physical sciences.
This definition generalizes in a straightforward manner to a discrete signal with an infinite number of values
such as a signal sampled at discrete times :

where is the discrete Fourier transform of In the mathematical sciences, the sampling interval is
often set to one. It is needed, however, to keep the correct physical units and to ensure that we recover the
continuous case in the limit

Power spectral density
The above definition of energy spectral density is most suitable for transients, i.e., pulse-like signals, for which the
Fourier transforms of the signals exist. For continued signals that describe, for example, stationary physical
processes, it makes more sense to define a power spectral density (PSD), which describes how the power of a
signal or time series is distributed over the different frequencies, as in the simple example given previously. Here,
power can be the actual physical power, or more often, for convenience with abstract signals, can be defined as the
squared value of the signal. (Statisticians study the variance of a set of data, but because of the analogy with
electrical signals, still refer to it as the power spectrum). The total power of the signal will be a time average since
power has units of energy/time: the power of the signal will be given by

,

if it exists, and the power of a signal may be finite even if the energy is infinite.
If the following normalized Fourier transform exists,

,

which is not necessarily the case, we can define the power spectral density as[5][6]

Remark: Many signals of interest are not integrable and the non-normalized (=ordinary) Fourier transform
of the signal does not exist. Some authors (e.g. Risken[7] ) still use the non-normalized Fourier transform in a formal
way to formulate a definition of the power spectral density
,
where is the Dirac delta function. Such formal statements may be sometimes useful to guide the
intuition, but should always be used with utmost care.
Using such formal reasoning, one may already guess that for a stationary random process, the power spectral density
and the autocorrelation function of this signal should be a Fourier transform
pair. Provided that is absolutely integrable, which is not always true, then


A deep theorem that was worked out by Norbert Wiener and Aleksandr Khinchin (the Wiener–Khinchin theorem)
makes sense of this formula for any wide-sense stationary process under weaker hypotheses: does not need to be
absolutely integrable, it only needs to exist. But the integral can no longer be interpreted as usual. The formula also
makes sense if interpreted as involving distributions (in the sense of Laurent Schwartz, not in the sense of a
statistical Cumulative distribution function) instead of functions. If is continuous, Bochner's theorem can be used
to prove that its Fourier transform exists as a positive measure, whose distribution function is F (but not necessarily
as a function and not necessarily possessing a probability density).
Many authors use this equality to actually define the power spectral density.[8]
The power of the signal in a given frequency band can be calculated by integrating over positive and
negative frequencies,

where F is the integrated spectrum whose derivative is f.
More generally, similar techniques may be used to estimate a time-varying spectral density.
The definition of the power spectral density generalizes in a straightforward manner to finite time-series with
, such as a signal sampled at discrete times for a total measurement period
.

.

In a real-world application, one would typically average this single-measurement PSD over several repetitions of the
measurement to obtain a more accurate estimate of the theoretical PSD of the physical process underlying the
individual measurements. This computed PSD is sometimes called periodogram. One can prove that this
periodogram converges to the true PSD when the averaging time interval T goes to infinity (Brown & Hwang[9]) to
approach the Power Spectral Density (PSD).
If two signals both possess power spectra (the correct terminology), then a cross-power spectrum can be calculated
by using their cross-correlation function.

Properties of the power spectral density
Some properties of the PSD include:[10]
• the spectrum of a real valued process is symmetric: , or in other words, it is an even function
• if the process is continuous and purely indeterministic, the autocovariance function can be reconstructed by using
the Inverse Fourier transform
• it describes the distribution of the variance over frequency. In particular,

• it is a linear function of the autocovariance function
If is decomposed into two functions , then

where
The integrated spectrum or power spectral distribution is defined as[11]


Cross-spectral density
"Just as the Power Spectral Density (PSD) is the Fourier transform of the auto-covariance function we may define
the Cross Spectral Density (CSD) as the Fourier transform of the cross-covariance function."[12]
The PSD is a special case of the cross spectral density (CPSD) function, defined between two signals xn and yn as

Estimation
The goal of spectral density estimation is to estimate the spectral density of a random signal from a sequence of time
samples. Depending on what is known about the signal, estimation techniques can involve parametric or
non-parametric approaches, and may be based on time-domain or frequency-domain analysis. For example, a
common parametric technique involves fitting the observations to an autoregressive model. A common
non-parametric technique is the periodogram.
The spectral density is usually estimated using Fourier transform methods, but other techniques such as Welch's
method and the maximum entropy method can also be used.

Properties
• The spectral density of and the autocorrelation of form a Fourier transform pair (for PSD versus
ESD, different definitions of autocorrelation function are used).
• One of the results of Fourier analysis is Parseval's theorem which states that the area under the energy spectral
density curve is equal to the area under the square of the magnitude of the signal, the total energy:

The above theorem holds true in the discrete cases as well. A similar result holds for power: the area under the
power spectral density curve is equal to the total signal power, which is , the autocorrelation function at
zero lag. This is also (up to a constant which depends on the normalization factors chosen in the definitions
employed) the variance of the data comprising the signal.

Related concepts
• Most "frequency" graphs really display only the spectral density. Sometimes the complete frequency spectrum is
graphed in two parts, "amplitude" versus frequency (which is the spectral density) and "phase" versus frequency
(which contains the rest of the information from the frequency spectrum). cannot be recovered from the
spectral density part alone — the "temporal information" is lost.
• The spectral centroid of a signal is the midpoint of its spectral density function, i.e. the frequency that divides the
distribution into two equal parts.
• The spectral edge frequency of a signal is an extension of the previous concept to any proportion instead of two
equal parts.
• Spectral density is a function of frequency, not a function of time. However, the spectral density of small
"windows" of a longer signal may be calculated, and plotted versus time associated with the window. Such a
graph is called a spectrogram. This is the basis of a number of spectral analysis techniques such as the short-time
Fourier transform and wavelets.
• In radiometry and colorimetry (or color science more generally), the spectral power distribution (SPD) of a light
source is a measure of the power carried by each frequency or "color" in a light source. The light spectrum is
usually measured at points (often 31) along the visible spectrum, in wavelength space instead of frequency space,


which makes it not strictly a spectral density. Some spectrophotometers can measure increments as fine as one to
two nanometers. Values are used to calculate other specifications and then plotted to demonstrate the spectral
attributes of the source. This can be a helpful tool in analyzing the color characteristics of a particular source.

Applications

Electrical engineering
The concept and use of the power spectrum of a signal is fundamental in electrical engineering, especially in
electronic communication systems, including radio communications, radars, and related systems, plus passive
[remote sensing] technology. Much effort has been expended and millions of dollars spent on developing and
producing electronic instruments called "spectrum analyzers" for aiding electrical engineers and technicians in
observing and measuring the power spectra of signals. The cost of a spectrum analyzer varies depending on its
frequency range, its bandwidth, and its accuracy. The higher the frequency range (S-band, C-band, X-band, Ku-band,
K-band, Ka-band, etc.), the more difficult the components are to make, assemble, and test and the more expensive
the spectrum analyzer is. Also, the wider the bandwidth that a spectrum analyzer possesses, the more costly that it is,
and the capability for more accurate measurements increases costs as well.
The spectrum analyzer measures the magnitude of the short-time Fourier transform (STFT) of an input signal. If the
signal being analyzed can be considered a stationary process, the STFT is a good smoothed estimate of its power
spectral density. These devices work in low frequencies and with small bandwidths.

Coherence
See Coherence (signal processing) for use of the cross-spectral density.

References
[1] Gérard Maral (2003). VSAT Networks (http:/ / books. google. com/ books?id=CMx5HQ1Mr_UC& pg=PR20& dq="power+ spectral+
density"+ W/ Hz& lr=& as_brr=0& ei=VYwvSImyA4L4sQPxxJXzAg& sig=-bko0DhmJwzISN6PcHszF9E3qUE#PPR20,M1). John Wiley
and Sons. ISBN 0-470-86684-5. .
[2] Michael Peter Norton and Denis G. Karczub (2003). Fundamentals of Noise and Vibration Analysis for Engineers (http:/ / books. google.
com/ books?id=jDeRCSqtev4C& pg=PA352& dq="power+ spectral+ density"+ "energy+ spectral+ density"& lr=& as_brr=3&
ei=i3IvSLL6H4-KsgPfze13& sig=RJgA8uGocYf5d6mC6rKKS-X_2bc). Cambridge University Press. ISBN 0-521-49913-5. .
[3] Alessandro Birolini (2007). Reliability Engineering (http:/ / books. google. com/ books?id=xPIW3AI9tdAC& pg=PA83&
dq=acceleration-spectral-density+ g+ hz& as_brr=3& ei=q24xSpKOBZXkzASPrs39BQ). Springer. p. 83. ISBN 978-3-540-49388-4. .
[4] C. Chatfield (1989). The Analysis of Time Series—An Introduction (fourth ed.). Chapman and Hall, London. p. 1. ISBN 0-412-31820-2.
[5] Fred Rieke, William Bialek, and David Warland (1999). Spikes: Exploring the Neural Code (Computational Neuroscience). MIT Press.
ISBN 978-0262681087.
[6] Scott Millers and Donald Childers (2012). Probability and random processes. Academic Press.
[7] Hannes Risken (1996). The Fokker–Planck Equation: Methods of Solution and Applications (http:/ / books. google. com/
books?id=MG2V9vTgSgEC& pg=PA30) (2nd ed.). Springer. p. 30. ISBN 9783540615309. .
[8] Dennis Ward Ricker (2003). Echo Signal Processing (http:/ / books. google. com/ books?id=NF2Tmty9nugC& pg=PA23& dq="power+
spectral+ density"+ "energy+ spectral+ density"& lr=& as_brr=3& ei=HZMvSPSWFZyStwPWsfyBAw&
sig=1ZZcHwxXkErvNXtAHv21ijTXoP8#PPA23,M1). Springer. ISBN 1-4020-7395-X. .
[9] Robert Grover Brown & Patrick Y.C. Hwang (1997). Introduction to Random Signals and Applied Kalman Filtering (http:/ / www. amazon.
com/ dp/ 0471128392). John Wiley & Sons. ISBN 0-471-12839-2. .
[10] Storch, H. Von; F. W Zwiers (2001). Statistical analysis in climate research. Cambridge Univ Pr. ISBN 0-521-01230-9.
[11] An Introduction to the Theory of Random Signals and Noise, Wilbur B. Davenport and Willian L. Root, IEEE Press, New York, 1987, ISBN
0-87942-235-1
[12] http:/ / www. fil. ion. ucl. ac. uk/ ~wpenny/ course/ course. html, chapter 7


Signal processing
Signal processing is an area of
systems engineering, electrical
engineering and applied mathematics
that deals with operations on or
analysis of signals, or measurements of
time-varying or spatially varying
physical quantities. Signals of interest
can include sound, images, and sensor Signal transmission using electronic signal processing. Transducers convert signals from
other physical waveforms to electrical current or voltage waveforms, which then are
data, for example biological data such
processed, transmitted as electromagnetic waves, received and converted by another
as electrocardiograms, control system transducer to final form.
signals, telecommunication
transmission signals, and many others.

Typical operations and applications
The goals of signal processing can roughly be divided into the following categories.
• Signal acquisition and reconstruction, which involves measuring a physical signal, storing it, and possibly later
rebuilding the original signal or an approximation thereof. For digital systems, this typically includes sampling
and quantization.
• Quality improvement, such as noise reduction, image enhancement, and echo cancellation.
• Signal compression, including audio compression, image compression, and video compression.
• Feature extraction, such as image understanding and speech recognition.
In communication systems, signal processing may occur at OSI layer 1, the Physical Layer (modulation,
equalization, multiplexing, etc.) in the seven layer OSI model, as well as at OSI layer 6, the Presentation Layer
(source coding, including analog-to-digital conversion and data compression).

History
According to Alan V. Oppenheim and Ronald W. Schafer, the principles of signal processing can be found in the
classical numerical analysis techniques of the 17th century. They further state that the "digitalization" or digital
refinement of these techniques can be found in the digital control systems of the 1940s and 1950s.[1]

Mathematical methods applied in signal processing
• Linear signals and systems, and transform theory
• System identification and classification
• Calculus
• Differential equations
• Vector spaces and Linear algebra
• Functional analysis
• Probability and stochastic processes
• Detection theory
• Estimation theory
• Optimization
• Programming


• Numerical methods
• Iterative methods

Categories of signal processing

Analog signal processing
Analog signal processing is for signals that have not been digitized, as in legacy radio, telephone, radar, and
television systems. This involves linear electronic circuits such as passive filters, active filters, additive mixers,
integrators and delay lines. It also involves non-linear circuits such as compandors, multiplicators (frequency mixers
and voltage-controlled amplifiers), voltage-controlled filters, voltage-controlled oscillators and phase-locked loops.

Discrete time signal processing
Discrete time signal processing is for sampled signals that are considered as defined only at discrete points in time,
and as such are quantized in time, but not in magnitude.
Analog discrete-time signal processing is a technology based on electronic devices such as sample and hold circuits,
analog time-division multiplexers, analog delay lines and analog feedback shift registers. This technology was a
predecessor of digital signal processing (see below), and is still used in advanced processing of gigahertz signals.
The concept of discrete-time signal processing also refers to a theoretical discipline that establishes a mathematical
basis for digital signal processing, without taking quantization error into consideration.

Digital signal processing
Digital signal processing is the processing of digitised discrete time sampled signals. Processing is done by
general-purpose computers or by digital circuits such as ASICs, field-programmable gate arrays or specialized digital
signal processors (DSP chips). Typical arithmetical operations include fixed-point and floating-point, real-valued
and complex-valued, multiplication and addition. Other typical operations supported by the hardware are circular
buffers and look-up tables. Examples of algorithms are the Fast Fourier transform (FFT), finite impulse response
(FIR) filter, Infinite impulse response (IIR) filter, and adaptive filters such as the Wiener and Kalman filters.

Fields of signal processing
• Statistical signal processing – analyzing and extracting information from signals and noise based on their
stochastic properties
• Audio signal processing – for electrical signals representing sound, such as speech or music
• Speech signal processing – for processing and interpreting spoken words
• Image processing – in digital cameras, computers and various imaging systems
• Video processing – for interpreting moving pictures
• Array processing – for processing signals from arrays of sensors
• Time-frequency analysis – for processing non-stationary signals[2]
• Filtering – used in many fields to process signals
• Seismic signal processing
• Data mining
• Financial signal processing


Notes and references
[1] Oppenheim, Alan V.; Schafer, Ronald W. (1975). Digital Signal Processing. Prentice Hall. p. 5. ISBN 0-13-214635-5.
[2] Boashash, Boualem, ed. (2003). Time frequency signal analysis and processing a comprehensive reference (1 ed.). Amsterdam: Elsevier.
ISBN 0-08-044335-4.

External links
• Signal Processing for Communications (http://www.sp4comm.org/) — free online textbook by Paolo Prandoni
and Martin Vetterli (2008)
• Scientists and Engineers Guide to Digital Signal Processing (http://www.dspguide.com) — free online
textbook by Stephen Smith

Autoregressive conditional heteroskedasticity
In econometrics, AutoRegressive Conditional Heteroskedasticity (ARCH) models are used to characterize and
model observed time series. They are used whenever there is reason to believe that, at any point in a series, the terms
will have a characteristic size, or variance. In particular ARCH models assume the variance of the current error term
or innovation to be a function of the actual sizes of the previous time periods' error terms: often the variance is
related to the squares of the previous innovations.
Such models are often called ARCH models (Engle, 1982),[1] although a variety of other acronyms are applied to
particular structures of model which have a similar basis. ARCH models are employed commonly in modeling
financial time series that exhibit time-varying volatility clustering, i.e. periods of swings followed by periods of
relative calm.

ARCH(q) model Specification
Suppose one wishes to model a time series using an ARCH process. Let denote the error terms (return residuals,
with respect to a mean process) i.e. the series terms. These are split into a stochastic piece and a
time-dependent standard deviation characterizing the typical size of the terms so that

The random variable is a strong White noise process. The series is modelled by

where and .
An ARCH(q) model can be estimated using ordinary least squares. A methodology to test for the lag length of
ARCH errors using the Lagrange multiplier test was proposed by Engle (1982). This procedure is as follows:
1. Estimate the best fitting autoregressive model AR(q)

.

2. Obtain the squares of the error and regress them on a constant and q lagged values:

where q is the length of ARCH lags.
3. The null hypothesis is that, in the absence of ARCH components, we have for all . The
alternative hypothesis is that, in the presence of ARCH components, at least one of the estimated coefficients


must be significant. In a sample of T residuals under the null hypothesis of no ARCH errors, the test statistic TR²
follows distribution with q degrees of freedom. If TR² is greater than the Chi-square table value, we reject the null
hypothesis and conclude there is an ARCH effect in the ARMA model. If TR² is smaller than the Chi-square table
value, we do not reject the null hypothesis.

GARCH
If an autoregressive moving average model (ARMA model) is assumed for the error variance, the model is a
generalized autoregressive conditional heteroskedasticity (GARCH, Bollerslev(1986)) model.
In that case, the GARCH(p, q) model (where p is the order of the GARCH terms and q is the order of the ARCH
terms ) is given by

Generally, when testing for heteroskedasticity in econometric models, the best test is the White test. However, when
dealing with time series data, this means to test for ARCH errors (as described above) and GARCH errors (below).
Prior to GARCH there was EWMA which has now been superseded by GARCH, although some people utilise both.

GARCH(p, q) model specification
The lag length p of a GARCH(p, q) process is established in three steps:
1. Estimate the best fitting AR(q) model

.

2. Compute and plot the autocorrelations of by

3. The asymptotic, that is for large samples, standard deviation of is . Individual values that are larger
than this indicate GARCH errors. To estimate the total number of lags, use the Ljung-Box test until the value of
these are less than, say, 10% significant. The Ljung-Box Q-statistic follows distribution with n degrees of
freedom if the squared residuals are uncorrelated. It is recommended to consider up to T/4 values of n. The
null hypothesis states that there are no ARCH or GARCH errors. Rejecting the null thus means that there are
existing such errors in the conditional variance.

NGARCH
Nonlinear GARCH (NGARCH) also known as Nonlinear Asymmetric GARCH(1,1) (NAGARCH) was introduced
by Engle and Ng in 1993.

.
For stock returns, parameter is usually estimated to be positive; in this case, it reflects the leverage effect,
signifying that negative returns increase future volatility by a larger amount than positive returns of the same
magnitude.[2][3]
This model shouldn't be confused with the NARCH model, together with the NGARCH extension, introduced by
Higgins and Bera in 1992.


IGARCH
Integrated Generalized Autoregressive Conditional Heteroskedasticity IGARCH is a restricted version of the
GARCH model, where the persistent parameters sum up to one, and therefore there is a unit root in the GARCH
process. The condition for this is

.

EGARCH
The exponential general autoregressive conditional heteroskedastic (EGARCH) model by Nelson (1991) is
another form of the GARCH model. Formally, an EGARCH(p,q):

where , is the conditional variance, , , , and are
coefficients, and may be a standard normal variable or come from a generalized error distribution. The
formulation for allows the sign and the magnitude of to have separate effects on the volatility. This is
[4]
particularly useful in an asset pricing context.
Since may be negative there are no (fewer) restrictions on the parameters.

GARCH-M
The GARCH-in-mean (GARCH-M) model adds a heteroskedasticity term into the mean equation. It has the
specification:

The residual is defined as

QGARCH
The Quadratic GARCH (QGARCH) model by Sentana (1995) is used to model symmetric effects of positive and
negative shocks.
In the example of a GARCH(1,1) model, the residual process is

where is i.i.d. and


GJR-GARCH
Similar to QGARCH, The Glosten-Jagannathan-Runkle GARCH (GJR-GARCH) model by Glosten, Jagannathan
and Runkle (1993) also models asymmetry in the ARCH process. The suggestion is to model where
is i.i.d., and

where if , and if .

TGARCH model
The Threshold GARCH (TGARCH) model by Zakoian (1994) is similar to GJR GARCH, and the specification is
one on conditional standard deviation instead of conditional variance:

where if , and if . Likewise, if , and
if .

fGARCH
Hentschel's fGARCH model,[5] also known as Family GARCH, is an omnibus model that nests a variety of other
popular symmetric and asymmetric GARCH models including APARCH, GJR, AVGARCH, NGARCH, etc.

References
[1] Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation (http:/ / www. jstor. org/ stable/
10. 2307/ 1912773) Robert F. Engle, Econometrica , Vol. 50, No. 4 (Jul., 1982), pp. 987-1007. Published by: The Econometric Society Stable
URL: http:/ / www. jstor. org/ stable/ 1912773
[2] Engle, R.F.; Ng, V.K. (1991). "Measuring and testing the impact of news on volatility" (http:/ / papers. ssrn. com/ sol3/ papers.
cfm?abstract_id=262096). Journal of Finance 48 (5): 1749–1778. .
[3] Posedel, Petra (2006). "Analysis Of The Exchange Rate And Pricing Foreign Currency Options On The Croatian Market: The Ngarch Model
As An Alternative To The Black Scholes Model" (http:/ / www. ijf. hr/ eng/ FTP/ 2006/ 4/ posedel. pdf). Financial Theory and Practice 30
(4): 347–368. .
[4] St. Pierre, Eilleen F (1998): Estimating EGARCH-M Models: Science or Art, The Quarterly Review of Economics and Finance, Vol. 38, No.
2, pp. 167-180 (http:/ / dx. doi. org/ 10. 1016/ S1062-9769(99)80110-0)
[5] Hentschel, Ludger (1995). All in the family Nesting symmetric and asymmetric GARCH models (http:/ / www. personal. anderson. ucla. edu/
rossen. valkanov/ hentschel_1995. pdf), Journal of Financial Economics, Volume 39, Issue 1, Pages 71-104

• Bollerslev, Tim (1986). "Generalized Autoregressive Conditional Heteroskedasticity", Journal of Econometrics,
31:307-327
• Bollerslev, Tim (2008). Glossary to ARCH (GARCH) (ftp://ftp.econ.au.dk/creates/rp/08/rp08_49.pdf),
working paper
• Enders, W. (1995). Applied Econometrics Time Series, John-Wiley & Sons, 139-149, ISBN 0-471-11163-5
• Engle, Robert F. (1982). "Autoregressive Conditional Heteroscedasticity with Estimates of Variance of United
Kingdom Inflation", Econometrica 50:987-1008. (the paper which sparked the general interest in ARCH models)
• Engle, Robert F. (2001). "GARCH 101: The Use of ARCH/GARCH Models in Applied Econometrics", Journal
of Economic Perspectives 15(4):157-168. (a short, readable introduction) Preprint (http://pages.stern.nyu.edu/
~rengle/Garch101.doc)
• Engle, R.F. (1995) ARCH: selected readings. Oxford University Press. ISBN 0-19-877432-X
• Gujarati, D. N. (2003) Basic Econometrics, 856-862
• Hacker, R. S. and Hatemi-J, A. (2005). A Test for Multivariate ARCH Effects (http://ideas.repec.org/a/taf/
apeclt/v12y2005i7p411-417.html), Applied Economics Letters, 12(7), 411–417.


• Nelson, D. B. (1991). "Conditional heteroskedasticity in asset returns: A new approach", Econometrica 59:
347-370.

Autoregressive integrated moving average
In statistics and econometrics, and in particular in time series analysis, an autoregressive integrated moving
average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. These models
are fitted to time series data either to better understand the data or to predict future points in the series (forecasting).
They are applied in some cases where data show evidence of non-stationarity, where an initial differencing step
(corresponding to the "integrated" part of the model) can be applied to remove the non-stationarity.
The model is generally referred to as an ARIMA(p,d,q) model where p, d, and q are non-negative integers that refer
to the order of the autoregressive, integrated, and moving average parts of the model respectively. ARIMA models
form an important part of the Box-Jenkins approach to time-series modelling.
When one of the three terms is zero, it's usual to drop "AR", "I" or "MA". For example, ARIMA(0,1,0) is I(1), and
ARIMA(0,0,1) is MA(1).

Definition
Given a time series of data where is an integer index and the are real numbers, then an ARMA(p,q)
model is given by:

where is the lag operator, the are the parameters of the autoregressive part of the model, the are the
parameters of the moving average part and the are error terms. The error terms are generally assumed to be
independent, identically distributed variables sampled from a normal distribution with zero mean.

Assume now that the polynomial has a unitary root of multiplicity d. Then it can be rewritten as:

An ARIMA(p,d,q) process expresses this polynomial factorisation property, and is given by:

and thus can be thought as a particular case of an ARMA(p+d,q) process having the auto-regressive polynomial with
some roots in the unity. For this reason every ARIMA model with d>0 is not wide sense stationary.

Other special forms
The explicit identification of the factorisation of the autoregression polynomial into factors as above, can be
extended to other cases, firstly to apply to the moving average polynomial and secondly to include other special
factors. For example, having a factor in a model is one way of including a non-stationary seasonality of
period s into the model. Another example is the factor , which includes a (non-stationary)

seasonality of period 12. The effect of the first type of factor is to allow each season's value to drift separately over
time, whereas with the second type values for adjacent seasons move together.


Identification and specification of appropriate factors in an ARIMA model can be an important step in modelling as
it can allow a reduction in the overall number of parameters to be estimated, while allowing the imposition on the
model of types of behaviour that logic and experience suggest should be there.

Forecasts using ARIMA models
ARIMA models are used for observable non-stationary processes that have some clearly identifiable trends:
• a constant trend (i.e. zero average) is modeled by
• a linear trend (i.e. linear growth behavior) is modeled by
• a quadratic trend (i.e. quadratic growth behavior) is modeled by
In these cases the ARIMA model can be viewed as a "cascade" of two models. The first is non-stationary:

while the second is wide-sense stationary:

Now standard forecasts techniques can be formulated for the process , and then (having the sufficient number of
initial conditions) can be forecast via opportune integration steps.

Examples
Some well-known special cases arise naturally. For example, an ARIMA(0,1,0) model is given by:

which is simply a random walk.
A number of variations on the ARIMA model are commonly used. For example, if multiple time series are used then
the can be thought of as vectors and a VARIMA model may be appropriate. Sometimes a seasonal effect is
suspected in the model. For example, consider a model of daily road traffic volumes. Weekends clearly exhibit
different behaviour from weekdays. In this case it is often considered better to use a SARIMA (seasonal ARIMA)
model than to increase the order of the AR or MA parts of the model. If the time-series is suspected to exhibit
long-range dependence then the parameter may be replaced by certain non-integer values in an autoregressive
fractionally integrated moving average model, which is also called a Fractional ARIMA (FARIMA or ARFIMA)
model.

Various packages that apply methodology like Box-Jenkins parameter optimization are available to find the right
parameters for the ARIMA model.
• In R, the standard stats package includes an arima function, is documented in "ARIMA Modelling of Time
Series" [6]. Besides the ARIMA(p,d,q) part, the function also includes seasonal factors, an intercept term, and
exogenous variables (xreg, called "external regressors"). The CRAN task view on Time Series [9] is the reference
with many more links.
• The "forecast" [1] package in R can automatically select an ARIMA model for a given time series with the
auto.arima() function. The package can also simulate seasonal and non-seasonal ARIMA models with its
simulate.Arima() function. It also has a function Arima(), which is a wrapper for the arima from the "stats"
package.
• SAS(R) of "SAS Institute Inc." [2] includes extensive ARIMA processing in its Econometric and Time Series
Analysis system: SAS/ETS.


• Stata includes ARIMA modelling (using its arima command) as of Stata 9.
• SAP(R) "SAP" [3] allows creating models like ARIMA by using native, predictive algorithms and by employing
algorithms from R.

References
• Mills, Terence C. (1990) Time Series Techniques for Economists. Cambridge University Press
• Percival, Donald B. and Andrew T. Walden. (1993) Spectral Analysis for Physical Applications. Cambridge
University Press.

External links
• The US Census Bureau uses ARIMA for "seasonally adjusted" data (programs, docs, and papers here) [4]

References
[1] http:/ / cran. r-project. org/ web/ packages/ forecast/ index. html
[2] http:/ / www. sas. com
[3] http:/ / www. sap. com/ solutions/ analytics/ business-intelligence/ predictive-analysis/ index. epx
[4] http:/ / www. census. gov/ srd/ www/ x12a/

Volatility (finance)
In finance, volatility is a measure for variation of price of a financial instrument over time. Historic volatility is
derived from time series of past market prices. An implied volatility is derived from the market price of a market
traded derivative (in particular an option). The symbol σ is used for volatility, and corresponds to standard deviation,
which should not be confused with the similarly named variance, which is instead the square, σ2.

Volatility terminology
Volatility as described here refers to the actual current volatility of a financial instrument for a specified period (for
example 30 days or 90 days). It is the volatility of a financial instrument based on historical prices over the specified
period with the last observation the most recent price. This phrase is used particularly when it is wished to
distinguish between the actual current volatility of an instrument and
• actual historical volatility which refers to the volatility of a financial instrument over a specified period but with
the last observation on a date in the past
• actual future volatility which refers to the volatility of a financial instrument over a specified period starting at
the current time and ending at a future date (normally the expiry date of an option)
• historical implied volatility which refers to the implied volatility observed from historical prices of the financial
instrument (normally options)
• current implied volatility which refers to the implied volatility observed from current prices of the financial
instrument
• future implied volatility which refers to the implied volatility observed from future prices of the financial
instrument
For a financial instrument whose price follows a Gaussian random walk, or Wiener process, the width of the
distribution increases as time increases. This is because there is an increasing probability that the instrument's price
will be farther away from the initial price as time increases. However, rather than increase linearly, the volatility
increases with the square-root of time as time increases, because some fluctuations are expected to cancel each other
out, so the most likely deviation after twice the time will not be twice the distance from zero.


Since observed price changes do not follow Gaussian distributions, others such as the Lévy distribution are often
used.[1] These can capture attributes such as "fat tails".

Volatility and Liquidity
Much research has been devoted to modeling and forecasting the volatility of financial returns, and yet few
theoretical models explain how volatility comes to exist in the first place.
Roll (1984) shows that volatility is affected by market microstructure.[2] Glosten and Milgrom (1985) shows that at
least one source of volatility can be explained by the liquidity provision process. When market makers infer the
possibility of adverse selection, they adjust their trading ranges, which in turn increases the band of price
oscillation.[3]

Volatility for investors
Investors care about volatility for five reasons:- 1) The wider the swings in an investment's price, the harder
emotionally it is to not worry;
2) When certain cash flows from selling a security are needed at a specific future date, higher volatility means a
greater chance of a shortfall;
3) Higher volatility of returns while saving for retirement results in a wider distribution of possible final portfolio
values;
4) Higher volatility of return when retired gives withdrawals a larger permanent impact on the portfolio's value;
5) Price volatility presents opportunities to buy assets cheaply and sell when overpriced.[4]
In today's markets, it is also possible to trade volatility directly, through the use of derivative securities such as
options and variance swaps. See Volatility arbitrage.

Volatility versus direction
Volatility does not measure the direction of price changes, merely their dispersion. This is because when calculating
standard deviation (or variance), all differences are squared, so that negative and positive differences are combined
into one quantity. Two instruments with different volatilities may have the same expected return, but the instrument
with higher volatility will have larger swings in values over a given period of time.
For example, a lower volatility stock may have an expected (average) return of 7%, with annual volatility of 5%.
This would indicate returns from approximately negative 3% to positive 17% most of the time (19 times out of 20, or
95% via a two standard deviation rule). A higher volatility stock, with the same expected return of 7% but with
annual volatility of 20%, would indicate returns from approximately negative 33% to positive 47% most of the time
(19 times out of 20, or 95%). These estimates assume a normal distribution; in reality stocks are found to be
leptokurtotic.

Volatility over time
Although the Black Scholes equation assumes predictable constant volatility, this is not observed in real markets,
and amongst the models are Bruno Dupire's Local Volatility, Poisson Process where volatility jumps to new levels
with a predictable frequency, and the increasingly popular Heston model of Stochastic Volatility.[5]
It is common knowledge that types of assets experience periods of high and low volatility. That is, during some
periods prices go up and down quickly, while during other times they barely move at all.
Periods when prices fall quickly (a crash) are often followed by prices going down even more, or going up by an
unusual amount. Also, a time when prices rise quickly (a possible bubble) may often be followed by prices going up


even more, or going down by an unusual amount.
The converse behavior, 'doldrums', can last for a long time as well.
Most typically, extreme movements do not appear 'out of nowhere'; they are presaged by larger movements than
usual. This is termed autoregressive conditional heteroskedasticity. Of course, whether such large movements have
the same direction, or the opposite, is more difficult to say. And an increase in volatility does not always presage a
further increase—the volatility may simply go back down again.

Mathematical definition
The annualized volatility σ is the standard deviation of the instrument's yearly logarithmic returns.[6]
The generalized volatility σT for time horizon T in years is expressed as:

Therefore, if the daily logarithmic returns of a stock have a standard deviation of σSD and the time period of returns
is P, the annualized volatility is

A common assumption is that P = 1/252 (there are 252 trading days in any given year). Then, if σSD = 0.01 the
annualized volatility is

The monthly volatility (i.e., T = 1/12 of a year) would be

The formula used above to convert returns or volatility measures from one time period to another assume a particular
underlying model or process. These formulas are accurate extrapolations of a random walk, or Wiener process,
whose steps have finite variance. However, more generally, for natural stochastic processes, the precise relationship
between volatility measures for different time periods is more complicated. Some use the Lévy stability exponent α
to extrapolate natural processes:

If α = 2 you get the Wiener process scaling relation, but some people believe α < 2 for financial activities such as
stocks, indexes and so on. This was discovered by Benoît Mandelbrot, who looked at cotton prices and found that
they followed a Lévy alpha-stable distribution with α = 1.7. (See New Scientist, 19 April 1997.)

Crude volatility estimation
Using a simplification of the formulae above it is possible to estimate annualized volatility based solely on
approximate observations. Suppose you notice that a market price index, which has a current value near 10,000, has
moved about 100 points a day, on average, for many days. This would constitute a 1% daily movement, up or down.
To annualize this, you can use the "rule of 16", that is, multiply by 16 to get 16% as the annual volatility. The
rationale for this is that 16 is the square root of 256, which is approximately the number of trading days in a year
(252). This also uses the fact that the standard deviation of the sum of n independent variables (with equal standard
deviations) is √n times the standard deviation of the individual variables.
Of course, the average magnitude of the observations is merely an approximation of the standard deviation of the
market index. Assuming that the market index daily changes are normally distributed with mean zero and standard
deviation σ, the expected value of the magnitude of the observations is √(2/π)σ = 0.798σ. The net effect is that this


crude approach underestimates the true volatility by about 20%.

Estimate of compound annual growth rate (CAGR)
Consider the Taylor series:

Taking only the first two terms one has:

Realistically, most financial assets have negative skewness and leptokurtosis, so this formula tends to be
over-optimistic. Some people use the formula:

for a rough estimate, where k is an empirical factor (typically five to ten).

Criticisms
Despite their sophisticated
composition, critics claim the
predictive power of most volatility
forecasting models is similar to that of
plain-vanilla measures, such as simple
past volatility.[7][8] Other works have
agreed, but claim critics failed to
correctly implement the more
[9]
complicated models. Some
practitioners and portfolio managers Performance of VIX (left) compared to past volatility (right) as 30-day volatility
seem to completely ignore or dismiss predictors, for the period of Jan 1990-Sep 2009. Volatility is measured as the standard
volatility forecasting models. For deviation of S&P500 one-day returns over a month's period. The blue lines indicate linear
regressions, resulting in the correlation coefficients r shown. Note that VIX has virtually
example, Nassim Taleb famously titled
the same predictive power as past volatility, insofar as the shown correlation coefficients
one of his Journal of Portfolio are nearly identical.
Management papers "We Don't Quite
Know What We are Talking About When We Talk About Volatility".[10] In a similar note, Emanuel Derman
expressed his disillusion with the enormous supply of empirical models unsupported by theory.[11] He argues that,
while "theories are attempts to uncover the hidden principles underpinning the world around us, as Albert Einstein
did with his theory of relativity", we should remember that "models are metaphors -- analogies that describe one
thing relative to another".


References
[1] http:/ / www. wilmottwiki. com/ wiki/ index. php/ Levy_distribution
[2] Roll, R. (1984): "A Simple Implicit Measure of the Effective Bid-Ask Spread in an Efficient Market", Journal of Finance 39 (4), 1127-1139
[3] Glosten, L. R. and P. R. Milgrom (1985): "Bid, Ask and Transaction Prices in a Specialist Market with Heterogeneously Informed Traders",
Journal of Financial Economics 14 (1), 71-100
[4] Investment Risks - Price Volatility (http:/ / www. retailinvestor. org/ risk. html#volatility)
[5] http:/ / www. wilmottwiki. com/ wiki/ index. php?title=Volatility
[6] "Calculating Historical Volatility: Step-by-Step Example" (http:/ / www. lfrankcabrera. com/ calc-hist-vol. pdf). 2011-07-14. . Retrieved
2011-07-15.
[7] Cumby, R.; Figlewski, S.; Hasbrouck, J. (1993). "Forecasting Volatility and Correlations with EGARCH models". Journal of Derivatives 1
(2): 51–63. doi:10.3905/jod.1993.407877.
[8] Jorion, P. (1995). "Predicting Volatility in Foreign Exchange Market". Journal of Finance 50 (2): 507–528. JSTOR 2329417.
[9] Andersen, Torben G.; Bollerslev, Tim (1998). "Answering the Skeptics: Yes, Standard Volatility Models Do Provide Accurate Forecasts".
International Economic Review 39 (4): 885–905. JSTOR 2527343.
[10] Goldstein, Daniel and Taleb, Nassim, (March 28, 2007) "We Don't Quite Know What We are Talking About When We Talk About
Volatility" (http:/ / papers. ssrn. com/ sol3/ papers. cfm?abstract_id=970480). Journal of Portfolio Management 33 (4), 2007.
[11] Derman, Emanuel (2011): Models.Behaving.Badly: Why Confusing Illusion With Reality Can Lead to Disaster, on Wall Street and in Life”,
Ed. Free Press.

External links
• Graphical Comparison of Implied and Historical Volatility (http://training.thomsonreuters.com/video/v.
php?v=273), video
• An introduction to volatility and how it can be calculated in excel, by Dr A. A. Kotzé (http://quantonline.co.za/
Articles/article_volatility.htm)
• Interactive Java Applet " What is Historic Volatility? (http://www.frog-numerics.com/ifs/ifs_LevelA/
HistVolaBasic.html)"
• Diebold, Francis X.; Hickman, Andrew; Inoue, Atsushi & Schuermannm, Til (1996) "Converting 1-Day
Volatility to h-Day Volatility: Scaling by sqrt(h) is Worse than You Think" (http://citeseer.ist.psu.edu/244698.
html)
• A short introduction to alternative mathematical concepts of volatility (http://staff.science.uva.nl/~marvisse/
volatility.html)
• Volatility estimation from predicted return density (http://www.macroaxis.com/invest/market/
GOOG--symbolVolatility) Example based on Google daily return distribution using standard density function
• Research paper including excerpt from report entitled Identifying Rich and Cheap Volatility (http://www.
iijournals.com/doi/abs/10.3905/JOT.2010.5.2.035) Excerpt from Enhanced Call Overwriting, a report by
Ryan Renicker and Devapriya Mallick at Lehman Brothers (2005).


Stable distribution
Stable

Probability density function

Symmetric α-stable distributions with unit scale factor

Skewed centered stable distributions with unit scale factor


Cumulative distribution function

CDFs for symmetric α-stable distributions

CDFs for skewed centered stable distributions
Parameters α ∈ (0, 2] — stability parameter
β ∈ [−1, 1] — skewness parameter (note that skewness is undefined)
c ∈ (0, ∞) — scale parameter
μ ∈ (−∞, ∞) — location parameter
Support x ∈ R, or x ∈ [μ, +∞) if α < 1 and β = 1, or x ∈ (-∞, μ] if α < 1 and β = −1
PDF not analytically expressible, except for some parameter values
CDF not analytically expressible, except for certain parameter values
Mean μ when α > 1, otherwise undefined
Median μ when β = 0, otherwise not analytically expressible
Mode μ when β = 0, otherwise not analytically expressible
Variance 2c2 when α = 2, otherwise infinite
Skewness 0 when α = 2, otherwise undefined
Ex. kurtosis 0 when α = 2, otherwise undefined
Entropy not analytically expressible, except for certain parameter values
MGF undefined
CF

where


In probability theory, a random variable is said to be stable (or to have a stable distribution) if it has the property
that a linear combination of two independent copies of the variable has the same distribution, up to location and scale
parameters. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution.
The importance of stable probability distributions is that they are "attractors" for properly normed sums of
independent and identically-distributed (iid) random variables. The normal distribution is one family of stable
distributions. By the classical central limit theorem the properly normed sum of a set of random variables, each with
finite variance, will tend towards a normal distribution as the number of variables increases. Without the finite
variance assumption the limit may be a stable distribution. Stable distributions that are non-normal are often called
stable Paretian distributions, after Vilfredo Pareto.
Umarov, Tsallis, Gell-Mann and Steinberg have defined q-analogs of all symmetric stable distributions which
recover the usual symmetric stable distributions in the limit of q → 1.[1]

Definition
A non-degenerate distribution is a stable distribution if it satisfies the following property:
Let X1 and X2 be independent copies of a random variable X. Then X is said to be stable if for any constants a
> 0 and b > 0 the random variable aX1 + bX2 has the same distribution as cX + d for some constants c > 0 and
d. The distribution is said to be strictly stable if this holds with d = 0 (Nolan 2009).
Since the normal distribution, the Cauchy distribution, and the Lévy distribution all have the above property, it
follows that they are special cases of stable distributions.
Such distributions form a four-parameter family of continuous probability distributions parametrized by location and
scale parameters μ and c, respectively, and two shape parameters β and α, roughly corresponding to measures of
asymmetry and concentration, respectively (see the figures).
Although the probability density function for a general stable distribution cannot be written analytically, the general
characteristic function can be. Any probability distribution is determined by its characteristic function φ(t) by:

A random variable X is called stable if its characteristic function can be written as (see Nolan (2009) and Voit (2003,
§ 5.4.3))

where sgn(t) is just the sign of t and Φ is given by

for all α except α = 1 in which case:

μ ∈ R is a shift parameter, β ∈ [−1, 1], called the skewness parameter, is a measure of asymmetry. Notice that in this
context the usual skewness is not well defined, as for α < 2 the distribution does not admit 2nd or higher moments,
and the usual skewness definition is the 3rd central moment.
In the simplest case β = 0, the characteristic function is just a stretched exponential function; the distribution is
symmetric about μ and is referred to as a (Lévy) symmetric alpha-stable distribution, often abbreviated SαS.
When α < 1 and β = 1, the distribution is supported by [μ, ∞).
The parameter |c| > 0 is a scale factor which is a measure of the width of the distribution and α is the exponent or
index of the distribution and specifies the asymptotic behavior of the distribution for α < 2.


Parameterizations
The above definition is only one of the parameterizations in use for stable distributions; it is the most common but is
not continuous in the parameters. For example, for the case α = 1 we could replace Φ by: (Nolan 2009)

and μ by

This parameterization has the advantage that we may define a standard distribution using

and

The pdf for all α will then have the following standardization property:

Applications
Stable distributions owe their importance in both theory and practice to the generalization of the central limit
theorem to random variables without second (and possibly first) order moments and the accompanying
self-similarity of the stable family. It was the seeming departure from normality along with the demand for a
self-similar model for financial data (i.e. the shape of the distribution for yearly asset price changes should resemble
that of the constituent daily or monthly price changes) that led Benoît Mandelbrot to propose that cotton prices
follow an alpha-stable distribution with α equal to 1.7. Lévy distributions are frequently found in analysis of critical
behavior and financial data (Voit 2003, § 5.4.3).
They are also found in spectroscopy as a general expression for a quasistatically pressure-broadened spectral line
(Peach 1981, § 4.5).
The statistics of solar flares are described by a non-Gaussian distribution. The solar flare statistics were shown to be
describable by a Lévy distribution and it was assumed that intermittent solar flares perturb the intrinsic fluctuations
in Earth’s average temperature. The end result of this perturbation is that the statistics of the temperature anomalies
inherit the statistical structure that was evident in the intermittency of the solar flare data. [2]
Lévy distribution of solar flare waiting time events (time between flare events) was demonstrated for CGRO BATSE
hard x-ray solar flares December 2001. Analysis of the Lévy statistical signature revealed that two different memory
signatures were evident; one related to the solar cycle and the second whose origin appears to be associated with a
localized or combination of localized solar active region effects. [3]


Properties
• All stable distributions are infinitely divisible.
• With the exception of the normal distribution (α = 2), stable distributions are leptokurtotic and heavy-tailed
distributions.
• Closure under convolution
Stable distributions are closed under convolution for a fixed value of α. Since convolution is equivalent to
multiplication of the Fourier-transformed function, it follows that the product of two stable characteristic functions
with the same α will yield another such characteristic function. The product of two stable characteristic functions is
given by:

Since Φ is not a function of the μ, c or β variables it follows that these parameters for the convolved function are
given by:

In each case, it can be shown that the resulting parameters lie within the required intervals for a stable distribution.

The distribution
A stable distribution is therefore specified by the above four parameters. It can be shown that any non-degenerate
stable distribution has a smooth (infinitely differentiable) density function.(Nolan 2009, Theorem 1.9) If
denotes the density of X and Y is the sum of independent copies of X:

then Y has the density with

The asymptotic behavior is described, for α< 2, by: (Nolan 2009, Theorem 1.12)

where Γ is the Gamma function (except that when α < 1 and β = ±1, the tail vanishes to the left or right, resp., of μ).
This "heavy tail" behavior causes the variance of stable distributions to be infinite for all α < 2. This property is
illustrated in the log-log plots below.
When α = 2, the distribution is Gaussian (see below), with tails asymptotic to exp(−x2/4c2)/(2c√π).


Special cases
There is no general analytic solution
for the form of p(x). There are,
however three special cases which can
be expressed in terms of elementary
functions as can be seen by inspection
of the characteristic function.

• For α = 2 the distribution reduces to
a Gaussian distribution with
variance σ2 = 2c2 and mean μ; the
skewness parameter β has no effect
(Nolan 2009) (Voit 2003, § 5.4.3).
• For α = 1 and β = 0 the distribution
reduces to a Cauchy distribution
with scale parameter c and shift
parameter μ (Voit 2003, § 5.4.3)
Log-log plot of symmetric centered stable distribution PDF's showing the power law
(Nolan 2009).
behavior for large x. The power law behavior is evidenced by the straight-line appearance
of the PDF for large x, with the slope equal to −(α+1). (The only exception is for α = 2, in • For α =1/2 and β = 1 the
black, which is a normal distribution.) distribution reduces to a Lévy
distribution with scale parameter c
and shift parameter μ. (Peach 1981,
§ 4.5)(Nolan 2009)
Note that the above three distributions
are also connected, in the following
way: A standard Cauchy random
variable can be viewed as a mixture of
Gaussian random variables (all with
mean zero), with the variance being
drawn from a standard Lévy
distribution. And in fact this is a
special case of a more general theorem
which allows any symmetric
alpha-stable distribution to be viewed
in this way (with the alpha parameter
of the mixture distribution equal to
Log-log plot of skewed centered stable distribution PDF's showing the power law twice the alpha parameter of the
behavior for large x. Again the slope of the linear portions is equal to −(α+1) mixing distribution—and the beta
parameter of the mixing distribution
always equal to one).

A general closed form expression for stable PDF's with rational values of α has been given by Zolotarev[4] in terms
of Meijer G-functions. For simple rational numbers, the closed form expression is often in terms of less complicated
special functions. Lee(Lee 2010, § 2.4) has listed a number of closed form expressions having rather simple
expressions in terms of special functions. In the table below, PDF's expressible by elementary functions are indicated
by an E and those given by Lee that are expressible by special functions are indicated by an s.


α

1/3 1/2 2/3 1 4/3 3/2 2

β=0 s s s E s s E

β=1 s E s s s

Some of the special cases are known by particular names:
• For α = 1 and β = 1, the distribution is a Landau distribution which has a specific usage in physics under this
name.
• For α = 3/2 and β = 0 the distribution reduces to a Holtsmark distribution with scale parameter c and shift
parameter μ.(Lee 2010, § 2.4)
Also, in the limit as c approaches zero or as α approaches zero the distribution will approach a Dirac delta function
δ(x−μ).

A generalized central limit theorem
Another important property of stable distributions is the role that they play in a generalized central limit theorem.
The central limit theorem states that the sum of a number of independent and identically distributed (i.i.d.) random
variables with finite variances will tend to a normal distribution as the number of variables grows. A generalization
due to Gnedenko and Kolmogorov states that the sum of a number of random variables with power-law tail
distributions decreasing as |x|−α−1 where 0 < α < 2 (and therefore having infinite variance) will tend to a stable
distribution as the number of variables grows. (Voit 2003, § 5.4.3)

Series representation
The stable distribution can be restated as the real part of a simpler integral:(Peach 1981, § 4.5)

Expressing the second exponential as a Taylor series, we have:

where . Reversing the order of integration and summation, and carrying out the integration
yields:

which will be valid for x ≠ μ and will converge for appropriate values of the parameters. (Note that the n = 0 term
which yields a delta function in x−μ has therefore been dropped.) Expressing the first exponential as a series will
yield another series in positive powers of x−μ which is generally less useful.


References
[1] Umarov, Sabir; Tsallis, Constantino, Gell-Mann, Murray and Steinberg, Stanly (2010). "Generalization of symmetric α-stable Lévy
distributions for q>1". J Math Phys. (American Institute of Physics) 51 (3). arXiv:0911.2009. Bibcode 2010JMP....51c3502U.
doi:10.1063/1.3305292. PMC 2869267. PMID 20596232.
[2] Scafetta, N., Bruce, J.W., Is climate sensitive to solar variability? Physics Today, 60, 50-51 (2008) (http:/ / www. fel. duke. edu/ ~scafetta/
pdf/ opinion0308. pdf).
[3] Leddon, D., A statistical Study of Hard X-Ray Solar Flares (http:/ / www. library. unt. edu/ theses/ open/ 20013/ leddon_deborah/
dissertation. pdf)
[4] Zolotarev, V.M. (1995). "On Representation of Densities of Stable Laws by Special Functions" (http:/ / epubs. siam. org/ tvp/ resource/ 1/
tprbau/ v39/ i2/ p354_s1). Theory Probab. Appl. (SIAM) 39 (2): 354–362. . Retrieved 2011-08-15.

• Feller, W. (1971) An Introduction to Probability Theory and Its Applications, Volume 2. Wiley. ISBN
0-471-25709-5
• Gnedenko, B. V.; Kolmogorov, A. N. (1954). Limit Distributions for Sums of Independent Random Variables.
Addison-Wesley.
• Ibragimov, I.; Linnik, Yu (1971). Independent and Stationary Sequences of Random Variables.
Wolters-Noordhoff Publishing Groningen, The Netherlands.
• Lee, W.H. (2010). Continuous and discrete properties of stochastic processes (http://etheses.nottingham.ac.uk/
1194/) (PhD thesis). The University of Nottingham.
• Matsui, M.; Takemura, A.. "Some improvements in numerical evaluation of symmetric stable density and its
derivatives" (http://www.e.u-tokyo.ac.jp/cirje/research/dp/2004/2004cf292.pdf) (PDF). CIRGE Discussion
paper. Retrieved July 13, 2005.
• Nolan, John P. (2009). "Stable Distributions: Models for Heavy Tailed Data" (http://academic2.american.edu/
~jpnolan/stable/chap1.pdf) (PDF). Retrieved 2009-02-21.
• Peach, G. (1981). "Theory of the pressure broadening and shift of spectral lines" (http://journalsonline.tandf.co.
uk/openurl.asp?genre=article&eissn=1460-6976&volume=30&issue=3&spage=367). Advances in Physics 30
(3): 367–474. Bibcode 1981AdPhy..30..367P. doi:10.1080/00018738100101467.
• Rachev, S.; Mittnik, S. (2000). "Stable Paretian Models in Finance". Wiley. ISBN 978-0-471-95314-2.
• Samorodnitsky, G.; Taqqu, M. (1994). "Stable Non-Gaussian Random Processes: Stochastic Models with Infinite
Variance (Stochastic Modeling Series)". Chapman and Hall/CRC. ISBN 978-0-412-05171-5.
• Voit, Johannes (2003). The Statistical Mechanics of Financial Markets (Texts and Monographs in Physics).
Springer-Verlag. ISBN 3-540-00978-7.
• Zolotarev, V.M. (1986). One-dimensional Stable Distributions. American Mathematical Society.

External links
• PlanetMath (http://planetmath.org/encyclopedia/StrictlyStableRandomVariable.html) stable random variable
article
• John P. Nolan (http://academic2.american.edu/~jpnolan/stable/stable.html) page on stable distributions
• stable distributions (http://www.gnu.org/software/gsl/manual/gsl-ref.
html#The-Levy-alpha_002dStable-Distributions) in GNU Scientific Library — Reference Manual
• Applications (http://www.mathestate.com/tools/Financial/map/Overview.html) of stable laws in finance.
• fBasics (http://cran.r-project.org/web/packages/fBasics/index.html) R package with functions to compute
stable density, distribution function, quantile function and generate random variates.
• STBL (http://math.bu.edu/people/mveillet/html/alphastablepub.html) MATLAB package which includes
functions to compute stable densities, CDFs and inverse CDFs. Also can fit stable distributions to data, and
generate stable random variables.
• StableDistribution (http://reference.wolfram.com/mathematica/ref/StableDistribution.html) is fully supported
in Mathematica since version 8.


Mathematical finance
Mathematical finance is a field of applied mathematics, concerned with financial markets. The subject has a close
relationship with the discipline of financial economics, which is concerned with much of the underlying theory.
Generally, mathematical finance will derive and extend the mathematical or numerical models without necessarily
establishing a link to financial theory, taking observed market prices as input. Thus, for example, while a financial
economist might study the structural reasons why a company may have a certain share price, a financial
mathematician may take the share price as a given, and attempt to use stochastic calculus to obtain the fair value of
derivatives of the stock (see: Valuation of options; Financial modeling). The fundamental theorem of arbitrage-free
pricing is one of the key theorems in mathematical finance, while the Black–Scholes equation and formula are
amongst the key results.
In terms of practice, mathematical finance also overlaps heavily with the field of computational finance (also known
as financial engineering). Arguably, these are largely synonymous, although the latter focuses on application, while
the former focuses on modeling and derivation (see: Quantitative analyst), often by help of stochastic asset models.
In general, there exist two separate branches of finance that require advanced quantitative techniques: derivatives
pricing on the one hand, and risk- and portfolio management on the other hand. These are discussed below.
Many universities around the world now offer degree and research programs in mathematical finance; see Master of
Mathematical Finance.

History: Q versus P
There exist two separate branches of finance that require advanced quantitative techniques: derivatives pricing on the
one hand, and risk and portfolio management on the other hand. One of the main differences is that they use different
probabilities, namely the risk-neutral probability, denoted by "Q", and the actual probability, denoted by "P".

Derivatives pricing: the Q world
The goal of derivatives pricing is to determine the fair price of a given security in terms of more liquid securities
whose price is determined by the law of supply and demand. The meaning of "fair" depends, of course, on whether
one considers buying or selling the security. Examples of securities being priced are plain vanilla and exotic options,
convertible bonds, etc.
Once a fair price has been determined, the sell-side trader can make a market on the security. Therefore, derivatives
pricing is a complex "extrapolation" exercise to define the current market value of a security, which is then used by
the sell-side community.

Derivatives pricing: the Q world
Goal "extrapolate the present"

Environment risk-neutral probability

Processes continuous-time martingales

Dimension low

Tools Ito calculus, PDE’s

Challenges calibration

Business sell-side

Quantitative derivatives pricing was initiated by Louis Bachelier in The Theory of Speculation (published 1900),
with the introduction of the most basic and most influential of processes, the Brownian motion, and its applications


to the pricing of options. Bachelier modeled the time series of changes in the logarithm of stock prices as a random
walk in which the short-term changes had a finite variance. This causes longer-term changes to follow a Gaussian
distribution. Bachelier's work, however, was largely unknown outside academia.
The theory remained dormant until Fischer Black and Myron Scholes, along with fundamental contributions by
Robert C. Merton, applied the second most influential process, the geometric Brownian motion, to option pricing.
For this M. Scholes and R. Merton were awarded the 1997 Nobel Memorial Prize in Economic Sciences. Black was
ineligible for the prize because of his death in 1995.
The next important step was the fundamental theorem of asset pricing by Harrison and Pliska (1981), according to
which the suitably normalized current price P0 of a security is arbitrage-free, and thus truly fair, only if there exists a
stochastic process Pt with constant expected value which describes its future evolution:

(1 )

A process satisfying (1) is called a "martingale". A martingale does not reward risk. Thus the probability of the
normalized security price process is called "risk-neutral" and is typically denoted by the blackboard font letter "
".
The relationship (1) must hold for all times t: therefore the processes used for derivatives pricing are naturally set in
continuous time.
The quants who operate in the Q world of derivatives pricing are specialists with deep knowledge of the specific
products they model.
Securities are priced individually, and thus the problems in the Q world are low-dimensional in nature. Calibration is
one of the main challenges of the Q world: once a continuous-time parametric process has been calibrated to a set of
traded securities through a relationship such as (1), a similar relationship is used to define the price of new
derivatives.
The main quantitative tools necessary to handle continuous-time Q-processes are Ito’s stochastic calculus and partial
differential equations (PDE’s).

Risk and portfolio management: the P world
Risk and portfolio management aims at modelling the probability distribution of the market prices of all the
securities at a given future investment horizon.
This "real" probability distribution of the market prices is typically denoted by the blackboard font letter " ", as
opposed to the "risk-neutral" probability " " used in derivatives pricing.
Based on the P distribution, the buy-side community takes decisions on which securities to purchase in order to
improve the prospective profit-and-loss profile of their positions considered as a portfolio.

Risk and portfolio management: the P world
Goal "model the future"

Environment real probability

Processes discrete-time series

Dimension large

Tools multivariate statistics

Challenges estimation

Business buy-side


The quantitative theory of risk and portfolio management started with the mean-variance framework of Harry
Markowitz (1952), who caused a shift away from the concept of trying to identify the best individual stock for
investment. Using a linear regression strategy to understand and quantify the risk (i.e. variance) and return (i.e.
mean) of an entire portfolio of stocks, bonds, and other securities, an optimization strategy was used to choose a
portfolio with largest mean return subject to acceptable levels of variance in the return. Next, breakthrough advances
were made with the Capital Asset Pricing Model (CAPM) and the Arbitrage Pricing Theory (APT) developed by
Treynor (1962), Mossin (1966), William Sharpe (1964), Lintner (1965) and Ross (1976).
For their pioneering work, Markowitz and Sharpe, along with Merton Miller, shared the 1990 Nobel Memorial Prize
in Economic Sciences, for the first time ever awarded for a work in finance.
The portfolio-selection work of Markowitz and Sharpe introduced mathematics to the "black art" of investment
management. With time, the mathematics has become more sophisticated. Thanks to Robert Merton and Paul
Samuelson, one-period models were replaced by continuous time, Brownian-motion models, and the quadratic utility
function implicit in mean–variance optimization was replaced by more general increasing, concave utility
functions.[1] Furthermore, in more recent years the focus shifted toward estimation risk, i.e., the dangers of
incorrectly assuming that advanced time series analysis alone can provide completely accurate estimates of the
market parameters [2]
Much effort has gone into the study of financial markets and how prices vary with time. Charles Dow, one of the
founders of Dow Jones & Company and The Wall Street Journal, enunciated a set of ideas on the subject which are
now called Dow Theory. This is the basis of the so-called technical analysis method of attempting to predict future
changes. One of the tenets of "technical analysis" is that market trends give an indication of the future, at least in the
short term. The claims of the technical analysts are disputed by many academics.

Criticism
Over the years, increasingly sophisticated mathematical models and derivative pricing strategies have been
developed, but their credibility was damaged by the financial crisis of 2007–2010.
Contemporary practice of mathematical finance has been subjected to criticism from figures within the field notably
by Nassim Nicholas Taleb, a professor of financial engineering at Polytechnic Institute of New York University, in
his book The Black Swan[3] and Paul Wilmott. Taleb claims that the prices of financial assets cannot be characterized
by the simple models currently in use, rendering much of current practice at best irrelevant, and, at worst,
dangerously misleading. Wilmott and Emanuel Derman published the Financial Modelers' Manifesto in January
2008[4] which addresses some of the most serious concerns.
Bodies such as the Institute for New Economic Thinking are now attempting to establish more effective theories and
methods.[5]
In general, modeling the changes by distributions with finite variance is, increasingly, said to be inappropriate.[6] In
the 1960s it was discovered by Benoît Mandelbrot that changes in prices do not follow a Gaussian distribution, but
are rather modeled better by Lévy alpha-stable distributions. The scale of change, or volatility, depends on the length
of the time interval to a power a bit more than 1/2. Large changes up or down are more likely than what one would
calculate using a Gaussian distribution with an estimated standard deviation.[3] See also Financial models with
long-tailed distributions and volatility clustering.


Mathematical finance articles
See also Outline of finance: § Financial mathematics; § Mathematical tools; § Derivatives pricing.

Mathematical tools

• Asymptotic analysis • Mathematical models • Stochastic calculus
• Brownian motion
• Lévy process
• Calculus • Monte Carlo method • Stochastic differential equations
• Copulas • Numerical analysis • Stochastic volatility
• Numerical partial differential equations
• Crank–Nicolson method
• Finite difference method
• Differential equations • Real analysis • Value at risk
• Expected value • Partial differential equations • Volatility
• ARCH model
• GARCH model
• Ergodic theory • Probability
• Feynman–Kac formula • Probability distributions
• Binomial distribution
• Log-normal distribution
• Fourier transform • Quantile functions
• Heat equation
• Gaussian copulas • Radon–Nikodym derivative
• Girsanov's theorem • Risk-neutral measure
• Itô's lemma
• Martingale representation theorem

Derivatives pricing

• The Brownian Motion Model of • Options • Interest rate derivatives
Financial Markets • Put–call parity (Arbitrage • Black model
• Rational pricing assumptions relationships for options) • caps and floors
• Risk neutral valuation • Intrinsic value, Time value • swaptions
• Arbitrage-free pricing • Moneyness • Bond options
• Forward Price Formula • Pricing models • Short-rate models
• Futures contract pricing • Black–Scholes model • Rendleman-Bartter model
• Swap Valuation • Black model • Vasicek model
• Binomial options model • Ho-Lee model
• Monte Carlo option model • Hull–White model
• Implied volatility, Volatility smile • Cox–Ingersoll–Ross model
• SABR Volatility Model • Black–Karasinski model
• Markov Switching Multifractal • Black–Derman–Toy model
• The Greeks • Kalotay–Williams–Fabozzi model
• Finite difference methods for • Longstaff–Schwartz model
option pricing • Chen model
• Vanna Volga method • Forward rate-based models
• Trinomial tree
• LIBOR market model
• Garman-Kohlhagen model
(Brace–Gatarek–Musiela Model, BGM)
• Optimal stopping (Pricing of
• Heath–Jarrow–Morton Model (HJM)
American options)


Notes
[1] Karatzas, Ioannis; Shreve, Steve (1998). Methods of Mathematical Finance. Secaucus, NJ, USA: Springer-Verlag New York, Incorporated.
ISBN 9780387948393.
[2] Meucci, Attilio (2005). Risk and Asset Allocation. Springer. ISBN 9783642009648.
[3] Taleb, Nassim Nicholas (2007). The Black Swan: The Impact of the Highly Improbable. Random House Trade. ISBN 978-1-4000-6351-2.
[4] "Financial Modelers' Manifesto" (http:/ / www. wilmott. com/ blogs/ paul/ index. cfm/ 2009/ 1/ 8/ Financial-Modelers-Manifesto). Paul
Wilmott's Blog. January 8, 2009. . Retrieved June 1, 2012.
[5] Gillian Tett (April 15, 2010). "Mathematicians must get out of their ivory towers" (http:/ / www. ft. com/ cms/ s/ 0/
cfb9c43a-48b7-11df-8af4-00144feab49a. html). Financial Times. .
[6] Svetlozar T. Rachev, Frank J. Fabozzi, Christian Menn (2005). Fat-Tailed and Skewed Asset Return Distributions: Implications for Risk
Management, Portfolio Selection, and Option Pricing. John Wiley and Sons. ISBN 978-0471718864.

References
• Harold Markowitz, Portfolio Selection, Journal of Finance, 7, 1952, pp. 77–91
• William Sharpe, Investments, Prentice-Hall, 1985
• Attilio Meucci, versus Q: Differences and Commonalities between the Two Areas of Quantitative Finance (http://
ssrn.com/abstract=1717163''P), GARP Risk Professional, February 2011, pp. 41-44

Stochastic differential equation
A stochastic differential equation (SDE) is a differential equation in which one or more of the terms is a stochastic
process, resulting in a solution which is itself a stochastic process. SDE are used to model diverse phenomena such
as fluctuating stock prices or physical system subject to thermal fluctuations. Typically, SDEs incorporate white
noise which can be thought of as the derivative of Brownian motion (or the Wiener process); however, it should be
mentioned that other types of random fluctuations are possible, such as jump processes.

Background
The earliest work on SDEs was done to describe Brownian motion in Einstein's famous paper, and at the same time
by Smoluchowski. However, one of the earlier works related to Brownian motion is credited to Bachelier (1900) in
his thesis 'Theory of Speculation'. This work was followed upon by Langevin. Later Itō and Stratonovich put SDEs
on more solid mathematical footing.

Terminology
In physical science, SDEs are usually written as Langevin equations. These are sometimes confusingly called "the
Langevin equation" even though there are many possible forms. These consist of an ordinary differential equation
containing a deterministic part and an additional random white noise term. A second form is the Smoluchowski
equation and, more generally, the Fokker-Planck equation. These are partial differential equations that describe the
time evolution of probability distribution functions. The third form is the stochastic differential equation that is used
most frequently in mathematics and quantitative finance (see below). This is similar to the Langevin form, but it is
usually written in differential form. SDEs come in two varieties, corresponding to two versions of stochastic
calculus.


Stochastic Calculus
Brownian motion or the Wiener process was discovered to be exceptionally complex mathematically. The Wiener
process is nowhere differentiable; thus, it requires its own rules of calculus. There are two dominating versions of
stochastic calculus, the Ito stochastic calculus and the Stratonovich stochastic calculus. Each of the two has
advantages and disadvantages, and newcomers are often confused whether the one is more appropriate than the other
in a given situation. Guidelines exist (e.g. Øksendal, 2003) and conveniently, one can readily convert an Ito SDE to
an equivalent Stratonovich SDE and back again. Still, one must be careful which calculus to use when the SDE is
initially written down.

Numerical Solutions
Numerical solution of stochastic differential equations and especially stochastic partial differential equations is a
young field relatively speaking. Almost all algorithms that are used for the solution of ordinary differential equations
will work very poorly for SDEs, having very poor numerical convergence. A textbook describing many different
algorithms is Kloeden & Platen (1995).
Methods include the Euler–Maruyama method, Milstein method and Runge–Kutta method (SDE).

Use in Physics
In physics, SDEs are typically written in the Langevin form and referred to as "the Langevin equation." For example,
a general coupled set of first-order SDEs is often written in the form:

where is the set of unknowns, the and are arbitrary functions and the are
random functions of time, often referred to as "noise terms". This form is usually usable because there are standard
techniques for transforming higher-order equations into several coupled first-order equations by introducing new
unknowns. If the are constants, the system is said to be subject to additive noise, otherwise it is said to be subject
to multiplicative noise. This term is somewhat misleading as it has come to mean the general case even though it
appears to imply the limited case where : . Additive noise is the simpler of the two cases; in that
situation the correct solution can often be found using ordinary calculus and in particular the ordinary chain rule of
calculus. However, in the case of multiplicative noise, the Langevin equation is not a well-defined entity on its own,
and it must be specified whether the Langevin equation should be interpreted as an Ito SDE or a Stratonovich SDE.
In physics, the main method of solution is to find the probability distribution function as a function of time using the
equivalent Fokker-Planck equation (FPE). The Fokker-Planck equation is a deterministic partial differential
equation. It tells how the probability distribution function evolves in time similarly to how the Schrödinger equation
gives the time evolution of the quantum wave function or the diffusion equation gives the time evolution of chemical
concentration. Alternatively numerical solutions can be obtained by Monte Carlo simulation. Other techniques
include the path integration that draws on the analogy between statistical physics and quantum mechanics (for
example, the Fokker-Planck equation can be transformed into the Schrödinger equation by rescaling a few variables)
or by writing down ordinary differential equations for the statistical moments of the probability distribution function.


Use in probability and mathematical finance
The notation used in probability theory (and in many applications of probability theory, for instance mathematical
finance) is slightly different. This notation makes the exotic nature of the random function of time in the physics
formulation more explicit. It is also the notation used in publications on numerical methods for solving stochastic
differential equations. In strict mathematical terms, can not be chosen as a usual function, but only as a
generalized function. The mathematical formulation treats this complication with less ambiguity than the physics
formulation.
A typical equation is of the form

where denotes a Wiener process (Standard Brownian motion). This equation should be interpreted as an informal
way of expressing the corresponding integral equation

The equation above characterizes the behavior of the continuous time stochastic process Xt as the sum of an ordinary
Lebesgue integral and an Itō integral. A heuristic (but very helpful) interpretation of the stochastic differential
equation is that in a small time interval of length δ the stochastic process Xt changes its value by an amount that is
normally distributed with expectation μ(Xt, t) δ and variance σ(Xt, t)² δ and is independent of the past behavior of the
process. This is so because the increments of a Wiener process are independent and normally distributed. The
function μ is referred to as the drift coefficient, while σ is called the diffusion coefficient. The stochastic process Xt is
called a diffusion process, and is usually a Markov process.
The formal interpretation of an SDE is given in terms of what constitutes a solution to the SDE. There are two main
definitions of a solution to an SDE, a strong solution and a weak solution. Both require the existence of a process Xt
that solves the integral equation version of the SDE. The difference between the two lies in the underlying
probability space (Ω F, Pr). A weak solution consists of a probability space and a process that satisfies the integral
equation, while a strong solution is a process that satisfies the equation and is defined on a given probability space.
An important example is the equation for geometric Brownian motion

which is the equation for the dynamics of the price of a stock in the Black Scholes options pricing model of financial
mathematics.
There are also more general stochastic differential equations where the coefficients μ and σ depend not only on the
present value of the process Xt, but also on previous values of the process and possibly on present or previous values
of other processes too. In that case the solution process, X, is not a Markov process, and it is called an Itō process and
not a diffusion process. When the coefficients depends only on present and past values of X, the defining equation is
called a stochastic delay differential equation.


Existence and uniqueness of solutions
As with deterministic ordinary and partial differential equations, it is important to know whether a given SDE has a
solution, and whether or not it is unique. The following is a typical existence and uniqueness theorem for Itō SDEs
taking values in n-dimensional Euclidean space Rn and driven by an m-dimensional Brownian motion B; the proof
may be found in Øksendal (2003, §5.2).
Let T > 0, and let

be measurable functions for which there exist constants C and D such that

for all t ∈ [0, T] and all x and y ∈ Rn, where

Let Z be a random variable that is independent of the σ-algebra generated by Bs, s ≥ 0, and with finite second
moment:

Then the stochastic differential equation/initial value problem

has a Pr-almost surely unique t-continuous solution (t, ω) |→ Xt(ω) such that X is adapted to the filtration FtZ
generated by Z and Bs, s ≤ t, and

References
• Adomian, George (1983). Stochastic systems. Mathematics in Science and Engineering (169). Orlando, FL:
Academic Press Inc..
• Adomian, George (1986). Nonlinear stochastic operator equations. Orlando, FL: Academic Press Inc..
• Adomian, George (1989). Nonlinear stochastic systems theory and applications to physics. Mathematics and its
Applications (46). Dordrecht: Kluwer Academic Publishers Group.
• Øksendal, Bernt K. (2003). Stochastic Differential Equations: An Introduction with Applications. Berlin:
Springer. ISBN 3-540-04758-1.
• Teugels, J. and Sund B. (eds.) (2004). Encyclopedia of Actuarial Science. Chichester: Wiley. pp. 523–527.
• C. W. Gardiner (2004). Handbook of Stochastic Methods: for Physics, Chemistry and the Natural Sciences.
Springer. p. 415.
• Thomas Mikosch (1998). Elementary Stochastic Calculus: with Finance in View. Singapore: World Scientific
Publishing. p. 212. ISBN 981-02-3543-7.
• Seifedine Kadry, (2007). A Solution of Linear Stochastic Differential Equation. USA: WSEAS
TRANSACTIONS on MATHEMATICS, April 2007.. p. 618. ISSN 1109-2769.
• Bachelier, L., (1900). Théorie de la speculation (in French), PhD Thesis. NUMDAM: http://archive.numdam.
org/ARCHIVE/ASENS/ASENS_1900_3_17_/ASENS_1900_3_17__21_0/ASENS_1900_3_17__21_0.pdf.&
#32;In English in 1971 book 'The Random Character of the Stock Market' Eds. P.H. Cootner.


• P.E. Kloeden and E. Platen, (1995). Numerical Solution of Stochastic Differential Equations,. Springer,.

Brownian model of financial markets
The Brownian motion models for financial markets are based on the work of Robert C. Merton and Paul A.
Samuelson, as extensions to the one-period market models of Harold Markowitz and William Sharpe, and are
concerned with defining the concepts of financial assets and markets, portfolios, gains and wealth in terms of
continuous-time stochastic processes.
Under this model, these assets have continuous prices evolving continuously in time and are driven by Brownian
motion processes.[1] This model requires an assumption of perfectly divisible assets and a frictionless market (i.e.
that no transaction costs occur either for buying or selling). Another assumption is that asset prices have no jumps,
that is there are no surprises in the market. This last assumption is removed in jump diffusion models.

Financial market processes
Consider a financial market consisting of financial assets, where one of these assets, called a bond or
money-market, is risk free while the remaining assets, called stocks, are risky.

Definition
A financial market is defined as :
1. A probability space
2. A time interval
3. A -dimensional Brownian process adapted to the augmented
filtration
4. A measurable risk-free money market rate process
5. A measurable mean rate of return process .
6. A measurable dividend rate of return process .

7. A measurable volatility process such that .

8. A measurable, finite variation, singularly continuous stochastic
9. The initial conditions given by

The augmented filtration
Let be a probability space, and a be D-dimensional
Brownian motion stochastic process, with the natural filtration:

If are the measure 0 (i.e. null under measure ) subsets of , then define the augmented filtration:

The difference between and is that the latter is both
left-continuous, in the sense that:

and right-continuous, such that:


while the former is only left-continuous.[2]

Bond
A share of a bond (money market) has price at time with , is continuous,
adapted, and has finite variation. Because it has finite variation, it can be decomposed into
an absolutely continuous part and a singularly continuous part , by Lebesgue's decomposition
theorem. Define:
and

resulting in the SDE:

which gives:

Thus, it can be easily seen that if is absolutely continuous (i.e. ), then the price of the bond
evolves like the value of a risk-free savings account with instantaneous interest rate , which is random,
time-dependent and measurable.

Stocks
Stock prices are modeled as being similar to that of bonds, except with a randomly fluctuating component (called its
volatility). As a premium for the risk originating from these random fluctuations, the mean rate of return of a stock is
higher than that of a bond.
Let be the strictly positive prices per share of the stocks, which are continuous stochastic
processes satisfying:

Here, gives the volatility of the -th stock, while is its mean rate of return.
In order for an arbitrage-free pricing scenario, must be as defined above. The solution to this is:

and the discounted stock prices are:

Note that the contribution due to the discontinuites in the bond price does not appear in this equation.


Dividend rate
Each stock may have an associated dividend rate process giving the rate of dividend payment per unit price of
the stock at time . Accounting for this in the model, gives the yield process :

Portfolio and gain processes

Definition
Consider a financial market .
A portfolio process for this market is an measurable, valued process such that:

, almost surely,

, almost surely, and

, almost surely.

The gains process for this porfolio is:

We say that the porfolio is self-financed if:

.

It turns out that for a self-financed portfolio, the appropriate value of is determined from and
therefore sometimes is referred to as the portfolio process. Also, implies borrowing money from the
money-market, while implies taking a short position on the stock.
The term in the SDE of is the risk premium process, and it is the compensation
received in return for investing in the -th stock.

Motivation
Consider time intervals , and let be the number of shares of asset
, held in a portfolio during time interval at time . To avoid the case
of insider trading (i.e. foreknowledge of the future), it is required that is measurable.
Therefore, the incremental gains at each trading interval from such a portfolio is:

and is the total gain over time , while the total value of the portfolio is .

Define , let the time partition go to zero, and substitute for as defined earlier, to get the
corresponding SDE for the gains process. Here denotes the dollar amount invested in asset at time , not
the number of shares held.


Income and wealth processes

Definition
Given a financial market , then a cumulative income process is a semimartingale and
represents the income accumulated over time , due to sources other than the investments in the assets
of the financial market.
A wealth process is then defined as:

and represents the total wealth of an investor at time . The portfolio is said to be -financed if:

The corresponding SDE for the wealth process, through appropriate substitutions, becomes:

.

Note, that again in this case, the value of can be determined from .

Viable markets
The standard theory of mathematical finance is restricted to viable financial markets, i.e. those in which there are no
opportunities for arbitrage. If such opportunities exists, it implies the possibility of making an arbitrarily large
risk-free profit.

Definition
In a financial market , a self-financed portfolio process is said to be an arbitrage opportunity if the
associated gains process , almost surely and strictly. A market in which no
such portfolio exists is said to be viable.

Implications
In a viable market , there exists a adapted process such that for almost every
:

.

This is called the market price of risk and relates the premium for the -the stock with its volatility .
Conversely, if there exists a D-dimensional process such that it satifies the above requirement, and:

,

then the market is viable.
Also, a viable market can have only one money-market (bond) and hence only one risk-free rate. Therefore, if
the -th stock entails no risk (i.e. ) and pays no dividend (i.e. ), then its rate
of return is equal to the money market rate (i.e. ) and its price tracks that of the bond (i.e.
).


Standard financial market

Definition
A financial market is said to be standard if:
(i) It is viable.
(ii) The number of stocks is not greater than the dimension of the underlying Brownian motion process
.
(iii) The market price of risk process satisfies:

, almost surely.

(iv) The positive process is a

martingale.

Comments
In case the number of stocks is greater than the dimension , in violation of point (ii), from linear algebra, it
can be seen that there are stocks whose volatilies (given by the vector ) are linear
combination of the volatilities of other stocks (because the rank of is ). Therefore, the stocks can be
replaced by equivalent mutual funds.
The standard martingale measure on for the standard market, is defined as:
.
Note that and are absolutely continuous with respect to each other, i.e. they are equivalent. Also, according to
Girsanov's theorem,

,

is a -dimensional Brownian motion process on the filtration with respect to .

Complete financial markets
A complete financial market is one that allows effective hedging of the risk inherent in any investment strategy.

Definition
Let be a standard financial market, and be an -measurable random variable, such that:

.

,

The market is said to be complete if every such is financeable, i.e. if there is an -financed portfolio
process , such that its associated wealth process satisfies
, almost surely.


Motivation
If a particular investment strategy calls for a payment at time , the amount of which is unknown at time
, then a conservative strategy would be to set aside an amount in order to cover the

payment. However, in a complete market it is possible to set aside less capital (viz. ) and invest it so that at time
it has grown to match the size of .

Corollary
A standard financial market is complete if and only if , and the volalatily process is
non-singular for almost every , with respect to the Lebesgue measure.

Notes
[1] Tsekov, Roumen (2010) (pdf). Brownian Markets (http:/ / arxiv. org/ ftp/ arxiv/ papers/ 1010/ 1010. 2061. pdf). . Retrieved October 13, 2010.
[2] Karatzas, Ioannis; Shreve, Steven E. (1991). Brownian motion and stochastic calculus. New York: Springer-Verlag. ISBN 0-387-97655-8.

References
Karatzas, Ioannis; Shreve, Steven E. (1998). Methods of mathematical finance. New York: Springer.
ISBN 0-387-94839-2.
Korn, Ralf; Korn, Elke (2001). Option pricing and portfolio optimization: modern methods of financial mathematics.
Providence, R.I.: American Mathematical Society. ISBN 0-8218-2123-7.
Merton, R. C. (1 August 1969). "Lifetime Portfolio Selection under Uncertainty: the Continuous-Time Case". The
Review of Economics and Statistics 51 (3): 247–257. doi:10.2307/1926560. ISSN 00346535. JSTOR 1926560.
Merton, R.C. (1970). "Optimum consumption and portfolio rules in a continuous-time model" (http:/ / www. math.
uwaterloo.ca/~mboudalh/Merton1971.pdf) (w). Journal of Economic Theory 3. Retrieved 2009-05-29.


Stochastic volatility
Stochastic volatility models are used in the field of mathematical finance to evaluate derivative securities, such as
options. The name derives from the models' treatment of the underlying security's volatility as a random process,
governed by state variables such as the price level of the underlying security, the tendency of volatility to revert to
some long-run mean value, and the variance of the volatility process itself, among others.
Stochastic volatility models are one approach to resolve a shortcoming of the Black–Scholes model. In particular,
these models assume that the underlying volatility is constant over the life of the derivative, and unaffected by the
changes in the price level of the underlying security. However, these models cannot explain long-observed features
of the implied volatility surface such as volatility smile and skew, which indicate that implied volatility does tend to
vary with respect to strike price and expiry. By assuming that the volatility of the underlying price is a stochastic
process rather than a constant, it becomes possible to model derivatives more accurately.

Basic model
Starting from a constant volatility approach, assume that the derivative's underlying price follows a standard model
for geometric brownian motion:

where is the constant drift (i.e. expected return) of the security price , is the constant volatility, and
is a standard Wiener process with zero mean and unit rate of variance. The explicit solution of this stochastic
differential equation is
.
The Maximum likelihood estimator to estimate the constant volatility for given stock prices at different times
is

its expectation value is .

This basic model with constant volatility is the starting point for non-stochastic volatility models such as
Black–Scholes and Cox–Ross–Rubinstein.
For a stochastic volatility model, replace the constant volatility with a function , that models the variance of
. This variance function is also modeled as brownian motion, and the form of depends on the particular SV
model under study.

where and are some functions of and is another standard gaussian that is correlated with
with constant correlation factor .


Heston model
The popular Heston model is a commonly used SV model, in which the randomness of the variance process varies as
the square root of variance. In this case, the differential equation for variance takes the form:

where is the mean long-term volatility, is the rate at which the volatility reverts toward its long-term mean,
is the volatility of the volatility process, and is, like , a gaussian with zero mean and unit standard
deviation. However, and are correlated with the constant correlation value .
In other words, the Heston SV model assumes that the variance is a random process that
1. exhibits a tendency to revert towards a long-term mean at a rate ,
2. exhibits a volatility proportional to the square root of its level
3. and whose source of randomness is correlated (with correlation ) with the randomness of the underlying's price
processes.

CEV Model
The CEV model describes the relationship between volatility and price, introducing stochastic volatility:

Conceptually, in some markets volatility rises when prices rise (e.g. commodities), so . In other markets,
volatility tends to rise as prices fall, modelled with .
Some argue that because the CEV model does not incorporate its own stochastic process for volatility, it is not truly
a stochastic volatility model. Instead, they call it a local volatility model.

SABR volatility model
The SABR model (Stochastic Alpha, Beta, Rho) describes a single forward (related to any asset e.g. an index,
interest rate, bond, currency or equity) under stochastic volatility :

The initial values and are the current forward price and volatility, whereas and are two correlated
Wiener processes (i.e. Brownian motions) with correlation coefficient . The constant parameters
are such that .
The main feature of the SABR model is to be able to reproduce the smile effect of the volatility smile.

GARCH model
The Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model is another popular model for
estimating stochastic volatility. It assumes that the randomness of the variance process varies with the variance, as
opposed to the square root of the variance as in the Heston model. The standard GARCH(1,1) model has the
following form for the variance differential:

The GARCH model has been extended via numerous variants, including the NGARCH, TGARCH, IGARCH,
LGARCH, EGARCH, GJR-GARCH, etc.


3/2 model
The 3/2 model is similar to the Heston model, but assumes that the randomness of the variance process varies with
. The form of the variance differential is:

.
However the meaning of the parameters is different from Heston model. In this model both, mean reverting and
volatility of variance parameters, are stochastic quantities given by and respectively.

Chen model
In interest rate modelings, Lin Chen in 1994 developed the first stochastic mean and stochastic volatility model,
Chen model. Specifically, the dynamics of the instantaneous interest rate are given by following the stochastic
differential equations:
,
,
.

Calibration
Once a particular SV model is chosen, it must be calibrated against existing market data. Calibration is the process of
identifying the set of model parameters that are most likely given the observed data. One popular technique is to use
Maximum Likelihood Estimation (MLE). For instance, in the Heston model, the set of model parameters
can be estimated applying an MLE algorithm such as the Powell Directed Set method [1] to
observations of historic underlying security prices.
In this case, you start with an estimate for , compute the residual errors when applying the historic price data to
the resulting model, and then adjust to try to minimize these errors. Once the calibration has been performed, it is
standard practice to re-calibrate the model periodically.

References
• Stochastic Volatility and Mean-variance Analysis [2], Hyungsok Ahn, Paul Wilmott, (2006).
• A closed-form solution for options with stochastic volatility [3], SL Heston, (1993).
• Inside Volatility Arbitrage [4], Alireza Javaheri, (2005).
• Accelerating the Calibration of Stochastic Volatility Models [5], Kilin, Fiodar (2006).
• Lin Chen (1996). Stochastic Mean and Stochastic Volatility -- A Three-Factor Model of the Term Structure of
Interest Rates and Its Application to the Pricing of Interest Rate Derivatives. Blackwell Publishers.. Blackwell
Publishers.

References
[1] http:/ / www. library. cornell. edu/ nr/ bookcpdf. html
[2] http:/ / www. wilmott. com/ detail. cfm?articleID=245
[3] http:/ / www. javaquant. net/ papers/ Heston-original. pdf
[4] http:/ / www. amazon. com/ s?platform=gurupa& url=index%3Dblended& keywords=inside+ volatility+ arbitrage
[5] http:/ / ssrn. com/ abstract=982221

BlackScholes 154

Black–Scholes
The Black–Scholes model /ˌblækˈʃoʊlz/[1] or Black–Scholes–Merton is a mathematical model of a financial market
containing certain derivative investment instruments. From the model, one can deduce the Black–Scholes formula,
which gives the price of European-style options. The formula led to a boom in options trading and legitimised
scientifically the activities of the Chicago Board Options Exchange and other options markets around the world.[2] lt
is widely used by options market participants.[3]:751 Many empirical tests have shown the Black–Scholes price is
“fairly close” to the observed prices, although there are well-known discrepancies such as the “option
smile”.[3]:770–771
The model was first articulated by Fischer Black and Myron Scholes in their 1973 paper, “The Pricing of Options
and Corporate Liabilities", published in the Journal of Political Economy. They derived a partial differential
equation, now called the Black–Scholes equation, which governs the price of the option over time. The key idea
behind the derivation was to hedge perfectly the option by buying and selling the underlying asset in just the right
way and consequently "eliminate risk". This hedge is called delta hedging and is the basis of more complicated
hedging strategies such as those engaged in by Wall Street investment banks. The hedge implies there is only one
right price for the option and it is given by the Black–Scholes formula.
Robert C. Merton was the first to publish a paper expanding the mathematical understanding of the options pricing
model and coined the term Black–Scholes options pricing model. Merton and Scholes received the 1997 Nobel Prize
in Economics (The Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel) for their work.
Though ineligible for the prize because of his death in 1995, Black was mentioned as a contributor by the Swedish
academy.[4]

Assumptions
The Black–Scholes model of the market for a particular stock makes the following explicit assumptions:
• There is no arbitrage opportunity (i.e., there is no way to make a riskless profit).
• It is possible to borrow and lend cash at a known constant risk-free interest rate.
• It is possible to buy and sell any amount, even fractional, of stock (this includes short selling).
• The above transactions do not incur any fees or costs (i.e., frictionless market).
• The stock price follows a geometric Brownian motion with constant drift and volatility.
• The underlying security does not pay a dividend.[5]
From these assumptions, Black and Scholes showed that “it is possible to create a hedged position, consisting of a
long position in the stock and a short position in the option, whose value will not depend on the price of the stock.”[6]
Several of these assumptions of the original model have been removed in subsequent extensions of the model.
Modern versions account for changing interest rates (Merton, 1976), transaction costs and taxes (Ingersoll, 1976),
and dividend payout.[7]

BlackScholes 155

Notation
Let
, be the price of the stock (please note inconsistencies as below).
, the price of a derivative as a function of time and stock price.
the price of a European call option and the price of a European put option.
, the strike of the option.
, the annualized risk-free interest rate, continuously compounded (the force of interest).
, the drift rate of , annualized.
, the volatility of the stock's returns; this is the square root of the quadratic variation of the stock's log price
process.
, a time in years; we generally use: now=0, expiry=T.
, the value of a portfolio.
Finally we will use which denotes the standard normal cumulative distribution function,

.

which denotes the standard normal probability density function,

.

Inconsistencies
The reader is warned of the inconsistent notation that appears in this article. Thus the letter is used as:
1. a constant denoting the current price of the stock
2. a real variable denoting the price at an arbitrary time
3. a random variable denoting the price at maturity
4. a stochastic process denoting the price at an arbitrary time
It is also used in the meaning of (4) with a subscript denoting time, but here the subscript is merely a mnemonic.
In the partial derivatives, the letters in the numerators and denominators are, of course, real variables, and the partial
derivatives themselves are, initially, real functions of real variables. But after the substitution of a stochastic process
for one of the arguments they become stochastic processes.
The Black–Scholes PDE is, initially, a statement about the stochastic process , but when is reinterpreted as a
real variable, it becomes an ordinary PDE. It is only then that we can ask about its solution.
The parameter that appears in the discrete-dividend model and the elementary derivation is not the same as the
parameter that appears elsewhere in the article. For the relationship between them see Geometric Brownian
motion.

BlackScholes 156

The Black–Scholes equation
As above, the Black–Scholes equation is a partial differential
equation, which describes the price of the option over time. The key
idea behind the equation is that one can perfectly hedge the option by
buying and selling the underlying asset in just the right way and
consequently “eliminate risk". This hedge, in turn, implies that there is
only one right price for the option, as returned by the Black–Scholes
formula given in the next section. The Equation:

Simulated geometric Brownian motions with
parameters from market data

Derivation
The following derivation is given in Hull's Options, Futures, and Other Derivatives.[8]:287–288 That, in turn, is based
on the classic argument in the original Black–Scholes paper.
Per the model assumptions above, the price of the underlying asset (typically a stock) follows a geometric Brownian
motion. That is,

where W is Brownian motion. Note that W, and consequently its infinitesimal increment dW, represents the only
source of uncertainty in the price history of the stock. Intuitively, W(t) is a process that "wiggles up and down" in
such a random way that its expected change over any time interval is 0. (In addition, its variance over time T is equal
to T; see Wiener process: Basic properties); a good discrete analogue for W is a simple random walk. Thus the above
equation states that the infinitesimal rate of return on the stock has an expected value of μ dt and a variance of
.
The payoff of an option at maturity is known. To find its value at an earlier time we need to know how
evolves as a function of and . By Itō's lemma for two variables we have

Now consider a certain portfolio, called the delta-hedge portfolio, consisting of being short one option and long

shares at time . The value of these holdings is

Over the time period , the total profit or loss from changes in the values of the holdings is:

Now discretize the equations for dS/S and dV by replacing differentials with deltas:

and appropriately substitute them into the expression for :

BlackScholes 157

Notice that the term has vanished. Thus uncertainty has been eliminated and the portfolio is effectively
riskless. The rate of return on this portfolio must be equal to the rate of return on any other riskless instrument;
otherwise, there would be opportunities for arbitrage. Now assuming the risk-free rate of return is we must have
over the time period

If we now equate our two formulas for we obtain:

Simplifying, we arrive at the celebrated Black–Scholes partial differential equation:

With the assumptions of the Black–Scholes model, this second order partial differential equation holds for any type
of option as long as its price function is twice differentiable with respect to and once with respect to .
Different pricing formulae for various options will arise from the choice of payoff function at expiry and appropriate
boundary conditions.

Black–Scholes formula
The Black–Scholes formula calculates the price of European put and
call options. This price is consistent with the Black–Scholes equation
as above; this follows since the formula can be obtained by solving the
equation for the corresponding terminal and boundary conditions.
The value of a call option for a non-dividend paying underlying stock
in terms of the Black–Scholes parameters is:

A European call valued using the Black-Scholes
pricing equation for varying asset price S and
time-to-expiry T. In this particular example, the
strike price is set to unity.

The price of a corresponding put option based on put-call parity is:

For both, as above:

BlackScholes 158

• is the cumulative distribution function of the standard normal distribution
• is the time to maturity
• is the spot price of the underlying asset
• is the strike price
• is the risk free rate (annual rate, expressed in terms of continuous compounding)
• is the volatility of returns of the underlying asset

Alternative formulation
Introducing some auxiliary variables allows the formula to be simplified and reformulated in a more intuitive form:

The auxiliary variables are:
• is the time to expiry (remaining time, backwards time)
• is the discount factor
• is the forward price of the underlying asset, and
with d+ = d1 and d− = d2 to clarify notation.
Given put-call parity, which is expressed in these terms as:

the price of a put option is:

If one uses spot S instead of forward F,, in there is instead a factor of which can be interpreted
as a drift factor in the risk-neutral measure for appropriate numéraire (see below).

Interpretation
The Black–Scholes formula can be interpreted fairly easily, with the main subtlety the interpretation of the
(and a fortiori ) terms, and why they are different.[9]
The formula can be interpreted by first decomposing a call option into the difference of two binary options: an
asset-or-nothing call minus a cash-or-nothing call (long an asset-or-nothing call, short a cash-or-nothing call). A call
option exchanges cash for an asset at expiry, while an asset-or-nothing call just yields the asset (with no cash in
exchange) and a cash-or-nothing call just yields cash (with no asset in exchange). The Black–Scholes formula is a
difference of two terms, and these two terms equal the value of the binary call options. These binary options are
much less frequently traded than vanilla call options, but are easier to analyze.
Thus the formula:

breaks up as:

where is the present value of an asset-or-nothing call and is the present value of a
cash-or-nothing call. The D factor is for discounting, because the expiration date is in future, and removing it
changes present value to future value (value at expiry). Thus is the future value of an asset-or-nothing
call and is the future value of a cash-or-nothing call. In risk-neutral terms, these are the expected value
of the asset and the expected value of the cash in the risk-neutral measure.

BlackScholes 159

The naive – and not quite correct – interpretation of these terms is that is the probability of the option
expiring in the money times the value of the underlying at expiry F, while is the probability
of the option expiring in the money times the value of the cash at expiry K. This is obviously incorrect, as
either both binaries expire in the money or both expire out of the money (either cash is exchanged for asset or it is
not), but the probabilities and are not equal. In fact, can be interpreted as measures of
moneyness and as probabilities of expiring ITM, suitably interpreted, as discussed below. Simply put, the
interpretation of the cash option, is correct, as the value of the cash is independent of movements of the
underlying, and thus can be interpreted as a simple product of "probability times value", while the is
more complicated, as the probability of expiring in the money and the value of the asset at expiry are not
independent.[9] for moneyness rather than the simple moneyness
The use of d− – in other words, the

reason for the factor – is due to the difference between the median and mean of the log-normal distribution; it
is the same factor as in Itō's lemma applied to geometric Brownian motion. Further, another way to see that the naive
interpretation is incorrect is that replacing N(d+) by N(d−) in the formula yields a negative value for
out-of-the-money call options.[9]:6
In detail, the terms are the probabilities of the option expiring in-the-money under the equivalent
exponential martingale probability measure (numéraire=stock) and the equivalent martingale probability measure
(numéraire=risk free asset), respectively.[9] The risk neutral probability density for the stock price is

where is defined as above.
Specifically, is the probability that the call will be exercised provided one assumes that the asset drift is the
risk-free rate. , however, does not lend itself to a simple probability interpretation. is correctly
interpreted as the present value, using the risk-free interest rate, of the expected asset price at expiration, given that
the asset price at expiration is above the exercise price.[10] For related discussion – and graphical representation –
see section "Interpretation" under Datar–Mathews method for real option valuation.
The equivalent martingale probability measure is also called the risk-neutral probability measure. Note that both of
these are probabilities in a measure theoretic sense, and neither of these is the true probability of expiring
in-the-money under the real probability measure. To calculate the probability under the real (“physical”) probability
measure, additional information is required—the drift term in the physical measure, or equivalently, the market price
of risk.

Derivation
We now show how to get from the general Black–Scholes PDE to a specific valuation for an option. Consider as an
example the Black–Scholes price of a call option, for which the PDE above has boundary conditions

The last condition gives the value of the option at the time that the option matures. The solution of the PDE gives the
value of the option at any earlier time, . To solve the PDE we recognize that it is a
Cauchy–Euler equation which can be transformed into a diffusion equation by introducing the change-of-variable
transformation

BlackScholes 160

Then the Black–Scholes PDE becomes a diffusion equation

The terminal condition now becomes an initial condition

Using the standard method for solving a diffusion equation we have

which, after some manipulations, yields

where

Reverting to the original set of variables yields the above stated solution to the Black–Scholes equation.

Other derivations
Above we used the method of arbitrage-free pricing (“delta-hedging”) to derive the Black–Scholes PDE, and then
solved the PDE to get the valuation formula. It is also possible to derive the latter directly using a Risk neutrality
argument.[9] This method gives the price as the expectation of the option payoff under a particular probability
measure, called the risk-neutral measure, which differs from the real world measure. For the underlying logic see
section "risk neutral valuation" under Rational pricing as well as section "Derivatives pricing: the Q world" under
Mathematical finance; for detail, once again, see Hull.[8]:307–309

The Greeks
“The Greeks” measure the sensitivity to change of the option price under a slight change of a single parameter while
holding the other parameters fixed. Formally, they are partial derivatives of the option price with respect to the
independent variables (technically, one Greek, gamma, is a partial derivative of another Greek, called delta).
The Greeks are not only important for the mathematical theory of finance, but for those actively involved in trading.
Financial institutions will typically set limits for the Greeks that their trader cannot exceed. Delta is the most
important Greek and traders will zero their delta at the end of the day. Gamma and vega are also important but not as
closely monitored.
The Greeks for Black–Scholes are given in closed form below. They can be obtained by straightforward
differentiation of the Black–Scholes formula.[11]

BlackScholes 161

What Calls Puts

delta

gamma

vega

theta

rho

Note that the gamma and vega formulas are the same for calls and puts. This can be seen directly from put–call
parity, since the difference of a put and a call is a forward, which is linear in S and independent of σ (so the gamma
and vega of a forward vanish).
In practice, some sensitivities are usually quoted in scaled-down terms, to match the scale of likely changes in the
parameters. For example, rho is often reported multiplied by 10,000 (1bp rate change), vega by 100 (1 vol point
change), and theta by 365 or 252 (1 day decay based on either calendar days or trading days per year).

Extensions of the model
The above model can be extended for variable (but deterministic) rates and volatilities. The model may also be used
to value European options on instruments paying dividends. In this case, closed-form solutions are available if the
dividend is a known proportion of the stock price. American options and options on stocks paying a known cash
dividend (in the short term, more realistic than a proportional dividend) are more difficult to value, and a choice of
solution techniques is available (for example lattices and grids).

Instruments paying continuous yield dividends
For options on indexes, it is reasonable to make the simplifying assumption that dividends are paid continuously, and
that the dividend amount is proportional to the level of the index.
The dividend payment paid over the time period is then modelled as

for some constant (the dividend yield).
Under this formulation the arbitrage-free price implied by the Black–Scholes model can be shown to be

and

where now

is the modified forward price that occurs in the terms :

and

BlackScholes 162

Exactly the same formula is used to price options on foreign exchange rates, except that now q plays the role of the
foreign risk-free interest rate and S is the spot exchange rate. This is the Garman–Kohlhagen model (1983).

Instruments paying discrete proportional dividends
It is also possible to extend the Black–Scholes framework to options on instruments paying discrete proportional
dividends. This is useful when the option is struck on a single stock.
A typical model is to assume that a proportion of the stock price is paid out at pre-determined times .
The price of the stock is then modelled as

where is the number of dividends that have been paid by time .
The price of a call option on such a stock is again

where now

is the forward price for the dividend paying stock.

American options
The problem of finding the price of an American option is related to the optimal stopping problem of finding the
time to execute the option. Since the American option can be exercised at any time before the expiration date, the
Black-Scholes equation becomes an inequality of the form

[12]

With the terminal and (free) boundary conditions: and where
denotes the payoff at stock price
In general this inequality does not have a closed form solution, though an American call with no dividends is equal
to a European call and the Roll-Geske-Whaley method provides a solution for an American call with one
dividend.[13][14]
Barone-Adesi and Whaley[15] is a further approximation formula. Here, the stochastic differential equation (which is
valid for the value of any derivative) is split into two components: the European option value and the early exercise
premium. With some assumptions, a quadratic equation that approximates the solution for the latter is then obtained.
This solution involves finding the critical value, , such that one is indifferent between early exercise and holding
to maturity.[16][17]
Bjerksund and Stensland[18] provide an approximation based on an exercise strategy corresponding to a trigger price.
Here, if the underlying asset price is greater than or equal to the trigger price it is optimal to exercise, and the value
must equal , otherwise the option “boils down to: (i) a European up-and-out call option… and (ii) a rebate
that is received at the knock-out date if the option is knocked out prior to the maturity date.” The formula is readily
modified for the valuation of a put option, using put call parity. This approximation is computationally inexpensive
and the method is fast, with evidence indicating that the approximation may be more accurate in pricing long dated
options than Barone-Adesi and Whaley. [19]

BlackScholes 163

Black–Scholes in practice
The Black–Scholes model disagrees with reality in a number of ways,
some significant. It is widely employed as a useful approximation, but
proper application requires understanding its limitations – blindly
following the model exposes the user to unexpected risk.
Among the most significant limitations are:
• the underestimation of extreme moves, yielding tail risk, which can
be hedged with out-of-the-money options;
• the assumption of instant, cost-less trading, yielding liquidity risk,
which is difficult to hedge;
• the assumption of a stationary process, yielding volatility risk,
which can be hedged with volatility hedging;
• the assumption of continuous time and continuous trading, yielding
gap risk, which can be hedged with Gamma hedging.
In short, while in the Black–Scholes model one can perfectly hedge
options by simply Delta hedging, in practice there are many other
The normality assumption of the Black–Scholes
sources of risk.
model does not capture extreme movements such
Results using the Black–Scholes model differ from real world prices as stock market crashes.
because of simplifying assumptions of the model. One significant
limitation is that in reality security prices do not follow a strict stationary log-normal process, nor is the risk-free
interest actually known (and is not constant over time). The variance has been observed to be non-constant leading to
models such as GARCH to model volatility changes. Pricing discrepancies between empirical and the Black–Scholes
model have long been observed in options that are far out-of-the-money, corresponding to extreme price changes;
such events would be very rare if returns were lognormally distributed, but are observed much more often in
practice.

Nevertheless, Black–Scholes pricing is widely used in practice,[20][3]:751 for it is:
• easy to calculate
• a useful approximation, particularly when analyzing the direction in which prices move when crossing critical
points
• a robust basis for more refined models
• reversible, as the model's original output -- price -- can be used as an input and one of the other variables solved
for; the implied volatility calculated in this way is often used to quote option prices (that is, as a quoting
convention)
The first point is self-evidently useful. The others can be further discussed:
Useful approximation: although volatility is not constant, results from the model are often helpful in setting up
hedges in the correct proportions to minimize risk. Even when the results are not completely accurate, they serve as a
first approximation to which adjustments can be made.
Basis for more refined models: The Black–Scholes model is robust in that it can be adjusted to deal with some of its
failures. Rather than considering some parameters (such as volatility or interest rates) as constant, one considers
them as variables, and thus added sources of risk. This is reflected in the Greeks (the change in option value for a
change in these parameters, or equivalently the partial derivatives with respect to these variables), and hedging these
Greeks mitigates the risk caused by the non-constant nature of these parameters. Other defects cannot be mitigated
by modifying the model, however, notably tail risk and liquidity risk, and these are instead managed outside the
model, chiefly by minimizing these risks and by stress testing.

BlackScholes 164

Explicit modeling: this feature mean that, rather than assuming a volatility a priori and computing prices from it, one
can use the model to solve for volatility, which gives the implied volatility of an option at given prices, durations and
exercise prices. Solving for volatility over a given set of durations and strike prices one can construct an implied
volatility surface. In this application of the Black–Scholes model, a coordinate transformation from the price domain
to the volatility domain is obtained. Rather than quoting option prices in terms of dollars per unit (which are hard to
compare across strikes and tenors), option prices can thus be quoted in terms of implied volatility, which leads to
trading of volatility in option markets.

The volatility smile
One of the attractive features of the Black–Scholes model is that the parameters in the model (other than the
volatility) — the time to maturity, the strike, the risk-free interest rate, and the current underlying price – are
unequivocally observable. All other things being equal, an option's theoretical value is a monotonic increasing
function of implied volatility.
By computing the implied volatility for traded options with different strikes and maturities, the Black–Scholes model
can be tested. If the Black–Scholes model held, then the implied volatility for a particular stock would be the same
for all strikes and maturities. In practice, the volatility surface (the 3D graph of implied volatility against strike and
maturity) is not flat.
The typical shape of the implied volatility curve for a given maturity depends on the underlying instrument. Equities
tend to have skewed curves: compared to at-the-money, implied volatility is substantially higher for low strikes, and
slightly lower for high strikes. Currencies tend to have more symmetrical curves, with implied volatility lowest
at-the-money, and higher volatilities in both wings. Commodities often have the reverse behavior to equities, with
higher implied volatility for higher strikes.
Despite the existence of the volatility smile (and the violation of all the other assumptions of the Black–Scholes
model), the Black–Scholes PDE and Black–Scholes formula are still used extensively in practice. A typical approach
is to regard the volatility surface as a fact about the market, and use an implied volatility from it in a Black–Scholes
valuation model. This has been described as using "the wrong number in the wrong formula to get the right
price."[21] This approach also gives usable values for the hedge ratios (the Greeks).
Even when more advanced models are used, traders prefer to think in terms of volatility as it allows them to evaluate
and compare options of different maturities, strikes, and so on.

Valuing bond options
Black–Scholes cannot be applied directly to bond securities because of pull-to-par. As the bond reaches its maturity
date, all of the prices involved with the bond become known, thereby decreasing its volatility, and the simple
Black–Scholes model does not reflect this process. A large number of extensions to Black–Scholes, beginning with
the Black model, have been used to deal with this phenomenon.[22] See Bond option: Valuation.

Interest-rate curve
In practice, interest rates are not constant – they vary by tenor, giving an interest rate curve which may be
interpolated to pick an appropriate rate to use in the Black–Scholes formula. Another consideration is that interest
rates vary over time. This volatility may make a significant contribution to the price, especially of long-dated
options.This is simply like the interest rate and bond price relationship which is inversely related.

BlackScholes 165

Short stock rate
It is not free to take a short stock position. Similarly, it may be possible to lend out a long stock position for a small
fee. In either case, this can be treated as a continuous dividend for the purposes of a Black–Scholes valuation,
provided that there is no glaring asymmetry between the short stock borrowing cost and the long stock lending
income.

Criticism
Espen Gaarder Haug and Nassim Nicholas Taleb argue that the Black–Scholes model merely recast existing widely
used models in terms of practically impossible "dynamic hedging" rather than "risk," to make them more compatible
with mainstream neoclassical economic theory.[23] Similar arguments were made in an earlier paper by Emanuel
Derman and Nassim Taleb.[24] In response, Paul Wilmott has defended the model.[20][25]
Jean-Philippe Bouchaud argues: Reliance on models based on incorrect axioms has clear and large effects. The
Black–Scholes model,[26] for example, which was invented in 1973 to price options, is still used extensively. But it
assumes that the probability of extreme price changes is negligible, when in reality, stock prices are much jerkier
than this. Unwarranted use of the model spiralled into the worldwide October 1987 crash; the Dow Jones index
dropped 23% in a single day, dwarfing recent market hiccups. Using the Student's t-distribution in place of the
normal distribution as basis for the valuation of options can better take in account extreme events.

Notes
[1] "Scholes" (http:/ / www. merriam-webster. com/ dictionary/ scholes). . Retrieved March 26, 2012.
[2] MacKenzie, Donald (2006). An Engine, Not a Camera: How Financial Models Shape Markets. Cambridge, MA: MIT Press.
ISBN 0-262-13460-8.
[3] Bodie, Zvi; Alex Kane, Alan J. Marcus (2008). Investments (7th ed.). New York: McGraw-Hill/Irwin. ISBN 978-0-07-326967-2.
[4] "Nobel prize foundation, 1997 Press release" (http:/ / nobelprize. org/ nobel_prizes/ economics/ laureates/ 1997/ press. html). October 14,
1997. . Retrieved March 26, 2012.
[5] Although the original model assumed no dividends, trivial extensions to the model can accommodate a continuous dividend yield factor.
[6] Black, Fischer; Scholes, Myron. "The Pricing of Options and Corporate Liabilities". Journal of Political Economy 81 (3): 637–654.
[7] Merton, Robert. "Theory of Rational Option Pricing". Bell Journal of Economics and Management Science 4 (1): 141–183.
doi:10.2307/3003143..
[8] Hull, John C. (2008). Options, Futures and Other Derivatives (7 ed.). Prentice Hall. ISBN 0-13-505283-1.
[9] Nielsen, Lars Tyge (1993). "Understanding N(d1) and N(d2): Risk-Adjusted Probabilities in the Black-Scholes Model" (http:/ / www.
ltnielsen. com/ wp-content/ uploads/ Understanding. pdf). Revue Finance (Journal of the French Finance Association) 14 ( 1 (http:/ / www.
affi. asso. fr/ TPL_CODE/ TPL_REVUE/ PAR_TPL_IDENTIFIANT/ 53/ 193-publications. htm)): 95–106. . Retrieved 2012 Dec 8, earlier
circulated as INSEAD Working Paper 92/71/FIN (http:/ / librarycatalogue. insead. edu/ bib/ 972) (1992); abstract (http:/ / www. ltnielsen.
com/ papers/ understanding-nd1-and-nd2-risk-adjusted-probabilities-in-the-black-scholes-model) and link to article, published article (http:/ /
www. affi. asso. fr/ TPL_CODE/ TPL_REVUEARTICLEDOWNLOAD/ PAR_TPL_IDENTIFIANT/ 187/ 193-publications. htm).
[10] Don Chance (June 3, 2011). "Derivation and Interpretation of the Black–Scholes Model" (http:/ / www. bus. lsu. edu/ academics/ finance/
faculty/ dchance/ Instructional/ TN99-02. pdf) (PDF). . Retrieved March 27, 2012.
[11] Although with significant algebra; see, for example, Hong-Yi Chen, Cheng-Few Lee and Weikang Shih (2010). Derivations and
Applications of Greek Letters: Review and Integration (https:/ / docs. google. com/ viewer?a=v& q=cache:ai5xEtbLxCIJ:centerforpbbefr.
rutgers. edu/ TaipeiPBFR& D/ 01-16-09%20papers/ 5-4%20Greek%20letters. doc+ Derivations+ and+ Applications+ of+ Greek+ Letters+
â€“+ Review+ and+ Integration& hl=en& pid=bl&
srcid=ADGEEShU4q28apOYjO-BmqXOJTOHj2BG0BgnxtLn-ccCfh27FYlCDla0nspYCidFFFWiPfYjM2PTT0_109Lth79rFwKsenMFpawjU9BtpBSQO81hUj0
sig=AHIEtbREe6Jg8SlzylhuYC9xEoG0eG3dGg), Handbook of Quantitative Finance and Risk Management, III:491–503.
[12] André Jaun. "The Black-Scholes equation for American options" (http:/ / www. lifelong-learners. com/ opt/ com/ SYL/ s6node6. php). .
Retrieved May 5, 2012.
[13] Bernt Ødegaard (2003). "Extending the Black Scholes formula" (http:/ / finance. bi. no/ ~bernt/ gcc_prog/ recipes/ recipes/ node9.
html#SECTION00920000000000000000). . Retrieved May 5, 2012.
[14] Don Chance (2008). "Closed-Form American Call Option Pricing: Roll-Geske-Whaley" (http:/ / www. bus. lsu. edu/ academics/ finance/
faculty/ dchance/ Instructional/ TN98-01. pdf). . Retrieved May 16, 2012.
[15] Giovanni Barone-Adesi and Robert E Whaley (June 1987). "Efficient analytic approximation of American option values" (http:/ / ideas.
repec. org/ a/ bla/ jfinan/ v42y1987i2p301-20. html). Journal of Finance 42 (2): 301-20. .

BlackScholes 166

[16] Bernt Ødegaard (2003). "A quadratic approximation to American prices due to Barone-Adesi and Whaley" (http:/ / finance. bi. no/ ~bernt/
gcc_prog/ recipes/ recipes/ node13. html). . Retrieved June 25, 2012.
[17] Don Chance (2008). "Approximation Of American Option Values: Barone-Adesi-Whaley" (http:/ / www. bus. lsu. edu/ academics/ finance/
faculty/ dchance/ Instructional/ TN98-02. pdf). . Retrieved June 25, 2012.
[18] Petter Bjerksund and Gunnar Stensland, 2002. Closed Form Valuation of American Options (http:/ / brage. bibsys. no/ nhh/ bitstream/
URN:NBN:no-bibsys_brage_22301/ 1/ bjerksund petter 0902. pdf)
[19] American options (http:/ / www. global-derivatives. com/ index. php?option=com_content& task=view& id=14)
[20] Paul Wilmott (2008): In defence of Black Scholes and Merton (http:/ / www. wilmott. com/ blogs/ paul/ index. cfm/ 2008/ 4/ 29/
Science-in-Finance-IX-In-defence-of-Black-Scholes-and-Merton), Dynamic hedging and further defence of Black-Scholes (http:/ / www.
wilmott. com/ blogs/ paul/ index. cfm/ 2008/ 7/ 23/ Science-in-Finance-X-Dynamic-hedging-and-further-defence-of-BlackScholes)
[21] Riccardo Rebonato (1999). Volatility and correlation in the pricing of equity, FX and interest-rate options. Wiley. ISBN 0-471-89998-4.
[22] Kalotay, Andrew (November 1995). "The Problem with Black, Scholes et al." (http:/ / kalotay. com/ sites/ default/ files/ private/
BlackScholes. pdf) (PDF). Derivatives Strategy. .
[23] Espen Gaarder Haug and Nassim Nicholas Taleb (2011). Option Traders Use (very) Sophisticated Heuristics, Never the
Black–Scholes–Merton Formula (http:/ / papers. ssrn. com/ sol3/ papers. cfm?abstract_id=1012075). Journal of Economic Behavior and
Organization, Vol. 77, No. 2, 2011
[24] Emanuel Derman and Nassim Taleb (2005). The illusions of dynamic replication (http:/ / www. ederman. com/ new/ docs/
qf-Illusions-dynamic. pdf), Quantitative Finance, Vol. 5, No. 4, August 2005, 323–326
[25] See also: Doriana Ruffinno and Jonathan Treussard (2006). Derman and Taleb’s The Illusions of Dynamic Replication: A Comment (http:/ /
wayback. archive. org/ web/ */ http:/ / www. bu. edu/ econ/ workingpapers/ papers/ RuffinoTreussardDT. pdf), WP2006-019, Boston
University - Department of Economics.
[26] Bouchaud, Jean-Philippe (October 30, 2008). "Economics needs a scientific revolution" (http:/ / arxiv. org/ abs/ 0810. 5306v1). Nature 455:
1181. doi:10.1038/4551181a. .

References

Primary references
• Black, Fischer; Myron Scholes (1973). "The Pricing of Options and Corporate Liabilities". Journal of Political
Economy 81 (3): 637–654. doi:10.1086/260062. (http://links.jstor.org/sici?sici=0022-3808(197305/
06)81:3<637:TPOOAC>2.0.CO;2-P) (Black and Scholes' original paper.)
• Merton, Robert C. (1973). "Theory of Rational Option Pricing". Bell Journal of Economics and Management
Science (The RAND Corporation) 4 (1): 141–183. doi:10.2307/3003143. JSTOR 3003143. (http://links.jstor.
org/sici?sici=0005-8556(197321)4:1<141:TOROP>2.0.CO;2-0&origin=repec)
• Hull, John C. (1997). Options, Futures, and Other Derivatives. Prentice Hall. ISBN 0-13-601589-1.

Historical and sociological aspects
• Bernstein, Peter (1992). Capital Ideas: The Improbable Origins of Modern Wall Street. The Free Press.
ISBN 0-02-903012-9.
• MacKenzie, Donald (2003). "An Equation and its Worlds: Bricolage, Exemplars, Disunity and Performativity in
Financial Economics". Social Studies of Science 33 (6): 831–868. doi:10.1177/0306312703336002. (http://sss.
sagepub.com/cgi/content/abstract/33/6/831)
• MacKenzie, Donald; Yuval Millo (2003). "Constructing a Market, Performing Theory: The Historical Sociology
of a Financial Derivatives Exchange". American Journal of Sociology 109 (1): 107–145. doi:10.1086/374404.
(http://www.journals.uchicago.edu/AJS/journal/issues/v109n1/060259/brief/060259.abstract.html)
• MacKenzie, Donald (2006). An Engine, not a Camera: How Financial Models Shape Markets. MIT Press.
ISBN 0-262-13460-8.

BlackScholes 167

Further reading
• Haug, E. G (2007). "Option Pricing and Hedging from Theory to Practice". Derivatives: Models on Models.
Wiley. ISBN 978-0-470-01322-9. The book gives a series of historical references supporting the theory that
option traders use much more robust hedging and pricing principles than the Black, Scholes and Merton model.
• Triana, Pablo (2009). Lecturing Birds on Flying: Can Mathematical Theories Destroy the Financial Markets?.
Wiley. ISBN 978-0-470-40675-5. The book takes a critical look at the Black, Scholes and Merton model.

External links

Discussion of the model
• Ajay Shah. Black, Merton and Scholes: Their work and its consequences. Economic and Political Weekly,
XXXII(52):3337–3342, December 1997 link (http://www.mayin.org/ajayshah/PDFDOCS/Shah1997_bms.
pdf)
• Inside Wall Street's Black Hole (http://www.portfolio.com/news-markets/national-news/portfolio/2008/02/
19/Black-Scholes-Pricing-Model?print=true) by Michael Lewis, March 2008 Issue of portfolio.com
• Whither Black–Scholes? (http://www.forbes.com/opinions/2008/04/07/
black-scholes-options-oped-cx_ptp_{0}408black.html) by Pablo Triana, April 2008 Issue of Forbes.com
• Black Scholes model lecture (http://wikilecture.org/Black_Scholes) by Professor Robert Shiller from Yale
• The mathematical equation that caused the banks to crash (http://www.guardian.co.uk/science/2012/feb/12/
black-scholes-equation-credit-crunch) by Ian Stewart in The Observer, February 12, 2012

Derivation and solution
• Derivation of the Black–Scholes Equation for Option Value (http://www.sjsu.edu/faculty/watkins/blacksch.
htm), Prof. Thayer Watkins
• Solution of the Black–Scholes Equation Using the Green's Function (http://www.physics.uci.edu/~silverma/
bseqn/bs/bs.html), Prof. Dennis Silverman
• Solution via risk neutral pricing or via the PDE approach using Fourier transforms (http://homepages.nyu.edu/
~sl1544/KnownClosedForms.pdf) (includes discussion of other option types), Simon Leger
• Step-by-step solution of the Black–Scholes PDE (http://planetmath.org/encyclopedia/
AnalyticSolutionOfBlackScholesPDE.html), planetmath.org.
• On the Black–Scholes Equation: Various Derivations (http://www.stanford.edu/~japrimbs/Publications/
OnBlackScholesEq.pdf), Manabu Kishimoto
• The Black–Scholes Equation (http://terrytao.wordpress.com/2008/07/01/the-black-scholes-equation/)
Expository article by mathematician Terence Tao.

Revisiting the model
• When You Cannot Hedge Continuously: The Corrections to Black–Scholes (http://www.ederman.com/new/
docs/risk-non_continuous_hedge.pdf), Emanuel Derman
• Arbitrage and Stock Option Pricing: A Fresh Look At The Binomial Model (http://www.soa.org/library/
newsletters/risks-and-rewards/2011/august/rar-2011-iss58-joss.pdf)

Computer implementations
• Black–Scholes in Multiple Languages (http://www.espenhaug.com/black_scholes.html), espenhaug.com
• Chicago Option Pricing Model (Graphing Version) (http://sourceforge.net/projects/chipricingmodel/),
sourceforge.net

BlackScholes 168

• Black-Scholes-Merton Implied Volatility Surface Model (Java) (https://github.com/OpenGamma/
OG-Platform/blob/master/projects/OG-Analytics/src/com/opengamma/analytics/financial/model/volatility/
surface/BlackScholesMertonImpliedVolatilitySurfaceModel.java), github.com

Historical
• Trillion Dollar Bet (http://www.pbs.org/wgbh/nova/stockmarket/)—Companion Web site to a Nova episode
originally broadcast on February 8, 2000. "The film tells the fascinating story of the invention of the
Black–Scholes Formula, a mathematical Holy Grail that forever altered the world of finance and earned its
creators the 1997 Nobel Prize in Economics."
• BBC Horizon (http://www.bbc.co.uk/science/horizon/1999/midas.shtml) A TV-programme on the so-called
Midas formula and the bankruptcy of Long-Term Capital Management (LTCM)
• BBC News Magazine (http://www.bbc.co.uk/news/magazine-17866646) Black-Scholes: The maths formula
linked to the financial crash (April 27, 2012 article)

Black model
The Black model (sometimes known as the Black-76 model) is a variant of the Black–Scholes option pricing
model. Its primary applications are for pricing bond options, interest rate caps / floors, and swaptions. It was first
presented in a paper written by Fischer Black in 1976.
Black's model can be generalized into a class of models known as log-normal forward models, also referred to as
LIBOR market model.

The Black formula
The Black formula is similar to the Black–Scholes formula for valuing stock options except that the spot price of the
underlying is replaced by a discounted futures price F.
Suppose there is constant risk-free interest rate r and the futures price F(t) of a particular underlying is log-normal
with constant volatility σ. Then the Black formula states the price for a European call option of maturity T on a
futures contract with strike price K and delivery date T' (with ) is

The corresponding put price is

where

and N(.) is the cumulative normal distribution function.
Note that T' doesn't appear in the formulae even though it could be greater than T. This is because futures contracts
are marked to market and so the payoff is realized when the option is exercised. If we consider an option on a
forward contract expiring at time T' > T, the payoff doesn't occur until T' . Thus the discount factor is replaced
by since one must take into account the time value of money. The difference in the two cases is clear from
the derivation below.

Black model 169

Derivation and assumptions
The Black formula is easily derived from use of Margrabe's formula, which in turn is a simple, but clever,
application of the Black–Scholes formula.
The payoff of the call option on the futures contract is max (0, F(T) - K). We can consider this an exchange
(Margrabe) option by considering the first asset to be and the second asset to be the riskless bond
paying off $1 at time T. Then the call option is exercised at time T when the first asset is worth more than K riskless
bonds. The assumptions of Margrabe's formula are satisfied with these assets.
The only remaining thing to check is that the first asset is indeed an asset. This can be seen by considering a portfolio
formed at time 0 by going long a forward contract with delivery date T and short F(0) riskless bonds (note that under
the deterministic interest rate, the forward and futures prices are equal so there is no ambiguity here). Then at any
time t you can unwind your obligation for the forward contract by shorting another forward with the same delivery
date to get the difference in forward prices, but discounted to present value: . Liquidating
the F(0) riskless bonds, each of which is worth , results in a net payoff of .

External links
Discussion
• Bond Options, Caps and the Black Model [1] Dr. Milica Cudina, University of Texas at Austin
Online tools
• Caplet And Floorlet Calculator [2] Dr. Shing Hing Man, Thomson-Reuters' Risk Management
• 'Greeks' Calculator using the Black model [3], Razvan Pascalau, Univ. of Alabama

References
• Black, Fischer (1976). The pricing of commodity contracts, Journal of Financial Economics, 3, 167-179.
• Garman, Mark B. and Steven W. Kohlhagen (1983). Foreign currency option values, Journal of International
Money and Finance, 2, 231-237.
• Miltersen, K., Sandmann, K. et Sondermann, D., (1997): "Closed Form Solutions for Term Structure Derivates
with Log-Normal Interest Rates", Journal of Finance, 52(1), 409-430.

References
[1] https:/ / www. ma. utexas. edu/ users/ mcudina/ Lecture24_3. pdf
[2] http:/ / lombok. demon. co. uk/ financialTap/ options/ bond/ shortterm
[3] http:/ / www. cba. ua. edu/ ~rpascala/ greeks2/ GFOPMForm. php

BlackDermanToy model 170

Black–Derman–Toy model
In finance, the Black–Derman–Toy model (BDT) is a popular short rate model used in the pricing of bond options,
swaptions and other interest rate derivatives. It is a one-factor model; that is, a single stochastic factor – the short
rate – determines the future evolution of all interest rates. It was the first model to combine the mean-reverting
behaviour of the short rate with the lognormal distribution, [1] and is still widely used. [2][3]
The model was introduced by Fischer Black, Emanuel Derman, and Bill Toy. It was first developed for in-house use
by Goldman Sachs in the 1980s and was published in the Financial Analysts Journal in 1990. A personal account of
the development of the model is provided in one of the chapters in Emanuel Derman's memoir "My Life as a
Quant."[4]
Under BDT, using a binomial lattice, one calibrates the model parameters to fit both the current term structure of
interest rates (yield curve), and the volatility structure for interest rate caps (usually as implied by the
Black-76-prices for each component caplet). Using the calibrated lattice one can then value a variety of more
complex interest-rate sensitive securities and interest rate derivatives. Calibration here means that: (1) we assume the
probability of an up move = 50%; (2) for each input spot rate, we: (a) iteratively adjust the rate at the top-most node
at the current time-step, i; (b) find all other nodes in the time-step, where these are linked to the node immediately
above via 0.5 ln (ru/rd) = σi sqrt(Δt); (c) discount recursively through the tree, from to the time-step in question to
the first node in the tree; (d) repeat this until the calculated spot-rate (i.e. the discount factor at the first node in the
tree) equals the assumed spot-rate; (3) Once solved, we retain these known short rates, and proceed to the next
time-step (i.e. input spot-rate), "growing" the tree until it incorporates the full input yield-curve.
Although initially developed for a lattice-based environment, the model has been shown to imply the following
continuous stochastic differential equation:[5][1]

where,
= the instantaneous short rate at time t
= value of the underlying asset at option expiry
= instant short rate volatility
= a standard Brownian motion under a Risk-neutral probability measure; its differential.
For constant (time independent) short rate volatility, , the model is:

One reason that the model remains popular, is that the "standard" Root-finding algorithms – such as Newton's
method (the secant method) or bisection – are very easily applied to the calibration.[6] Relatedly, the model was
originally described in algorithmic language, and not using stochastic calculus or martingales. [7]

BlackDermanToy model 171

References
[1] http:/ / janroman. dhis. org/ finance/ Interest%20Rates/ 3%20interest%20rates%20models. pdf
[2] http:/ / books. google. com/ books?id=GnR3g9lvwfkC& pg=PP1& dq=Fixed+ income+ analysis+ By+ Frank+ J. + Fabozzi,+ Mark+
Jonathan+ Paul+ Anson& ei=tpTVS7LjKILYNoPk7I8I& cd=1#v=snippet& q=Black-Derman-Toy& f=false
[3] http:/ / www. soa. org/ library/ professional-actuarial-specialty-guides/ professional-actuarial-specialty-guides/ 2003/ september/ spg0308alm.
pdf
[4] http:/ / www. ederman. com/ new/ index. html
[5] http:/ / help. derivativepricing. com/ 2327. htm
[6] http:/ / www. cfapubs. org/ toc/ rf/ 2001/ 2001/ 4
[7] http:/ / www. ederman. com/ new/ docs/ fen-interview. html

• Benninga, S.; Wiener, Z. (1998). "Binomial Term Structure Models" (http://pluto.mscc.huji.ac.il/~mswiener/
research/Benninga73.pdf). Mathematica in Education and Research: vol.7 No. 3.
• Black, F.; Derman, E. and Toy, W. (January–February 1990). "A One-Factor Model of Interest Rates and Its
Application to Treasury Bond Options" (http://savage.wharton.upenn.edu/FNCE-934/syllabus/papers/
Black_Derman_Toy_FAJ_90.pdf). Financial Analysts Journal: 24–32.
• Boyle, P.; Tan, K. and Tian, W. (2001). "Calibrating the Black–Derman–Toy model: some theoretical results"
(http://belkcollegeofbusiness.uncc.edu/wtian1/bdt.pdf). Applied Mathematical Finance: 8, 27–48.
• Hull, J. (2008). "The Black, Derman, and Toy Model" (http://www.rotman.utoronto.ca/~hull/TechnicalNotes/
TechnicalNote23.pdf). Technical Note No. 23, Options, Futures, and Other Derivatives.
• Klose, C.; Li C. Y. (2003). "Implementation of the Black, Derman and Toy Model" (http://www.lcy.net/files/
BDT_Seminar_Paper.pdf). Seminar Financial Engineering, University of Vienna.

External links
• Online: Black-Derman-Toy short rate tree generator (http://lombok.demon.co.uk/financialTap/interestrates/
bdtshortrates) Dr. Shing Hing Man, Thomson-Reuters' Risk Management
• Online: Pricing A Bond Using the BDT Model (http://lombok.demon.co.uk/financialTap/interestrates/
bdtbond) Dr. Shing Hing Man, Thomson-Reuters' Risk Management

CoxIngersollRoss model 172

Cox–Ingersoll–Ross model
In mathematical finance, the Cox–Ingersoll–Ross model (or CIR
model) describes the evolution of interest rates. It is a type of "one
factor model" (short rate model) as it describes interest rate movements
as driven by only one source of market risk. The model can be used in
the valuation of interest rate derivatives. It was introduced in 1985 by
John C. Cox, Jonathan E. Ingersoll and Stephen A. Ross as an
extension of the Vasicek model.

The model Three trajectories of CIR Processes

The CIR model specifies that the instantaneous interest rate follows the
stochastic differential equation, also named the CIR process:

where Wt is a Wiener process modelling the random market risk factor.
The drift factor, a(b − rt), is exactly the same as in the Vasicek model. It ensures mean reversion of the interest rate
towards the long run value b, with speed of adjustment governed by the strictly positive parameter a.
The standard deviation factor, , avoids the possibility of negative interest rates for all positive values of a and
b. An interest rate of zero is also precluded if the condition

is met. More generally, when the rate is at a low level (close to zero), the standard deviation also becomes close to
zero, which dampens the effect of the random shock on the rate. Consequently, when the rate gets close to zero, its
evolution becomes dominated by the drift factor, which pushes the rate upwards (towards equilibrium).
The same process is used in the Heston model to model stochastic volatility.

Future distribution
The distribution of future values of a CIR process can be computed in closed form:
,

where , and Y is a non-central Chi-Squared distribution with degrees of freedom and

non-centrality parameter .

Bond pricing
Under the no-arbitrage assumption, a bond may be priced using this interest rate process. The bond price is
exponential affine in the interest rate:

Extensions
Time varying functions replacing coefficients can be introduced in the model in order to make it consistent with a
pre-assigned term structure of interest rates and possibly volatilities. The most general approach is in Maghsoodi
(1996). A more tractable approach is in Brigo and Mercurio (2001b) where an external time-dependent shift is added
to the model for consistency with an input term structure of rates. A significant extension of the CIR model to the

CoxIngersollRoss model 173

case of stochastic mean and stochastic volatility is given by Lin Chen(1996) and is known as Chen model. A CIR
process is a special case of a basic affine jump diffusion, which still permits a closed-form expression for bond
prices.

References
• Hull, John C. (2003). Options, Futures and Other Derivatives. Upper Saddle River, NJ: Prentice Hall.
ISBN 0-13-009056-5.
• Cox, J.C., J.E. Ingersoll and S.A. Ross (1985). "A Theory of the Term Structure of Interest Rates". Econometrica
53: 385–407. doi:10.2307/1911242.
• Maghsoodi, Y. (1996). "Solution of the extended CIR Term Structure and Bond Option Valuation". Mathematical
Finance (6): 89–109.
• Damiano Brigo, Fabio Mercurio (2001). Interest Rate Models — Theory and Practice with Smile, Inflation and
Credit (2nd ed. 2006 ed.). Springer Verlag. ISBN 978-3-540-22149-4.
• Brigo, Damiano and Fabio Mercurio (2001b). "A deterministic-shift extension of analytically tractable and
time-homogeneous short rate models". Finance & Stochastics 5 (3): 369–388.

Monte Carlo method
Monte Carlo methods (or Monte Carlo experiments) are a class of computational algorithms that rely on repeated
random sampling to compute their results. Monte Carlo methods are often used in computer simulations of physical
and mathematical systems. These methods are most suited to calculation by a computer and tend to be used when it
is infeasible to compute an exact result with a deterministic algorithm.[1] This method is also used to complement
theoretical derivations.
Monte Carlo methods are especially useful for simulating systems with many coupled degrees of freedom, such as
fluids, disordered materials, strongly coupled solids, and cellular structures (see cellular Potts model). They are used
to model phenomena with significant uncertainty in inputs, such as the calculation of risk in business. They are
widely used in mathematics, for example to evaluate multidimensional definite integrals with complicated boundary
conditions. When Monte Carlo simulations have been applied in space exploration and oil exploration, their
predictions of failures, cost overruns and schedule overruns are routinely better than human intuition or alternative
"soft" methods.[2]
The Monte Carlo method was coined in the 1940s by John von Neumann, Stanislaw Ulam and Nicholas Metropolis,
while they were working on nuclear weapon projects (Manhattan Project) in the Los Alamos National Laboratory. It
was named after the Monte Carlo Casino, a famous casino where Ulam's uncle often gambled away his money.[3]


Introduction
Monte Carlo methods vary, but tend to follow a particular pattern:
1. Define a domain of possible inputs.
2. Generate inputs randomly from a probability distribution over the
domain.
3. Perform a deterministic computation on the inputs.
4. Aggregate the results.
For example, consider a circle inscribed in a unit square. Given that the
circle and the square have a ratio of areas that is π/4, the value of π can
be approximated using a Monte Carlo method:[4]
1. Draw a square on the ground, then inscribe a circle within it.
2. Uniformly scatter some objects of uniform size (grains of rice or
Monte Carlo method applied to approximating
sand) over the square.
the value of π. After placing 30000 random
3. Count the number of objects inside the circle and the total number points, the estimate for π is within 0.07% of the
of objects. actual value. This happens with an approximate
probability of 20%. After 30000 points it is
4. The ratio of the two counts is an estimate of the ratio of the two
within 7%. [needs reference or/and verification].
areas, which is π/4. Multiply the result by 4 to estimate π.
In this procedure the domain of inputs is the square that circumscribes our circle. We generate random inputs by
scattering grains over the square then perform a computation on each input (test whether it falls within the circle).
Finally, we aggregate the results to obtain our final result, the approximation of π.
If grains are purposefully dropped into only the center of the circle, they are not uniformly distributed, so our
approximation is poor. Second, there should be a large number of inputs. The approximation is generally poor if only
a few grains are randomly dropped into the whole square. On average, the approximation improves as more grains
are dropped.

History
Before the Monte Carlo method was developed, simulations tested a previously understood deterministic problem
and statistical sampling was used to estimate uncertainties in the simulations. Monte Carlo simulations invert this
approach, solving deterministic problems using a probabilistic analog (see Simulated annealing).
An early variant of the Monte Carlo method can be seen in the Buffon's needle experiment, in which π can be
estimated by dropping needles on a floor made of parallel strips of wood. In the 1930s, Enrico Fermi first
experimented with the Monte Carlo method while studying neutron diffusion, but did not publish anything on it.[3]
In 1946, physicists at Los Alamos Scientific Laboratory were investigating radiation shielding and the distance that
neutrons would likely travel through various materials. Despite having most of the necessary data, such as the
average distance a neutron would travel in a substance before it collided with an atomic nucleus, and how much
energy the neutron was likely to give off following a collision, the Los Alamos physicists were unable to solve the
problem using conventional, deterministic mathematical methods. Stanisław Ulam had the idea of using random
experiments. He recounts his inspiration as follows:
The first thoughts and attempts I made to practice [the Monte Carlo Method] were suggested by a
question which occurred to me in 1946 as I was convalescing from an illness and playing solitaires. The
question was what are the chances that a Canfield solitaire laid out with 52 cards will come out
successfully? After spending a lot of time trying to estimate them by pure combinatorial calculations, I
wondered whether a more practical method than "abstract thinking" might not be to lay it out say one
hundred times and simply observe and count the number of successful plays. This was already possible


to envisage with the beginning of the new era of fast computers, and I immediately thought of problems
of neutron diffusion and other questions of mathematical physics, and more generally how to change
processes described by certain differential equations into an equivalent form interpretable as a
succession of random operations. Later [in 1946], I described the idea to John von Neumann, and we
began to plan actual calculations.
–Stanisław Ulam[5]
Being secret, the work of von Neumann and Ulam required a code name. Von Neumann chose the name Monte
Carlo. The name refers to the Monte Carlo Casino in Monaco where Ulam's uncle would borrow money to
gamble.[1][6][7] Using lists of "truly" random random numbers was extremely slow, but von Neumann developed a
way to calculate pseudorandom numbers, using the middle-square method. Though this method has been criticized
as crude, von Neumann was aware of this: he justified it as being faster than any other method at his disposal, and
also noted that when it went awry it did so obviously, unlike methods that could be subtly incorrect.
Monte Carlo methods were central to the simulations required for the Manhattan Project, though severely limited by
the computational tools at the time. In the 1950s they were used at Los Alamos for early work relating to the
development of the hydrogen bomb, and became popularized in the fields of physics, physical chemistry, and
operations research. The Rand Corporation and the U.S. Air Force were two of the major organizations responsible
for funding and disseminating information on Monte Carlo methods during this time, and they began to find a wide
application in many different fields.
Uses of Monte Carlo methods require large amounts of random numbers, and it was their use that spurred the
development of pseudorandom number generators, which were far quicker to use than the tables of random numbers
that had been previously used for statistical sampling.

Definitions
There is no consensus on how Monte Carlo should be defined. For example, Ripley[8] defines most probabilistic
modeling as stochastic simulation, with Monte Carlo being reserved for Monte Carlo integration and Monte Carlo
statistical tests. Sawilowsky[9] distinguishes between a simulation, a Monte Carlo method, and a Monte Carlo
simulation: a simulation is a fictitious representation of reality, a Monte Carlo method is a technique that can be used
to solve a mathematical or statistical problem, and a Monte Carlo simulation uses repeated sampling to determine the
properties of some phenomenon (or behavior). Examples:
• Simulation: Drawing one pseudo-random uniform variable from the interval (0,1] can be used to simulate the
tossing of a coin: If the value is less than or equal to 0.50 designate the outcome as heads, but if the value is
greater than 0.50 designate the outcome as tails. This is a simulation, but not a Monte Carlo simulation.
• Monte Carlo method: The area of an irregular figure inscribed in a unit square can be determined by throwing
darts at the square and computing the ratio of hits within the irregular figure to the total number of darts thrown.
This is a Monte Carlo method of determining area, but not a simulation.
• Monte Carlo simulation: Drawing a large number of pseudo-random uniform variables from the interval (0,1],
and assigning values less than or equal to 0.50 as heads and greater than 0.50 as tails, is a Monte Carlo simulation
of the behavior of repeatedly tossing a coin.
Kalos and Whitlock[4] point out that such distinctions are not always easy to maintain. For example, the emission of
radiation from atoms is a natural stochastic process. It can be simulated directly, or its average behavior can be
described by stochastic equations that can themselves be solved using Monte Carlo methods. "Indeed, the same
computer code can be viewed simultaneously as a 'natural simulation' or as a solution of the equations by natural
sampling."


Monte Carlo and random numbers
Monte Carlo simulation methods do not always require truly random numbers to be useful — while for some
applications, such as primality testing, unpredictability is vital.[10] Many of the most useful techniques use
deterministic, pseudorandom sequences, making it easy to test and re-run simulations. The only quality usually
necessary to make good simulations is for the pseudo-random sequence to appear "random enough" in a certain
sense.
What this means depends on the application, but typically they should pass a series of statistical tests. Testing that
the numbers are uniformly distributed or follow another desired distribution when a large enough number of
elements of the sequence are considered is one of the simplest, and most common ones.
Sawilowsky lists the characteristics of a high quality Monte Carlo simulation:[9]
• the (pseudo-random) number generator has certain characteristics (e.g., a long “period” before the sequence
repeats)
• the (pseudo-random) number generator produces values that pass tests for randomness
• there are enough samples to ensure accurate results
• the proper sampling technique is used
• the algorithm used is valid for what is being modeled
• it simulates the phenomenon in question.
Pseudo-random number sampling algorithms are used to transform uniformly distributed pseudo-random numbers
into numbers that are distributed according to a given probability distribution.
Low-discrepancy sequences are often used instead of random sampling from a space as they ensure even coverage
and normally have a faster order of convergence than Monte Carlo simulations using random or pseudorandom
sequences. Methods based on their use are called quasi-Monte Carlo methods.

Monte Carlo simulation versus "what if" scenarios
There are ways of using probabilities that are definitely not Monte Carlo simulations—for example, deterministic
modeling using single-point estimates. Each uncertain variable within a model is assigned a “best guess” estimate.
Scenarios (such as best, worst, or most likely case) for each input variable are chosen and the results recorded.[11]
By contrast, Monte Carlo simulations sample probability distribution for each variable to produce hundreds or
thousands of possible outcomes. The results are analyzed to get probabilities of different outcomes occurring.[12] For
example, a comparison of a spreadsheet cost construction model run using traditional “what if” scenarios, and then
run again with Monte Carlo simulation and Triangular probability distributions shows that the Monte Carlo analysis
has a narrower range than the “what if” analysis. This is because the “what if” analysis gives equal weight to all
scenarios (see quantifying uncertainty in corporate finance).

Applications
Monte Carlo methods are especially useful for simulating phenomena with significant uncertainty in inputs and
systems with a large number of coupled degrees of freedom. Areas of application include:

Physical sciences
Monte Carlo methods are very important in computational physics, physical chemistry, and related applied fields,
and have diverse applications from complicated quantum chromodynamics calculations to designing heat shields and
aerodynamic forms. In statistical physics Monte Carlo molecular modeling is an alternative to computational
molecular dynamics, and Monte Carlo methods are used to compute statistical field theories of simple particle and
polymer systems.[13] Quantum Monte Carlo methods solve the many-body problem for quantum systems. In
experimental particle physics, Monte Carlo methods are used for designing detectors, understanding their behavior


and comparing experimental data to theory. In astrophysics, they are used in such diverse manners as to model both
the evolution of galaxies[14] and the transmission of microwave radiation through a rough planetary surface.[15]
Monte Carlo methods are also used in the ensemble models that form the basis of modern weather forecasting.

Engineering
Monte Carlo methods are widely used in engineering for sensitivity analysis and quantitative probabilistic analysis in
process design. The need arises from the interactive, co-linear and non-linear behavior of typical process
simulations. For example,
• in microelectronics engineering, Monte Carlo methods are applied to analyze correlated and uncorrelated
variations in analog and digital integrated circuits.
• in geostatistics and geometallurgy, Monte Carlo methods underpin the design of mineral processing flowsheets
and contribute to quantitative risk analysis.
• in wind energy yield analysis, the predicted energy output of a wind farm during its lifetime is calculated giving
different levels of uncertainty (P90, P50, etc.)
• impacts of pollution are simulated[16] and diesel compared with petrol.[17]
• In autonomous robotics, Monte Carlo localization can determine the position of a robot. It is often applied to
stochastic filters such as the Kalman filter or Particle filter that forms the heart of the SLAM (Simultaneous
Localization and Mapping) algorithm.

Computational biology
Monte Carlo methods are used in computational biology, such for as Bayesian inference in phylogeny.
Biological systems such as proteins[18] membranes,[19] images of cancer,[20] are being studied by means of computer
simulations.
The systems can be studied in the coarse-grained or ab initio frameworks depending on the desired accuracy.
Computer simulations allow us to monitor the local environment of a particular molecule to see if some chemical
reaction is happening for instance. We can also conduct thought experiments when the physical experiments are not
feasible, for instance breaking bonds, introducing impurities at specific sites, changing the local/global structure, or
introducing external fields.

Computer Graphics
Path Tracing, occasionally referred to as Monte Carlo Ray Tracing, renders a 3D scene by randomly tracing samples
of possible light paths. Repeated sampling of any given pixel will eventually cause the average of the samples to
converge on the correct solution of the rendering equation, making it one of the most physically accurate 3D
graphics rendering methods in existence.

Applied statistics
In applied statistics, Monte Carlo methods are generally used for two purposes:
1. To compare competing statistics for small samples under realistic data conditions. Although Type I error and
power properties of statistics can be calculated for data drawn from classical theoretical distributions (e.g., normal
curve, Cauchy distribution) for asymptotic conditions (i. e, infinite sample size and infinitesimally small treatment
effect), real data often do not have such distributions.[21]
2. To provide implementations of hypothesis tests that are more efficient than exact tests such as permutation tests
(which are often impossible to compute) while being more accurate than critical values for asymptotic
distributions.


Monte Carlo methods are also a compromise between approximate randomization and permutation tests. An
approximate randomization test is based on a specified subset of all permutations (which entails potentially
enormous housekeeping of which permutations have been considered). The Monte Carlo approach is based on a
specified number of randomly drawn permutations (exchanging a minor loss in precision if a permutation is drawn
twice – or more frequently—for the efficiency of not having to track which permutations have already been
selected).

Games
Monte Carlo methods have recently been incorporated in algorithms
for playing games that have outperformed previous algorithms in
games such as Go, Tantrix, Battleship, and Havannah. These
algorithms employ Monte Carlo tree search. Possible moves are
organized in a tree and a large number of random simulations are used
to estimate the long-term potential of each move. A black box
simulator represents the opponent's moves.

In November 2011, a Tantrix playing robot named FullMonte, which
employs the Monte Carlo method, played and beat the previous world
champion Tantrix robot (Goodbot) quite easily. In a 200 game match
FullMonte won 58.5%, lost 36%, and drew 5.5% without ever running
over the fifteen minute time limit.
In games like Battleship, where there is only limited knowledge of the
state of the system (i.e., the positions of the ships), a belief state is
constructed consisting of probabilities for each state and then initial
states are sampled for running simulations. The belief state is updated
as the game proceeds, as in the figure. On a 10 x 10 grid, in which the
total possible number of moves is 100, one algorithm sank all the ships
Monte Carlo tree search applied to a game of 50 moves faster, on average, than random play.[22]
Battleship. Initially the algorithm takes random
shots, but as possible states are eliminated, the
shots can be more selective. As a crude example, Design and visuals
if a ship is hit (figure A), then adjacent squares
Monte Carlo methods are also efficient in solving coupled integral
become much higher priorities (figures B and C).
differential equations of radiation fields and energy transport, and thus
these methods have been used in global illumination computations that produce photo-realistic images of virtual 3D
models, with applications in video games, architecture, design, computer generated films, and cinematic special
effects.[23]


Finance and business
Monte Carlo methods in finance are often used to calculate the value of companies, to evaluate investments in
projects at a business unit or corporate level, or to evaluate financial derivatives. They can be used to model project
schedules, where simulations aggregate estimates for worst-case, best-case, and most likely durations for each task to
determine outcomes for the overall project.

Telecommunications
When planning a wireless network, design must be proved to work for a wide variety of scenarios that depend
mainly on the number of users, their locations and the services they want to use. Monte Carlo methods are typically
used to generate these users and their states. The network performance is then evaluated and, if results are not
satisfactory, the network design goes through an optimization process.

Use in mathematics
In general, Monte Carlo methods are used in mathematics to solve various problems by generating suitable random
numbers and observing that fraction of the numbers that obeys some property or properties. The method is useful for
obtaining numerical solutions to problems too complicated to solve analytically. The most common application of
the Monte Carlo method is Monte Carlo integration.

Integration
Deterministic numerical integration algorithms work well in a small
number of dimensions, but encounter two problems when the functions
have many variables. First, the number of function evaluations needed
increases rapidly with the number of dimensions. For example, if 10
evaluations provide adequate accuracy in one dimension, then 10100
points are needed for 100 dimensions—far too many to be computed.
This is called the curse of dimensionality. Second, the boundary of a
multidimensional region may be very complicated, so it may not be
feasible to reduce the problem to a series of nested one-dimensional
integrals.[24] 100 dimensions is by no means unusual, since in many
physical problems, a "dimension" is equivalent to a degree of freedom.
Monte-Carlo integration works by comparing
Monte Carlo methods provide a way out of this exponential increase in
random points with the value of the function
computation time. As long as the function in question is reasonably
well-behaved, it can be estimated by randomly selecting points in
100-dimensional space, and taking some kind of average of the
function values at these points. By the central limit theorem, this
method displays convergence—i.e., quadrupling the number of
sampled points halves the error, regardless of the number of
dimensions.[24]

A refinement of this method, known as importance sampling in
statistics, involves sampling the points randomly, but more frequently
where the integrand is large. To do this precisely one would have to
already know the integral, but one can approximate the integral by an
integral of a similar function or use adaptive routines such as stratified Errors reduce by a factor of
sampling, recursive stratified sampling, adaptive umbrella
sampling[25][26] or the VEGAS algorithm.


A similar approach, the quasi-Monte Carlo method, uses low-discrepancy sequences. These sequences "fill" the area
better and sample the most important points more frequently, so quasi-Monte Carlo methods can often converge on
the integral more quickly.
Another class of methods for sampling points in a volume is to simulate random walks over it (Markov chain Monte
Carlo). Such methods include the Metropolis-Hastings algorithm, Gibbs sampling and the Wang and Landau
algorithm.

Simulation - Optimization
Another powerful and very popular application for random numbers in numerical simulation is in numerical
optimization. The problem is to minimize (or maximize) functions of some vector that often has a large number of
dimensions. Many problems can be phrased in this way: for example, a computer chess program could be seen as
trying to find the set of, say, 10 moves that produces the best evaluation function at the end. In the traveling
salesman problem the goal is to minimize distance traveled. There are also applications to engineering design, such
as multidisciplinary design optimization.
The traveling salesman problem is what is called a conventional optimization problem. That is, all the facts
(distances between each destination point) needed to determine the optimal path to follow are known with certainty
and the goal is to run through the possible travel choices to come up with the one with the lowest total distance.
However, let's assume that instead of wanting to minimize the total distance traveled to visit each desired destination,
we wanted to minimize the total time needed to reach each destination. This goes beyond conventional optimization
since travel time is inherently uncertain (traffic jams, time of day, etc.). As a result, to determine our optimal path we
would want to use simulation - optimization to first understand the range of potential times it could take to go from
one point to another (represented by a probability distribution in this case rather than a specific distance) and then
optimize our travel decisions to identify the best path to follow taking that uncertainty into account.

Inverse problems
Probabilistic formulation of inverse problems leads to the definition of a probability distribution in the model space.
This probability distribution combines prior information with new information obtained by measuring some
observable parameters (data). As, in the general case, the theory linking data with model parameters is nonlinear, the
posterior probability in the model space may not be easy to describe (it may be multimodal, some moments may not
be defined, etc.).
When analyzing an inverse problem, obtaining a maximum likelihood model is usually not sufficient, as we
normally also wish to have information on the resolution power of the data. In the general case we may have a large
number of model parameters, and an inspection of the marginal probability densities of interest may be impractical,
or even useless. But it is possible to pseudorandomly generate a large collection of models according to the posterior
probability distribution and to analyze and display the models in such a way that information on the relative
likelihoods of model properties is conveyed to the spectator. This can be accomplished by means of an efficient
Monte Carlo method, even in cases where no explicit formula for the a priori distribution is available.
The best-known importance sampling method, the Metropolis algorithm, can be generalized, and this gives a method
that allows analysis of (possibly highly nonlinear) inverse problems with complex a priori information and data with
an arbitrary noise distribution.[27][28]


Computational mathematics
Monte Carlo methods are useful in many areas of computational mathematics, where a "lucky choice" can find the
correct result. A classic example is Rabin's algorithm for primality testing: for any n that is not prime, a random x has
at least a 75% chance of proving that n is not prime. Hence, if n is not prime, but x says that it might be, we have
observed at most a 1-in-4 event. If 10 different random x say that "n is probably prime" when it is not, we have
observed a one-in-a-million event. In general a Monte Carlo algorithm of this kind produces one correct answer with
a guarantee n is composite, and x proves it so, but another one without, but with a guarantee of not getting this
answer when it is wrong too often—in this case at most 25% of the time. See also Las Vegas algorithm for a related,
but different, idea.

Notes
[1] Hubbart 2007
[2] Hubbard 2009
[3] Metropolis 1987
[4] Kalos & Whitlock 2008
[5] Eckhardt 1987
[6] Grinstead & Snell 1997
[7] Anderson 1986
[8] Ripley 1987
[9] Sawilowsky 2003
[10] Davenport 1992
[11] Vose 2000, p. 13
[12] Vose 2000, p. 16
[13] Baeurle 2009
[14] MacGillivray & Dodd 1982
[15] Golden 1979
[16] Int Panis et al. 2001
[17] Int Panis et al. 2002
[18] Ojeda & et al. 2009,
[19] Milik & Skolnick 1993
[20] Forastero et al. 2010
[21] Sawilowsky & Fahoome 2003
[22] Silver & Veness 2010
[23] Szirmay-Kalos 2008
[24] Press et al. 1996
[25] MEZEI, M (31 December 1986). "Adaptive umbrella sampling: Self-consistent determination of the non-Boltzmann bias". Journal of
Computational Physics 68 (1): 237–248. Bibcode 1987JCoPh..68..237M. doi:10.1016/0021-9991(87)90054-4.
[26] Bartels, Christian; Karplus, Martin (31 December 1997). "Probability Distributions for Complex Systems: Adaptive Umbrella Sampling of
the Potential Energy". The Journal of Physical Chemistry B 102 (5): 865–880. doi:10.1021/jp972280j.
[27] Mosegaard & Tarantola 1995
[28] Tarantola & 2005 http:/ / www. ipgp. fr/ ~tarantola/ Files/ Professional/ Books/ index. html

References
• Anderson, H.L. (1986). "Metropolis, Monte Carlo and the MANIAC" (http://library.lanl.gov/cgi-bin/
getfile?00326886.pdf). Los Alamos Science 14: 96–108.
• Baeurle, Stephan A. (2009). "Multiscale modeling of polymer materials using field-theoretic methodologies: A
survey about recent developments". Journal of Mathematical Chemistry 46 (2): 363–426.
doi:10.1007/s10910-008-9467-3.
• Berg, Bernd A. (2004). Markov Chain Monte Carlo Simulations and Their Statistical Analysis (With Web-Based
Fortran Code). Hackensack, NJ: World Scientific. ISBN 981-238-935-0.
• Binder, Kurt (1995). The Monte Carlo Method in Condensed Matter Physics. New York: Springer.
ISBN 0-387-54369-4.


• Caflisch, R. E. (1998). Monte Carlo and quasi-Monte Carlo methods. Acta Numerica. 7. Cambridge University
Press. pp. 1–49.
• Davenport, J. H.. "Primality testing revisited". Proceeding ISSAC '92 Papers from the international symposium on
Symbolic and algebraic computation: 123 129. doi:10.1145/143242.143290. ISBN 0-89791-489-9.
• Doucet, Arnaud; Freitas, Nando de; Gordon, Neil (2001). Sequential Monte Carlo methods in practice. New
York: Springer. ISBN 0-387-95146-6.
• Eckhardt, Roger (1987). "Stan Ulam, John von Neumann, and the Monte Carlo method" (http://www.lanl.gov/
history/admin/files/Stan_Ulam_John_von_Neumann_and_the_Monte_Carlo_Method.pdf). Los Alamos
Science, Special Issue (15): 131–137.
• Fishman, G. S. (1995). Monte Carlo: Concepts, Algorithms, and Applications. New York: Springer.
ISBN 0-387-94527-X.
• C. Forastero and L. Zamora and D. Guirado and A. Lallena (2010). "A Monte Carlo tool to simulate breast cancer
screening programmes". Phys. In Med. And Biol. 55 (17): 5213. Bibcode 2010PMB....55.5213F.
doi:10.1088/0031-9155/55/17/021.
• Golden, Leslie M. (1979). "The Effect of Surface Roughness on the Transmission of Microwave Radiation
Through a Planetary Surface". Icarus 38 (3): 451. Bibcode 1979Icar...38..451G.
doi:10.1016/0019-1035(79)90199-4.
• Gould, Harvey; Tobochnik, Jan (1988). An Introduction to Computer Simulation Methods, Part 2, Applications to
Physical Systems. Reading: Addison-Wesley. ISBN 0-201-16504-X.
• Grinstead, Charles; Snell, J. Laurie (1997). Introduction to Probability. American Mathematical Society.
pp. 10–11.
• Hammersley, J. M.; Handscomb, D. C. (1975). Monte Carlo Methods. London: Methuen. ISBN 0-416-52340-4.
• Hartmann, A.K. (2009). Practical Guide to Computer Simulations (http://www.worldscibooks.com/physics/
6988.html). World Scientific. ISBN 978-981-283-415-7.
• Hubbard, Douglas (2007). How to Measure Anything: Finding the Value of Intangibles in Business. John Wiley &
Sons. p. 46.
• Hubbard, Douglas (2009). The Failure of Risk Management: Why It's Broken and How to Fix It. John Wiley &
Sons.
• Kahneman, D.; Tversky, A. (1982). Judgement under Uncertainty: Heuristics and Biases. Cambridge University
Press.
• Kalos, Malvin H.; Whitlock, Paula A. (2008). Monte Carlo Methods. Wiley-VCH. ISBN 978-3-527-40760-6.
• Kroese, D. P.; Taimre, T.; Botev, Z.I. (2011). Handbook of Monte Carlo Methods (http://www.
montecarlohandbook.org). New York: John Wiley & Sons. p. 772. ISBN 0-470-17793-4.
• MacGillivray, H. T.; Dodd, R. J. (1982). "Monte-Carlo simulations of galaxy systems" (http://www.
springerlink.com/content/rp3g1q05j176r108/fulltext.pdf). Astrophysics and Space Science (Springer
Netherlands) 86 (2).
• MacKeown, P. Kevin (1997). Stochastic Simulation in Physics. New York: Springer. ISBN 981-3083-26-3.
• Metropolis, N. (1987). "The beginning of the Monte Carlo method" (http://library.lanl.gov/la-pubs/00326866.
pdf). Los Alamos Science (1987 Special Issue dedicated to Stanisław Ulam): 125–130.
• Metropolis, Nicholas; Rosenbluth, Arianna W.; Rosenbluth, Marshall N.; Teller, Augusta H.; Teller, Edward
(1953). "Equation of State Calculations by Fast Computing Machines". Journal of Chemical Physics 21 (6): 1087.
Bibcode 1953JChPh..21.1087M. doi:10.1063/1.1699114.
• Metropolis, N.; Ulam, S. (1949). "The Monte Carlo Method". Journal of the American Statistical Association
(American Statistical Association) 44 (247): 335–341. doi:10.2307/2280232. JSTOR 2280232. PMID 18139350.
• M. Milik and J. Skolnick (Jan 1993). "Insertion of peptide chains into lipid membranes: an off-lattice Monte
Carlo dynamics model". Proteins 15 (1): 10–25. doi:10.1002/prot.340150104. PMID 8451235.


• Mosegaard, Klaus; Tarantola, Albert (1995). "Monte Carlo sampling of solutions to inverse problems". J.
Geophys. Res. 100 (B7): 12431–12447. Bibcode 1995JGR...10012431M. doi:10.1029/94JB03097.
• P. Ojeda and M. Garcia and A. Londono and N.Y. Chen (Feb 2009). "Monte Carlo Simulations of Proteins in
Cages: Influence of Confinement on the Stability of Intermediate States". Biophys. Jour. (Biophysical Society) 96
(3): 1076–1082. Bibcode 2009BpJ....96.1076O. doi:10.1529/biophysj.107.125369.
• Int Panis L; De Nocker L, De Vlieger I, Torfs R (2001). "Trends and uncertainty in air pollution impacts and
external costs of Belgian passenger car traffic International". Journal of Vehicle Design 27 (1–4): 183–194.
doi:10.1504/IJVD.2001.001963.
• Int Panis L, Rabl A, De Nocker L, Torfs R (2002). P. Sturm. ed. "Diesel or Petrol ? An environmental comparison
hampered by uncertainty". Mitteilungen Institut für Verbrennungskraftmaschinen und Thermodynamik
(Technische Universität Graz Austria) Heft 81 Vol 1: 48–54.
• Press, William H.; Teukolsky, Saul A.; Vetterling, William T.; Flannery, Brian P. (1996) [1986]. Numerical
Recipes in Fortran 77: The Art of Scientific Computing. Fortran Numerical Recipes. 1 (Second ed.). Cambridge
University Press. ISBN 0-521-43064-X.
• Ripley, B. D. (1987). Stochastic Simulation. Wiley & Sons.
• Robert, C. P.; Casella, G. (2004). Monte Carlo Statistical Methods (2nd ed.). New York: Springer.
ISBN 0-387-21239-6.
• Rubinstein, R. Y.; Kroese, D. P. (2007). Simulation and the Monte Carlo Method (2nd ed.). New York: John
Wiley & Sons. ISBN 978-0-470-17793-8.
• Savvides, Savvakis C. (1994). "Risk Analysis in Investment Appraisal". Project Appraisal Journal 9 (1).
doi:10.2139/ssrn.265905.
• Sawilowsky, Shlomo S.; Fahoome, Gail C. (2003). Statistics via Monte Carlo Simulation with Fortran. Rochester
Hills, MI: JMASM. ISBN 0-9740236-0-4.
• Sawilowsky, Shlomo S. (2003). "You think you've got trivials?" (http://education.wayne.edu/jmasm/
sawilowsky_effect_size_debate.pdf). Journal of Modern Applied Statistical Methods 2 (1): 218–225.
• Silver, David; Veness, Joel (2010). "Monte-Carlo Planning in Large POMDPs" (http://books.nips.cc/papers/
files/nips23/NIPS2010_0740.pdf). In Lafferty, J.; Williams, C. K. I.; Shawe-Taylor, J. et al.. Advances in
Neural Information Processing Systems 23. Neural Information Processing Systems Foundation.
• Szirmay-Kalos, László (2008). Monte Carlo Methods in Global Illumination - Photo-realistic Rendering with
Randomization. VDM Verlag Dr. Mueller e.K.. ISBN 978-3-8364-7919-6.
• Tarantola, Albert (2005). Inverse Problem Theory (http://www.ipgp.jussieu.fr/~tarantola/Files/Professional/
SIAM/index.html). Philadelphia: Society for Industrial and Applied Mathematics. ISBN 0-89871-572-5.
• Vose, David (2008). Risk Analysis, A Quantitative Guide (Third ed.). John Wiley & Sons.

External links
• Overview and reference list (http://mathworld.wolfram.com/MonteCarloMethod.html), Mathworld
• "Café math : Monte Carlo Integration" (http://cafemath.kegtux.org/mathblog/article.php?page=MonteCarlo.
php) : A blog article describing Monte Carlo integration (principle, hypothesis, confidence interval)
• Feynman-Kac models and particle Monte Carlo algorithms (http://www.math.u-bordeaux1.fr/~delmoral/
simulinks.html) Website on the applications of particle Monte Carlo methods in
signal processing, rare event simulation, molecular dynamics, financial mathematics, optimal control, computational
physics, and biology.
• Introduction to Monte Carlo Methods (http://www.phy.ornl.gov/csep/CSEP/MC/MC.html), Computational
Science Education Project
• The Basics of Monte Carlo Simulations (http://www.chem.unl.edu/zeng/joy/mclab/mcintro.html),
University of Nebraska-Lincoln


• Introduction to Monte Carlo simulation (http://office.microsoft.com/en-us/excel-help/
introduction-to-monte-carlo-simulation-HA010282777.aspx) (for Microsoft Excel), Wayne L. Winston
• Monte Carlo Simulation for MATLAB and Simulink (http://www.mathworks.com/discovery/
monte-carlo-simulation.html)
• Monte Carlo Methods – Overview and Concept (http://www.brighton-webs.co.uk/montecarlo/concept.htm),
brighton-webs.co.uk
• Molecular Monte Carlo Intro (http://www.cooper.edu/engineering/chemechem/monte.html), Cooper Union
• Monte Carlo techniques applied in physics (http://personal-pages.ps.ic.ac.uk/~achremos/Applet1-page.htm)
• Monte Carlo Method Example (http://waqqasfarooq.com/waqqasfarooq/index.php?option=com_content&
view=article&id=47:monte-carlo&catid=34:statistics&Itemid=53), A step-by-step guide to creating a monte
carlo excel spreadsheet
• Approximate And Double Check Probability Problems Using Monte Carlo method (http://orcik.net/
programming/approximate-and-double-check-probability-problems-using-monte-carlo-method/) at Orcik Dot
Net


Article Sources and Contributors
Time series Source: http://en.wikipedia.org/w/index.php?oldid=527126170 Contributors: 1ForTheMoney, Abeliavsky, Adrianafraj, Aegis Maelstrom, Albmont, Aleksd, Andreas4965, Andycjp,
Apdevries, Arthena, Babbage, Boxplot, Btyner, Burhem, Calair, Charles Matthews, Charmi99, Cherkash, Chire, Chris the speller, ChrisGualtieri, Coginsys, CommodiCast, Cottrellnc, Cpdo,
Cwdegier, DARTH SIDIOUS 2, Dankonikolic, Dcljr, Dekart, Discospinster, Dkondras, Dr ahmed1010, Drrho, Edison, ElKevbo, Eliz81, Esoterum, FBmotion, Funandtrvl, G716, Gandalf61, Gap,
Gary King, Giftlite, Hellopeopleofdetroit, Helwr, Hoo man, Instinct, Jimmaths, Joel7687, John Cumbers, Jugander, Keithljelp, Kiefer.Wolfowitz, Kku, Kuru, Kv75, Lambiam, Ldecola, Luyima,
Mathaddins, Melcombe, Merlion444, Michael Hardy, Mihal Orela, Mm100100, Modargo, Mwtoews, Nbarth, Nialsh, Nono64, Nutcracker, Oli Filth, PAR, Pak21, Piotrus, Pucicu,
QualitycontrolUS, Qwfp, Rbonvall, Requestion, Rgclegg, Rich Farmbrough, Rinconsoleao, SShearman, Sandman888, SchreiberBike, Scientio, Scwarebang, Sick Rantorum, Spangineer,
Statoman71, Susko, Taxman, The enemies of god, Thulka, Tobacman, Topbanana, Truswalu, Twilight Nightmare, Unyoyega, VictorAnyakin, Visu dreamz, Wavelength, Wile E. Heresiarch,
Wyllium, Zheric, Zipircik, Zvika, 258 anonymous edits

Forecasting Source: http://en.wikipedia.org/w/index.php?oldid=519725225 Contributors: 05proedl, 2001:620:8:3F42:8000:0:0:267, 2405:B000:600:262:0:0:36:7D,
2604:2000:FFC0:E0:4997:D9E0:5C06:C26, 2620:0:1040:407:BD4E:66C3:7379:96B8, Abtinb, Agraefe, Andyjsmith, Apdevries, Arpabr, Bateni, Beinhauer, Bernburgerin, Bjmedeiros,
Blockright, Bongwarrior, BrutForce, Cheapskate08, Chyen, Cinsing3, Ckatz, CommodiCast, Constancek, DMacks, Dancter, Dassiebtekreuz, Dbachmann, Drbreznjev, Econ2010, Ehrenberg-Bass,
Elonka, Enygma.911, Fcaster, Federalist51, Fuzfadal, Giftlite, Gorgalore, Hubbardaie, IPWAI, Icseaturtles, Igoldste, Jan7436790, Jaygary, Joel B. Lewis, Johnchasenz, Jonathanmoyer, Jpo,
Katonal, Kesten, Kimleonard, Kneale, Kris103045, Kuru, Kxjtaz, Lammidhania, LandalaEng, Lotje, Luk, M gerzon, Mack2, Maqayum, Markchockal, Martinbueno, Mato, Melcombe, Michael
Hardy, Moonriddengirl, MrOllie, Mrsaad31, NeilN, Neo-Jay, Pdcook, Phanerozoic, Philip Trueman, Pilgaard, R'n'B, Ricky@36, Rigadoun, Rjhyndman, Rjnicholas, Rohrbeck, Salamurai,
Saxifrage, Shadowjams, ShelfSkewed, Skizzik, SlackerMom, Spiderwriter, Spilla, Starflixx, SueHay, SupplyChainAnalyticsGuru, Tbhotch, The Anome, The Transhumanist, Thegreatdr, Tony
Myers, TravellingThru, Truswalu, Tuduser, Usability Tester 6, Vddku, Vermorel, WikiSlasher, Wimpie2, Yamara, Zagothal, 143 anonymous edits

Stationary process Source: http://en.wikipedia.org/w/index.php?oldid=526151236 Contributors: Abdull, Ahmadyan, Andres r5, BenFrantzDale, Billkamp, Charmi99, Citation inspector,
Dakshayani, Dicklyon, Docu, Dodothegoof, Duoduoduo, Dysprosia, Edward, Gareth Jones, Giftlite, J heisenberg, Jamelan, Jmath666, Linas, Luvrboy1, Mani1, Melcombe, Michael Hardy,
Mwilde, Nbarth, Protonk, Quantling, Radagast83, Rgclegg, Rumping, Sekarnet, Shivakaul, Sterrys, Tekhnofiend, Tesi1700, Tsirel, Weregerbil, WikHead, WikiPuppies, Zvika, 42 anonymous
edits

Stochastic process Source: http://en.wikipedia.org/w/index.php?oldid=526021993 Contributors: 123forman, A. Pichler, AManWithNoPlan, Abu badali, Afasmit, Alanb, Alinja, Andre Engels,
Andycjp, Arakunem, Arcfrk, Bdmy, BitchX, Boostat, Brian0918, Brickc1, Brunopimentel, Bryan Derksen, CSTAR, Chaos, Charles Matthews, Compsonheir, Cpiral, Crowsnest, Cuinuc, Damian
Yerrick, Danog, Deflective, Dmbandr, Dr. Universe, Dysprosia, Edwinstearns, Elf, EncMstr, Examtester, Fangz, Feinstein, Fresheneesz, Gaius Cornelius, Gene.arboit, Giftlite, Goodralph, Hairer,
Hari, Hyperbola, Inald, J heisenberg, J04n, J8079s, JWSchmidt, Javierluraschi, Jaxelrod, Jeff3000, Jjalexand, Jonathan de Boyne Pollard, Kaba3, Kbdank71, Keenforever, Kilmer-san, Kiril
Simeonovski, Kku, Kwamikagami, Kwertii, LOL, Lambiam, LiDaobing, Ligulem, Linas, LoveMonkey, Lucianosilvajp, Mack2, MarkSweep, Markhebner, MathMartin, MattieTK, Mdd,
Melcombe, Mets501, Michael Hardy, Miguel, Mokgen, Msh210, Nakon, Ncmathsadist, Nick Number, Oleg Alexandrov, Orangelightning, Pete142, Peterius, Phil Boswell, Phys,
PierreYvesLouis, Populus, Qwfp, Razvanux, Reywas92, Rjwilmsi, Romanm, Rossami, Rxnt, Salgueiro, Schmock, SgtThroat, Sligocki, Sodin, Spalding, Spmeyn, Star General, Stephen Bain,
Sullivan.t.j, TedPavlic, That Guy, From That Show!, TheObtuseAngleOfDoom, Thuycis, Tim Starling, Toby, Toby Bartels, Tomi, Tonymarsh, Tpb, Tsirel, Uli.loewe, Unyoyega, William Avery,
Zalle, Zenomax, Zmoboros, 143 anonymous edits

Covariance Source: http://en.wikipedia.org/w/index.php?oldid=526499509 Contributors: 16@r, Abovechief, AlphaPyro, Amonet, Ancheta Wis, Antandrus, Anturtle, Ap, Asymmetric, Awaterl,
AxelBoldt, Bender2k14, Bsilverthorn, Btyner, CKCortez, CenturionZ 1, Chronosilence, Ciphers, Closedmouth, Coppertwig, Cruise, Dag Hovland, Daniel5Ko, David.lijin.zhang, Den fjättrade
ankan, Didickman, Dod1, Duoduoduo, Eamon Nerbonne, Edcarrochio, Ericd, Esnascosta, EtudiantEco, Ewlyahoocom, Fangz, Felix Hoffmann, Fundamental metric tensor, Fæ, Gauge, Gene
Nygaard, Giftlite, Gjshisha, Glahaye, Glenn, Gruntler, Guslacerda, HammerHeadHuman, Hess88, Ht686rg90, Ikelos, J.delanoy, Jabbba, Jmath666, Johnbibby, Jorgenumata, Jugander,
Keshavanvrajan, KoenDelaere, Kristian Joensen, Kwertii, Kzollman, LOL, LaymanDon, Lbelcour, Lensi, Lindabrock, Looxix, MarkSweep, Melcombe, Michael Hardy, Mtroffaes, Naught101,
Nijdam, Nwalfield, Oleg Alexandrov, Pak21, Patrick, Pgan002, PhotoBox, Pirsq, Policron, Prax54, Q4444q, Qwfp, RL0919, Ralfreinermueller, Rgclegg, RichardSocher, Rock69, Rocketman768,
Romanpoet, Saric, SimonP, Skagedal, Soredewa, Spireguy, Sportyfelix, Sstrader, Sterrys, Stpasha, Sullivan.t.j, Sławomir Biały, Talgalili, TedPavlic, Tfkhang, The Thing That Should Not Be,
Thorwald, Tokigun, Tomeasy, Tomi, Traviscj, Usna71, VKokielov, Versus22, Wafulz, WhiteHatLurker, Wikomidia, Wmahan, Zath42, Zundark, Борис Пряха, มือใหม่, 182 anonymous edits

Autocovariance Source: http://en.wikipedia.org/w/index.php?oldid=527456165 Contributors: Charles Matthews, Ciphers, Den fjättrade ankan, First Harmonic, GuidoGer, Hooperbloob, Ilmari
Karonen, KoenDelaere, Michael Hardy, Nuwewsco, Rgclegg, Riyad parvez, Shaolin128, Sterrys, Tomaschwutz, Ztran, 20 anonymous edits

Autocorrelation Source: http://en.wikipedia.org/w/index.php?oldid=520399079 Contributors: ARTE, Adambro, Adrian1906, Albmont, AxelBoldt, BD2412, Bookandcoffee, Btyner, Carvas,
Charles Matthews, Cmglee, Coltsjoe, Conversion script, Cottrellnc, Counterfact, Damian Yerrick, Dankonikolic, DavidCBryant, Den fjättrade ankan, Dicklyon, Douglas R. White, DrBob,
Esaintpierre, Esprit15d, Evildeathmath, Favonian, Fredvanner, Gcjblack, Ghaly, Giftlite, Gomm, Graeme Bartlett, HamburgerRadio, Hannes Eder, Hgfernan, Hobbema, II MusLiM HyBRiD II,
Isnow, J S Lundeen, Jackzhp, Jeff3000, John Quiggin, Jrmanning, Jschwa1, Juliancolton, Just Another Dan, Landroni, Larryisgood, Lunch, Maartend8, Manoguru, Mathstat, Mcld, Mebden,
Melcombe, Merovingian, Metricslover, Michael Hardy, Mild Bill Hiccup, Mollwollfumble, Mtalexan, Mwtoews, Nova77, Oleg Alexandrov, Omegatron, PAR, Paresnah, Parijata, Pgabolde,
Pgan002, PhnomPencil, Qraquen, Quaresalas, Qwfp, Raker, Rbj, Rgclegg, Rich Farmbrough, Rjwilmsi, Robgomez, Rogerbrent, SamuelRiv, ShelfSkewed, Sigmundur, Slippens, Smack,
Smmurphy, Soultaco, Sumit Dutta, Thermochap, Tlroche, Tomaschwutz, Tony999, Twelvethirteen, Uncle Dick, UncleDouggie, Vdm, Vini.rn, Virga, Viswaprabha, WavePart, Who, Wikih 66,
William James Croft, Wricardoh, 148 anonymous edits

Cross-correlation Source: http://en.wikipedia.org/w/index.php?oldid=518373671 Contributors: Alankjackson, AllHailZeppelin, Andysac12, Apalaria, Belizefan, BenFrantzDale, Bob K,
Cdmurray 80, Charles Matthews, Cmglee, Cooli46, Credan, Da nuke, Dankonikolic, Den fjättrade ankan, Deville, Dfrankow, Drizzd, EffeX2, Eteq, Gellule, Ghaly, Giftlite, Gunnar Larsson,
HenningThielemann, Ht686rg90, Indeterminate, Inner Earth, Jamelan, Kku, ManMachine1984, Mbr5002, Melcombe, Michael Hardy, MichaelRW1, Momet, MusicScience, Muskr, Mysid,
Natalya, North8000, Omegatron, Optical3d, PAR, PEHowland, Parodi, Richard Giuly, RoccoM, SebastianHelm, Slava Evgrafov, Snowolf, Sss41, Sunev, Sławomir Biały, The Thing That Should
Not Be, Tomaschwutz, Unexpect, Vini.rn, Voidxor, Vossman, Wernervb, Wikipelli, Will Hage, 85 anonymous edits

White noise Source: http://en.wikipedia.org/w/index.php?oldid=524894983 Contributors: 1984, AManWithNoPlan, Acdx, Admrboltz, Aduvall, Aitias, Aka042, Alan012, Alansohn,
Alextrevelian 006, Amontero, Ancheta Wis, Andre Engels, Andrewferrier, Antikon, Army1987, ArnoldReinhold, Ashkax, Baccyak4H, BenFrantzDale, Bihzad, Binksternet, Bjh21, Bloodshedder,
Breno, Byrned, CBM, CTF83!, Canderson7, Carlb, Cburnett, Charles Matthews, Chotchki, ColonelHamilton, CompRhetoric, Correogsk, Craigy144, Crazyvas, Cronholm144, Cstrobel, Ctyonahl,
DNewhall, Damian Yerrick, Darac, DavidWBrooks, Degsart, DerGolgo, Deville, Dinkelaker, DragonHawk, Drewcifer3000, Drizzd, Duckdad, Dzenanz, Eastpoint, Eastpoints, Edcolins, Edude7,
Edward Z. Yang, El C, Encyclopedia a-z, Estevezj, EugeneZelenko, Everyking, Fg2, FiP, Freezing the mainstream, FrummerThanThou, Furrykef, GabeIglesia, Gadfium, Ghaly, Giftlite, Gillis,
Giraffedata, Glane23, Gobonobo, GoingBatty, Goodnightmush, Goosegoosegoose, Gracefool, Grebmops, Greglocock, Hairhorn, Hannolancia, Hastur chief, HenningThielemann, Heron, Hess8,
Hyacinth, Ian Dunster, Ig0r, In base 4, Inky, Ipatrol, Ismartman, James T Curran, JarlaxleArtemis, Jim.henderson, Jiy, Jmath666, Johnleemk, Jonathan.s.kt, Jonfischeruk, Julesd, KF, KRS,
Keenan Pepper, Ken Holden, Kenneth M Burke, Kenyon, Kkumpf, Koavf, Krukouski, Kvng, Landroni, Lenilucho, Liftarn, Light current, LightxYagami, Lindosland, LokiClock, Lola Voss,
Lowellian, Luna Santin, MPerel, Madbadger, Magic Window, Maksim-e, Martynas Patasius, Mathieu Perrin, Mattyblueboy, Melcombe, Michael Hardy, Miquonranger03, Mohan1986, Mrh30,
Mwilde, Nasa-verve, Natesibas, Nbhatla, Netoholic, Numbo3, OlEnglish, Oli Filth, Omegatron, Ossipewsk, PAR, Persian Poet Gal, Phansen, Philopedia, Phoxware, Pinethicket, Pne,
Prestonmcconkie, Pzavon, Quazar777, Qwfp, Ravius, Razorflame, Reedbeta, Rich Farmbrough, Rjwilmsi, Rnt20, Robofish, Rod57, Rogerborg, Saucemaster, Scapermoya, Scarian, Scarpy,
Schmock, SchreiberBike, Scientizzle, Sean.hoyland, Search4Lancer, Sepia tone, Shanes, Skysmith, Slysplace, Smithderek2000, SpaceFlight89, Spectrogram, Splash, Stephan Leeds, Steve2011,
Stumpfoot, Superbeecat, Synocco, Tbackstr, That Guy, From That Show!, The Anome, The Nixinator, The Thing That Should Not Be, TheFlyingWallaby, Themfromspace, Thorseth, Tim1357,
Tom Jenkins, TomViza, Torzsmokus, Tsujigiri, Tulameen, Ulric1313, Ultimus, Verbalcontract, Whitenoise2010, Wiki Wikardo, Wiki alf, Wikimalte, Wooddoo-eng, Wuthering123, Xezbeth,
Z2trillion, Zanimum, ZildjianAVC, Zoeb, Zvika, 273 anonymous edits

Random walk Source: http://en.wikipedia.org/w/index.php?oldid=525899241 Contributors: 12345shibow, 2001:250:1006:6191:4046:FA24:94D5:92C8, A. B., Achituv, Alexedmonds, Alksub,
Allaei, Alro, Amalas, Amatulic, Andonic, Areldyb, Arnaudf, Arno Matthias, ArnoldReinhold, Belasted, Bender235, Bereziny, Bfinn, Bkessler, Bkkbrad, Bongwarrior, Boud, Btyner, Catapult,
Centrx, Charles Matthews, Chhe, Chris 73, Chris the speller, ChrisGualtieri, CiaPan, Ciphergoth, Cliff, Cobaltcigs, Complexica, Compsim, Connelly, Courcelles, Cutler, Cybercobra,
DanielWenger, Danielgenin, Dankonikolic, David-Sarah Hopwood, Deor, Dfg13, Dmcq, Dolohov, Easwarno1, Edward, Ehrenkater, Elf, Eliezg, Ethanmdavidson, Finog, Flomenbom,
Gadykozma, Galentx, GeeJo, Giftlite, GrafZahl, Gregfitzy, GregorB, Halaster, Harry.cook2, Headbomb, Henrygb, Ignacio Icke, Illusiwind, Ino5hiro, Ixfd64, Ixnayonthetimmay, Jakohn, Jason
Quinn, Jason Recliner, Esq., Jheald, JohnBlackburne, JohnCD, JohnOwens, Jwrosenzweig, Karol Langner, Kelisi, Kghose, KoenDelaere, Koertefa, Lambiam, Lan56, Lensi, LuminaryJanitor,
Lupin, MGarcia, Ma8e, Mack2, MadN!sh, Markhebner, Masterpiece2000, Melcombe, Michael Hardy, Miguel, Mikeeg555, Morn, Mortezaaa, NawlinWiki, Nbarth, Neilc, Niall2, Nuwanda7,
OdedSchramm, Oleg Alexandrov, Oli Filth, Ordnascrazy, PAR, Paranoid, Paul A, Pfortuny, Pgreenfinch, Pranshus, Puffin, Purpy Pupple, Qrancik, R'n'B, R3m0t, Rafi Neal, Randomwalk2035,
Ricardogpn, Richard.decal, Rjwilmsi, RobinK, Run54, SJP, Salgueiro, Sam Hocevar, SamuelScarano, Schmock, Sephrial, Silly rabbit, Simetrical, Special-T, Srleffler, Stasyuha, Steevven1,
Stephen richlen, Sviemeister, Swpb, Tardis, Technopilgrim, ThorinMuglindir, TomyDuby, Trumpsternator, Tsaitgaist, Urdutext, V8rik, Wdechent, Weishaupt00, Wereon, Whiteglitter79, Wile E.
Heresiarch, Yqwen, Zephyris, ZezzaMTE, Zorodius, Zundark, 221 anonymous edits


Brownian motion Source: http://en.wikipedia.org/w/index.php?oldid=526205489 Contributors: 1234567890qwertyuiop, 1exec1, 2D, A. Pichler, AC+79 3888, Aannoonn, Abbas.nayebi,
Abp35, Adesai, Afasmit, Aim Here, Aitias, Alex Bakharev, AlexWelens, Alfio, Andonic, Andrei Stroe, AndreiDukhin, Androstachys, Aristophanes68, Arjun01, Arkuat, Arthena, Ashmoo,
Awickert, Baddevil49, Barras, Bass fishing physicist, BenFrantzDale, Benbest, Bender235, Bernhlav, Bethjaneway, Bjcairns, Bolekilolek, Borgx, Bplohr, BrightStarSky, Calabe1992, Cdang,
Cesiumfrog, ChE Fundamentalist, Chandlerburr, Charles Matthews, Cheesy Yeast, Chester Markel, Chhe, Cliff, Colonies Chris, Conversion script, Cutler, Cybercobra, D.H, DGG, DSP-user,
DVdm, Da Joe, Da5id403, Darkfrog24, Darryl Ring, David Cooke, Davy p, Discospinster, Dismas, Drmies, Dssamuels, Dungodung, Dysprosia, ERobson, Edgar181, Edgarswooth, Edwinhubbel,
Eean, Emperorbma, Epbr123, Eric Forste, Eseijo, Etxrge, Examtester, Faradayplank, FariX, Fieldday-sunday, Fij, Flyocean, Fopnor, Gdr, Ge-lun, George100, Gianluigi, Giftlite, Giovanni33,
Gjshisha, Glycoform, Gogo Dodo, Goodnightmush, Gut Monk, Hadal, HamatoKameko, HappyApple, HarryHenryGebel, Headbomb, Henning Makholm, Heron, Hesperian, Hhhippo,
History2007, HolcmanDavid, Hyacinth, Ianml, InverseHypercube, Isnow, Itub, JHUastro, JSquish, Jakohn, Jamesooders, Jellybeanbelly, Jim Sukwutput, Jimbobjones123, Jj137, Joanjoc, Joe
Schmedley, John gordon 54, JohnInDC, Joke137, Josce, Jszymon, Julesd, Kappa, Kareeser, Karol Langner, Kaslanidi, Keenan Pepper, Klye Mif, KnowledgeOfSelf, Koavf, Krauss, Kt123456789,
Kyeseven, L33tminion, LOL, LachlanA, Lee Carre, Lensovet, Lightmouse, Lilcoolio, Ling.Nut, LizardJr8, Logicus, Looie496, Lookang, Luk, MER-C, Machine Elf 1735, MagnInd, Marco Polo,
MarkGallagher, Marquez, Matthew Woodcraft, Mdd, Meigel, Melchoir, Melcombe, Memming, Metromoxie, Mgiganteus1, Michael Hardy, Michael welcome back, Microinjection, Miguel,
Mike409, Mintwins2006, Mipmip, Mirv, MithrandirMage, Mohsin2511, Molerat, Morbid-o, MrOllie, Msh210, Murtasa, Mydogategodshat, Myria, NHSavage, Natalie Erin, Nate1481, NeilN,
Netalarm, Nimbusania, Niri.M, Nk, Nsda, Ognir, OliAtlason, Olivier, Omnipaedista, Omsserpent, Orange Suede Sofa, Ozarfreo, Ozzy667, PAR, Paffordmd72, Palica, Papershark, PaulLowrance,
Pete212, Pfeiferwalter, Phthoggos, Polisher of Cobwebs, Postdlf, PouyaDT, Ppearle1, Praseodymium, Purpy Pupple, Quaeler, Quatschman, QuiteUnusual, Qwfp, R.e.b., Ragityman, Raphael s,
Razorflame, RexNL, Rgclegg, Rich Farmbrough, Richard.decal, Richerman, Richwil, Rjwilmsi, RockMagnetist, Rolf Mayo, Rossal17, Rscotta, S9440996i, Serek, Shadow1, Shimbels, Sillybilly,
Siryendor, Slawekb, Smallweed, SpaceFlight89, SpikeMolec, Squidonius, Srnec, StaticGull, StaticSan, Stephen B Streater, Steved424, Stevenj, Sullivan.t.j, Sundar, Super7luv, Svein Olav
Nyberg, TCHJ3K, Tagus, Telemakh0s, Teol, Terra-kun, The Sklad, TheTito, Thepasta, ThorinMuglindir, Thoroughbred Phoenix, Tocharianne, Tomaxer, Triwbe, UberCryxic, Ucucha, Vanished
user, Vikasbahar, Viridiflavus, Vsmith, Vyruss, WOT, Walterpfeifer, Widr, Wknight94, Wormy, Wricardoh, Wsiegmund, XJamRastafire, Yamamoto Ichiro, Zercam, Zorodius, Zzzzort, 447
anonymous edits

Wiener process Source: http://en.wikipedia.org/w/index.php?oldid=524769946 Contributors: Aanderson@amherst.edu, AaronKauf, Alexf, Amatulic, Ambrose.chongo, Andrewpmack,
Awaterl, Bender235, Berland, Bmju, C45207, CBM, CSTAR, Charles Matthews, Cyp, DaniScoreKeeper, David Underdown, Delius, Deodar, Digfarenough, Dmytro, Dylan Lake, El Roih, Elf,
Fintor, Forwardmeasure, Gala.martin, Giftlite, Hanmyo, Jackzhp, James T Curran, JeffreyBenjaminBrown, Jmnbatista, Joke137, JonAWellner, Jujutacular, Lambiam, Mbell, Mediran, Melcombe,
Michael Hardy, MisterSheik, Oleg Alexandrov, Oli Filth, PeR, Phaedo1732, Pokedork7876, Ptrf, R.e.b., RobertHannah89, Ru elio, Sandym, Schmock, Soap, Speedplane, Spinningspark, Stephen
B Streater, Sullivan.t.j, Sławomir Biały, Tristanreid, Tsirel, Warbler271, Wenerman, Wikomidia, Wysinwygaa, Zvika, 88 anonymous edits

Autoregressive model Source: http://en.wikipedia.org/w/index.php?oldid=515144402 Contributors: Albmont, Everettr2, Jackzhp, Jean-Frédéric, John of Reading, Kate 85b, Kku, Melcombe,
Memming, MrFelicity, N419BH, Nova77, Pavon, Rajmylove123, Schmock, Shantham11, Sympatycznyfacet, Tomaschwutz, Wiklimatik, Wpollard, Xian.ch, Zvika, 53 anonymous edits

Moving average Source: http://en.wikipedia.org/w/index.php?oldid=525622319 Contributors: A bit iffy, Adouglass, Ain92, Amatulic, Amitchaudhary, Arthena, Beetstra, Berland, Bil1, Btm,
Btyner, Carmitsp, Cemsbr, Chikichiki, Chipmunck, Cjs, Ckatz, CliffC, DARTH SIDIOUS 2, Daniel Quinlan, Dark Mage, DeluxNate, DerBorg, Dickreuter, DragonHawk, Econotechie, Ekotkie,
Epbr123, Esanchez7587, Euku, Falk Lieder, Feco, Fenice, Foobarhoge, Gakmo, Gandalf61, Giftlite, Gleb, Glennimoss, GraemeL, Grin, Hdante, HenningThielemann, HimanshuJains, Hu12,
Investorhenry, JLaTondre, Jamelan, JenyaTsoy, Jianingy, Jitendralovekar, Karol Langner, Kazov, Kevin Ryde, Kwertii, Lambiam, Lamro, Landroni, Lenrius, Leszek0140, Lukas227,
Makeemlighter, Mandarax, Manicstreetpreacher, Martinkv, Materialscientist, Mathaddins, Maxlittle2007, Mazin07, Mehtagauravs, Melcombe, Merosonox, Michael Hardy, Michaelzeng7,
MilfordMark, Mir76, Mkacholia, Mwtoews, Nanzeng, Naught101, Naveensakhamuri, Netkinetic, Neurowiki, Nikai, Ninly, Paradoctor, PhilKnight, Pim2009, Pisquared8, Pleclech, Qwfp, R. S.
Shaw, Rabarberski, Rainypearl, Ramorum, Rentier, Requestion, Richard n, Robomojo, Ron shelf, SLi, Satori Son, Seaphoto, Sid1138, Siebrand, Soluch, Stocksnet, TechAnalyster, Teorth,
Thelb4, Time9, TomyDuby, Tony1, Tonyho, Towel401, Tradermatt, Tristanreid, USTUNINSEL, Utcursch, VladimirKorablin, Wa03, Wai Wai, Wavelength, Wayne Slam, WikHead,
Wikinotforcommercialuse, Wikiolap, William Avery, Yahya Abdal-Aziz, 251 ,‫ דניאל צבי‬anonymous edits

Autoregressive–moving-average model Source: http://en.wikipedia.org/w/index.php?oldid=525946532 Contributors: Abeliavsky, AdamSmithee, Albmont, AnibalSolon, BlaiseFEgan, Btyner,
Charles Matthews, Cheesekeeper, Chem1, Christiaan.Brown, Clausen, CommodiCast, Damian Yerrick, DaveDixon, Den fjättrade ankan, Dicklyon, Diegotorquemada, Dimmutool, Doobliebop,
Dr.kenyiu, EtudiantEco, Giftlite, Heptor, Jackzhp, Jaerom Darkwind, Jamelan, John Quiggin, Karada, Kiefer.Wolfowitz, Kku, KyraVixen, Landroni, Leycrtw1, Loeffler, Loew Galitz, Lunch,
Maechler, Makesship, Mbergins, MehdiPedia, Melcombe, Merovingian, Michael Hardy, Michelino12, Mr.ogren, Nutcracker, Olaf, Oleg Alexandrov, Oli Filth, PAR, Pyclanmap, Rgclegg,
Robinh, Roland Longbow, Rushbugled13, Sagiariel, Sandman888, Schmock, Shape84, Silly rabbit, Snooper77, TheSeven, Thomasmeeks, ThreeBlindMice, Tony1, Tristanreid, Unyoyega,
Wavelength, Whertusky, Whisky brewer, Wikigronk, Wile E. Heresiarch, Yoderj, Zvika, ‫ 401 ,ﻣﺎﻧﻲ‬anonymous edits

Fourier transform Source: http://en.wikipedia.org/w/index.php?oldid=526165366 Contributors: 9258fahsflkh917fas, A Doon, A. Pichler, Abecedare, Adam.stinchcombe, Admartch,
Adoniscik, Ahoerstemeier, Akbg, Alejo2083, AliceNovak, Alipson, Amaher, AnAj, Andrei Polyanin, Andres, Angalla, Anna Lincoln, Ap, Army1987, Arondals, Asmeurer, Astronautameya,
Avicennasis, AxelBoldt, Barak Sh, Bci2, Bdmy, BehzadAhmadi, BenFrantzDale, BigJohnHenry, Bo Jacoby, Bob K, Bobblewik, Bobo192, BorisG, Brews ohare, Bugnot, Bumm13, Burhem,
Butala, Bwb1729, CSTAR, Caio2112, Cassandra B, Catslash, Cburnett, CecilWard, Ch mad, Charles Matthews, Chris the speller, ChrisGualtieri, ClickRick, Cmghim925, Complexica,
Compsonheir, Coppertwig, CrisKatz, Crisófilax, DX-MON, Da nuke, DabMachine, Daqu, David R. Ingham, DavidCBryant, Dcirovic, Demosta, Dhabih, Discospinster, DmitTrix, Dmmaus,
Dougweller, Dr.enh, DrBob, Drew335, Drilnoth, Dysprosia, EconoPhysicist, Ed g2s, Eliyak, Elkman, Enochlau, Epzcaw, Favonian, Feline Hymnic, Feraudyh, Fizyxnrd, Forbes72, Foxj,
Fr33kman, Frappyjohn, Fred Bradstadt, Fropuff, Futurebird, Gaidheal1, Gaius Cornelius, Gareth Owen, Geekdiva, Giftlite, Giovannidimauro, Glenn, GuidoGer, GyroMagician, H2g2bob,
HappyCamper, Heimstern, HenningThielemann, Herr Lip, Hesam7, HirsuteSimia, Hrafeiro, Ht686rg90, I am a mushroom, Igny, Iihki, Ivan Shmakov, Iwfyita, Jaakobou, Jdorje, Jhealy, Jko,
Joerite, JohnBlackburne, JohnOFL, JohnQPedia, Joriki, Justwantedtofixonething, KHamsun, KYN, Keenan Pepper, Kevmitch, Kostmo, Kri, Kunaporn, Larsobrien, Linas, LokiClock, Loodog,
Looxix, Lovibond, Luciopaiva, Lupin, M1ss1ontomars2k4, Manik762007, MathKnight, Maxim, Mckee, Mct mht, Metacomet, Michael Hardy, Mikeblas, Mikiemike, Millerdl, Moxfyre, Mr. PIM,
NTUDISP, Naddy, NameIsRon, NathanHagen, Nbarth, NickGarvey, Nihil, Nishantjr, Njerseyguy, Nk, Nmnogueira, NokMok, NotWith, Nscozzaro, Od Mishehu, Offsure, Oleg Alexandrov, Oli
Filth, Omegatron, Oreo Priest, Ouzel Ring, PAR, Pak21, Papa November, Paul August, Pedrito, Pete463251, Petergans, Phasmatisnox, Phils, PhotoBox, PigFlu Oink, Poincarecon, Pol098,
Policron, PsiEpsilon, PtDw832, Publichealthguru, Quietbritishjim, Quintote, Qwfp, R.e.b., Rainwarrior, Rbj, Red Winged Duck, Riesz, Rifleman 82, Rijkbenik, Rjwilmsi, RobertHannah89, Rror,
Rs2, Rurz2007, SKvalen, Safenner1, Sai2020, Sandb, Sbyrnes321, SchreiberBike, SebastianHelm, Sepia tone, Sgoder, Sgreddin, Shreevatsa, Silly rabbit, Slawekb, SlimDeli, Snigbrook, Snoyes,
Sohale, Soulkeeper, SpaceFlight89, Spanglej, Sprocedato, Stausifr, Stevan White, Stevenj, Stpasha, StradivariusTV, Sun Creator, Sunev, Sverdrup, Sylvestersteele, Sławomir Biały, THEN WHO
WAS PHONE?, TYelliot, Tabletop, Tahome, TakuyaMurata, TarryWorst, Tetracube, The Thing That Should Not Be, Thenub314, Thermochap, Thinking of England, Tim Goodwyn, Tim
Starling, Tinos, Tobias Bergemann, Tobych, TranceThrust, Tunabex, Ujjalpatra, User A1, Vadik wiki, Vasiľ, Verdy p, VeryNewToThis, VictorAnyakin, Vidalian Tears, Vnb61, Voronwae,
WLior, Waldir, Wavelength, Wiki Edit Testing, WikiDao, Wile E. Heresiarch, Writer130, Wwheaton, Ybhatti, YouRang?, Zoz, Zvika, 589 anonymous edits

Spectral density Source: http://en.wikipedia.org/w/index.php?oldid=526372725 Contributors: Abdull, Adoniscik, Arjayay, Benjamin.friedrich, Cburnett, Cihan, Claidheamohmor, DanielCD,
DavidCary, Delius, Dfrankow, Dicklyon, Dlituiev, Eg-T2g, Ego White Tray, Eteq, Fdimer, Giftlite, Girlphysicist, Halpaugh, Helptry, IRWolfie-, IanOfNorwich, Jamelan, Jayman3000, Jendem,
Jonverve, Karol Langner, Kate, Khazar2, Kyda sh, Light current, Looxix, Mange01, MarmotteNZ, Mcld, Melcombe, Memming, Michael Hardy, Moala, Nscozzaro, Numbo3, Oleg Alexandrov,
Omegatron, PAR, Paclopes, Physicistjedi, Porqin, Rjwilmsi, Rs2, Ryanrs, SCEhardt, Santhosh.481, Sigmout, Silly rabbit, SimonP, Skarebo, Sonoma-rich, Srleffler, Sterrys, Supten, TedPavlic,
The Anome, ToLam, Tomaschwutz, TutterMouse, Wahying, Wikiwooroo, Wolfgang Noichl, Wwheaton, Zoicon5, Zvika, Zylorian, ^musaz, 163 anonymous edits

Signal processing Source: http://en.wikipedia.org/w/index.php?oldid=518097800 Contributors: 10metreh, Adoniscik, Alexkin, Alinja, Altenmann, Andrus Kallastu, Antandrus, Bact, Bethnim,
Biasoli, Borgx, Brews ohare, Burhem, Cdworetzky, Cedars, Charles Matthews, Chester Markel, Christopherlin, Conversion script, Cookie90, Dicklyon, DrKiernan, Dwachter, EagleFan,
Emperorbma, Firefly322, Funandtrvl, Geoeg, Giftlite, Glenn, Glrx, Grebaldar, Groovamos, GyroMagician, Helwr, Heron, Hezarfenn, Hu12, Isheden, Ixfd64, Jamilsamara, Janki, Jennifer
Levinson, Jim1138, Jimmaths, Johnuniq, Jovial4u, Jusdafax, Kjkolb, Kvng, Light current, Llorenzi, Luckyz, MER-C, Mako098765, Mange01, Mathuranathan, Mblumber, Mingramh,
Miscanalysis, Monig1, Muriel Gottrop, Nabla, Ninly, Northumbrian, Oicumayberight, Oleg Alexandrov, OrgasGirl, Redgecko, Rememberway, Robert.Harker, Robsavoie, Rockyrackoon,
Saddhiyama, SamShearman, Sandeeppalakkal, Sergio.ballestrero, Sfeldm2, Shyamal, Sl, Smack, Spinningspark, The Photon, Tonto Kowalski, TransporterMan, Triddle, Vertium, Vishalbhardwaj,
Voidxor, Wavelength, Wiki contributor 21, Wikihitech, Yk Yk Yk, Yswismer, Zvika, Ömer Cengiz Çelebi, 86 anonymous edits

Autoregressive conditional heteroskedasticity Source: http://en.wikipedia.org/w/index.php?oldid=527454000 Contributors: 4l3xlk, 8ung3st, Abeliavsky, Albmont, Bbllueataznee, Bobo192,
Bondegezou, Brenoneri, Btyner, Charles Matthews, Christiaan.Brown, Christopher Connor, Cje, Cp111, Cronholm144, Davezes, DeadEyeArrow, Den fjättrade ankan, Ebraminio, Erxnmedia,
Finnancier, Fyyer, Gaius Cornelius, GeorgeBoole, Hascott, Inferno, Lord of Penguins, Irbisgreif, JavOs, Jni, John Quiggin, Kevinhsun, Kizor, Kwertii, LDLee, Ladypine, Landroni, Loodog,
Magicmike, Melcombe, Merube 89, Michael Hardy, Nelson50, Nutcracker, Personline, Philip Trueman, Pitchfork4, Pontus, Protonk, Qwfp, Rgclegg, Rich Farmbrough, Ronnotel, Ryanrs,
Schmock, Sigmundur, Talgalili, Whisky brewer, Wile E. Heresiarch, Wtmitchell, Xian.ch, Xieyihui, Zootm, 166 anonymous edits

Autoregressive integrated moving average Source: http://en.wikipedia.org/w/index.php?oldid=525918887 Contributors: 72Dino, Abeliavsky, Aetheling, Alain, Albmont, Ary29, Benjamin
Mako Hill, Bkwillwm, Cheesekeeper, Chris-gore, CommodiCast, Cpdo, Damiano.varagnolo, Den fjättrade ankan, Dfrankow, Gak, Hirak 99, Hschliebs, Hu12, J heisenberg, JA(000)Davidson,
Jctull, Jeff G., Lunch, MER-C, Maechler, Marion.cuny, Melcombe, Michael Hardy, Nutcracker, Panicpgh, Rgclegg, Schmock, ShelfSkewed, Silly rabbit, Siroxo, Testrider, 34 anonymous edits

Volatility (finance) Source: http://en.wikipedia.org/w/index.php?oldid=527348650 Contributors: A quant, Alfazagato, Allens, Andrewpmack, Arthena, AvicAWB, Baeksu, Bdemeshev,
Bender235, Berklabsci, Biem, BrokenSegue, Brw12, Btyner, Byelf2007, Calair, Canadaduane, Catofgrey, Charles Matthews, Christofurio, Cleared as filed, Complexfinance, Cumulant,
D15724C710N, DocendoDiscimus, DominicConnor, Eeekster, Ehdr, Emperorbma, Enchanter, Eric Kvaalen, Favonian, FinancePublisher, Finnancier, Fredbauder, Fvasconcellos, GraemeL,


Helon, Hobsonlane, Honza Záruba, Hu12, Infinity0, Innohead, Jerryseinfeld, Jewzip, Jimmy Pitt, John Fader, Jphillips, KKramer, Karol Langner, Kingpin13, KnightRider, Kyng, LARS, Lajiri,
Lamro, Landroni, Marishik88, Martinp, Mayerwin, Melcombe, Merube 89, Michael Hardy, Nbarth, Nburden, OccamTheRazor, Orzetto, Pablete85, PaulTheOctopus, Pcb21, PeterM, Pgreenfinch,
Phil Boswell, Pontus, Pt johnston, Qarakesek, Quaeler, Qwfp, Ronnotel, Rrenicker1, Ryguasu, S.Örvarr.S, SKL, ShaunMacPherson, Statoman71, Swerfvalk, Taral, Tarc87, Tassedethe, Tedder,
That Guy, From That Show!, Thobitz, Time9, Tradedia, UnitedStatesian, UweD, Volatility Professor, Volatiller, Walkerma, Wongm, Wushi-En, Yurik, Zhenqinli, 216 anonymous edits

Stable distribution Source: http://en.wikipedia.org/w/index.php?oldid=523582468 Contributors: 28bytes, Aastrup, Alidev, AndrewHowse, Arthena, Avraham, Baccyak4H, Bdmy, Beland,
Benwing, Btyner, Charles Matthews, Confluent, Derek farn, Eggstone, Eric Kvaalen, Fluffernutter, Gaius Cornelius, Giftlite, Gill7da, Hans Adler, Intervallic, J heisenberg, J6w5,
JA(000)Davidson, Janlo, JoeKearney, John Nolan, Jérôme, LachlanA, Lpele, Melcombe, Michael Hardy, Msfr, Mveil4483, Myasuda, Nbarth, Nowaket, Nschuma, Oekaki, Ott2, PAR, Ptrf,
PyonDude, Qwfp, Rich Farmbrough, Rjwilmsi, Rlendog, Rob Langford, Stpasha, Sun Creator, That Guy, From That Show!, TomyDuby, Wainson, Wikibob, William Avery, Іванко1, 105
anonymous edits

Mathematical finance Source: http://en.wikipedia.org/w/index.php?oldid=523469709 Contributors: A.j.g.cairns, Acroterion, Ahd2007, Ahoerstemeier, Albertod4, Allemandtando, Amckern,
Angelachou, Arthur Rubin, Author007, Avraham, Ayonbd2000, Baoura, Beetstra, Billolik, Brad7777, Btyner, Burakg, Burlywood, CapitalR, Celuici, Cfries, Charles Matthews, Christoff pale,
Christofurio, Ciphers, Colonel Warden, Cursive, DMCer, DocendoDiscimus, DominicConnor, Drootopula, DuncanHill, Dysprosia, Edward, Elwikipedista, Eric Kvaalen, Evercat, Eweinber,
FF2010, Fastfission, Feco, Financestudent, Fintor, Flowanda, Gabbe, Gary King, Gene Nygaard, Giftlite, Giganut, HGB, Halliron, Hannibal19, HappyCamper, Headbomb, Hroðulf, Hu12,
Hégésippe Cormier, JBellis, JYolkowski, Jackol, Jamesfranklingresham, Jimmaths, Jmnbatista, JohnBlackburne, JonHarder, JonMcLoone, Jonhol, Jrtayloriv, Kaslanidi, Kaypoh, Kimys,
Kolmogorov Complexity, Kuru, Lamro, Langostas, Limit-theorem, Looxix, MER-C, MM21, Mav, Mic, Michael Hardy, Michaltomek, Mikaey, Minesweeper, MrOllie, Msh210,
Mydogategodshat, Nikossskantzos, Niuer, Nparikh, Oleg Alexandrov, Onyxxman, Optakeover, Paul A, Pcb21, PhotoBox, Pnm, Portutusd, Ppntori, Punanimal, Quantchina, Quantnet, Ralphpukei,
Rasmus Faber, Rhobite, Riskbooks, Rodo82, Ronnotel, Ruud Koot, SUPER-QUANT-HERO, Sardanaphalus, Sentriclecub, Silly rabbit, SkyWalker, Smaines, Smesh, Stanislav87, SymmyS,
Tassedethe, Taxman, Template namespace initialisation script, Tesscass, Tigergb, Timorrill, Timwi, Uxejn, Vabramov, Vasquezomlin, WebScientist, Willsmith, Woohookitty, Xiaobajie,
YUL89YYZ, Yunli, Zfeinst, Zfr, 257 anonymous edits

Stochastic differential equation Source: http://en.wikipedia.org/w/index.php?oldid=524436107 Contributors: AaronKauf, Agmpinia, BenFrantzDale, Bender235, Benjamin.friedrich,
Bistromathic, Btyner, Cradel, Danyaljj, Dgw, F=q(E+v^B), FilipeS, Firsfron, Foxj, Fuhghettaboutit, Gaius Cornelius, Giftlite, Hairer, Hectorthebat, Innerproduct, Kiefer.Wolfowitz,
KimYungTae, Kupirijo, LARS, LachlanA, Ladnunwala, Lhfriedman, Lkinkade, Marie Poise, Mathsfreak, McLowery, Melcombe, Michael Hardy, Moskvax, Nilanjansaha27, OliAtlason,
Phys0111, Piloter, Rikimaru, RockMagnetist, Ronnotel, Sandym, SgtThroat, Shawn@garbett.org, Shawnc, Siroxo, Sullivan.t.j, The Man in Question, UffeHThygesen, Vovchyck, 51 anonymous
edits

Brownian model of financial markets Source: http://en.wikipedia.org/w/index.php?oldid=509889166 Contributors: BD2412, Chris the speller, Ensign beedrill, Financestudent, Giganut,
GoingBatty, LilHelpa, Rich Farmbrough, Rjwilmsi, SpacemanSpiff, Tassedethe, Woohookitty, Zfeinst, 35 anonymous edits

Stochastic volatility Source: http://en.wikipedia.org/w/index.php?oldid=526032749 Contributors: A. Pichler, Asperal, Benna, Bluegrass, Btyner, Chiinners, DominicConnor, Enchanter,
Finnancier, Firsfron, Froufrou07, Hopefully acceptable username, Hulbert88, Leifern, Lmatt, Mevalabadie, Michael Hardy, Mu, Roadrunner, Ronnotel, Seanhunter, Teich50, Ulner, Wavelength,
Willsmith, Woohookitty, Zfeinst, 45 anonymous edits

Black–Scholes Source: http://en.wikipedia.org/w/index.php?oldid=527347371 Contributors: -Ozone-, A Train, A bit iffy, A. Pichler, A3 nm, Aberdeen01, Abpai, Adapter9, AdrianTM,
Akrouglo, Alan ffm, Aldanor, Alpesh24, Altenmann, Andrew Gray, AndrewHowse, Andycjp, Antoniekotze, Aquishix, Arkachatterjea, Artoasis, Artoffacts, Avikar1, Beetstra, Bender235,
BetterToBurnOut, Betterfinance, Bitalgowarrior, Bobo192, Brianboonstra, Btyner, C S, CSWarren, Calltech, Charles Matthews, Charlesreid1, Chinacat2002, Chrisvls, Cibergili, Coder Dan,
CoolGuy, Cretog8, Crocefisso, Crougeaux, Crowsnest, Csl77, Css, Cst17, Curtis23, Cyclist, Daida, Dan Rayn, Dan131m, DancingPhilosopher, Danfuzz, DavidRF, Dcsohl, Deflective, Dino,
Domino, Dpwkbw, Dr.007, Drusus 0, DudeOnTheStreet, Dudegalea, Echion2, EconomistBR, Edouard.darchimbaud, Edward, EdwardLockhart, Elkman, EmmetCaulfield, Enchanter, Enpc,
Ercolev, Ernie shoemaker, Etrigan, Favonian, Feco, Fenice, FilipeS, Finnancier, Fintor, Fish147, France3470, Fsiler, GRBerry, Galizur, Gaschroeder, Gauge, Geschichte, Giftlite, Gnixon,
GoldenPi, Goliat 43, Goodnightmush, Grafen, Gretchen, Gulbenk, Guy M, HLwiKi, Hadal, Hadlock, Hu12, HyDeckar, Ikelos, Indoorworkbench, IronGargoyle, Islandbay, IstvanWolf, JaGa,
Jaccard, JahJah, Jameslwoodward, JanSuchy, Jerzy, Jigneshkerai89, Jitse Niesen, Jlharrington, Jmnbatista, John, Johnbywater, JustAGal, Justin73, JzG, Kaslanidi, Kbrose, Kcordina,
Khagansama, Kimbly, Kirti Khandelwal, Kohzy, Kungfuadam, Kwamikagami, Kwertii, Landroni, Lehalle, Leifern, Leonard G., Leyanese, Lithium6, Lmatt, Lxtko2, Makreel, Marudubshinki,
Materialscientist, Mathiastck, Melcombe, Mets501, Mewbutton, Mgiganteus1, Michael C Price, Michael Hardy, Michael Slone, Mikc75, Minhaj.shaik, Mr Ape, MrOllie, Mrseacow,
Mydogategodshat, Naddy, Nbarth, Nicolas1981, Nixdorf, Nkojuharov, Nohat, Notary137, Novolucidus, Odont, Ogmark, Oleg Alexandrov, Oli Filth, Olivier, Pameis3, Parsiad.azimzadeh, Pcb21,
PeterM, Petrus, Pgreenfinch, PleaseKING, Pls, Pontus, Pps, Prasantapalwiki, Pretzelpaws, Protonk, Quarague, RJN, Rajnr, Razorflame, Rjwilmsi, Roadrunner, Roberto.croce, Ronnotel,
RussNelson, S2000magician, SDC, SPat, Saibod, Schmock, Scrutchfield, Sebculture, Sgcook, Silly rabbit, Smaines, Smallbones, Smallbones11, Spliffy, Splintax, Sreejithk2000, Statoman71,
Stephen B Streater, Stevenmitchell, Stochastic Financial Model, Supergrane, Sysfx, Tangotango, Taphuri, Tarotcards, Tawker, Tedickey, The Anome, TheObtuseAngleOfDoom, Thrymr, Tiles,
Tinz, Tomeasy, Tony1, Tristanreid, Typewritten, Ulner, Viz, Vonfraginoff, Vvarkey, Waagh, WallStGolfer31, WebScientist, Wile E. Heresiarch, William Avery, Williamv1138, Willsmith,
Wlmsears, Wurmli, Yassinemaaroufi, Yill577, Zfeinst, Zophar, Zzyzx11, 754 anonymous edits

Black model Source: http://en.wikipedia.org/w/index.php?oldid=519475158 Contributors: Captain Disdain, Cfries, DanielCD, Dori, DudeOnTheStreet, Edward, Feco, Felix Forgeron,
Finnancier, Fintor, GeneralBob, Hu12, Jimmycorp, Lmatt, Materialscientist, Michael Hardy, Oleg Alexandrov, Oli Filth, Pcb21, PtolemyIV, Renamed user 4, Roadrunner, Samcol1492, Sgcook,
Ulner, のぼりん, 71 anonymous edits

Black–Derman–Toy model Source: http://en.wikipedia.org/w/index.php?oldid=515144794 Contributors: AdamSmithee, Bryan Derksen, Christofurio, Danielfranciscook, Finnancier, Fintor,
GTBacchus, Gabbe, Gurch, Hmains, John Quiggin, Lcynet, PatrickFlaherty, Pleclech, Rich Farmbrough, RobJ1981, Schmock, Tony1, 13 anonymous edits

Cox–Ingersoll–Ross model Source: http://en.wikipedia.org/w/index.php?oldid=527348445 Contributors: A. Pichler, AdamSmithee, Amakuha, Andreas4965, Arthena, Bankert, Finnancier,
Fintor, Forwardmeasure, Gred925, Hwansokcho, JonathanIwiki, Lamro, Melcombe, Michael Hardy, Piloter, Schmock, Sgmu, 22 anonymous edits

Monte Carlo method Source: http://en.wikipedia.org/w/index.php?oldid=527276077 Contributors: *drew, 6StringJazzer, A.Cython, ABCD, Aardvark92, Adfred123, Aferistas, Agilemolecule,
Akanksh, Alanbly, Albmont, AlexBIOSS, AlexandreCam, AlfredR, Alliance09, Altenmann, Amritkar, Andkore, Andrea Parri, Andreas Kaufmann, Angelbo, Aniu, Apanag, Aspuru,
Astridpowers, Atlant, Avalcarce, Avicennasis, Aznrocket, BAxelrod, BConleyEEPE, Banano03, Banus, Bduke, Beatnik8983, Behinddemlune, BenFrantzDale, BenTrotsky, Bender235,
Bensaccount, Bgwhite, BillGosset, Bkell, Blotwell, Bmaddy, Bobo192, Boffob, Bomazi, Boredzo, Broquaint, Btyner, C628, CRGreathouse, Caiaffa, Canterbury Tail, Charles Matthews,
ChicagoActuary, Chip123456, Chrisriche, Cibergili, Cm the p, Colonies Chris, Compsim, Coneslayer, Cretog8, Criter, Crougeaux, Cybercobra, Cython1, DMG413, Damistmu, Darkrider2k6,
Datta research, Davnor, Ddcampayo, Ddxc, Denis.arnaud, Dhatfield, DianeSteele, Digemedi, Dmcq, Dogface, Donner60, Download, Dratman, Drewnoakes, Drsquirlz, Ds53, Duck ears,
Duncharris, Dylanwhs, EEMIV, ERosa, Edstat, Edward, EldKatt, Elpincha, Elwikipedista, Eudaemonic3, Ezrakilty, Fastfission, Fintor, Flammifer, Fritsebits, Frozen fish, Furrykef, G716, Gareth
Griffith-Jones, Giftlite, Gilliam, Glosser.ca, Goudzovski, GraemeL, GrayCalhoun, Greenyoda, Grestrepo, Gruntfuterk, Gtrmp, Gökhan, Hanksname, Hawaiian717, Headbomb, Herath.sanjeewa,
Here, Hokanomono, Hu12, Hubbardaie, Hugh.medal, ILikeThings, IanOsgood, Inrad, Ironboy11, Itub, J.Dong820, J00tel, Jackal irl, Jacobleonardking, Jamesscottbrown, Janpedia, JavaManAz,
Jayjaybillings, Jeffq, Jitse Niesen, Joey0084, John, John Vandenberg, JohnOwens, Jorgenumata, Jsarratt, Jugander, Jérôme, K.lee, KSmrq, KaHa242, Karol Langner, Kenmckinley, Kibiusa,
Kimys, Knordlun, Kroese, Kummi, Kuru, Lambyte, Lee Sau Dan, LeoTrottier, Lerdthenerd, Levin, Lexor, Lhynard, LizardJr8, LoveMonkey, M-le-mot-dit, Magioladitis, Malatesta, Male1979,
ManchotPi, Marcofalcioni, Marie Poise, Mark Foskey, Martinp, Martombo, Masatran, Mathcount, MaxHD, Maxal, Maxentrope, Maylene, Mblumber, Mbmneuro, Mbryantuk, Melcombe,
Michael Hardy, MicioGeremia, Mikael V, Misha Stepanov, Mlpearc, Mnath, Moink, MoonDJ, MrOllie, Mtford, Nagasaka, Nanshu, Narayanese, Nasarouf, Nelson50, Nosophorus, Nsaa, Nuno
Tavares, Nvartaniucsd, Oceans and oceans, Ohnoitsjamie, Oli Filth, Oneboy, Orderud, OrgasGirl, Ott2, P99am, Paul August, PaulieG, PaulxSA, Pbroks13, Pcb21, Pdelmoral, Pete.Hurd,
PeterBoun, Pgreenfinch, Philopp, Phluid61, PhnomPencil, Pibwl, Pinguin.tk, PlantTrees, Pne, Pol098, Popsracer, Poupoune5, Qadro, Quantumelfmage, Quentar, Qwfp, Qxz, R'n'B, RWillwerth,
Ramin Nakisa, Rcsprinter123, Redgolpe, Renesis, RexJacobus, Reza1615, Rich Farmbrough, Richie Rocks, Rinconsoleao, Rjmccall, Rjwilmsi, Robma, RockMagnetist, Rodo82, Ronnotel, Ronz,
Rs2, Rygo796, SKelly1313, Saltwolf, Sam Korn, Samratvishaljain, Sankyo68, Sergio.ballestrero, Shacharg, Shreevatsa, Sjoemetsa, Snegtrail, Snoyes, Somewherepurple, Spellmaster, Splash6,
Spotsaurian, SpuriousQ, Stefanez, Stefanomione, StewartMH, Stimpy, Storm Rider, Superninja, Sweetestbilly, Tarantola, Taxman, Tayste, Techhead7890, Tesi1700, Theron110, Thirteenity,
ThomasNichols, Thr4wn, Thumperward, Tiger Khan, Tim Starling, Tom harrison, TomFitzhenry, Tooksteps, Toughpkh, Trebor, Twooars, Tyrol5, UBJ 43X, Urdutext, Uwmad, Vgarambone,
Vipuser, Vividstage, VoseSoftware, Wavelength, Wile E. Heresiarch, William Avery, X-men2011, Yoderj, Zarniwoot, Zoicon5, Zr40, Zuidervled, Zureks, Мурад 97, 471 anonymous edits


Image Sources, Licenses and Contributors
Image:Random-data-plus-trend-r2.png Source: http://en.wikipedia.org/w/index.php?title=File:Random-data-plus-trend-r2.png License: GNU Free Documentation License Contributors:
Adam majewski, Maksim, Rainald62, WikipediaMaster
File:Tuberculosis incidence US 1953-2009.png Source: http://en.wikipedia.org/w/index.php?title=File:Tuberculosis_incidence_US_1953-2009.png License: Creative Commons
Attribution-Sharealike 3.0 Contributors: User:Ldecola
File:Stationarycomparison.png Source: http://en.wikipedia.org/w/index.php?title=File:Stationarycomparison.png License: Creative Commons Attribution-Sharealike 3.0 Contributors:
Protonk (talk)
Image:Acf new.svg Source: http://en.wikipedia.org/w/index.php?title=File:Acf_new.svg License: Creative Commons Attribution-Sharealike 2.5 Contributors: Acf.svg: Jeremy Manning
derivative work: Jrmanning (talk)
File:Comparison_convolution_correlation.svg Source: http://en.wikipedia.org/w/index.php?title=File:Comparison_convolution_correlation.svg License: Creative Commons
Attribution-Sharealike 3.0 Contributors: Cmglee
File:Comparison convolution correlation.svg Source: http://en.wikipedia.org/w/index.php?title=File:Comparison_convolution_correlation.svg License: Creative Commons
Attribution-Sharealike 3.0 Contributors: Cmglee
Image:white-noise.png Source: http://en.wikipedia.org/w/index.php?title=File:White-noise.png License: GNU Free Documentation License Contributors: Omegatron
Image:Noise.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Noise.jpg License: GNU Free Documentation License Contributors: A1
Image:simulation-filter.png Source: http://en.wikipedia.org/w/index.php?title=File:Simulation-filter.png License: GNU Free Documentation License Contributors: Bapho, Maksim, 1
anonymous edits
Image:whitening-filter.png Source: http://en.wikipedia.org/w/index.php?title=File:Whitening-filter.png License: GNU Free Documentation License Contributors: Maksim, Mdd, 1 anonymous
edits
Image:Random Walk example.svg Source: http://en.wikipedia.org/w/index.php?title=File:Random_Walk_example.svg License: GNU Free Documentation License Contributors: Morn (talk)
File:2D Random Walk 400x400.ogv Source: http://en.wikipedia.org/w/index.php?title=File:2D_Random_Walk_400x400.ogv License: Creative Commons Attribution-Sharealike 3.0
Contributors: Purpy Pupple
Image:Flips.svg Source: http://en.wikipedia.org/w/index.php?title=File:Flips.svg License: Public Domain Contributors: Flips.png: en:User:Whiteglitter79 derivative work: Tsaitgaist (talk)
Image:Random walk in2D closeup.png Source: http://en.wikipedia.org/w/index.php?title=File:Random_walk_in2D_closeup.png License: Public Domain Contributors: Darapti, Jahobr, Oleg
Alexandrov
Image:random walk in2D.png Source: http://en.wikipedia.org/w/index.php?title=File:Random_walk_in2D.png License: Public Domain Contributors: Darapti, Jahobr, Oleg Alexandrov
Image:Random walk 2000000.png Source: http://en.wikipedia.org/w/index.php?title=File:Random_walk_2000000.png License: Creative Commons Attribution-Sharealike 3.0 Contributors:
Purpy Pupple
Image:Walk3d 0.png Source: http://en.wikipedia.org/w/index.php?title=File:Walk3d_0.png License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: Darapti, Karelj,
Zweistein
Image:Brownian hierarchical.png Source: http://en.wikipedia.org/w/index.php?title=File:Brownian_hierarchical.png License: GNU Free Documentation License Contributors: Anarkman,
Darapti, Di Gama, 1 anonymous edits
File:Antony Gormley Quantum Cloud 2000.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Antony_Gormley_Quantum_Cloud_2000.jpg License: Creative Commons Attribution
2.0 Contributors: User:FlickrLickr
File:Random walk in1d.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Random_walk_in1d.jpg License: Creative Commons Attribution-Sharealike 3.0 Contributors:
User:Flomenbom
File:Brownian motion large.gif Source: http://en.wikipedia.org/w/index.php?title=File:Brownian_motion_large.gif License: Creative Commons Attribution-Sharealike 3.0 Contributors:
User:Lookang
File:Brownianmotion5particles150frame.gif Source: http://en.wikipedia.org/w/index.php?title=File:Brownianmotion5particles150frame.gif License: Creative Commons Attribution-Sharealike
3.0 Contributors: User:Lookang
Image:Brownian hierarchical.svg Source: http://en.wikipedia.org/w/index.php?title=File:Brownian_hierarchical.svg License: Public Domain Contributors: Di Gama, JonMcLoone, Perditax
Image:Wiener process 3d.png Source: http://en.wikipedia.org/w/index.php?title=File:Wiener_process_3d.png License: GNU Free Documentation License Contributors: Original uploader was
Sullivan.t.j at English Wikipedia.
Image:PerrinPlot2.svg Source: http://en.wikipedia.org/w/index.php?title=File:PerrinPlot2.svg License: Public Domain Contributors: J. B. Perrin, SVG drawing by User:MiraiWarren
File:Diffusion of Brownian particles.svg Source: http://en.wikipedia.org/w/index.php?title=File:Diffusion_of_Brownian_particles.svg License: Creative Commons Zero Contributors:
InverseHypercube
File:Brownian motion gamboge.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Brownian_motion_gamboge.jpg License: Public Domain Contributors: Bernard H. Lavenda
File:Brownian Motion.ogv Source: http://en.wikipedia.org/w/index.php?title=File:Brownian_Motion.ogv License: Creative Commons Attribution-Sharealike 3.0 Contributors: DSP-user (talk)
19:43, 26 January 2011 (UTC)
Image:BMonSphere.jpg Source: http://en.wikipedia.org/w/index.php?title=File:BMonSphere.jpg License: Creative Commons Attribution-Sharealike 2.5 Contributors: Thomas Steiner
Image:wiener_process_zoom.png Source: http://en.wikipedia.org/w/index.php?title=File:Wiener_process_zoom.png License: GNU Free Documentation License Contributors: User:PeR
Image:Wiener_process_3d.png Source: http://en.wikipedia.org/w/index.php?title=File:Wiener_process_3d.png License: GNU Free Documentation License Contributors: Original uploader
was Sullivan.t.j at English Wikipedia.
File:ArTimeSeries.svg Source: http://en.wikipedia.org/w/index.php?title=File:ArTimeSeries.svg License: Creative Commons Attribution 3.0 Contributors: Tomaschwutz
File:AutocorrTimeAr.svg Source: http://en.wikipedia.org/w/index.php?title=File:AutocorrTimeAr.svg License: Creative Commons Attribution 3.0 Contributors: Tomaschwutz
File:AutoCorrAR.svg Source: http://en.wikipedia.org/w/index.php?title=File:AutoCorrAR.svg License: Creative Commons Attribution 3.0 Contributors: Tomaschwutz
File:MovingAverage.GIF Source: http://en.wikipedia.org/w/index.php?title=File:MovingAverage.GIF License: Public Domain Contributors: Victorv
Image:Weighted moving average weights N=15.png Source: http://en.wikipedia.org/w/index.php?title=File:Weighted_moving_average_weights_N=15.png License: Creative Commons
Attribution-ShareAlike 3.0 Unported Contributors: Joxemai, Kevin Ryde
Image:Exponential moving average weights N=15.png Source: http://en.wikipedia.org/w/index.php?title=File:Exponential_moving_average_weights_N=15.png License: Creative Commons
Attribution-ShareAlike 3.0 Unported Contributors: Joxemai, Kevin Ryde
Image:Function ocsillating at 3 hertz.svg Source: http://en.wikipedia.org/w/index.php?title=File:Function_ocsillating_at_3_hertz.svg License: Creative Commons Attribution-Sharealike 3.0
Contributors: Thenub314
Image:Onfreq.svg Source: http://en.wikipedia.org/w/index.php?title=File:Onfreq.svg License: GNU Free Documentation License Contributors: Original: Nicholas Longo, SVG conversion:
DX-MON (Richard Mant)
Image:Offfreq.svg Source: http://en.wikipedia.org/w/index.php?title=File:Offfreq.svg License: Creative Commons Attribution-Sharealike 3.0 Contributors: Thenub314
Image:Fourier transform of oscillating function.svg Source: http://en.wikipedia.org/w/index.php?title=File:Fourier_transform_of_oscillating_function.svg License: Creative Commons
Attribution-Sharealike 3.0 Contributors: Thenub314
File:Rectangular function.svg Source: http://en.wikipedia.org/w/index.php?title=File:Rectangular_function.svg License: GNU Free Documentation License Contributors: Aflafla1, Axxgreazz,
Bender235, Darapti, Omegatron
File:Sinc function (normalized).svg Source: http://en.wikipedia.org/w/index.php?title=File:Sinc_function_(normalized).svg License: GNU Free Documentation License Contributors: Aflafla1,
Bender235, Juiced lemon, Krishnavedala, Omegatron, Pieter Kuiper
File:Signal processing system.png Source: http://en.wikipedia.org/w/index.php?title=File:Signal_processing_system.png License: Creative Commons Attribution-Sharealike 3.0 Contributors:
User:Brews ohare
Image:Vix.png Source: http://en.wikipedia.org/w/index.php?title=File:Vix.png License: Public Domain Contributors: Artur adib


Image:Levy distributionPDF.png Source: http://en.wikipedia.org/w/index.php?title=File:Levy_distributionPDF.png License: Public Domain Contributors: User:PAR
Image:Levyskew distributionPDF.png Source: http://en.wikipedia.org/w/index.php?title=File:Levyskew_distributionPDF.png License: GNU Free Documentation License Contributors:
EugeneZelenko, Joxemai, PAR, 1 anonymous edits
Image:Levy distributionCDF.png Source: http://en.wikipedia.org/w/index.php?title=File:Levy_distributionCDF.png License: GNU Free Documentation License Contributors:
Image:Levyskew distributionCDF.png Source: http://en.wikipedia.org/w/index.php?title=File:Levyskew_distributionCDF.png License: GNU Free Documentation License Contributors:
Image:Levy LdistributionPDF.png Source: http://en.wikipedia.org/w/index.php?title=File:Levy_LdistributionPDF.png License: GNU Free Documentation License Contributors:
Image:Levyskew LdistributionPDF.png Source: http://en.wikipedia.org/w/index.php?title=File:Levyskew_LdistributionPDF.png License: GNU Free Documentation License Contributors:
Image:Stockpricesimulation.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Stockpricesimulation.jpg License: Public Domain Contributors: Roberto Croce
File:European Call Surface.png Source: http://en.wikipedia.org/w/index.php?title=File:European_Call_Surface.png License: Creative Commons Attribution-Sharealike 3.0 Contributors:
Parsiad.azimzadeh
File:Crowd outside nyse.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Crowd_outside_nyse.jpg License: Public Domain Contributors: AnRo0002, Echtner, Fnfd, Gribeco,
Gryffindor, Hystrix, Infrogmation, J 1982, Romary, Skeezix1000, Soerfm, Spuk968, Yerpo, 6 anonymous edits
Image:SQRTDiffusion.png Source: http://en.wikipedia.org/w/index.php?title=File:SQRTDiffusion.png License: GNU Free Documentation License Contributors: Thomas Steiner
Image:Pi 30K.gif Source: http://en.wikipedia.org/w/index.php?title=File:Pi_30K.gif License: Creative Commons Attribution 3.0 Contributors: CaitlinJo
Image:Monte carlo method.svg Source: http://en.wikipedia.org/w/index.php?title=File:Monte_carlo_method.svg License: Public Domain Contributors: --pbroks13talk? Original uploader was
Pbroks13 at en.wikipedia
File:Monte-carlo2.gif Source: http://en.wikipedia.org/w/index.php?title=File:Monte-carlo2.gif License: unknown Contributors: Zureks
File:Monte-Carlo method (errors).png Source: http://en.wikipedia.org/w/index.php?title=File:Monte-Carlo_method_(errors).png License: unknown Contributors: Zureks

License 190

License
Creative Commons Attribution-Share Alike 3.0 Unported
//creativecommons.org/licenses/by-sa/3.0/

Time series and forecasting from wikipedia

More Related Content

Viewers also liked (16)

Similar to Time series and forecasting from wikipedia (20)

More from Monica Barros (13)

Time series and forecasting from wikipedia