Arif Zainurrohman
Nerd For Tech
Published in
6 min readMay 16, 2021

--

Time Series Analysis — Introduction

Weather, stock markets, and heartbeats. They all form time series. If you’re interested in diverse data and forecasting the future, you’re interested in time series analysis.

Time series data spans a wide range of disciplines and use cases. It can be anything from customer purchase histories to conductance measurements of a nano-electronic system to digital recordings of human language. One point we discuss throughout the book is that time series analysis applies to a surprisingly diverse set of data.

Overview

Time Series

Time series analysis is the endeavor of extracting meaningful summary and statistical information from points arranged in chronological
order. It is done to diagnose past behavior as well as to predict future behavior.

Innovations in time series analysis result from new ways of collecting, recording, and visualizing data. Next, we briefly discuss the emergence of time series analysis in a variety of applications.

Examples of Time Series

Australian monthly red wine sales
CO2 Series from Mauna Loa, Hawaii

The Origins of Statistical Time Series Analysis

Statistics is a very young science. Progress in statistics, data analysis, and time series has always depended strongly on when, where, and how data was available and in what quantity.

The emergence of time series analysis as a discipline is linked not only
to developments in probability theory but equally to the development of analysis.

Objectives of Time Series Analysis

Before we can do time series analysis, however, we necessary to set up a hypothetical probability model to represent the data. After an appropriate model has been chosen, it is then possible to estimate parameters, check for the goodness of fit to the data, and possibly to use the fitted model to enhance our understanding of the the mechanism generating the series.

Once a satisfactory model has been developed, it may be used in a variety of ways depending on the particular field of application.

The model may be used simply to provide a compact description of the data. For example, be able to represent the accidental deaths data, the sum of a specified trend, and seasonal and random terms. For the interpretation
of economic statistics such as unemployment figures, it is important to recognize the presence of seasonal components and to remove them so as not to confuse them with long-term trends.

Some Simple Time Series Models

An important part of the analysis of a time series is the selection of a suitable probability model (or class of models) for the data. To allow for the possibly unpredictable nature of future observations it is natural to suppose that each observation xt is a realized value of a certain random variable Xt.

  1. Some Zero-Mean Models

A. iid noise

Perhaps the simplest model for a time series is one in which there is no trend or seasonal component and in which the observations are simply independent and identically distributed (iid) random variables with zero mean. We refer to such a sequence of random variables X1, X2,… as iid noise. By definition, we can write, for any positive integer n and real numbers x1,…,xn,

B. A binary process

As an example of iid noise, consider the sequence of iid random variables {Xt, t  1, 2,…,} with

where p =0,5. The time series obtained by tossing a penny repeatedly and scoring +1 for each head and −1 for each tail is usually modeled as a realization of this process.

C. Random walk

The random walk {St, t  0, 1, 2,…} (starting at zero) is obtained by cumulatively summing (or “integrating”) iid random variables. Thus a random walk with zero mean is obtained by defining So = 0 and

where {Xt} is iid noise. If {Xt} is the binary process , then {St, t  0, 1, 2,…,} is called a simple symmetric random walk.

2. Models with Trend and Seasonality

In several of the time series case, there is a clear trend in the data. An increasing trend in cases of a zero-mean model for the data is clear. In other case, which contains no apparent periodic component, suggests trying a model of the form

where mt is a slowly changing function known as the trend component and Yt has zero mean. A useful technique for estimating mt is the method of least squares.

A General Approach to Time Series Modeling

Before introducing the ideas of dependence and stationarity, this approach provides an overview of the way in which the various ideas for time series analysis.

Plot the series and examine the main features of the graph, checking in particular whether there is a trend, a seasonal component, any apparent sharp changes in behavior, and any outlying observations.

Stationary Models and the Autocorrelation Function

In practical problems we do not start with a model, but with observed data {x1, x2,…,xn}. To assess the degree of dependence in the data
and to select a model, one of the important tools we can use is the sample autocorrelation function (sample ACF) of the data. If we believe that the data are realized values of a stationary time series {Xt}, then the sample ACF will provide us with an estimate of the ACF of {Xt}.

This estimate may suggest which of the many possible stationary time series models is a suitable candidate for representing the dependence in the data.

Estimation and Elimination of Trend and Seasonal Components

The first step in the analysis of any time series is to plot the data. If there are any apparent discontinuities in the series, such as a sudden change of level, it may be advisable to analyze the series by first breaking it into homogeneous segments.

If there are outlying observations, they should be studied carefully to check whether there is any justification for discarding them (for example if an observation has been incorrectly recorded). Inspection of a graph may also suggest the possibility of representing the data as a realization of the process (the classical decomposition model).

where mt is a slowly changing function known as a trend component, st is a function with known period d referred to as a seasonal component, and Yt is a random noise the component that is stationary.

If the seasonal and noise fluctuations appear to increase with the level of the process, then a preliminary transformation of the data is often used to make the transformed data more compatible with the model.

Testing the Estimated Noise Sequence

The objective of the data transformations is to produce a series with no apparent deviations from stationarity, and in particular with no apparent trend or seasonality.

Assuming that this has been done, the next step is to model the estimated noise sequence (i.e., the residuals obtained either by differencing the data or by estimating and subtracting the trend and seasonal components).

If there is no dependence between these residuals, then we can regard them as observations of independent random variables, and there is no further modeling to be done except to estimate their mean and variance. However, if there is significant dependence among the residuals, then we need to look for a more complex stationary time series model for the noise that accounts for the dependence. This will be to our advantage since dependence means in particular that past observations of the noise sequence can assist in predicting future values.

Reference

Practical Time Series Analysis Prediction with Statistics & Machine Learning

Introduction to Time Series and Forecasting, Second Edition

--

--

Arif Zainurrohman
Nerd For Tech

Corporate Data Analytics. Enthusiast in all things data, personal finance, and Fintech.