In a previous tutorial, we discussed the basics of time series and time series analysis. We looked at how to convert data into time series data and analyse this in R. In this tutorial, we'll go into more depth and look at time series decomposition.
Data Snapshot
We’ll firstly recap the components of time series and then discuss the moving average concept. After that we’ll focus on two time series decompositions – a simple method based on moving averages and the local regression method.
You can download the data files for this tutorial here
Components of Time Series
We know that there are four time series components, out of which trend and seasonality are the main components. We can assume two models for time series – the additive model and the multiplicative model. When we assume the additive model, the data at any period t, that is ‘Yt’, is the addition of the trend ‘Tt’, seasonal ‘St’ and error ‘Rt’ components at period t.
Alternatively, in a multiplication model, we assume that Yt is the multiplication of different components Tt, St and Rt. When the magnitude of seasonal fluctuations or the variation around a trend cycle does not vary with the level of time series, the additive model is more appropriate than the multiplicative model.
Yt : Time series value at period t
Alternatively, in a multiplication model, we assume that Yt is the multiplication of different components Tt, St and Rt. When the magnitude of seasonal fluctuations or the variation around a trend cycle does not vary with the level of time series, the additive model is more appropriate than the multiplicative model.
Yt : Time series value at period t
St : Seasonal component at period t
Tt : Trend-cycle component at period t
Rt : Remainder (or irregular or error) component at period t
Alternatively, a multiplicative model would be written as:
Understanding Moving Averages
Understanding Moving Averages
In time series analysis, the moving average method is a common approach for estimating trends in time series. So let's understand how moving averages are calculated. Moving averages are averages calculated for consecutive data from overlapping subgroups of fixed length. Moving averages smoothen the time series by filtering out random fluctuations. The period of moving average depends on the type of data. For non-seasonal data, a shorter length, typically a 3 period or a 5-period moving average, is considered.
For seasonal data, the length equals the number of observations in a season, 12 for monthly data, 4 for quarterly data, etc.
While calculating a moving average of period 3, the first 2 moving averages are not calculated. The moving average for day 3 is the average of values at day 1,2 and 3. The moving average for day 4 is the average of values at day 2,3 and 4. Similarly, for a period 5, the first four moving averages are not calculated.
Time Series Decomposition – Simple Method
For seasonal data, the length equals the number of observations in a season, 12 for monthly data, 4 for quarterly data, etc.
While calculating a moving average of period 3, the first 2 moving averages are not calculated. The moving average for day 3 is the average of values at day 1,2 and 3. The moving average for day 4 is the average of values at day 2,3 and 4. Similarly, for a period 5, the first four moving averages are not calculated.
Time Series Decomposition – Simple Method
Let's now try to understand the first technique of time series decomposition. Decomposition is a statistical method that deconstructs a time series. The three basics steps to decompose a time series using the simple method are:
To find the trend, we obtain moving averages covering one season. We then eliminate the trend component from the original time series by calculating Yt minus Tt, where Tt is the trend value. Thirdly, to estimate the seasonal component for a given time period, we average the de-trended values for that time period. We then adjust these seasonal indexes to ensure that they add to zero. The remainder component is calculated by subtracting the estimated seasonal and trend-cycle components.
Let’s consider an example. Suppose we have monthly time series data for three years 2014, 2015 and 2016. First, calculate the moving average. We consider 13 values for capturing the trend in the yearly data – that is – we consider the previous 6 months, the following 6 months, and the current month to calculate moving average for the current month. This gives us the trend component. After doing that, we remove the trend component Tt from the original time series Yt. Finally, the seasonal index for July is the average of all the de-trended July values in the data, that is the average de-trended for July 2014, July 2015 and July 2016. Note that this moving averages approach is slightly different from what we discussed earlier as it uses pre and post-data values for a given period moving average.
Case Study
1) Estimating the trend
2) Eliminating the trend
3) Estimating Seasonality
To find the trend, we obtain moving averages covering one season. We then eliminate the trend component from the original time series by calculating Yt minus Tt, where Tt is the trend value. Thirdly, to estimate the seasonal component for a given time period, we average the de-trended values for that time period. We then adjust these seasonal indexes to ensure that they add to zero. The remainder component is calculated by subtracting the estimated seasonal and trend-cycle components.
Let’s consider an example. Suppose we have monthly time series data for three years 2014, 2015 and 2016. First, calculate the moving average. We consider 13 values for capturing the trend in the yearly data – that is – we consider the previous 6 months, the following 6 months, and the current month to calculate moving average for the current month. This gives us the trend component. After doing that, we remove the trend component Tt from the original time series Yt. Finally, the seasonal index for July is the average of all the de-trended July values in the data, that is the average de-trended for July 2014, July 2015 and July 2016. Note that this moving averages approach is slightly different from what we discussed earlier as it uses pre and post-data values for a given period moving average.
Case Study
Let’s consider a case study of monthly sales data for three years from 2013 to 2015. The objective of the study is to apply decomposition methods and analyse each component of the time series separately. We have 36 records with year, month, and sales as the variables of the study.
Data Snapshot
Here is a snapshot of the data. We have three columns representing three variables Year and Month are time variables, whereas Sales is our time series of interest.
Time Series Decomposition in R – Simple Method
Time Series Decomposition in R – Simple Method
First import our data using the read.csv function. As discussed in the previous tutorial, we’ll use the ‘ts’ function in R to convert a variable from a data frame to a time series object. We specify the x-axis scale, that is the year and month in our data, as the start and end argument. frequency=12 tells R that we have monthly data. Once we set our data frame to a time series object, we perform a classical seasonal decomposition through moving average by using the decompose function. We then plot the decomposed data using the plot function in R.
# Time Series Decomposition