by Anais Dotis-Georgiou

Getting started with time series analysis

feature
Jul 08, 20216 mins
AnalyticsData ScienceDatabases

Time series analysis involves identifying attributes of your time series data, such as trend and seasonality, by measuring statistical properties.

abstract data analytics
Credit: Thinkstock

From stock market analysis to economic forecasting, earthquake prediction, and industrial process and quality control, time series analysis has countless applications that enterprises of all kinds rely on to detect trends, develop forecasts, and improve outcomes. In the past year, using time series modeling to manage responses to the pandemic has definitely been one of the most urgent applications of time series analysis.

Time series analysis involves identifying attributes of your time series data, such as trend and seasonality, by measuring statistical properties such as covariance and autocorrelation. Once the attributes of observed time series data are identified, they can be interpreted, integrated with other data, and used for anomaly detection, forecasting, and other machine learning tasks.

Programming languages used for time series analysis and data science include Python, R, Java, Flux, and others. Learning how time series pertains to data science is a great place to start whether you’re interested in becoming a data scientist or simply need to perform time series forecasting or anomaly detection for your use case.

Storing and visualizing time series data

As the Internet of Things (IoT) plays a larger role in all of our lives and as industrial IoT technologies increasingly depend on time series analysis to achieve operational efficiencies and enable predictive maintenance, the ability to scalably ingest, store, and analyze time series data has become a necessity within data infrastructures. 

To ingest and manage time series data, a purpose-built time series platform with built-in UI and analytics capabilities can go a long way in preparing an organization to handle time series data and run data modeling and online machine learning workloads. An effective purpose-built time series database should enable users to automatically retire old data, easily downsample data to lower-resolution data, and transform time series on a schedule in preparation for future analysis.

Another necessity, since time series analysis is based on data plotted against time, is to visualize the data—often in real time—to observe any patterns that might occur over time. An effective purpose-built UI should facilitate cross-collaboration with teams working on time series in different time zones, efficiently render visualizations that represent millions of time series points, and easily enable users to take corrective action in response to their time series data. 

Attributes of time series data

Time series data can be understood through three components or characteristics: 

  • Trend refers to any systematic change in the level of a series—i.e., its long-term direction. Both the direction and slope (rate of change) of a trend may remain constant or change throughout the course of the series.
  • Seasonality refers to a repeating pattern of increase and decrease in the series that occurs consistently throughout its duration. Seasonality is commonly thought of as a cyclical or repeating pattern within a period of one year, but seasons aren’t confined to a yearly time scale. Seasons can exist in the nanosecond range as well.
  • Residuals refer to what’s left after you remove the seasonality and trend from the data.

In a time series, the independent variable is often time itself, which is used to develop forecasts. To get to that point, you have to understand whether the time series is “stationary” or whether there is seasonality.

A time series is stationary if it has a constant mean and variance regardless of changes in the independent variable of time itself. Covariance is frequently used as a measure of the stationarity of a series. Autocorrelation is frequently used to identify seasonality within a time series. Autocorrelation measures the similarity of observations between a time series and a delayed or lagged copy of that time series.

Classical time series models

The first step in performing time series forecasting is to learn about various algorithms and methods that exist to help you achieve your goal. Always research the underlying statistical assumptions of the algorithm you choose, and verify whether or not your data violates those assumptions. Classical time series forecasting models fall into three broad categories:

  • Autoregressive models are used to represent a type of random process and are most commonly used to perform time series analysis in the context of economics, nature, and other domains. Forecasts from autoregressive models depend linearly on past observations and a stochastic term.
  • Moving-average models are commonly used to model univariate time series, as the forecast depends linearly on the residual errors from previous forecasts. It assumes that your time series is stationary.
  • Exponential smoothing models are used for univariate time series. The forecasts are an exponentially weighted sum of past observations.

The attributes of your time series data, as well as your use case, help you determine which time series forecasting model to use.

[ Also on InfoWorld: Visualizing time series data ]

Methods of time series analysis

Various time series analysis methods serve various purposes. For example:

  • Spectral analysis is widely used in fields such as geophysics, oceanography, atmospheric science, astronomy, and engineering. It allows discovering underlying periodicities in time series data. The spectral density can be estimated using an object known as a periodogram, which is the squared correlation between our time series and sine/cosine waves at the different frequencies spanned by the series.
  • Wavelet analysis is used for signal processing. A wavelet is a function that is localized in time and frequency, generally with a zero mean. It is also a tool for decomposing a signal by location and frequency.

Anais Dotis-Georgiou is a developer advocate for InfluxData with a passion for making data beautiful with the use of data analytics, AI, and machine learning. She takes the data that she collects and applies a mix of research, exploration, and engineering to translate the data into something of function, value, and beauty. When she is not behind a screen, you can find her outside drawing, stretching, boarding, or chasing after a soccer ball.

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.