Choosing the “right” demand forecasting model

Nov 28, 2018  |  5 min read

When building demand forecasts for consumer goods, there’s a variety of algorithms you can use, from longstanding best practices to cutting-edge methodologies. While each have their pros and cons, at their core, every method is ultimately using historical data to try to predict future demand. The complexity, assumptions, and types of data inputs used in a given model type — and how they are weighted — will vary, but the basic ingredients are similar across the board.

So with all these choices, it can be difficult to know which methodology will work best for your products. In this post, we’ll review different classes of commonly used modeling methods, including simple historical averages, classical time series with regressors, and the new classes of machine learning and artificial intelligence methodologies. Our goal is to give you a taste of where each really shines, as well as the tradeoffs that go along with them.

1. Historical Average

Historical average models smooth out demand by taking averages over different historical periods. This is one of the simplest classes of forecasting models, and tends to work well for long-term or high-level planning, where basic trends over time are most important to understand and the daily or weekly variation can be de-emphasized.

Pros: The primary advantage of a historical average model over other methodologies is its simplicity; this type of model can be easily implemented using almost any tool, including Excel and Tableau, making it minimal work to compare results in different systems or debug calculation errors.

Cons: Historical averages are anchored to their previous periods and react to changes slowly. They are also overly responsive to outliers in the data, a problem stemming from the fact that these models do not use regressors.

Example model types: Simple Moving Averages, Holt-Winters Exponential Smoothing

Best for: Products with low forecastability. These products may be regularly sold, but have low daily/weekly volume, so intermittent demand isn’t easily predicted and there is low value for improving the forecasting. High-level budgeting and planning processes can also use these methods, since short-term and seasonal variations are smoothed out.

2. Time Series with Added Regressors

This class features regression models that are fit by comparing different historical time periods. These models can also incorporate other inputs, such as price promotions, seasonality, and weather patterns.

Pros: Models in this class are especially adept at predicting products that have consistent variation, such as established seasonality trends. These models can also take into account additional factors known to influence demand, such as promotional and price activity.

Cons: Regression models expect consistent variation in the data, which means that datasets with a limited history force the model to make a fit on too few data points. Time series must be inspected for proper assumptions, such as stationarity and homoscedasticity. Regressors must be scaled and pre-selected to prevent issues such as multicollinearity.

Example model types: Seasonal ARIMAX, Generalized Additive Models

Best for: Products with well-defined seasonality or changes in demand, e.g., swimsuits and winter coats, and products for which the effects of promotional activity can be easily captured using regressors.

3. Machine Learning/Artificial Intelligence

Machine learning models are a broad set of methodologies that use more complex mathematical techniques to select variables and optimize fit in instances where there may be complicated interactions between features. These can be powerful choices, but you’ll want to make sure these obey the integrated, transparent, and actionable properties described in our white paper.

Pros: These types of models are great for discovering non-linear and complex relationships in your data, without needing to preselect the exact model type or make assumptions about external factors. Instead of explicitly weighting variables or variable interactions, many of these methods allow you to determine variable importance without worrying about the effects of multicollinearity.

Cons: The challenge with machine learning is it requires a lot of input data and a fair amount of investment in setup and maintenance. The results of these types of models can also be more complicated to interpret correctly and can be prone to overfitting.

Example model types: Random Forest, Gradient Boosted Machines, Neural Networks (LSTM-RNN, CNN) Support Vector Machine

Best for: Products sold widely and very frequently that may have nuanced buying patterns given the scope and volume of sales.

Which model is best?

While the increased complexity of AI/ML models can yield more accurate demand forecasts in some cases, tradeoffs in the difficulty of implementation mean that the nuances of your products determine which model will yield the best results for you. So, how do you go about choosing the right model, or set of models, for your business?

Weighing the pros and cons of each model outlined above are a start, but your product set might not neatly fit into a single model type. To work around this limitation, it’s also possible to build an ensemble model, which takes models from multiple different classes and averages across them. This approach has the benefit of incorporating effects generated from each of methodology; similar effects are strengthened, while inconsistent effects are canceled out.

Model classes

If you want to discuss your forecasting needs in greater detail with one of our data scientists, drop us a line at inquiries@alloy.ai. Be sure to also keep an eye on our blog, where we’ll continue to publish posts that explore different aspects of the demand forecasting process.

Posted by Alloy