More accurate demand forecasts, powered by big data and machine learning, can generate millions in additional revenue for brands.
In our forecasting white paper, we shared the three principles of modern forecasting: use an integrated approach, keep the methodology transparent, and make results actionable. In this blog post, we’ll discuss why the second principle, keeping methodology transparent, is especially important as executives look to make decisions based on demand forecasts.
Grasping the data science behind forecasts might sound intimidating, but it’s important to remember the scope of the transparency — it’s about understanding why a model produced certain results. As a business leader, you don’t need to evaluate the inner workings of forecasting algorithms, but you should understand how the decisions and tradeoffs made while designing a model impact its outcome. When everyone understands how a forecasting model works — its limitations, biases, margin of error, etc. — you can make the smartest decisions possible based on the results.
Below, we’ll walk through five probing questions executives can ask to improve your understanding of how forecasting models work and the associated risks and opportunities of your forecasts. By taking the time to become familiar with these concepts, you’ll be able to make decisions based on forecasts with greater confidence and accuracy.
1. How are we measuring the quality of our forecasts?
This is the overarching point to keep in mind when looking at forecasting model results: how close does the forecast get us to where we want to be? A model designed for a perishable product line, such as produce, will likely be very different from one that supports seasonal apparel, such as mittens. Both models might be statistically sound and high-quality, so it’s not enough to rely on that standard alone. Instead, you need to think about the forecast’s results at a macro level. As an executive, your industry expertise provides helpful context for framing this evaluation. Examples of factors to consider include:
- How are we evaluating and reporting error? Keep the principle of Goodhart’s law in mind: “When a measure becomes a target, it ceases to be a good measure.” For instance, if you make “zero out-of-stocks” a goal for your supply chain, you could meet that target by keeping very high inventory levels, which might be a net negative for the brand.
- How are we evaluating and reporting uncertainty? Every statistical model has a certain margin of error associated with it, and it’s important to know what that margin is for your demand forecasts. How off-base could your model be, and what would that look like in terms of real business outcomes?
- How do the machine learning forecasts compare to baselines? This is where experience is especially relevant. How much of an improvement does the forecast represent over less-sophisticated models? How does it compare to the intuition of domain experts?
Once you’re better able to understand how the quality of the forecast’s output is being judged, you can more effectively know where the potential pitfalls are and where deviations from the forecast are most likely to occur.
2. Which features are included in our algorithm, and how are we choosing them?
Advanced forecasting is kind of like flying an airplane: there’s a whole host of levers and controls that can be adjusted to affect performance. It’s important to understand the different factors that influence an algorithm’s output, and to evaluate those individually, in addition to looking at the model’s end result. Two examples of factors to consider are:
- Promotional deals: Discounts can have a big impact on overall sales, so if your company runs promos frequently, it’s important to include this information as a factor in your model.
- Unconstrained demand: If one of your products is chronically out of stock for the last few days of every week, then sales data likely isn’t an accurate picture of what true demand looks like.
With a better understanding of the factors that are affecting a given model’s output, you can make decisions that take into account potential shortcomings or pitfalls of that model.
3. Why did the algorithm make the prediction that it did?
This question is a direct corollary to the previous question. Once you understand the factors that are influencing the model’s results, you can then evaluate the outcome to determine if that combination of factors matches your understanding of the industry. For instance, is it realistic that local temperatures would have a significant effect on consumer electronics sales? Or is it more likely that cooler temperatures correspond with the holiday shopping season, and that’s the real driver of increased sales?
Once you understand the impact that different features have on the model’s performance, you can make decisions about how to adjust your strategy and what levers you can pull to create the desired outcome.
4. Is the model underfitting or overfitting?
Since no machine learning model is perfect, creating the best possible fit is often an exercise in balancing underfitting against overfitting.
An algorithm that’s underfit (also referred to as having high bias) will perform similarly in real life compared to how it performed against training data, but the similarity will lie in the fairly high error rate. Oftentimes the model is too simple and cannot take into account variations in input data.
An algorithm that’s overfit (also referred to as having high variance) performs very well on training data, but has a higher error rate on real-life data. The model may be too attuned to patterns in the data it was trained on that don’t generalize well to real situations.In general, dialing down bias will increase variance, and vice versa. That’s why the quest to minimize both values is an exercise in balance. As an executive, it’s important you understand the tradeoffs between underfitting and overfitting in the context of the decisions your forecast is being used to make. For instance, if the goal of the forecast is to predict sales of a newly-launched product, you might accept a higher bias in your model, since there isn’t a lot of historical data to work from.
5. What is the quality of our training data?
Remember the common saying “garbage in, garbage out”? In data science, it refers to the fact that an algorithm can only be as good as the data it was trained on, which is why it’s important to understand the quality of that data. Some questions you can ask to better understand training data quality are:
- How granular is the training data? If you don’t have sufficiently granular training data, then your model will likely be underfit, since it hasn’t considered realistically complex data.
- How well-labeled is the data? Any incorrectly-labeled data will “teach” the model to make incorrect associations. For instance, if you include promotions as a feature but neglect to label them as such, the model may predict sales spikes at random intervals.
- Is the data balanced? If your data is overwhelmingly from a certain type of product (e.g., seasonally-influenced), sales channel or region, your model may overfit to that and produce results that don’t make sense in a broader context.
- Is the necessary metadata included? If stores have closed or products have been phased out, but the model isn’t provided that information, it may overfit to irrelevant information that happens to correlate to these changes, or underfit by ignoring them entirely.
The goal is for training data to reflect real life as closely as possible, so your model is set up to make solid forecasts. When you’re considering a model’s output, it’s important to understand how the quality and availability of training data may have impacted performance.
• • •
Knowledge is power
Forecasting models can involve a lot of complex data science, but by asking the right questions, executives can gain a better understanding of how the models work. This understanding empowers you to make decisions based off of your forecasts with greater confidence. It also provides the opportunity down the road to collaborate more closely with your data science team on future forecast development.
In the coming weeks, we’ll continue to explore different aspects of the forecasting process in detail, such as an evaluation of which forecast models work best in different situations. In the meantime, make sure to familiarize yourself with all three key principles of modern demand forecasting, discussed in our recent white paper.