When it comes to demand forecasting, incorporating out-of-stock (OOS) data is a complex subject that we first addressed in a prior blog post. It covered the three primary methods for incorporating OOS when developing your demand forecast: exclusion, imputation, and flagging.
To recap, exclusion is simple and makes for a clean time series, but it leaves gaps that may lead to under-prediction. Imputation fills those gaps with an estimate of what demand would have been were there no out-of-stock, but requires a good understanding of true demand. Flagging acknowledges OOS as a special category, but functions most effectively when there are consistent causes.
Clearly, each method captures different nuances of out-of-stocks and different methods are preferred in particular circumstances. Here, we’ll first consider how to choose the right one based on the characteristics of the OOS, including the cause of the OOS, the product type, and the location type. Once you’ve determined the method you would like to use, that has implications for the type of data you need and demand forecasting models you can use, which we’ll discuss in the second half of the post.
What is the cause of the out-of-stock?
Let’s begin by considering causation. The reasons behind an out-of-stock may range widely, from insufficient coordination between manufacturers and retailers, to extraordinary and unexpected demand, or even poor replenishment models.
Consider some sample scenarios:
- Say you had a one-time service disruption that caused a single late delivery and led to an out-of-stock. In this case, it’s reasonable to use exclusion and omit this event when developing your demand forecast. It doesn’t provide much insight, as long as it doesn’t happen more than once or twice.
- On the other hand, if the out-of-stock happened because of a special scenario that could occur again in the future, such as selling out due to a promotional offer, it’s best to use flagging since it can indicate the root cause. You can then incorporate these exceptional situations into your sales data for greater forecasting accuracy and simulate how they will affect future demand planning.
- Now let’s say that in contrast to our first example, you faced a situation where shelves were empty due to recurrent late delivery, or another frequent cause. If demand is regular and predictable, then the best method to choose for forecasting would be imputation. You can estimate what demand would have been using historical data, on previous, similar situations when out-of-stocks did not occur, and avoid leaving gaps in the data.
What types of products or locations were out-of-stock?
Also important is the fact that out-of-stocks lead to different customer reactions depending on the product type and channel characteristics, so these too must be considered.
For example, we know that 40% of shoppers will substitute one brand’s product for another when they encounter an out-of-stock in an online store, like Amazon. So in an instance with these particulars (online OOS; obvious substitutions), we would choose imputation as the best methodology as it’s easy to make an accurate estimate of what demand would have been by comparing data from the products being substituted.
In general, imputation is ideal for products with regular demand, where it’s easy to track downward or upward sales trends.
Exclusion is best for low-volume products such as new luxury items, or for high volatility products like non-tradable food, due to the relative infrequency of out-of-stocks. These factors make it difficult to estimate their likely demand, and OOS would be less valuable to understand more generally.
Flagging is applicable when out-of-stocks happen intermittently, when product or store demand is impacted by occasional events, such as inclement weather or a local sporting event. With clear OOS drivers, the flags are useful to simulate future impact on demand if an event like it recurs.
After you’ve determined what method to use — exclusion, imputation, or flagging — you need to ensure you have the right data and forecasting models in place to support it. Instead of the other way around where you let the data you have and your forecasting model determine what method you use to account for out-of-stocks, this approach helps ensure your forecast reflects as close to true demand as possible. You’ll notice that often the more work you have to do up-front to take out-of-stocks into account, the bigger the payoff in terms of what you can do with it.
Imputation: the most flexibility
If you’ve determined you can use imputation, your options are pretty open. It only requires aggregate sales data as you can still make a good estimate of what true demand would be at all times, as long as you have sufficient historical data. More granular data can help you more accurately estimate true demand by comparing to similar locations and events, but is not mandatory.
In addition, imputation leaves with you a clean, continuous time series, so you have a lot of flexibility to apply different forecasting methodologies. Historical average, time series with added regressors, and machine learning techniques can all work.
Exclusion: the most restrictive
In contrast, because exclusion creates gaps in the data, you need to be very careful what you use it with. You want to be able to pinpoint specific out-of-stock events that are one-time anomalies to exclude, so you should use very granular store/SKU data at the daily level. If the data is at a higher level, you could be inadvertently excluding good data from your input.
Furthermore, some historical average models, like exponential smoothing, don’t behave well with any data gaps and cannot be used with exclusion. Similarly, time series models typically like repeatability (e.g. seasonality, last week/last year comparisons) to identify patterns, so if you are doing a lot of exclusion, it could cause sparsity in places where you are missing the reference data and you’re limited in the models you can use. Machine learning methods can still work, but generally the more data you have to feed into them the better, so exclusion is not ideal.
Flagging: somewhere in between
Like exclusion, flagging requires that you have granular sales data to be able to add the right categorical indicator variable to each individual out-of-stock occurrence; otherwise, you have to know what fraction of locations or products were OOS. However, if you only have aggregate sales data, there may ways to intelligently fill in the gaps and get to more granular data, which Alloy can help with.
In terms of forecasting models, flagging works well with both Time Series with Added Regressors and Machine Learning models. For example, seasonal ARIMAX models, which compare different historical time periods and take additional factors into account, can easily make use of the out-of-stock flag. Machine learning systems can easily incorporate flagged data into their models, though some of the flagged events could cause a downward bias in the overall estimate (in this case, combining flagging with imputation can help correct for it).
IMPLICATIONS FOR CHOSEN METHOD
There’s a lot to consider when deciding which method to use to take into account out-of-stocks when preparing a demand forecast - the cause, product, location, data requirements, forecasting model implications. You can also try hybrid options, as we’ve alluded to above, where you combine multiple methods to balance out potential biases they create.
However, we believe it’s critical to forecast accuracy. Out-of-stocks create a difference between true demand - what consumers would actually like to purchase, and what you observe at the point-of-sale. Thus, to use true demand for forecasting, you must ensure you’re considered OOS in the data that you’re feeding into your forecasting model. Data preparation is the first step in a best-in-class forecasting process, and thus the foundation for everything that comes after.