Exposé, an Australian-based data analytics company, published a use case in which they analyze the benefits of a custom-made machine learning solution. The only piece of data in their report [PDF] was a graph which shows the observed and the predicted
Graphs like this one provide an easy-to-digest overview of the data but are meaningless with respect to our ability to judge model accuracy. When predicting values of time series, it is customary to use all the available data to predict the next step. In cases like that, “predicting” the next value to be equal to the last available one will result in an impressive correlation. Below, for example, is my “prediction” of Apple stock price. In my model, I “predict” tomorrow’s prices to be equal to today’s closing price plus random noise.
Look how impressive my prediction is!
I’m not saying that Exposé constructed a nonsense model. I have no idea what their model is. I do say, however, that their communication is meaningless. In many time series, such as consumption dynamics, stock price, etc, each value is a function of the previous ones. Thus, the “null hypothesis” of each modeling attempt should be that of a random walk, which means that we should not compare the actual values but rather the changes. And if we do that, we will see the real nature of the model. Below is such a graph for my pseudo-model (zoomed to the last 20 points)
Suddenly, my bluff is evident.
To sum up, a direct comparison of observed and predicted time series can only be used as a starting point for a more detailed analysis. Without such an analysis, this comparison is nothing but a meaningless illustration.
Hi Boris,
I noted your comments re our published case study.
The case study and the included graph of model performance is intended as a high level communications piece aimed at business users. It demonstrates that the model captures approximately the same magnitude maxima and minima at the same points as the true series, important properties if the model were used for business planning. A differenced series would be more difficult for the intended audience to interpret and would remove the ability to compare magnitudes.
On the NULL model – Even without differencing, our model may be quickly distinguished from a null martingale model by noting that turning points occur at the same time as actual turning points, a property that does not occur in the null model.
LikeLike
> On the NULL model – Even without differencing, our model may be quickly distinguished from a null martingale model by noting that turning points occur at the same time as actual turning points, a property that does not occur in the null model.
This might be the case. I can only judge using the information that I have, and what you say is not *evident* from the graph because the granularity isn’t clear. Look how in my first graph, the scale and the resolution are such that you can’t see the obvious lag between the two curves. Without more information, even my graph looks as if the turning points in the two curves occur together.
LikeLike
As I said,
> I’m not saying that Exposé constructed a nonsense model. I have no idea what their model is.
It’s not a scientific paper, you are free to use any graph you want. My point is that the graph you included in your publication has no actual meaning without more details, which the “high-level communications piece” report didn’t provide. I will be the first to admit how hard it is to compose a high-level document that is both numerically accurate and not intimidating or too technical. Definitely, finding mistakes in the works of others is more comfortable than creating own flawless work. However, writing (and reading) criticism, even when it’s nitpicking, is a useful training and educational tool.
LikeLike