1. Understand marginal distributions of return and risk among stocks
Sand Storm! 2010, a Blu-ray animation from DiligentInvestor, has three acts. Act I, Scene 1 is Marginal Distributions Remain Nearly Normal.
By Marginal Distributions* we mean the sketches at right and bottom showing the number of stocks having a value near that point on each axis. The blocky histograms are the counts in uniformly spaced cells (measurements). The smooth curves are the best fit Gaussian distributions (theory). When the histogram and the curve almost match, we say that the actual (but unknown) distribution that underlies the histogram is likely to be nearly normal. Watch out for departures from theory, symmetric Kurtosis where the histogram has a much flatter top or a much more peaked top than the curve, and asymmetric Skew where the histogram bulges out on one side of the mean. These occur when things are not normal, as noted below.
The coordinate space where we describe stock behavior probably is the most important feature of the DiligentInvestor econometric model. In this normal space, the statistical mean and standard deviation actually work the way they are defined, for a normal Gaussian distribution; in non-normal spaces, they do not. For an example of a non-normal space, it is better to speak of a median price for a home, rather than an average price, because the average (arithmetic mean) is not a good estimate for an expected value; housing prices do not follow a normal distribution. Neither do the raw return and risk among stocks, hence the transformations of the raw data into qReturn and pRisk here. Linear operators work better in this normal space.
Application of this concept to stocks first appeared in the Screener as described in Scatterplot Chart, particularly in its Distribution section and the companion historical Distribution Chart which illustrated marginal histograms and fitted Gaussian distributions for a dataset collected at end May, '07. At that time, there was little evidence that these two coordinate transformations, into qReturn vertically and pRisk horizontally, would survive different market conditions. One purpose of the Sand Storm! 2010 animation is to show you positive supporting evidence collected since that time, extending across the most dramatic change in market conditions in recent memory. A graphics animation showing the methods by which we find Return and Risk for the Selected Stock appears in Act III.
The format changes a little bit from that historical chart to the Blu-ray graphics described here, as shown in this Distribution Snapshot illustrating a frame from Scene 1 of the animation. Here we do not distinguish between cyclical or seasonal stocks and non-cyclicals, so each stock appears only as a black dot. Where a stock has a Recent Close above its significance envelope, a green background appears: if below its envelope, the background is red, as in the buy low - sell high discussion. New to this format, an orange background appears briefly behind a stock with a return that was going up and now is going down, or a blue background if its return was going down and now is going up. The relative counts of stocks with these conditions are shown by the small bar charts at the lower left of the plot. Additional minor differences from the Screener include modifying the coordinate system to apply throughout the entire duration of the animation without losing too many stocks at the edges.
The animation lets you can see how the entire market and your Selected Stock change with time. The two axes have identical scales, so a change of position over time is a velocity. The position coordinates in normal space start at the origin, shown by the intersection of the central horizontal and vertical lines at the mean values for each axis. Normal space has a metric, shown by the neighboring lines one standard deviation away from the means. Since a standard deviation means the same thing and has the same size in both coordinates, we can speak of a stock as moving at a speed reported as standard deviations per year, in any direction. The market itself is characterized by the average velocity of all the stocks at a given date, shown by the length of the arrow in the upper left corner. It is blue when average returns are going up, orange when returns are going down. The arrow points toward the direction of the average change in qReturn and pRisk, like a tidal current marker on a nautical chart, at some angle with respect to North and East.
This format also has a year hand just below the global average velocity arrow, rotating counter-clockwise (astronomical convention). Above it is a tic-mark at January first, so the snapshot shows the end of the first week in January 2007. To see an example clip of this Act I, Scene 1 from the Blu-ray animation, play the Act I, Scene 1 MPEG-2 file on a high resolution monitor. The rest of the page you are reading now digs deeper into the graphics illustrating the statistical analysis of these normal distributions.
Mean
Return: The central horizontal line is the origin at the global mean value of qReturn, measured for all the stocks over all the trading dates in the dataset. The short horizontal line through the peak of the smooth Gaussian curve on the right side of the scatterplot chart marks the local mean value of qReturn, measured for all stocks at the date indicated by the clock. By comparing the location of two means, local and global, you can see how much the market has drifted away from its typical distribution at this date.
Risk: The central vertical line is the origin at the global mean value of pRisk, measured for all the stocks over all the trading dates in the dataset. The short vertical line through the peak of the smooth Gaussian curve on the bottom of the scatterplot chart marks the local mean value of pRisk, measured for all stocks at the date indicated by the clock. As with qReturn, you can see the drift at this date. For these drifts, the best unit of measurement in normal space is the standard deviation. The local mean rarely moves more than one standard deviation away from the global mean, for both variables.
Standard Deviation
The global standard deviations for all data are represented by the outer pairs of lines, plus and minus one global standard deviation away from the central mean lines crossing at the origin. The local standard deviations for the indicated date are shown as yellow bars extending symmetrically about the local means. On a date when the global and local mean coincide, you can easily see whether the local standard deviation is different from the global standard deviation on that date. Part of the purpose of building these transformations from Return to qReturn and Risk to pRisk is to make sure that these local and global standard deviations in each variable remain nearly the same over time, the mark of a good metric. The local standard deviations are particularly useful in connection with the kurtosis.
Kurtosis
The mean and standard deviation describe a normal distribution. Kurtosis and skew measure departure from a normal distribution. Kurtosis looks for symmetrical departure, and is shown by the bar nearer to the center of the chart than the yellow standard deviation bar, colored either blue or orange. For these animated scatterplot charts, the width of the kurtosis bar is made equal to the width of the local standard deviation bar when the kurtosis has the value appropriate to a normal distribution, 3.0. If the observed distribution has a sharper peak than a normal distribution, the kurtosis bar is shorter than the normal distribution bar, and colored blue. If the observed distribution is flatter than normal, less peaked, the kurtois bar is wider than the standard deviation bar, and colored orange. Notice that the onset of the collapse in the market occurs when the qReturn kurtosis bar on the right changes from blue to orange, indicating a newly flattened distribution, that is, a change from leptokurtic to platykurtic qReturn.
Skew
The smallest bar in the marginal distribution figures, furthest from the edge of the chart, represents asymmetric departure from normal, or skew. It extends from the local mean line toward the densest part of the distribution, or mode, where the histogram shows the largest values. The color of the skew bar is green if skew is positive, longer tail on the side of larger values, red if negative, longer tail on the side toward smaller values. You might note that qReturn shows both positive and negative skew over time, while pRisk has only negative skew so far. Also, qReturn has both flatter and a more peaky kurtosis than normal, while pRisk only seems to be more peaky: even the logarithm of raw risk does not completely eliminate the long tail toward low risk. Have investors sought too much safety in these low-risk stocks?
To summarize the technical aspects of the scatterplot chart, the DiligentInvestor econometric model provides a normal space that illustrates individual stocks in the context of most of the other stocks in the market. Observations during 164 weeks of extreme behavior show that the marginal distributions of qReturn and pRisk remain nearly normal:
| Kurtosis: | min | mean | max | Skew: | min | mean | max | |
| qReturn: | 2.48 | 2.87 | 3.35 | -0.37 | -0.01 | 0.35 | ||
| pRisk: | 2.98 | 3.70 | 4.39 | -0.31 | -0.21 | -0.05 |
If an even closer approach to normal behavior is needed, attention to risk is always worthwhile. Here the pRisk kurtosis suggests admixture of the present log-normal distribution in raw risk with another and longer-tailed distribution, even more peaked than the anti-log of the logistic distribution. Alternatively, one might pursue the idea that perhaps a few dozen of the stocks are somehow different from the others, and should be identified as responsible for the peculiar histogram bins, to be analyzed separately.
*Technical note: Distribution means slightly different things, depending on context. A distribution function or cumulative distribution function starts at negative infinity with value zero and ends at positive infinity with unit value. A frequency distribution or data density distribution starts near zero value where the data begin at the left or bottom, and ends where the data end near zero value at the right or the top. We always mean the latter, that is, a histogram if derived from data, or a density distribution function if derived from theory. Unlike all of the many other uses of the word "marginal" by economists, here it literally means "drawn in the margins of the page."
| Up | Next |
© Copyright 2010 DiligentInvestor