Lies, Damned Lies, and Statistics

George Friedman
Editor, This Week in Geopolitics


Economics and finance are thought to be more predictable than other disciplines such as politics because they are quantifiable. This is debatable. The extent to which quantitative economic analysis is possible depends on the relationship between the number and reality. That there is a relationship is true, but the relationship is more tenuous than might be thought.

I am not talking about the possibility that economic statistics are manipulated for political reasons.

This is certainly the case in some countries like China but less so, in my opinion, in Europe and the United States. But I want to put aside that theory and examine the validity of data assuming that everyone, everywhere, was honest.

Before we begin, a question…

Do you know how to separate the signal from the noise? It’s one of the most useful abilities a person can possess. But it’s also one of the hardest to acquire in the age of big data, big media bias, and the Internet, the biggest communication platform we’ve ever had. Reality gets diluted by the surreal.

Cutting through the noise to find insight and value is what has made Warren Buffett one of the most successful investors the world has seen. It’s what we strive to do here at Geopolitical Futures as well.

Now you can learn how to cut through the noise to become a better investor and analyst of the world, too. Find out how right here.

Let’s get started with This Week in Geopolitics.

Gauging the Economy

Countries consist of millions of people who conduct trillions of economic transactions each year.

Countless products are produced, warehoused, sold, and consumed. Value is created and destroyed. The vastness of economic activity in even the smallest countries, let alone countries like the United States with over 300 million people, makes it impossible to count each economic transaction. And so it becomes necessary to find a more reliable method of measurement.

Consider employment figures in the US. The country boasts some 160 million workers. The method for measuring employment and unemployment is not to count them all but to use a sample. Sampling is a method that selects a small, representative sample to survey, and then generalizes from that sample. The Bureau of Labor Statistics maintains a list of 60,000 households it calls monthly to determine who is employed, who isn’t, who has gotten a job that month, who had their hours cut or increased, and so on. Each month one quarter of this sample group is replaced, and after eight months, those who were dropped return to the rotation. This is where employment and unemployment numbers come from.

I don’t mean to demonize the Bureau of Labor Statistics. It employs only about 2,500 people who must select a representative sample of the population, contact them directly, and develop and apply statistical analysis to get final numbers. It is a huge task for so few people. And even if they do their jobs perfectly, their findings still may not be all that reliable.

This is in part because it’s unclear just how accurately the sample group represents the rest of the country. The group needs to correspond to the US labor market, but it’s very difficult to find 60,000 households that actually do. Compare this with, say, presidential polling, which is a notably smaller sample group (about 1,500–2,000 people). With few exceptions, those polled have a binary choice.

Tracking the changes is simple. Employment appears to be binary but it’s not. Vacation, sick leave, maternity leave, post-graduate education—all these affect the yes-or-no question of “Are you employed?”

Another reason to question the accuracy of employment statistics comes down to inputs. The samples used are built only with people who are prepared to share their economic data with the government.

Samples obviously exclude the kinds of economic activity and employment meant to evade taxes.

This raises an important question: How do we know the variables that make up a statistically valid household? Anyone who has done survey research knows that the creation of the model against which the sample is managed, alongside screening for dishonesty from the respondents, is a nightmare. There is a (sometimes true) belief that larger sample sizes yield more accurate results, so there is a tendency to create larger samples. A larger group providing information that must be shaped into a binary from a multivariate data set can create chaos, especially for a staff that obviously doesn’t have time for intense questioning.

The Moment of Truth

One way to adjust for this is to compare the sample with the real-world outcome. In political polling, there is the moment of truth for pollsters: the election. But for statisticians calculating employment figures, there is no such moment when the truth is revealed and the method adjusted. Political pollsters generate numbers but they also generate a margin of error, between 2% and 5%. A three-point margin of error creates a six-point range. A seven-point lead might mean a lead of somewhere between 4% and 10%.

Employment figures are never published with a margin of error. This is not because statisticians at the Bureau of Labor Statistics believe they are spot on but because they don’t know what the margin of error is. They have not had the moment of truth at which they have found out the relationship between the sample and the whole, and in not being able to measure error rates empirically, they don’t know how to project rates. They are there but we don’t know what they are. The results are presented without mention of uncertainty, giving the impression that the error range doesn’t exist when in fact it’s simply unknown. But we know that the larger and more complex the sample and the more complex the question (however seemingly simple), the greater the range.

Unemployment is relatively simple, especially compared to measuring gross domestic product.

GDP is staggeringly difficult, and it draws from numerous sources to try to aggregate economic productivity for the entire United States. In all economic statistics, there is an inherent margin of error, which makes economic forecasting difficult. It is not impossible, and correctives can be undertaken, but the fact is that no nation, even a totally honest one, knows precisely how the economy is doing. The best measure is to ask a small-business man how it is doing, and he will say that it is steady, up a little, rocking, weak, or disastrous. Ask a few thousand and you will get a sense of the economy’s status, and a sense is the best we can do.

It reminds me of the old joke: What is the definition of a lie in economics? A decimal point. I may have made up that joke but I can’t remember.

0 comentarios:

Publicar un comentario