2. The ideology behind this phrase
Numbers and formulas are supposed to represent objective scientific data you cannot deny and examined by intelligent and experienced experts. Now the complete liar wants his forgeries to look undeniably scientific, so why not use the magic of numbers that the not-so-math-literate masses could never deny? They say that statistics dont lie, and while that may be true, liars do use statistics. So it is with much that you read and hear. Averages and relationships and trends and graphs are not always what they seem. There may be more in them than meets the eye, and there may be a good deal less.
The secret language of statistics, so appealing in a factminded culture, is employed to sensationalize, inflate, confuse, and oversimplify. Statistical methods and statistical terms are necessary in reporting the mass data of social and economic trends, business conditions, opinion polls, the census. But without writers who use the words with honesty and understanding and readers who know what they mean, the result can only be semantic nonsense. 3. Use of this phrase in various places:-
In advertising and Politics and other forms of propaganda:
This phrase covers all instances of Artistic License Statistics where statistics are used to deceive people as to the truth. The problem is, people do not pay attention to the context, just the numbers. For example, the statement Brand X is 84% fat-free sounds good until you realize that this means the food product is 16% fat by weight. Also, fastest growing could mean that there used to be one customer and then there were five more, making a five-hundred percent increase. You should also notice Absolute Comparatives: its fastest growing, but specifically compared to when/what? The whole business of throwing percentages at people in advertising, politics and other forms of propaganda is almost destined for this kind of abuse. Relative measures are more likely to be understood accurately, and thus are less likely to be used in advertising. In casual links between things which are not related:
The bogus uses of statistics are intended to imply a causal link between two elements when they are not linked, the link is questionable, or the link is opposite to what is implied. A beautiful example? Coca-Cola causes drowning. By looking at statistics on drowning and Coca-Cola sales, you can see a link ” more people go swimming on hot days, and more people buy Coke on hot days. Likewise, birth rates per head of population are higher in areas where there are more storks ” because birth rates are always higher in rural areas, which is where one finds the Delivery Stork.
Correlation does not equal causation; if it does, then we might also conclude that global warming is caused by a decline in pirate population and that 100% of Homo Sapiens who consume dihydrogen monoxide will cease vital functions and decompose. Also be aware of the Law of Very Large Numbers. Any fraction of a very large number is likely to be a large number, no matter how small the fraction is. It is estimated that 2,135,000 Americans have used cocaine (including crack) in the past month. But thats only 0.7% of the population! So, is this a lot of people, or not? In making things more remarkable than they really are:
You should also be on the lookout for the related effect where things are made more remarkable than they really are. The odds that any given ticket will win a raffle may be very small, but it is certain that one will be a winner. Youd notice being dealt a royal flush in spades at poker, but the odds of it happening are exactly the same as those for being dealt any other hand of five specified cards. You can prove anything using statistics:
Statistics are like studies: who made them and who paid them matters a lot. Want to prove that video games cause violence? Get a group of scientists that are already savvy to this and dont mind the lack of ethics. Have them draw from a very small pool of test subjects that are known to display violent behavior. Mental hospitals , prisons , schools for children with behavior disorders, what have you. Do some generic tests that are guaranteed to show up positive, come up with numbers, and presto, instant headline. Recent test shows 77% of subjects become more violent after playing Mortal Kombat. Most people wont bother with reading the article the whole way through and will just look at the headline. This works with anything from comic books, and rock to watching Brokeback Mountain or voting for specific parties, basically anything.
4. There are a lot of examples which attest to this phrase and prove this statement right There are examples from nearly every aspect where statistics has been used to conclude a variety of false or incomplete conclusions. Some of them are:- 1. On a historical Note:-
Something of a historical subversion: During World War II, the Royal Air Force wanted to add more armor to their planes, but because of weight limits they needed to know which places needed the armor most. So, they examined the planes after they came back and counted how often bullet holes were found in certain areas¦ and then placed armor in places that showed the fewest bullet holes. This is because, they assumed, that any place that did have bullet holes was a place that planes could be hit and still fly . Helped by the fact: No plane that ever came back had holes where the gas tank was. Because planes whose tank was hit would explode and not come back. 2. Ridiculous Conclusions:-
Its a bit like the statistics on shark shows. You are more likely to die on the toilet than be eaten by a shark. When you compare how much time you spend around sharks versus how much time you spend around toilets ¦ really, the toilet has time to plan out its move in advance. Same deal with most accidents occurring in the home. Considering that you spend the majority of your time in your home, this should come as no surprise to anyone.
The same for the example above about most vehicular accidents occurring near the home (some say within 25 miles from your home). This is because most people do most of their driving near their homes, not that the home or the surrounding area is more dangerous than areas distant from the home.
3. For Doctors:-
Nine out of Ten Doctors Agree that the phrase Nine out of Ten Doctors Agree has been practically a stock phrase in advertising since the early 20th century. Nine out of ten dentists recommend Trident for their patients who chew gum. The tenth dentist was insistent that his patients never chew gum at all, but surprisingly, Trident didnt want you to know about that.
One interesting case happened in Portugal, where two ads were being broadcasted on national TV during the same period (and sometimes even in the same commercial break) claiming, respectively, that 90% of dentists use toothpaste X and 8 out of 10 dentists recommend toothpaste Y to their family. Together, if you stop to think about it, they imply something is not quite right about those professionals concern over their own family¦ Or that an awful lot of dentists are unmarried orphans, hence cant recommend it to a family they havent got. 4. Ad Campaign
In Montreal, there was an ad campaign run by a gum company whose gum came in round shapes instead of the usual square shapes. The ad said, 100% of people who chew square gum die. 5. Casinos:- Many casinos like to advertise their slot machines with lines like Up To 99% Payout! to make it sound like the player has a good chance to win. First, up to means the payout could be 1% for all you know (although laws usually set a minimum). Secondly, even a 99% payout means that for every $100 you put in the machine, on average, youll get $99 back, i.e. you still lose. That 99% payout is also an average that is based on something like one million pulls (plays) on the machine. If you play 100 times in one slot machine, youre not getting a representative sample of that average.
6. Programs On TV
Programs on Animal Planet are fond of citing how Americans spend more money annually on cat or dog food than on baby food. This is depicted as evidence that Americans pamper their pets like babies, but overlooks several facts: that pets eat pet food for their entire lives, whereas babies only eat baby food for about a year and a half, and that many families have more than one pet at a time, but relatively few have more than one child of an age to eat baby food at the same time. 7. Car Insurance
Ever wonder how all car insurance companies manage to to advertise that people who switch from to save an average of ? Its because the sample population people who switch is almost entirely composed of people who are going to save a big chunk of money doing so, or else why would they bother to switch? Since no record is kept of the percentage of people who would not save any money and therefore dont switch, the cited statistic has almost no meaning. 8. A Paradox:-
Simpsons Paradox is when data shows one trend, but dividing it into categories shows the opposite trend. In the example above, hospital 1 has a higher death rate, but if the patients are split into categories based on severity of injury, it has a lower death rate in each category. The same goes with good doctors and bad doctors, as told in the book Super Freakonomics. Good doctors are generally given tougher causes while bad doctors are given easier cases.
However, if you look at death rates you see that some doctors have higher death rates, but these are usually the good doctors. Patients with serious cases are more likely to die, so good doctors lose a lot of their patients than, say the doctor who cures hiccups. The lesson is that you can be fairly certain that the doctor you receive at a hospital is competent enough to be assigned to you.
5. How to ensure that we do not lie with the help of statistics There are a number of issues which need to be addressed in order to ensure that errors, both intentional and unintentional, associated with the interpretation of statistics, are minimized. 1. Sampling Biases
Response Bias: Tendency for people to over- or under-state the
truthNon-response: People who complete surveys are systematically different from those who fail to respond. Accessibility/Pride. Representative Sample: One where all sources of bias have been removed. (Literary Digest) Questionnaire wording/Interviewer effects
Recall Bias: Tendency for one group to remember prior exposure in retrospective studies 2. Well-Chosen Average Arithmetic Mean: Evenly distributes the total among individuals. Can be unrepresentative when measurements are highly skewed right. (e.g. per capita income) Median: Value dividing distribution into two equal parts. 50th percentile. (e.g. median household income) Mode: Most frequently observed outcome (rarely reported with numeric data) 3. Little Figures Not There
Small samples: Estimators with large standard errors, can provide seemingly very strong effects Low incidence rates: Need very large samples for meaningful estimates of low frequency events Significance levels/margins of error: Measures of the strength and precision of inference Ranges: Report ranges or standard deviations along with means (e.g. normal ranges) Inferring among individuals versus populations
Clearly label chart axes
4. Much Ado About Nothing
Probable Error: Estimation error with probability 0.5. If estimator is approximately normal, PE is approximately 0.675 standard errors. (Old school) Margin of Error: Estimation error with probability 0.95. If estimator is approximately normal, PE is approximately 2 standard errors Clinical (practical) significance: In very large samples an effect may be significant statistically, but not in a practical sense. Report confidence intervals as well as P-values.
5. Eye-Catching Graphs
Choice of ranges on graphs can have huge impact on interpretation (e.g. percent change) Choice of proportion of y-axis to x-axis can distort as well (very easy to do with modern software) Can also distort bar charts by having them start at positive values and/or trimming below an artificial baseline to 0
6. 1-D Pictures
Bar Charts and Pictorial Graphs should have areas proportional to values (only make comparisons in one dimension) 7. Semiattached Figure Target Population: Group we want to make inference regarding Study Population: Group or items that experiment or survey is conducted on When comparative studies are conducted among products,treatments, or groups; what is the comparison product, treatment, or group? Control for all other potential risk factors when studying effects of factors 8. Causal Relationships
Correlation does not imply causation
Elements of causal relationships
a. Association between Y and X
b. Clear time ordering (X precedes Y)
c. Removal of alternative explanations (controlling for other factors) d. Dose-Response (when possible)
In the end, statistics are not lies and statistics dont lie: people lie about the statistic itself or how it is interpreted. Some dont lie, they are simply ignorant, as are most members of the public in terms of statistical interpretation. See Logical Fallacies and Critical Research Failure. Put another way, by baseball announcer Vin Scully:
People use statistics the way a drunk uses a lamp post ” for support, not illumination.