--> Interpreting Correlations | Strategic Leap

Interpreting Correlations

SHARE:

 A correlation simply means the data falls in a pattern in this case, higher SAT scores are often associated with better performance in coll...

 A correlation simply means the data falls in a pattern in this case, higher SAT scores are often associated with better performance in college. Correlations do not tell us causation, and they cannot predict individual behavior. If that correlation is true, you can still most certainly have someone with a perfect score drop out of college, or somebody with a very low SAT score become valedictorian. The correlation just tells us that a pattern exists.

Correlation Coefficients

Correlations can be tricky, though, precisely because they don’t indicate causation. Correlations come from graphing data from two variables on a coordinate plane, then calculating what is known as the Pearson correlation coefficient (usually represented by the letter r), or how closely related the two variables are. The correlation coefficient can tell us that the two variables are strongly correlated, that they are strongly negatively correlated, or that there is no correlation at all (and everything in between).

Correlation coefficients, or r-values, are represented as numbers ranging from negative 1 to positive 1. Think of it as a spectrum. An r-value of –1 means there is a very strong (actually perfect) negative correlation, zero means there is no correlation at all, and positive one means there is a perfect positive correlation.

A strong correlation is considered anything between 0.7 and 1. Less than 0.7 means there is some correlation, but the closer the r-value gets to zero, the weaker the correlation.

Perfect correlations, either positive or negative, mean that the two variables are directly linked: Either one rises exactly in step with the other, or one declines exactly as the other rises. It’s kind of difficult to produce examples of perfect correlation in everyday life because most phenomena have some variation, even if slight. The number of miles you drive and the amount of gas you use would have a near-perfect positive correlation, since the amount of gas used increases steadily as the number of miles driven increases. This might not be a perfect correlation, though, because several factors influence how much gas you are using at any given moment. (Are you coasting on the highway? Idling at a light? Using the air conditioning?)

A strong negative correlation exists between, for example, the average temperature in winter and the amount of energy used to heat a home. As the temperature increases (assuming it’s still cold enough for houses to require heating), the energy used to heat homes decreases. Again this isn’t a perfect correlation, because other factors might affect how much energy a given home uses.

What about correlations that are only somewhat strong? Let’s look at feet for an example. Have you noticed that your tall friends mostly wear a bigger shoe size than you do? Or maybe you are the tall friend, and you feel like you have clown feet when you stand next to your friends. Most of us would probably guess that shoe size and height are correlated, but we can look at the data to prove it. Here’s a graph from StatCrunch showing height on the y-axis (vertical) and shoe size on the x-axis (horizontal).

This type of graph is called a scatterplot. Each point on the graph represents one data point, or one person with a shoe size of x and a height of y. You can see that there are points all over the graph, but that they follow an upward pattern as your eye moves to the right. Based on this pattern, we can surmise that there is a positive correlation between height and shoe size. It turns out that, when a linear regression model is run on a calculator or computer, the r-value is .6222.[lix] Remember that the closer the number is to zero, the less strong the correlation, and an r-value of one means there is a perfect correlation. This correlation coefficient tells us that height and shoe size do have a positive correlation, but that it’s not perfect—it’s not even considered statistically “strong” since it is below 0.7. In real-world terms, this means most people who are tall have big feet, but there are plenty of outliers. You could be a shorter person with unusually large feet for your size, or a tall person who happens to wear the smallest shoes among your friends. Either of these would be acceptable, and even expected, with a correlation coefficient of .6222.

A classic example of a strong (but not perfect) negative correlation is car price and age of the car. As cars get older, their cost goes down; a new car will usually cost you substantially more than a ten-year-old one. Just as a positive correlation means that the two variables increase together, a negative correlation means that as one variable (age) increases, the other (cost) decreases. The strength of this correlation varies by car brand, though. Some cars are known to “hold their value” better than others, meaning their value declines less rapidly. Once again, the correlation can tell us what the trend is or what we can expect to be the case, but it allows plenty of room for outliers.

The British website Auto Express gives us two examples of graphs showing car depreciation. The first graph (Car B) is for a typical car that loses much of its value within a year or two of being purchased:

The second graph shows a more linear progression for a car that holds on to its value better, as certain brands are known to do.

Both of these graphs show a negative correlation between the variables: As the years since purchase increase, the value of the car decreases. Car A shows an almost linear relationship, with the value decreasing pretty steadily over time, while Car B’s value drops almost instantly.

Finally, let’s look at two variables that have no correlation. Imagine if the two variables we looked at were height and number of pets owned. We would most likely find no correlation there. The graph might look something like this, with each point again representing a single person’s data:


You can tell from looking that there is no correlation between those variables, which makes sense. Height has nothing to do with how many pets a person owns.

Misleading Correlations

Interestingly, we sometimes find a correlation where there is none. Imagine in the example above that my sample the people I queried happened to show that greater height was associated with more pets. We would have to think carefully about that correlation and whether it made sense. Was the sample set appropriate? Was the data accurate? Could I repeat the study and get comparable results? Answering these questions would reveal that the correlation was most likely just a fluke. Maybe my sample size was too small, and I happened to get a handful of tall people who have a lot of pets and a bunch of short people with no pets. This inappropriate sampling could suggest a correlation that wouldn’t exist if we sampled a larger population.

Incorrect correlations can be fun, but they can also be misleading. Let’s look at some fun ones first. There’s an entire website, written by Tyler Vigen, devoted to supposed correlations that have nothing to do with each other. He examines, for instance, the correlation between yogurt consumption and Google searches of “i cant even”:


As we can see in the graph, those two variables look pretty well correlated—they would likely have a high r-value if he calculated it. He even used artificial intelligence (AI) to come up with an explanation for why this supposed correlation exists: “It’s simple. As yogurt consumption rose, so did our tolerance for the sour and curdled aspects of life. It’s as if the active cultures in the yogurt fermented a newfound ability to handle all the whey-ward frustrations. So next time you’re feeling moody, just grab a spoon and dairy yourself to a better mood. Remember, when life gives you lemons, make fro-yo!”

Vigen has even linked each of his graphs to an AI-generated “research” paper. He describes his process on the linked website Spurious Scholar:

  1. Step 1: Gather a bunch of data.
  2. Step 2: Dredge that data to find random correlations between variables.
  3. Step 3: Calculate the correlation coefficient, confidence interval, and p-value to see if the connection is statistically significant.
  4. Step 4: If it is, have a large language model draft a research paper.
  5. Step 5: Remind everyone that these papers are AI-generated and are not real. Seriously, just pick one and read the lit review section.
  6. Step 6: . . . publish.

The note after Step 1 claims that he has 25,156 variables in his database. After Step 2, he describes data dredging: “‘Dredging data’ means taking one variable and correlating it against every other variable just to see what sticks. It’s a dangerous way to go about analysis, because any sufficiently large dataset will yield strong correlations completely at random.”

Data dredging has another term: p-hacking. P-hacking refers to a study’s p-value, which is the probability that one could get the same results that the study did by chance. It is used to indicate how statistically significant a study is. The lower the p-value, the greater the statistical significance of a result. P-hacking means dredging a data set until you find something that is statistically significant, whether that thing makes any sense or not. Tyler Vigen’s “spurious correlations” prove the point that p-hacking is both possible, given enough data, and dangerous, as the results can be incredibly misleading.

Correlation Does Not Equal Causation

It is also possible to have a correlation be true but not causal. In other words, the two variables are indeed linked, but the cause is a third variable that wasn’t measured. For example, according to the website Scribbr, ice cream sales and violent crime rates are closely correlated. One might draw incorrect conclusions based on that fact: Maybe eating ice cream leads people to commit crimes, or maybe criminals like to eat ice cream after committing a crime. Both of these seem highly unlikely, though. The correlation exists because a third variable heat affects both ice cream sales and violent crimes. When temperatures increase, both of these also increase. So although ice cream sales and violent crimes have a correlation, they don’t have a causal relationship.

Finally, another problem we encounter when we look at correlated variables is lack of understanding about directionality. For example, researchers have known for many years that depression and vitamin D levels are negatively correlated. In other words, people with low vitamin D levels are often depressed. But researchers are still unclear about causality. As a 2020 mega-review in the Indian Journal of Psychological Medicine put it: “Overall findings were that there is a relationship between vitamin D and depression, though the directionality of this association remains unclear.”

Determining causality may seem like a minor point, but it’s critical in deciding on a course of action. Doctors may realize depression and vitamin D are linked, but it’s unclear if increasing serum vitamin D levels will ease depression symptoms. Should doctors tell their patients to take vitamin D? Or should they focus on other approaches to treating depression and other potential causes of vitamin D deficiency? Causality matters on both an individual and population level. More research is necessary to determine which direction the causality goes and thus what treatments are warranted.

Program Evaluation

If you’re a gen-xer, you probably remember Joe Camel. Joe Camel appeared in Camel cigarette ads from 1988 through 1997. He was supposed to be cool with his cigarette and masculine outfits, often alongside the tagline “smooth character.” His job was to entice people to smoke, thereby boosting sales of Camel cigarettes. Joe came under fire in the 1990s, with at least one study showing that the character was as recognizable to six-year-olds as the Disney Channel logo was. In 1997, after years of court battles, R.J. Reynolds Tobacco Company voluntarily retired Joe.

The pressure to ban Joe Camel was part of a national panic about teen smoking. Studies showed that teen smoking declined in the 1970s and 1980s but began to rise in the 1990s. Data from high school seniors showed that, in 1990, 19.4 percent of them were “current smokers” (defined as having smoked in the last thirty days). By 1997, that rate had risen to 24.5 percent. Studies have shown that most adult smokers began smoking when they were teens; very few adults pick up smoking as a new habit. All sorts of programs emerged in the 1990s and 2000s to try to reduce or prevent teen smoking, as that was seen as the key to lowering smoking rates overall. If you remember assemblies in school, ad campaigns, or public service announcements about the dangers of smoking, you were the target of one of these programs.

Billions of dollars are spent each year on large-scale programs like the ones to prevent teen smoking. But are these dollars being put to good use? Knowing whether or not these programs are effective is critically important. Nonprofits and governmental agencies do not want to waste billions on initiatives that aren’t making a difference. This leads us to another important use for statistical analysis: program evaluation.

Program evaluation, broadly speaking, is the process of figuring out if a program has done what it was created to do. An evaluation of teen smoking prevention programs would tell us whether or not fewer teens smoked, meaning if those billions of dollars spent were worthwhile. A program evaluation might also tell us if certain parts of a program are effective (certain ad campaigns, for example) and if other parts need to be tweaked or discontinued.

A good program evaluation is a thorough, systematic process that uses data to make a determination. According to a guide published by the US Department of Education:

A well-thought-out evaluation can identify barriers to program effectiveness, as well as catalysts for program successes. Program evaluation begins with outlining the framework for the program, determining questions about program milestones and goals, identifying what data address the questions, and choosing the appropriate analytical method to address the questions. By the end, an evaluation should provide easy-to-understand findings, as well as recommendations or possible actions.

Let’s stick with the campaign to reduce teen smoking for now. Program evaluations happened at many points in the process and looked at many different aspects of the campaign. A summative evaluation might tell us how effective the overall campaign was to reduce teen smoking, but other tools were used along the way to tweak the campaign, changing tactics and emphasizing new strategies.

To evaluate the campaign’s effectiveness, study designers had to first identify the outcomes they wanted (fewer teens initiating smoking, for example). They had to figure out how they were going to collect the data, and also what kind of data it was going to be. Quantitative data is numbers: Are there actually fewer teens who smoke? Qualitative or categorical data measures things numbers alone can’t measurewhich ad campaign teens remember seeing, for example. Both are useful, but study designers need to be clear on which type of data will give them the information they want. Program evaluators might also decide to use randomized controlled trials (RCTs) to evaluate a program’s effectiveness. You’ve probably heard of RCTs most with new drugs or medical treatments, as researchers try to determine how effective they are.

According to a meta-analysis published by the National Institute of Health, the teen smoking campaign involved, among other approaches, three major school initiatives: an “information deficit” model in which school-age children were taught about the effects and risks of tobacco, an “affective education” model that emphasized self-esteem and developing values, and a “social influence resistance model” that taught teens how to resist social influences, which included ad campaigns and peer pressure.

Think back to your years in school. Did you have a health or drug-education class that taught you about the effects of tobacco and other drugs? Do you remember attempts to build your self-esteem and influence your health outcomes, like wellness classes or meetings with school counselors? If so, these were likely part of that educational approach. Several studies published in the 1990s and early 2000s showed that, of the three approaches, the social influence resistance model was the most effective. In other words, teaching teens to identify and resist peer and societal pressure had the largest impact (in terms of educational programs) on preventing smoking.

Other aspects of the teen smoking campaign included laws targeted at sales of tobacco to teens, penalties for breaking these laws, advertising restrictions, counter-marketing campaigns (ad campaigns about the dangers of smoking, for example), and other community-based interventions. Of these approaches, preventing sales to minors proved to be one of the least effective models. Kids who want tobacco (or alcohol, for that matter) have a way of getting it from older friends and relatives! As of 2004, meta-analyses revealed that the most effective approach was a combined one:

[The] CDC recommends several components as critical in a comprehensive youth tobacco control program, all of which have parallels in efforts to reduce underage drinking. These components include implementing effective community-based and school-based interventions in a social context that is being hit with a strong media campaign (aimed at some set of “core values”) and with an effort to vigorously enforce existing policies regarding the purchase, possession, and use of the substance.

Without comprehensive data to back up claims, the smoking prevention campaign might have been abandoned after a few years, or ineffective aspects of the program might have continued while others were terminated. What if someone thought that simply preventing the sale of tobacco to minors would stop all youth from smoking, for example? What if no other intervention programs existed because someone believed so strongly in the power of the law to change behavior? Without an effort to study the data to systematically analyze the effectiveness of the program smoking rates today might be as high or higher than they were in the 1990s. As it turns out, in 2023, only two out of every one hundred high school students reported smoking cigarettes in the past thirty days.

Evaluating Healthcare Initiatives

Program evaluation is a critical part of many healthcare initiatives as well. Formative evaluations meaning ongoing, mid-process ones rather than retrospective ones have helped shape the Affordable Care Act, first passed in 2010 under President Obama. The ACA attempted a massive reform of healthcare, mandating universal coverage for individuals and attempting to curb costs from providers and insurance companies. Whether or not it succeeded in meeting those goals has been widely debated, with political views often complicating the picture.

Data tells us that the ACA did succeed in getting more people insured: According to the US Census Bureau, 26.4 million Americans remained uninsured in 2022 versus 2013’s 45.2 million.

Data also shows that millions more people have access to affordable care, particularly preventive services, and that health disparities between racial and ethnic groups have declined. Despite these results, legal challenges to the ACA still abound. While the numbers are indisputable, many people oppose the higher premiums individuals have to pay, as well as tax increases that have helped fund expanded care.

This debate over the ACA highlights two potential difficulties of program evaluation. The first is that defining what you are measuring and how to measure it is critically important. Do more people have insurance now than did in 2013? Yes. Are health outcomes in the US measurably better than they were in 2013? That’s an entirely different question that would need to be answered with a different set of data. And that’s not an easy thing to measure. In evaluating a program, evaluators need to define the outcome they are looking for, figure out how to assess it, and then collect that data. For a massive program like the Affordable Care Act, any kind of evaluation is an enormous undertaking.

The other difficulty program evaluators often stumble upon is a political one. The debate over the ACA hinges on political affiliation, with most liberals supporting it and many conservatives opposing it. With so much data out there and so many potential questions to answer, conflicts over program effectiveness abound. Whatever your argument, you can often find the data to back it up, particularly with such a large-scale program as the ACA. While numbers don’t lie, questions can be tweaked or asked in certain ways to get answers that can back up different viewpoints.

So how can you as the consumer of information ensure that you’re not being swayed by a particular point of view? Try to make sure the source you get your information from is neutral and doesn’t have a vested interest in one particular viewpoint. If you’re reading a study that says smoking is actually good for your lungs, for example, ask yourself who paid for that study and who reported it. Was it a major tobacco company that ran the study? Or was it a government health initiative or nonprofit? Always check the source; after all, some advertisements from the 1930s through the 1950s did indeed tout the health benefits of smoking. Ads like the following one were paid for by Philip Morris, the tobacco company.

Also ask yourself what question the information you’re hearing is answering, because the headline might be misleading. One could come up with different evaluations of the ACA based on if one asked about increased coverage, positive health outcomes, or increased taxes, for example. All of these questions are important, but none should be viewed in isolation. Seek out data and analyses from neutral sources and use your own critical thinking skills to make decisions. Just like program evaluators, try not to be swayed by what you want to be true, but look to the actual evidence for information.

Common Fears and How Statistics Prove Us Wrong

Do you have an irrational fear of shark attacks? Or of air travel? Or of getting stuck in an elevator? We all have fears and neuroses that sometimes govern our behavior. And sometimes we feel so strongly about them that we’re truly convinced something terrible is going to happen. I cannot get on that plane, we think to ourselves, because I’m convinced it will go down. Statistics, however, paint a very different picture for many of these fears. If we look at the facts, we may be able to talk ourselves out of some of our most prevalent fears.

Here’s a look at five common fears and misconceptions and what statistics can actually tell us about them.

Shark Attacks

Here’s an image that might scare you:

You’re probably thinking you shouldn’t swim in Florida, right? And surfing looks pretty dangerous too.

What this graphic doesn’t mention is that those percentages are out of a total of sixty-nine shark bites in 2023. And that’s worldwide. Only thirty-six of those bites occurred in the United States. So the 44 percent of bites that occurred in Florida means about fifteen people. Now think about the millions of people who visit Florida beaches every year. Fifteen people being bitten in a full year is an exceedingly small fraction of the total.

According to the University of Florida’s International Shark Attack File (ISAF), only ten of the sixty-nine bites resulted in death, two of which were in the US. The Florida Museum of Natural History (Florida has a higher-than-average interest in shark attacks, it seems) calculates this as a one in 11.5 million chance that a beachgoer in the US will be attacked by a shark, and less than a one in 264 million chance that a beachgoer will be killed by a shark.

If you’re still scared, it might help more (unless you’re in Australia) to know that the bulk of those ten deaths forty percent occurred in Australia, with three of them happening on a remote area of coast known for great surfing. According to ISAF, that coast is home to a large population of seals and white sharks. Sharks like to eat seals, and a surfer flopping about in the water looks an awful lot like a seal to a hungry shark.

When shark bites do occur, they’re big news, and that often stokes fear in people. But they’re big news precisely because they’re so rare. Think about it: Do you hear about every fender bender on the major highway nearby? No, because they probably happen almost daily. So the next time you hesitate before taking a plunge into the ocean, remind yourself that you’re thinking about shark attacks because they’re so rare. I can’t promise that you won’t get bitten, but the odds are in your favor that you’ll be fine.

Flying

The fear of flying, also known as aerophobia or aviaphobia, is pretty common. It’s so common, in fact, that it’s listed as a specific phobia in the DSM-5, the definitive guide to psychological disorders. NPR’s LifeKit podcast did an episode on it, Time magazine has an article on it, and Reddit is filled with threads offering advice to get over it. You may go out of your way to avoid flying, perhaps even driving cross country to see a relative instead of taking a five-hour flight.

By now you have probably heard that flying is the safest form of travel. Let’s look at the numbers to try to convince you just how much safer it is than driving. According to a Harvard University study, your chances of dying in a plane crash are one in eleven million. That’s pretty close to your risk of being bitten by a shark in the waters off the United States. By comparison, your odds of dying in a cataclysmic storm are one in 20,098. You are more likely to win the lottery than you are to die in a plane crash.

To someone with aerophobia, though, these statistics usually don’t mean much. Rationally, they know that flying is incredibly safe, but irrational fears take over. David Ropeik, instructor of risk communication at Harvard’s School of Public Health, argues that “risk perception is not just a matter of facts.” He points out that all sorts of other factors go into assessing how risky a situation is. For example, maybe a plane crashed recently and you heard about it on the news, so the risk is at the forefront of your mind. Maybe you had a near crash or other bad flying experience once or heard a story from a friend about a pilot who couldn’t do his job safely. Maybe you know that airlines have been struggling financially, so you’re worried about pilots being over-extended and airlines cutting safety measures. Any of these concerns all based on truths could make flying seem much riskier to you than the numbers say it is.

If you’re still unconvinced that getting on a plane is safe, many travel magazines offer tips on getting over your fear of flying, and mental health experts are trained to address phobias. But perhaps thinking about the numbers will help calm your fears just a little bit. Once again, numbers don’t lie, and comfort can be found in knowing how unlikely your fear is to come true.

Getting Kidnapped by a Stranger

If you have kids, you know the feeling that they are the single most important thing in your life. It’s understandable that any kind of threat to your children is terrifying. And there are threats out there, but many of the things we think are major threats are actually very rare. Let’s look at kidnappings, for example. This is many parents’ biggest fear, possibly stoked by the years-long campaign that put pictures of missing children on the back of milk cartons. If you are a parent who grew up during these years, you probably spent at least an hour each week staring at the latest picture while you ate your cereal, with little else to occupy your attention.

There are a disturbing number of kidnappings each year in the United States, but the overwhelming majority of them are parental kidnappings. Parental kidnappings, in which one parent absconds with the child without permission (often in fights over custody), are still serious crimes, but they are not the same as a kidnapping by a stranger. Sources put the number of kidnappings per year by strangers between about one hundred and three hundred. If we go with the lower end of that range, that’s about twice the number of worldwide shark attacks per year.

A kidnapping, like a shark attack, is another case of an event being in the news because it is so rare. If it happened every day, it wouldn’t be news. The news also often reports on “missing children,” a phrase that can stop every parent in their tracks. But 95 percent of the time, missing children are children who have run away. This is still a traumatic event for their family, but again, it doesn’t mean they went missing because a stranger abducted them.

Unless you have a volatile relationship with a family member or co-parent of your child, your child is highly unlikely to be kidnapped. That doesn’t mean it never happens, of course, but that the odds are very much in your favor. Again, exercise common sense, but also take comfort in statistics.

Getting Trapped in an Elevator

This one is a bit more complicated. You are most likely not going to get stuck in an elevator, but someone who works in a building with a finicky elevator might tell you otherwise. According to multiple sources, on average, elevators break down once every hundred thousand rides. The chances of an elevator breaking down in a single ride are thus .01 percent. That’s really low. If you only ride elevators occasionally, you don’t have much to worry about.

However, Elevating Studio, a company that seeks to make elevator riding more efficient, points out that this small risk can add up if you are a frequent elevator user. Let’s imagine you live or work in a highrise. If so, you might use the elevator eight times a day. While each trip has a low probability of getting you stuck, you’re increasing your odds by riding the elevator so frequently. Elevating Studio calculates that, over forty years of working in a highrise, you have about a six percent chance of getting stuck at some point in your career. If you also live in a building with an elevator, your risk rises to twelve percent over a forty-year period. That’s not high, but it’s not miniscule either.

As with other risks, the risk of getting stuck in an elevator may feel greater to you if you know of someone who was stuck in one. Maybe it was even in your building. And older, less well-maintained elevators are in fact more likely to break down. So if you’re a little wary of the rickety elevator in your pre-war high-rise, well, you might have good reason to be.

The good news is that it’s highly unlikely for an elevator accident to cause death. You know those scenes in movies where elevators go into free fall, with the hero or heroine barely managing to stop it before it crashes to the bottom of the shaft? Don’t worry, that’s a Hollywood trope. Elevators have all sorts of built-in mechanisms, including several cables (not just one or two) to prevent them from falling. According to Wikipedia, most deaths caused by elevators were in mines or construction sites, where an accident, fire, or other serious malfunction occurred. Though there have been a few deaths and serious injuries in residential or commercial elevators, the incidence remains extremely low.

Quicksand

Maybe this one only strikes fear in the hearts of people of a certain generation. Classic action movies seemingly wanted people to believe that quicksand was a serious threat to our lives. There is even a 1950 movie called Quicksand that portrays the protagonist’s descent into a life of crime. According to the trope, quicksand might appear anywhere (especially in a desert), and you’ll be powerless against it as it sucks you down below the surface, with bystanders unable to pull you out to safety.

At one point in American film history, quicksand appeared in nearly three percent of movies. It appeared on Gilligan’s Island, The Swiss Family Robinson, The Lone Ranger, and even The Lucy Show. Dan Engber, a Slate columnist who contributed to a 2013 RadioLab episode on quicksand in Hollywood, went through old movies to figure out just how prevalent quicksand was and came up with the following chart:

Reality (not necessarily statistics, because there aren’t enough of them about quicksand) proves our fears to be irrational. First of all, most people rarely encounter large pools of quicksand, which is really just a fine mixture of sand, clay, and water. Think about when you stand in a tidal pool on the beach and you sink into the sand a few inches, maybe even up to your calves. That’s a version of quicksand. Real quicksand can be stickier and deeper than that, but that’s the idea. Quicksand cannot, however, suck you under. According to The Encyclopedia Britannica, “quicksand is denser than the human body. People and animals can get stuck in it, but they don’t get sucked down to the bottom they float on the surface.” So even if you do encounter quicksand, you’re not going to drown in it the way old heroes and villains did.

You now know a bit about how statistics show up in our daily lives hopefully more than you did when you started reading this book. You might have been operating for the past ten, twenty, or thirty years thinking that math was just something you learned in school. You might have been one of those students who whined, “But when am I ever going to need to know this?” to their teacher. While you don’t need to remember a ton of formulas and procedures to get by in life, it can benefit you to have a basic understanding of how numbers affect us.

Pretty much everything we do can be turned into a data point, and those data points can all be analyzed to make better sense of our lives. Statistical measures help us make better decisions, fine-tune processes, and assess what we have been doing. Knowing a little bit about where statistical analyses come from helps us think more critically and become aware of false claims around us. Just recently, the Wall Street Journal reported on the closure of nineteen academic journals due to fake statistics. The publisher of these journals, Wiley, has retracted 11,300 articles over the past two years as they discovered fraudulent data and incorrect conclusions. Retractions can result from an honest misinterpretation of data or poor review process. In this case, however, these articles were linked to “paper mills,” companies that produce “academic” papers with fraudulent data, false authorship, plagiarism, or other violations of academic standards.

So why do paper mills, which exist just for the purpose of publishing false data, exist? The underlying answer is for profit. They can make a profit because people in certain industries, especially academic ones, often gain status or money from publishing. People and institutions that might not otherwise be able to achieve that status can pay to put their name on something that isn’t authentic, responsible research. This practice is becoming ever easier with the creation of artificial intelligence. Publishers are realizing that this is a growing problem, and they need to get savvier about recognizing fraudulent work.

The fact that fraudulent academic work is such a problem speaks to how important data is in our lives. Those who use fake data or slap their name on a plagiarized report do so because they know how influential data can be. You, the consumer of information, are now armed with several tools to help you make sense of data, to recognize when something seems off (there’s a correlation between yogurt consumption and Google searches for “I can’t even”? Hmm, that doesn’t seem right). You know what a difference data can make in our lives, and you can begin to train your eye to find good data.

So now is the time to go forth and put your faith in numbers and science but only once you have verified that the numbers and science you’re looking at come from trusted, unbiased sources, and that they are the results of rigorous statistical analysis. In other words, don’t base your decisions on the taste of one or even four cups of tea, but rather on an appropriate sample that gives you reliable and valid data.
Name

AI,15,Data,4,Strategy,4,Tech,9,
ltr
item
Strategic Leap: Interpreting Correlations
Interpreting Correlations
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj1d7FB0_h6H0x2_LDvz7H_tRlt7ErMl9G-dtLNVQTFU3oAPJzG8AUuHGI0jwxV0Z7hggwh27g3u6Rv49bDshtam3ukDvoWAM69kpdXuu_kHdPsezT8Itv_1_aKu18NYHIsMjXCT00xklshpiJCdgFzjgrAKe7z7CuizcQsI_7gizDn3YpF2iEqM7Tu7Aw/w640-h640/Interpreting%20Correlations.png
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj1d7FB0_h6H0x2_LDvz7H_tRlt7ErMl9G-dtLNVQTFU3oAPJzG8AUuHGI0jwxV0Z7hggwh27g3u6Rv49bDshtam3ukDvoWAM69kpdXuu_kHdPsezT8Itv_1_aKu18NYHIsMjXCT00xklshpiJCdgFzjgrAKe7z7CuizcQsI_7gizDn3YpF2iEqM7Tu7Aw/s72-w640-c-h640/Interpreting%20Correlations.png
Strategic Leap
https://strategicleap.blogspot.com/2024/11/interpreting-correlations.html
https://strategicleap.blogspot.com/
https://strategicleap.blogspot.com/
https://strategicleap.blogspot.com/2024/11/interpreting-correlations.html
true
5160727563232597795
UTF-8
Loaded All Posts Not found any posts VIEW ALL Readmore Reply Cancel reply Delete By Home PAGES POSTS View All RECOMMENDED FOR YOU LABEL ARCHIVE SEARCH ALL POSTS Not found any post match with your request Back Home Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sun Mon Tue Wed Thu Fri Sat January February March April May June July August September October November December Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec just now 1 minute ago $$1$$ minutes ago 1 hour ago $$1$$ hours ago Yesterday $$1$$ days ago $$1$$ weeks ago more than 5 weeks ago Followers Follow THIS PREMIUM CONTENT IS LOCKED STEP 1: Share to a social network STEP 2: Click the link on your social network Copy All Code Select All Code All codes were copied to your clipboard Can not copy the codes / texts, please press [CTRL]+[C] (or CMD+C with Mac) to copy Table of Content