Cause and Effect

What causes one organization to succeed and another to fail? This question drives the academic discipline of business management, and the answer can be informed from a great deal of research. But, ultimately, there is not a scripted answer to this question that can be definitively taught in a course. In other words, it is an unstructured problem that we continually strive to solve but never completely succeed in answering.

Therefore, business management (and particularly the sub-discipline of information systems) is actually a very creative discipline. There are no rules other than do not break the law and be ethical. Therefore, organizations have wide parameters within which they can develop creative ways to achieve above-average gains in a market. There are typically two methods that managers employ to test creative ideas: (1) draw from their past experience (and that of their coworkers and employees) to base new ideas upon, or (2) gather data about their performance and use the data to determine cause and effect.

The past experience of smart business managers is nothing to ignore. Some people have made incredible lives for themselves off of one good idea. But if managers don’t adapt and change, that one good idea may never produce good results again, and it may be a terrible solution to future problems. That’s because the past experience of just one person is very limited. Even the experience of two, three, or one thousand employees is quite limited. In addition, our personal experiences are interpreted (or often misinterpreted) through the narrow lens of our own biases, beliefs, values, and desires. Beliefs can be distorted very easily. The truth can be stretched until it is almost more false than true.

The second method, gathering and examining data, is a much more productive way to make wise business decisions. Good data are very valuable, and accurate and timely data don’t lie. These data aren’t biased. They don’t care about gender, race, religion, or politics. The data simply represent an objective view of the facts. Therefore, many of the best business decisions are based on accurate and timely data. Perhaps most importantly, data allow us to establish cause and effect. Accurately explaining the causes for each effect is how theory is formed and true knowledge is discovered.

The effect that we are interested in is business success. That’s obvious (even though success can be broken down into many specific and measurable outcomes). Less obvious is the cause of success. This is where you have to be careful. Data allow us to measure hypothesized causes and desired successes. However, data cannot determine the true cause of each effect. Data only give you support for a theorized cause and effect relationship. For example, consider this chart depicting accurate data:

Figure 1.1: Organic Food and Autism Chart

Does organic food cause autism? Probably not. In fact, this ridiculous chart was made as an example of how data can be terribly misinterpreted if you don’t know how to use data. So, why is there such a strong relationship between organic food sales and autism in the figure above? As you may remember from prior statistics courses, the only way to truly establish causality is through a randomized experiment with treatments. We call the data generated from those experiments primary data. However, that is not the purpose of data analytics. Data analytics is based on the collection of secondary data, which is data that were generated previously for one purpose but that the data scientist will use for another purpose. Even though we can’t establish causality with secondary data, we will still imply causality as long as we can come up with a theory to explain the causal relationship. Therefore, it is still very worthwhile to determine what variables are highly correlated with positive outcomes. From a technical definition, identifying a high correlation is not the same as establishing a cause, but for the purposes of this textbook, we will imply that we are establishing cause and effect.

In summary, data can be valuable, but it can also be dangerous if “the wrong hands” interpret it. The “right hands” to interpret data are the people who understand data and help create the theories that explain organization success. However, teaching theories and data interpretation is not the purpose of this book. Rather, the focus will be on teaching you the proper techniques to analyze and make use of data in order to build valuable theory and, more importantly, use that knowledge in a machine learning environment. It’s up to you (on your own or in future courses) to learn the theories that (1) identify the relevant variables and (2) explain the relationships between those variables that establish cause and effect.