Wrong data is worse than no data

Yesterday, someone asked me:

If most companies have a data team and they use data to make their decisions, why do some of them make the right decisions and others don’t?

My quick answer was:

Just because they have data doesn’t mean they use it to make decisions

Sounds easy enough, but as with everything, it is not that easy. Before I move further, let me give you a bit of context on why my opinion could matter to you.

Before becoming a Scrum Master, I was a BI engineer. My job used to be, giving people KPI’s and reports to help them arrive at the right decisions. During that decade I saw a lot about how data can be misused and how people can completely disregard what the data tells you. And to make it even fair, take it with whatever grain of salt, use analytical thinking and don’t just take my words for granted. Don’t fall for the appeal of authorities’ fallacy.

OK, let’s start


This goes with the assumption that there is data available. With that in mind, I’ve created the following tree.

We have Data available

We use data to make our decision (1)

We have the correct data (3)

We have incorrect data (4)

We DO NOT use data to make our decision (2)

If you have data, you use data and you have the correct data, there is no problem. That would be branch 3. This is the trifecta, the perfect case.

Champagne would fall from the heavens. Doors would open. Velvet ropes would part.

Nicolas Cage in “Gone in sixty seconds”

The problem is with the other 2 branches.

We DO NOT use data to make our decision

This is the most common to find. Decisions are made based on intuition even though there is data available.

This could be from a huge variety of factors:

  • Ego
  • Distrust from data
  • Ego
  • Dunning–Kruger effect, or
  • Any other number of Biases and Fallacies that we have out there.

The worst problem is the next one

We make decisions using incorrect data

This is a dangerous one, and very common in the IT world.

Most of us in IT have some knowledge of mathematics, and here it gets dangerous. Because of that, we think we understand how to create KPI’s. They are simple calculations. As long as it is a number we can do “math” on it.

The example I am going to give is extreme. It never happened, but shows creating the wrong KPI and making decisions on it is bad.

My BI team used to create ETL (extract, transform, load) mappings to bring data into the data warehouse. We worked together for a couple of years and we produced enough data about how many SP’s each of those ETL’s were.

We could simply divide the number of SPs per amount of ETL’s that we did and come up with a KPI called avg SP/ETL. We could then use this metric to pre-estimate all future ETL’s because math and history and <<insert a reason here>>

As you can understand, that tells you nothing, but in the wrong hands, it tells you everything. Decisions can and will be made considering this. What is the problem with that? Unlike when we are 100% aware that the decisions come from someone else’s intuition. In these cases, we trust that they are correct because they are backed by data.

This is why having the wrong data is way worse than having no data at all.

Cheers,

signature

1 thought on “Wrong data is worse than no data

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

search previous next tag category expand menu location phone mail time cart zoom edit close