Asking Great Data Science Questions
One of the most important parts of working in a data science team is discovering great questions. To ask great questions you have to understand critical thinking. Critical thinking is not about being hostile or disapproving. It’s about finding the critical questions.
These questions will help you shake out the bad assumptions and false conclusions that can keep your team from making real discoveries.
2.Apply Critical Thinking
Harness the power of questions
Your team needs to be comfortable in a world of uncertainty, arguments, questions, and reasoning. When you think about it, data science already gives you a lot of the information. You’ll have the reports that show buying trends. There will be terabytes of data on product ratings. Your team needs to take this information and ask the interesting questions that creates valuable insights.
A good question will challenge your thinking. It’s not easily dismissed or ignored. It forces you to unravel what you already neatly understood. It requires a lot more work than just listening passively.
Pan of gold
The critical in critical thinking is about finding the critical questions. These are the questions that might chip away at the foundation of the idea.
It’s about your ability to pick apart the conclusions that are part of an accepted belief.
Here’s an example:
At the end of the month, the data analysts ran a report that showed a 10% increase in sales. It’s very easy here to make a quick judgment. The lower prices encouraged more people to buy shoes. The higher shoe sales made up for the discounted prices. It looks like the promotions worked. More people bought shoes and the company earned greater revenue. Many teams would leave it at that. Your data science team would want to apply their critical thinking.Remember it’s not about good or bad, it’s about asking the critical questions. How do we know that the jump in revenue was related to the promotion? Maybe if you hadn’t had the promotion, the same number of people would have bought shoes.
What data would show a strong connection between the promotion and sales? You can even ask them essential questions. Do your promotions work? Would the same number of people have bought shoes? Everyone assumes that promotions work. That’s why many companies have them. Does that mean that they work for your website? These questions open up a whole new area for the research lead. When you just accept that promotions work, that everything is easy. These worked so let’s do more promotions. Instead the research lead has to go in a different direction.
How do we show that these promotions work? Should we look at the revenue from the one-day event? Did customers buy things that were on sale? Was it simply a matter of bringing more people into the website? This technique is often called panning for gold. It’s a reference to an early mining technique. It’s when miners would sift through sand looking for gold. The sand here are all the questions that your team asks. The research lead works with the team to find the gold nuggets that are worth exploring. The point of panning for gold is that you’ll have a lot of wasted material.
They’ll be a lot of sand for every nugget of gold. It takes patience to sift through that many questions. Don’t be afraid to ask big why’s?
Focus on reasoning
That’s why a key part of critical thinking is understanding the reasoning behind these ideas. Reasoning is your beliefs, evidence, experience, and valuesthat support conclusions about the data. It’s important to always keep track of everyone’s reasoning when working on a data science team.
Everyone on your team should question their own ideas. Everyone should come up with interesting questions and explore the weak points in their own arguments.
A University of California physicist named Richard Muller spent years arguing against global climate change. Much of his work was funded by the gas and oil industry.
Later, his own research found very strong evidence of global temperature increases. He concluded that he was wrong and that humans are to blame for climate change. Muller saw the facts against him were too strong to ignore, so he changed his mind. He didn’t do it in a quiet way or seem ashamed. Instead he wrote a long op-ed piece in the New York Times that outlined his initial arguments and why his new findings showed that he was wrong. That’s how you should apply critical thinking in your data science team.
Run question meetings
This is sometimes called a question-first approach. These meetings are about creating the maximum number of questions.
Identify question types
If you run an effective question meeting, then you’ll likely get a lot of good questions. That’s a good thing. Remember, you want your team to be panning for gold. They should be going through dozens of questions before they find a few that they want to explore. The more ideas you can expose, the better. Then you can decide which ones are best to explore. Just like the early miners who panned for gold, you want to be able to sort out the gold from the sand. You want to know how to separate good questions from those that you can leave behind.
You don’t want your team asking too many open questions, it’ll make everyone spend too much time questioning, and not enough time sorting through the data. On the flip side, you don’t want the team asking too many closed ended questions. Then the team will spend too much time asking smaller easier questions, without looking at the big picture. Once you’ve identified whether your question is open or closed, you’ll want to figure out if it’s essential.When it’s essential it gets to the essence of an assumption, idea, or challenge.
Organize your questions
Below them, you can use yellow notes for non-essential questions. Remember that these are questions that address smaller issues. They’re usually closed questions with a quicker answer.Finally you can use white or purple stickies for results. These are little data points that the team discovered which might help address the question. There are five benefits to having a question wall. This will help your team stay organized and even prioritize their highest-value questions.
Create question trees
Remember, the data science is using the scientific method to explore your data. That means that most of your data science will be empirical. Your team will ask a few questions, then gather the data, then they’ll react to the data and ask a series of new questions.
When you use a question tree it will reflect how the team has learned. At the same time, it will show the rest of the organization your progress.
Find new questions
You want to focus your questions on six key areas. The six key areas are questions that clarify key terms, root out assumptions, find errors, see other causes, uncover misleading statistics, and highlight missing data. If you discuss these six areas, then you’re bound to come up with at least a few questions.
4.Challenge the Team
Clarify key terms
You need to carefully look at the reasoning behind your ideas and then question it. That way you’ll have a better understanding of everyone’s ideas.
Root out assumptions
Remember that correlation doesn’t necessarily mean causation. The key is to focus on identifying where they are. An assumption that’s accepted as fact might cause a chain reaction of flawed reasoning. Also keep in mind that an assumption isn’t just an error to be corrected. It’s more like an avenue to explore.
There are key phrases that you might want to clarify. There’s also assumptions which might connect incorrect reasoning to false conclusions. Once you peel back these assumptions and clarify the language, you should be left with the bare reasoning. In many ways, now you’re asking more difficult questions.
In fact, your data science team might be one of the only groups in the organization that’s interested in questioning well established facts. When you’re in a data science team, each time you encounter a fact, you should start with three questions. Should we believe it? Is there evidence to support it? How good is the evidence? Evidence is well established data that you can use to prove a larger fact. Still, you shouldn’t just think of evidence as proving or disproving the facts. Instead, try to think of the evidence as being stronger or weaker.
The important thing to remember is that facts are not always chiseled in marble. Facts can change as the evidence gets stronger or weaker. When you’re working in a data science team, don’t be afraid to question the evidence. Often it will be a great source of new insights.
See the other causes
It’s easy to say that correlation doesn’t imply causation. It’s not always easy to see it in practice. Often you see cause and effect and there’s no reason to question how they relate.Sometimes it’s difficult to see an outcome that happens after something is different from an outcome that happens because of something.
If they don’t make sense then you should investigate the connection. Some of your best questions might come from eliminating these rival causes and finding an actual cause.
Uncover misleading statistics
When you’re in a question meeting, your team should closely evaluate statistical data. They should question the data and be skeptical of statistics from outside the team. The person on your data science team suggests that as many of half of your customers run with their friends. The best way to sort this out is to separate the statistic from the story. With the running shoe website, you had two stories. One that says that the customer likes their friends to save money. The other one says that customers run with their friends.
Highlight missing information
The first thing is to try and understand the reason that information is missing. Maybe there was no time or limited space in their report.
5. Avoid Obstacles
Overcome question bias
Questions are at the heart of getting insights from your data. It take courage to ask a good question.