From the course: Problem Identification and Solution Design for Data Scientists
Unlock the full course today
Join today to access over 24,800 courses taught by industry experts.
Focusing on inclusion, not exclusion
From the course: Problem Identification and Solution Design for Data Scientists
Focusing on inclusion, not exclusion
- [Instructor] I want to focus now on some specific issues to consider when speaking with subject matter experts. There are two broad topics I'd like to discuss with them, oddities in the data, and which variables are likely to be useful in the model. Let's start with oddities in the data. Now, you haven't done a full audit of the data yet, that is more the data understanding phase of CRISP-DM, and we're not there yet during this stage. But by the time you have your first serious sit down with your SME, you've likely seen some data. Plus you'll meet with them again after the data understanding phase. So what should you talk about? You're looking for odd little patterns that you, the project sponsor and your IT contacts can't explain, but that you think the SME might be able to explain from their point of view. I had a client nickname this the Quirk Report once and the name stuck, I've used that nickname for years. If you have eliminated every other method to resolve the mysteries…