From the course: Problem Identification and Solution Design for Data Scientists
Unlock the full course today
Join today to access over 24,800 courses taught by industry experts.
Beware too much prototyping
From the course: Problem Identification and Solution Design for Data Scientists
Beware too much prototyping
- [Narrator] I've been in this business long enough to see different styles emerge, but one trend with the emergence of increasingly complex algorithms that I think we have to be cautious about is rushing to an opaque, so-called black box modeling technique right away just to see how it goes. What I find problematic with this is that you learn almost nothing about the data this way. Remember what the experiment is for. It's simply to see if you can get the model to run it all, so manually playing around with a bunch of hyper parameters makes no sense at this stage, and we've already discussed that you can't trust measures of accuracy to be stable this early in the game. The approach that I would use would be to run a simple model like CART, and I really mean something as straightforward as CART, which is classification and regression trees, to see if it picks up a few variables. If you're not familiar with CART, I've got courses about decision tree algorithms in the library. Then make…