From the course: Machine Learning with Data Reduction in Excel, R, and Power BI

Unlock the full course today

Join today to access over 24,500 courses taught by industry experts.

Analyzing potential model dimensions

Analyzing potential model dimensions

- [Instructor] Once we have an idea of what models we want to use on our data set, whether it's a clustering algorithm or an anomaly detection one, we want to think about what variables or fields we want to include in the model. Examples of methods for selecting fields to include or remove from a model include high correlation filters low variance filters and analysis of missing data points. In Excel, I've created a data set that includes all the weather data for Denver in a single year in the form of almost 400 records. I've also included several fields of interest. In this course, we created clustering anomaly detection algorithm using two variables or dimensions in our models. However, we can include more of them. So how do we decide which ones to include. One way is through high correlation filters. Let's create a correlation table on the current dimensions of the model to see how this works. We'll select the data…

Contents