From the course: Python for Time Series Forecasting
Why set the datetime column as index? - Python Tutorial
From the course: Python for Time Series Forecasting
Why set the datetime column as index?
- [Instructor] Temporal column to index. This means placing the daytime column in the most left side of the data frame. Why is this important? Because if you want to apply some specific functions, the daytime must be set as the index if uniquely identifies each rows, which in this case since we have one hour per each one of the rows that they uniquely identify them. Let's see some of the tricks. Right now, we can access the daytime column, and if you want to convert to periods, we need to bypass the dt accessor and now compare to the period Q. But if we had the index with a daytime column as we can do now with the function set_index, let's save it as df_idx. And now if I want to apply the same function, I just need to access the index indirectly without using the dt because I get an error, it already get the daytime properties because it's on the index, I can use directly to period Q. You can clearly see how having the index with a daytime property, we can avoid using the dt accessor, one key shortcut. Now, which other properties can we enjoy from this functionality? Let's say that in the data frame that represents the energy demand in California, I want to see a chart of the total by quarter. This is resampling on each one of the quarters and on the column that contains the energy demand value, I will calculate the sum. We execute. Here we get a small warning because we need to specify if we want the date at the end or the start. Let's put it at the start of the quarter, and here we get a pythonic solution, one liner to get the total energy demand by quarter in California. Additionally, we could even plot this information by using the plot function with the line. Furthermore, because we have the daytime property in the index, we can get the data until 2024, because the 2025 is not completed, by using the locator property and accessing until the 2024. Now we execute, and here we can see a better representation. Do not worry about the functions I have applied here. They will be explaining detail in future lessons. This is just a demonstration of how, because you get the data frame with a daytime on the index, you can enjoy many of these functionalities at once. In short, as best practice, whenever you have a data frame with a daytime column that uniquely represents each one of the rows as it is in this case, make sure that it's always on the most left side of the data frame, which is the index.