I have a pandas dataframe holding more than million records. One of its columns is datetime. The sample of my data is like the following:
time,x,y,z2015-05-01 10:00:00,111,222,3332015-05-01 10:00:03,112,223,334...
I need to effectively get the record during the specific period. The following naive way is very time consuming.
new_df = df[(df["time"] > start_time) & (df["time"] < end_time)]
I know that on DBMS like MySQL the indexing by the time field is effective for getting records by specifying the time period.
My question is
- Does the indexing of pandas such as
df.index = df.time
makes the slicing process faster? - If the answer of Q1 is 'No', what is the common effective way to get a record during the specific time period in pandas?