Pandas
Pandas is a python library used for data structures and data analysis.
Loading CSV files
import pandas
df1 = pandas.read_csv("filename.csv")
df1
pandas.read_csv()
returns a DataFrame
.
CSV file without header
df = pandas.read_csv("csv_without_header.csv", header=None)
df.columns = ["col1", "col2",...]
df
Setting index
df.set_index("ID")
# this will create a new data frame with ID set as index
df.set_index("ID", inplace=True)
# this won't create new data frame. It will modify the same frame
But the second statement has a problem. If another index is set of it, the old index column would be deleted i.e. ID
column be will be dropped. This can be solved using
df.set_index("ID", inplace=True, drop=False)
Accessing DataFrames
Using labels
df.loc([row_start:row_end, col_start:col_end])
df1.loc[2:3, "Address":"City"]
Using indexing
# syntax df1.iloc[row_index_start: row_index_end, col_index_start: col_index_end]
df1.loc[2:3, 3:5]