Using Pandas to Read a CSV File
01 Dec 2018In my quest to get up to speed on Python, I came across a problem where I needed to read a CSV file and import it into a database. Below are the steps I took to install pandas as well as a quick example of how to use the package.
First, I created a virtual environment to use for the project from a terminal window:
python3 -m venv pandas
source pandas/bin/activate
Next, I installed pandas using pip:
pip install pandas
Below is a sample python file that I created while playing with the pandas package. The code comments explain what the code does.
import pandas as pd
# Read a csv file from the internet and print out the dataframe
url_csv = 'https://vincentarelbundock.github.io/Rdatasets/csv/boot/amis.csv'
df = pd.read_csv(url_csv, index_col=0)
print(df.head())
# Iterate over all the rows and print out the value in the
for index, row in df.iterrows():
print(index, row['speed'])
# Read a csv file from the internet and print out the first 7 rows from the dataframe
csv_url = 'http://vincentarelbundock.github.io/Rdatasets/csv/carData/MplsStops.csv'
df = pd.read_csv(csv_url, index_col='idNum')
print(df.iloc[:, 0:6].head())
# Read specific columns from the csv file and print out the first 7 rows from the dataframe
df = pd.read_csv(csv_url, index_col='idNum', usecols=['idNum', 'date', 'problem', 'MDC'])
print(df.iloc[:, 0:6].head())
# Get a tuple representing the dimensions of the frame (rows, columns)
print(df.shape)
A very light weight and easy to use library. I was up and running in under 10 minutes.
I found two very useful articles on using pandas at the links below: