Quick start to Pandas
Pre-requisite : Numpy
What is Pandas ?
- Pandas is a third party package which has to be installed explicitly
- Built on top of Numpy package
- Used in data analysis
- Feature loaded compared to numpy
- Open source and free
- Can be downloaded from either Python Package Index or from Conda environment
Pandas has basically two types of data structures. They are Series and data frames
Series :
- series are one dimensional array
- Build’s an index array to store the indices and values array to store the values
- Internally uses a numpy array
- Value mutable but not size mutable
eg:
import pandas as pds1 = pd.Series([10, 20, 30, 40, 50])
print(s1)
print(f"shape = {s1.shape}")
print(f"size = {s1.size}")
print(f"dimensions = {s1.ndim}")
print(f"data type = {s1.dtype}")
print(id(s1))
print(id(s1[0]))
print(type(s1[0]))
output:
0 10
1 20
2 30
3 40
4 50
dtype: int64
shape = (5,)
size = 5
dimensions = 1
data type = int64
1751034379184
1751154510320
<class 'numpy.int64'>

Data Frame :
- Multi dimensional array
- Creates a kind of ordered list to store the data in table like structure
- Stored in the form of rows and columns
- every column represents a series
eg:
import pandas as pdpatients = [
{"name": "p1", "bp": 80, "temperature": 37, "infected": 1},
{"name": "p2", "bp": 50, "temperature": 33, "infected": 0},
{"name": "p3", "bp": 100, "temperature": 34, "infected": 1},
{"name": "p4", "bp": 75, "temperature": 35, "infected": 0}
]
df = pd.DataFrame(patients)
print(type(df['name']))
print(df.describe())print('-' * 50)print(df.info())
output:
<class 'pandas.core.series.Series'>
bp temperature infected
count 4.000000 4.000000 4.00000
mean 76.250000 34.750000 0.50000
std 20.564938 1.707825 0.57735
min 50.000000 33.000000 0.00000
25% 68.750000 33.750000 0.00000
50% 77.500000 34.500000 0.50000
75% 85.000000 35.500000 1.00000
max 100.000000 37.000000 1.00000
--------------------------------------------------
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 name 4 non-null object
1 bp 4 non-null int64
2 temperature 4 non-null int64
3 infected 4 non-null int64
dtypes: int64(3), object(1)
memory usage: 256.0+ bytes
None

Data frame comes with many attributes like:
- size
- ndmin
- index
- values
- shape
Data frame also comes with various functions like:
- head()
- tail()
- describe()
- info()
- reshape()
I hope this blog really helped you in understanding the basics of Pandas
Thank you for reading this blog. If you find any corrections please post them in the comments section
Credits goes to: Amit Kulkarini