Pandas to Parquet
To write a pandas DataFrame to a Parquet file, you need to use the to_parquet()
function. The Parquet format is a columnar storage file format available in the Hadoop ecosystem. It is compatible with most of the data processing frameworks in the Hadoop environment.
Here is a simple example:
import pandas as pd
# Create a DataFrame.
df = pd.DataFrame({
'column1': ['value1', 'value2', 'value3'],
'column2': ['value4', 'value5', 'value6']
})
# Write DataFrame to Parquet.
df.to_parquet('df.parquet')
To read a Parquet file back into a pandas DataFrame, you can use the read_parquet()
function.
df = pd.read_parquet('df.parquet')
You may need to install pyarrow
or fastparquet
as these are dependencies for the above functions to work:
pip install pyarrow
or
pip install fastparquet
Python Training Demo Day 1
Conclusion:
Unogeeks is the No.1 IT Training Institute for Python Training. Anyone Disagree? Please drop in a comment
You can check out our other latest blogs on Python here – Python Blogs
You can check out our Best In Class Python Training Details here – Python Training
Follow & Connect with us:
———————————-
For Training inquiries:
Call/Whatsapp: +91 73960 33555
Mail us at: info@unogeeks.com
Our Website ➜ https://unogeeks.com
Follow us:
Instagram: https://www.instagram.com/unogeeks
Facebook: https://www.facebook.com/UnogeeksSoftwareTrainingInstitute
Twitter: https://twitter.com/unogeeks