Pandas to Parquet

Share

             Pandas to Parquet

To write a pandas DataFrame to a Parquet file, you need to use the to_parquet() function. The Parquet format is a columnar storage file format available in the Hadoop ecosystem. It is compatible with most of the data processing frameworks in the Hadoop environment.

Here is a simple example:

python

import pandas as pd

# Create a DataFrame.
df = pd.DataFrame({
'column1': ['value1', 'value2', 'value3'],
'column2': ['value4', 'value5', 'value6']
})

# Write DataFrame to Parquet.
df.to_parquet('df.parquet')

To read a Parquet file back into a pandas DataFrame, you can use the read_parquet() function.

python
df = pd.read_parquet('df.parquet')

You may need to install pyarrow or fastparquet as these are dependencies for the above functions to work:

bash
pip install pyarrow

or

bash
pip install fastparquet

Python Training Demo Day 1

 
You can find more information about Python in this Python Link

 

Conclusion:

Unogeeks is the No.1 IT Training Institute for Python  Training. Anyone Disagree? Please drop in a comment

You can check out our other latest blogs on Python here – Python Blogs

You can check out our Best In Class Python Training Details here – Python Training

💬 Follow & Connect with us:

———————————-

For Training inquiries:

Call/Whatsapp: +91 73960 33555

Mail us at: info@unogeeks.com

Our Website ➜ https://unogeeks.com

Follow us:

Instagram: https://www.instagram.com/unogeeks

Facebook: https://www.facebook.com/UnogeeksSoftwareTrainingInstitute

Twitter: https://twitter.com/unogeeks


Share

Leave a Reply

Your email address will not be published. Required fields are marked *