What is pandas?
If you’ve ever worked with data, I’m sure that you’ve heard of pandas (a.k.a the “Python Data Analysis Library”). For those of you who haven’t, pandas is a “fast, powerful, flexible and easy to use open-source data analysis and manipulation tool, built on top of the Python programming language.” Pandas is the most preferred and widely used tool for data wrangling/munging. It is open source, free to use (under a BSD license), and was originally created by Wes McKinney (Twitter | GitHub | LinkedIn).
Pandas has become popular because it makes it so easy to work with data using Python. One of the most popular features allows users to take data (like a CSV file or SQL database) and create a Python object with rows and columns called a DataFrame. This DataFrame looks very similar to a table in common statistical software (Excel and even the R programming language) and makes it a lot easier to work with data compared to working with lists and/or dictionaries through for loops or list comprehension.
As of April 2020, it is estimated that the number of pandas users exceeds 10 million.
Pandas offers flexible and efficient data structures (DataFrame and Series being the most popular), streamlined data representation and an extensive set of features that make handling large datasets relatively easy.
Pandas provides extremely streamlined forms of data representation. This helps to analyze and understand data better. As we all know, simpler data representations lead to better results for data science projects.
An extensive set of features
What makes pandas so powerful? It’s extensive set of features! With a huge set of important commands and features, analyzing data using Python has never been easier.
More work done with less writing
One of the best things about pandas is you code less yet achieve more. What would have taken multiple lines in Python without any support libraries, can simply be achieved through 1–2 lines of code using pandas. This significantly shortens the procedure of handling data. By saving time, you can focus more on data analysis algorithms!
Flexible and customizable data
Pandas’ huge feature set allows you to easily customize, edit and pivot data according to your own will or desire. This helps to bring the most of your data.
Made for Python
We all know that Python has become one of the most sought after programming languages in the world because of its extensive array of features and the sheer amount of productivity it provides. Since you code pandas in Python, you are able to tap into a number of important and powerful libraries that use Python. Some of these libraries include SciPy, NumPy, and matplotlib.
Thank you to the pandas core developers!
Please join me in thanking all the maintainers and core developers of pandas. Without you, none of this would be possible and it wouldn’t be so easy to work with data using Python!
Here are the top 10 maintainers and core developers. Check out their great work and reach out to thank them personally:
- Wes McKinney | GitHub | Twitter | LinkedIn
- Jeff Reback | GitHub | Twitter | LinkedIn
- Brock Mendel | GitHub | LinkedIn
- Joris Van den Bossche | GitHub | LinkedIn
- Tom Augspurger | GitHub | Twitter | LinkedIn
- gfyoung | GitHub
- Phillip Cloud | GitHub | Twitter | LinkedIn
- sinhrks | GitHub | Twitter
- Adam Klein | GitHub | Twitter | LinkedIn
- Matthew Roeschke | GitHub | LinkedIn
Claim your contribution to pandas on OpenTeams
Have you ever contributed to pandas? What about any other project? Made a PR? Written documentation? Answered questions? Regardless of how you contributed, go to pandas’ OpenTeams page by clicking here (or another project page) and claim your contribution. In doing so, you will get recognized for the great work you’ve done!
If you liked this, click the???? below so other people will see this here on Medium.
Thanks for reading!