In this video we benchmark some of the python pandas alternative libraries and benchmark their speed on a large dataset. We look at four different libraries: Dask, Modin, Ray and Vaex. Pandas is a very popular library used by data scientists who code in python and other libraries exist that claim to be faster than pandas. We put them to the test and see which is the fastest!
Timeline:
00:00 Intro
00:30 Setup
03:05 Pandas
05:54 Ray
10:24 Dask
13:30 Modin
15:45 Vaex
18:45 Summary
Follow me on twitch for live coding streams: www.twitch.tv/medallionstallion_
My other videos:
Speed Up Your Pandas Code: • Make Your Pandas Code Lightning Fast
Speed up Pandas Code: • Make Your Pandas Code Lightning Fast
Intro to Pandas video: • A Gentle Introduction to Pandas Data Analy...
Exploratory Data Analysis Video: • Exploratory Data Analysis with Pandas Python
Working with Audio data in Python: • Audio Data Processing in Python
Efficient Pandas Dataframes: • Speed Up Your Pandas Dataframes
Youtube: youtube.com/@robmulla?sub_confirmation=1
Discord: discord.gg/HZszek7DQc
Twitch: www.twitch.tv/medallionstallion_
Twitter: twitter.com/Rob_Mulla
Kaggle: www.kaggle.com/robikscube
#python #pandas #datascience #dataengineering
コメント