Up & Running with Polars
Impress your colleagues as you become a Polars pro with this comprehensive course enjoyed by over 2,000 learners to date!
Hi - I'm Liam and I'm the best person in the world to learn Polars with. In this course I show you how to take advantage of Polars - the fast-growing open source dataframe library that is becoming the go-to dataframe library for data scientists in python. I have been a Polars contributor for 2 years with a focus on making Polars accessible to new users. I was also one of the first people to put Polars into real-world production. I keep this course up-to-date with new releases of Polars - currently version 1.16
"A thorough introduction to Polars" - Ritchie Vink, creator of Polars
"Thank you for your great work with this course - I've optimized some code thanks to it already!" Maiia Bocharova
"5 stars based on the value this course added to my everyday working life, and for the quality of the instructor, Liam Brannigan." - Michael Purtell
The course is for data scientists who have some familiarity with a dataframe library like Pandas but who want to move to Polars because it is easier to write and faster to run. The materials are Jupyter notebooks that examine each topic in depth. Each notebook comes with a set of exercises to help you develop your understanding of the core concepts. For many key topics this course is the only source of documentation for learners and comes from my hundreds of hours working with the Polars source code.
The course introduces the syntax of Polars and shows you the many ways that Polars allows you to produce queries that are easy to read and write. However, the course also delves deeper to help you understand and exploit the hidden algorithms that drive the outstanding performance of Polars in the real world.
By the end of the course you will have optimised ways to:
- load and transform your data from CSV, Excel, Parquet, cloud storage or a database
- visualise your outputs with Matplotlib, Seaborn, Plotly, hvPlot & Altair
- prepare your data for machine learning pipelines with sklearn
- run your analysis in parallel
- understand optimal patterns for building queries
- work with larger-than-memory datasets
- carry out aggregations on your data
- combine your datasets with joins and concatenations
- work with nested dtypes including lists and structs
- optimise the speed and memory usage of your queries
- work with string and categorical data