Talks: Hacking `import` for speed: how we wrote a GPU accelerator for pandas

Saturday - May 18th, 2024 10:30 a.m.-11 a.m. in Ballroom BC

Presented by:

Description

Python’s import system is eminently hackable. Often, this is a tool of last resort, but it can be extremely powerful. In this talk, we’ll describe our ambitious effort to hack import pandas to accelerate large parts of it on the GPU using cuDF: a GPU DataFrame library.

We’ll cover the basics of import hacking and other tricks like Pythonic proxy patterns. We’ll show how we use these more dynamic features of Python to effectively accelerate any code that uses pandas, including third-party libraries. We’ll also get into the technical and social problems that currently necessitate these sophisticated solutions, and share some thoughts on solving them. It will be a story of successes, failures, wishes and tears, and excursions into exciting parts of Python many developers may not have encountered before!

This talk is for the Pythonista interested in the import system and how to hack it for performance. It is also for developers interested in the question of speeding up the vast ecosystem built on top of libraries like numpy and pandas without code changes.