Title: Apache Arrow and the Future of Data Frames
Speaker: Wes McKinney, Director, Ursa Labs
Date: July 8, 2020
In this talk I will discuss the background and motivation for the Apache Arrow project, which contains a columnar in-memory data standard and an expanding set of supporting libraries for a variety of programming languages. We will look at the relationship between data frame libraries and database systems and explore the ways in which analytics systems are likely to evolve to be more “Arrow-native” over the coming years.
Director, Ursa Labs
Wes McKinney is an open source software developer focusing on analytical computing. He created the Python pandas project and is a co-creator of Apache Arrow, his current focus. He authored two editions of the reference book Python for Data Analysis. Wes is a Member of The Apache Software Foundation and also a PMC member for Apache Parquet. He is the director of Ursa Labs, a not-for-profit development group focused on data science tools for Python and R powered by Apache Arrow, built in partnership with RStudio. Previously, he worked for Two Sigma, Cloudera, and AQR Capital Management, and he was co-founder and CEO of the startup DataPad.
Two Sigma Investments; ACM Practitioner Board
Larisa Sawyer is a software engineering manager and Vice President at Two Sigma Investments. Her educational background is in Computer Science and Applied Mathematics. The opportunity to blendmath and CS drew her to the realm of finance. Her career began at investment banks, building algorithmic trading platforms. Larisa has been at Two Sigma for the past seven years, and has worked on distributed time series analysis and platform technologies to increase research productivity and collaboration. Larisa also serves on the ACM Practitioner Board, as well as the advisory board for Data Clinic, Two Sigma’s data and tech for good program that leverages employees’ data science skills and technological know-how to support charities and non-profits.