Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.

How we built, broke, and re-built our ASOF joins — 6x faster, half the memory of pandas, and scaled to a distributed cluster.

Native Extensions via Stable C ABI, Live Query Dashboard, and 2-5x faster Parquet Reads on Nested Types

Row-wise, async, generator, and batch UDFs in Daft — one decorator, zero boilerplate, local or distributed.

Daft User Defined Functions (UDFs) let you run custom Python inside a distributed DataFrame pipeline. Leverage Row-wise, Async, Generators, and Batch.

Daft Observability Roadmap: metrics, OTEL integration, real-time dashboards, and DataFrame APIs for debugging and monitoring distributed pipelines.

Daft v0.7.4 completes its arrow-rs migration, adds Apache OpenDAL storage support, Flight shuffle for Flotilla, and a full observability stack.

Daft v0.7.3 adds distributed observability with df.metrics via OTEL, nightly builds, and native Lance vector search.

daft.File brings lazy, distributed handling for audio, video, PDFs, and code to Daft DataFrames. One interface, local or remote.

Today, we're introducing updates to the Daft OSS governance model defining new roles for contributors and maintainers with expanded permissions.

Learn from the ByteDance Volcengine LAS Team on how to optimize Daft UDFs on Ray. Discover the formula to evenly distribute data across actors.