Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.

Filter millions of files by path, size, and content type before opening any of them. Cheap operations first, expensive operations on the survivors.

Daft's query dashboard now shows you exactly where time is going. Slow operators light up red, completed nodes turn green, and arrows trace the data flow through your pipeline. No more guessing which

Row-wise, generator, async, and stateful UDFs — one notebook, one dataset, runnable side by side.

Run GPU models on millions of rows without OOM. Real patterns from ByteDance, Essential AI, and more.

Turn any Python class into a distributed operator. Hold models, connections, and clients across rows with one decorator.

Row-wise, async, generator, and batch UDFs in Daft — one decorator, zero boilerplate, local or distributed.

Daft User Defined Functions (UDFs) let you run custom Python inside a distributed DataFrame pipeline. Leverage Row-wise, Async, Generators, and Batch.

Daft v0.7.4 completes its arrow-rs migration, adds Apache OpenDAL storage support, Flight shuffle for Flotilla, and a full observability stack.

daft.File brings lazy, distributed handling for audio, video, PDFs, and code to Daft DataFrames. One interface, local or remote.