Welcome to the Daft blog

Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.

Product Engineering Announcements Team Company Thought Leadership Case Studies Tutorials Video

Daft v0.7.5: A Plugin System, 5x Faster Parquet, and a Real-Time Query Debugger

Engineering

March 11, 2026

Daft v0.7.5: A Plugin System, 5x Faster Parquet, and a Real-Time Query Debugger

Native Extensions via Stable C ABI, Live Query Dashboard, and 2-5x faster Parquet Reads on Nested Types

How We're Making Observability Better in Daft

Engineering

March 2, 2026

How We're Making Observability Better in Daft

Daft Observability Roadmap: metrics, OTEL integration, real-time dashboards, and DataFrame APIs for debugging and monitoring distributed pipelines.

Daft v0.7.4: Arrow-rs, OpenDAL, Flight Shuffle, and Better Metrics

Engineering

Product

February 26, 2026

Daft v0.7.4: Arrow-rs, OpenDAL, Flight Shuffle, and Better Metrics

Daft v0.7.4 completes its arrow-rs migration, adds Apache OpenDAL storage support, Flight shuffle for Flotilla, and a full observability stack.

Daft v0.7.3: OTEL for Flotilla, Nightly Builds, and Lance NN Search

Engineering

February 18, 2026

Daft v0.7.3: OTEL for Flotilla, Nightly Builds, and Lance NN Search

Daft v0.7.3 adds distributed observability with df.metrics via OTEL, nightly builds, and native Lance vector search.

Introducing daft.File: Work with Any File, Anywhere

Engineering

Product

February 17, 2026

Introducing daft.File: Work with Any File, Anywhere

daft.File brings lazy, distributed handling for audio, video, PDFs, and code to Daft DataFrames. One interface, local or remote.

Tuning Daft's Distributed UDFs: Lessons from ByteDance

Engineering

February 6, 2026

Tuning Daft's Distributed UDFs: Lessons from ByteDance

Learn from the ByteDance Volcengine LAS Team on how to optimize Daft UDFs on Ray. Discover the formula to evenly distribute data across actors.

Introducing Dynamic Batching: Auto-Tuning for Daft Pipelines

Engineering

January 12, 2026

Introducing Dynamic Batching: Auto-Tuning for Daft Pipelines

Manually tuning batch sizes is hard. So I implemented dynamic batching to never deal with it ever again.

Engineering

December 15, 2025

How We Use AI Coding Agents

Our engineering team's best practices for working with AI coding agents.

How Sourcetable Built the World's First AI Spreadsheet with Daft

Engineering

Case Studies

December 11, 2025

How Sourcetable Built the World's First AI Spreadsheet with Daft

Sourcetable CTO Andy Grosser discusses their data infrastructure choices and why reliability and scale drove their architecture decisions.

PreviousPage 1 of 3Next

Get updates, contribute code, or say hi.

Daft Engineering Blog

Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.

Github Discussions Forums

join

The Distributed Data Community Slack

join