Welcome to the Daft blog

Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.

Engineering
March 2, 2026

How We're Making Observability Better in Daft

Daft Observability Roadmap: metrics, OTEL integration, real-time dashboards, and DataFrame APIs for debugging and monitoring distributed pipelines.

Engineering
Product
February 26, 2026

Daft v0.7.4: Arrow-rs, OpenDAL, Flight Shuffle, and Better Metrics

Another killer release featuring arrow-rs migration, Apache OpenDAL support, Flight shuffle, better metrics, and Tencent Cloud COS integration.

Daft v0.7.3 release cover art
Engineering
February 18, 2026

Daft v0.7.3: OTEL for Flotilla, Nightly Builds, and Lance NN Search

Daft v0.7.3 adds distributed observability with df.metrics via OTEL, nightly builds, and native Lance vector search.

Product
Engineering
February 17, 2026

daft.File: Work with Any File, Anywhere

Distributed Random Access for Audio, Video, Documents, and Code

Tuning Daft's Distributed UDFs: Lessons from ByteDance
Engineering
February 6, 2026

Tuning Daft's Distributed UDFs: Lessons from ByteDance

Distributed Daft Engine
Engineering
January 12, 2026

Distributed Model Inference with Daft

A deep dive into Daft’s distributed execution engine, Flotilla, for multimodal data pipelines

Engineering
January 12, 2026

Introducing Dynamic Batching: Auto-Tuning for Daft Pipelines

I Got Tired of Tuning Batch Sizes, So I Made Them Tune Themselves

Engineering
December 15, 2025

How We Use AI Coding Agents

Our engineering team's best practices for working with AI coding agents.

Case Studies
Engineering
December 11, 2025

How Sourcetable Built the World's First AI Spreadsheet with Daft

Sourcetable CTO Andy Grosser discusses their data infrastructure choices and why reliability and scale drove their architecture decisions.

Engineering
Case Studies
December 2, 2025

Processing 99% of U.S. Caselaw for Under $1 in the Common Pile

How Teraflop AI processed 7 million court documents and 40 million pages spanning 365 years of U.S. caselaw for under a dollar using Daft.

Engineering
December 2, 2025

Multimodal Structured Outputs: Evaluating VLM Image Understanding at Scale

Leveraging ablation for contrastive image understanding evaluation in Daft

Engineering
November 4, 2025

Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale

A new inference backend that maximizes batch inference throughput.

PreviousPage 1 of 3Next
Get updates, contribute code, or say hi.
Daft Engineering Blog
Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.
Github Discussions Forums
join
GitHub logo
The Distributed Data Community Slack
join
Slack logo