Welcome to the Daft blog

Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.

GPU Inference with @daft.cls
Product
March 23, 2026

GPU Inference with @daft.cls

Run GPU models on millions of rows without OOM. Real patterns from ByteDance, Essential AI, and more.

Why I Joined Eventual
Team
December 3, 2025

Why I Joined Eventual

Sam Stokes shares why he joined Eventual, the company behind Daft, and what excites him about helping build our large scale data processing platform.

Multimodal Structured Outputs: Evaluating VLM Image Understanding at Scale
Engineering
December 2, 2025

Multimodal Structured Outputs: Evaluating VLM Image Understanding at Scale

Leveraging ablation for contrastive image understanding evaluation in Daft

Processing 99% of U.S. Caselaw for Under $1 in the Common Pile
Engineering
Case Studies
December 2, 2025

Processing 99% of U.S. Caselaw for Under $1 in the Common Pile

How Teraflop AI processed 7 million court documents and 40 million pages spanning 365 years of U.S. caselaw for under a dollar using Daft.

Prompting with DataFrames: Massively Parallel LLM Generation is Here
Product
November 14, 2025

Prompting with DataFrames: Massively Parallel LLM Generation is Here

Discover how Daft's prompt function revolutionizes LLM workflows with massively parallel context engineering on DataFrames.

Agentic systems are just query engines for unstructured data
Thought Leadership
November 12, 2025

Agentic systems are just query engines for unstructured data

Explores how agentic AI systems act as declarative query engines, revealing how reasoning and orchestration transform unstructured data.

Fall 2025 Review: OSS Updates | UDFs, Functions, & daft.File
Product
November 7, 2025

Fall 2025 Review: OSS Updates | UDFs, Functions, & daft.File

Daft Fall 2025: AI Functions, improved UDFs, faster vLLM inference, and new daft.File VideoFile subtype - plus Bigtable sink and Common Crawl loader.

Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale
Engineering
November 4, 2025

Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale

Learn how Dynamic Prefix Bucketing reduces LLM batch inference time, improves throughput, and unlocks faster multimodal processing at scale.

Simplifying Voice AI Analytics with Daft: Transcription, Summaries, and Embeddings at Scale
Tutorials
October 29, 2025

Simplifying Voice AI Analytics with Daft: Transcription, Summaries, and Embeddings at Scale

Build a Voice AI analytics pipeline with Daft and Faster-Whisper to convert raw audio into searchable transcripts, summaries, and embeddings at scale.

Using PyTorch DataLoaders to Streamline Multimodal Data
Tutorials
October 22, 2025

Using PyTorch DataLoaders to Streamline Multimodal Data

Learn how PyTorch's DataLoader streamlines deep learning pipelines by efficiently loading and shuffling data in batches.

PreviousPage 3 of 6Next
Get updates, contribute code, or say hi.
Daft Engineering Blog
Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.
Github Discussions Forums
join
GitHub logo
The Distributed Data Community Slack
join
Slack logo