Welcome to the Daft blog

Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.

Product Engineering Announcements Team Company Thought Leadership Case Studies Tutorials Video

Product

March 23, 2026

GPU Inference with @daft.cls

Run GPU models on millions of rows without OOM. Real patterns from ByteDance, Essential AI, and more.

Team

December 3, 2025

Why I Joined Eventual

Sam Stokes shares why he joined Eventual, the company behind Daft, and what excites him about helping build our large scale data processing platform.

Multimodal Structured Outputs: Evaluating VLM Image Understanding at Scale

Engineering

December 2, 2025

Multimodal Structured Outputs: Evaluating VLM Image Understanding at Scale

Leveraging ablation for contrastive image understanding evaluation in Daft

Processing 99% of U.S. Caselaw for Under $1 in the Common Pile

Engineering

Case Studies

December 2, 2025

Processing 99% of U.S. Caselaw for Under $1 in the Common Pile

How Teraflop AI processed 7 million court documents and 40 million pages spanning 365 years of U.S. caselaw for under a dollar using Daft.

Prompting with DataFrames: Massively Parallel LLM Generation is Here

Product

November 14, 2025

Prompting with DataFrames: Massively Parallel LLM Generation is Here

Discover how Daft's prompt function revolutionizes LLM workflows with massively parallel context engineering on DataFrames.

Agentic systems are just query engines for unstructured data

Thought Leadership

November 12, 2025

Agentic systems are just query engines for unstructured data

Explores how agentic AI systems act as declarative query engines, revealing how reasoning and orchestration transform unstructured data.

Fall 2025 Review: OSS Updates | UDFs, Functions, & daft.File

Product

November 7, 2025

Fall 2025 Review: OSS Updates | UDFs, Functions, & daft.File

Daft Fall 2025: AI Functions, improved UDFs, faster vLLM inference, and new daft.File VideoFile subtype - plus Bigtable sink and Common Crawl loader.

Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale

Engineering

November 4, 2025

Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale

Learn how Dynamic Prefix Bucketing reduces LLM batch inference time, improves throughput, and unlocks faster multimodal processing at scale.

Simplifying Voice AI Analytics with Daft: Transcription, Summaries, and Embeddings at Scale

Tutorials

October 29, 2025

Simplifying Voice AI Analytics with Daft: Transcription, Summaries, and Embeddings at Scale

Build a Voice AI analytics pipeline with Daft and Faster-Whisper to convert raw audio into searchable transcripts, summaries, and embeddings at scale.

Using PyTorch DataLoaders to Streamline Multimodal Data

Tutorials

October 22, 2025

Using PyTorch DataLoaders to Streamline Multimodal Data

Learn how PyTorch's DataLoader streamlines deep learning pipelines by efficiently loading and shuffling data in batches.

PreviousPage 3 of 6Next

Get updates, contribute code, or say hi.

Daft Engineering Blog

Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.

Github Discussions Forums

join

The Distributed Data Community Slack

join