Welcome to the Daft blog

Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.

Announcements
June 24, 2025

Eventual Raises $30M to Build the Future of Data

The parable of the blind men and the elephant
December 24, 2025

Knowledge curation (not search) is the AI big data problem

The next context engineering breakthrough looks a lot less like Google (Information Retrieval) and a lot more like Wikipedia (Knowledge Curation).

Engineering
December 15, 2025

How We Use AI Coding Agents

Our engineering team's best practices for working with AI coding agents.

Case Studies
Engineering
December 11, 2025

How Sourcetable Built the World's First AI Spreadsheet with Daft

Sourcetable CTO Andy Grosser discusses their data infrastructure choices and why reliability and scale drove their architecture decisions.

Why I joined Eventual - Sam Stokes
Team
December 3, 2025

Why I Joined Eventual

Sam Stokes on his decision to join Eventual

Engineering
Case Studies
December 2, 2025

Processing 99% of U.S. Caselaw for Under $1 in the Common Pile

How Teraflop AI processed 7 million court documents and 40 million pages spanning 365 years of U.S. caselaw for under a dollar using Daft.

Engineering
December 2, 2025

Multimodal Structured Outputs: Evaluating VLM Image Understanding at Scale

Leveraging ablation for contrastive image understanding evaluation in Daft

Product
November 14, 2025

Prompting with DataFrames: Massively Parallel LLM Generation is Here

Structure and scale LLM inference with prompt. A new function for Daft dataframes.

Agentic systems are just query engines for unstructured data
Engineering
November 12, 2025

Agentic systems are just query engines for unstructured data

A systems engineer’s view of the new AI stack

Daft Fall Review - daft logo among leaves
Product
November 7, 2025

Fall 2025 Review: Daft Open Source Updates

Highlights from new AI Functions, updated UDFs, and daft.File upgrades.

Engineering
November 4, 2025

Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale

A new inference backend that maximizes batch inference throughput.

Tutorials
October 29, 2025

Simplifying Voice AI Analytics with Daft: Transcription, Summaries, and Embeddings at Scale

How Daft simplifies voice AI analytics pipelines for meeting summaries, subtitle translations, and more.

Tutorials
October 22, 2025

Using PyTorch DataLoaders to Streamline Multimodal Data

Learn how PyTorch's DataLoader streamlines deep learning pipelines by efficiently loading and shuffling data in batches.

PreviousPage 1 of 3Next
Get updates, contribute code, or say hi.
Daft Engineering Blog
Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.
Github Discussions Forums
join
GitHub logo
The Distributed Data Community Slack
join
Slack logo