Welcome to the Daft blog

Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.

Daft Extensions Featuring daft-h3: Native Rust Performance, Community Owned
Engineering
May 4, 2026

Daft Extensions Featuring daft-h3: Native Rust Performance, Community Owned

Daft now supports native extensions via Apache Arrow's C Data Interface. daft-h3 is the first community extension — 9 Rust-native H3 geospatial functions, 3–16x faster than Python UDFs.

Daft v0.7.10: 30 contributors, 41 new features, distributed asof joins
Engineering
May 2, 2026

Daft v0.7.10: 30 contributors, 41 new features, distributed asof joins

30 contributors shipped Daft v0.7.10 — the most participation in any Daft release to date. The result: 41 new features and functions across distributed joins, duplicate detection, temporal arithmetic,

Audio transcription at scale with daft.AudioFile
Engineering
April 29, 2026

Audio transcription at scale with daft.AudioFile

How to transcribe thousands of audio files with Whisper using daft.AudioFile — handling resampling, silence splitting, and worker-resident model loading without the boilerplate.

Image Embeddings: Tutorial & Examples
Engineering
April 27, 2026

Image Embeddings: Tutorial & Examples

Learn about the concept of image embeddings, their various use cases, and best practices for handling them in data processing workflows.

Multimodal Embeddings: Tutorial & Examples
Engineering
April 15, 2026

Multimodal Embeddings: Tutorial & Examples

Learn multimodal embedding techniques for cross-modal search, recommendation systems, and content moderation applications.

Daft v0.7.9: Temporal Arithmetic, Video Frame Decoding, and Native UUID
Engineering
April 13, 2026

Daft v0.7.9: Temporal Arithmetic, Video Frame Decoding, and Native UUID

Migrating ETL workloads from Spark means hitting gaps in date arithmetic — functions like `date_add`, `date_diff`, and epoch conversions that Spark users take for granted. Daft v0.7.9 closes that gap

Daft v0.7.5: A Plugin System, 5x Faster Parquet, and a Real-Time Query Debugger
Engineering
March 11, 2026

Daft v0.7.5: A Plugin System, 5x Faster Parquet, and a Real-Time Query Debugger

Native Extensions via Stable C ABI, Live Query Dashboard, and 2-5x faster Parquet Reads on Nested Types

How We're Making Observability Better in Daft
Engineering
March 2, 2026

How We're Making Observability Better in Daft

Daft Observability Roadmap: metrics, OTEL integration, real-time dashboards, and DataFrame APIs for debugging and monitoring distributed pipelines.

Daft v0.7.4: Arrow-rs, OpenDAL, Flight Shuffle, and Better Metrics
Engineering
Product
February 26, 2026

Daft v0.7.4: Arrow-rs, OpenDAL, Flight Shuffle, and Better Metrics

Daft v0.7.4 completes its arrow-rs migration, adds Apache OpenDAL storage support, Flight shuffle for Flotilla, and a full observability stack.

PreviousPage 1 of 4Next
Get updates, contribute code, or say hi.
Daft Engineering Blog
Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.
Github Discussions Forums
join
GitHub logo
The Distributed Data Community Slack
join
Slack logo