Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.

Daft makes it easy to express these pipelines end-to-end, while seamlessly scaling them up to handle massive workloads.

Learn how to achieve near-100% GPU utilization processing millions of text documents with Qwen3 embeddings.

A Streaming Solution

An adventure in AI and data engineering to analyze developers across Github

Learn how Daft integrates with DeepSeek SmallPond 3FS to deliver faster file access and efficient data handling for modern workloads.

A SQL API enabling users to interact with their data in a new but familiar way. Learn how Daft-SQL brings fast, scalable querying to multimodal workloads, helping teams explore large datasets efficiently with a distributed engine.

Discover how Daft reads Delta Lake tables efficiently, giving teams fast access to large datasets and seamless integration into data workflows.

Learn how adversarial file reading speeds up data ingestion at scale, enabling fast conversion from thousands of CSVs into efficient Parquet files.

This guide shows how Apache Parquet boosts read performance, lowers storage use, and supports efficient workflows for large analytical datasets.