From raw multimodal data to training-ready datasets. In one pipeline, at any scale.

From raw multimodal data to training-ready datasets. In one pipeline, at any scale.
Process video, images, audio, and sensor data alongside structured metadata in a single dataframe.
Run GPU inference/embeddings alongside CPU decode and filter in one pipeline. Daft handles the scheduling and batching, no glue code required.
Same operations you use in Pandas or Spark: filter, transform, aggregate, write. No new framework to learn.
Process video, images, audio, and sensor data alongside structured metadata in a single dataframe.
Run GPU inference/embeddings alongside CPU decode and filter in one pipeline. Daft handles the scheduling and batching, no glue code required.
Same operations you use in Pandas or Spark: filter, transform, aggregate, write. No new framework to learn.
[1]
Native model operators
Embeddings, LLM extraction, and structured outputs as first-class operations. Plug in models from OpenAI, Hugging Face, or your own.
[2]
Multimodal column types
Images, video, audio, text, and embeddings as native column types. Decode, transform, and filter them like any other column.
[3]
Local to production consistency
Define pipelines once. Run them on your laptop or scale across a cluster. Same code, no rewrites.
[4]
Managed UDF runtime
Automatic batching, retries, and error handling for model UDFs. Zero-copy execution powered by Apache Arrow.
[5]
Lower memory footprint
Run the same queries with 5x less memory than alternatives. Jobs that would OOM on Spark or Pandas just work.
[6]
Built in Rust for speed
Daft's core is written in Rust. Decode video, run transforms, and join multimodal data at TB scale without paying Python overhead.
Tony Wang
Data @ Anthropic, PhD @ Stanford
Patrick Ames
Principal Engineer @ Amazon
Ritvik Kapila
ML Research @ Essential AI
Maurice Weber
PhD AI Researcher @ Together AI
Alexander Filipchik
Head Of Infrastructure @ Atoms