Vision Data Foundry
We build high-fidelity environments, evals, and supervised datasets to push the frontier of artificial visual intelligence.
Textual reasoning has been riding an exponential. Domains like SWE, law, and mathematics have entered a new era of productivity.
Beyond the incredible work from the model labs, this progress is owed to an equally impressive mobilization at the data layer. Great companies that have rallied to push this modality to the limit. Hand-crafting learning signal with the same vigor and attention to detail that goes into training models.
Visual intelligence needs to have this moment.
Current vision benchmarks saturate fast, and exciting demos capture our imagination every day. Frontier systems can clearly climb this hill. Despite this, primitive vision capabilities and pixel-in-the-loop tasks are embarrassingly behind in real-world scenarios.
The problem boils down to a shallow data layer that's trying to feed a deep and wide modality. We exist to bridge this gap.
Our team experienced this while building Chunkr. So far our work has lived in that realm, working with great ML teams to push the limit in document AI. We are now expanding to cover more high-value vision use-cases.



Products
The data we build runs from core primitives like spatial analysis, and form understanding to long horizon tasks for frontend engineering, and solving visual puzzles.
Our engineers take pride in delivering fresh, diverse data with meaningful learning signal. Every single data point is backed by in-house training and thorough QA.
Benchmarking
Public and private evaluations for measuring frontier vision capabilities.

Bespoke Services
Beyond our off-the-shelf offerings, we frequently embed with ML teams as research partners to build custom supervised datasets. Helping them unlock targeted vision capabilities that fit their systems.
Process
Alignment
Co-design a schema, distribution, annotation guide, and surface edge-cases.
Tooling
Stand up the custom QA, annotation tooling, and ML-assist systems.
Golden set
Deliver a small golden dataset so you can confirm the recipe before volume.
Scale
Ramp to full production, shaping the distribution to your spec.
Delivery
Stream in data as it clears QA, with ML-backed evaluation and learnings.
Work with us
We partner with startups, enterprises, and model labs. Crafting data for a variety of architectures and training paradigms.
- 01
Set up a short call or talk async over email to dig into the vision capabilities you want to unlock.
- 02
We point you to off-the-shelf data packets with samples and benchmarks, or draft a proposal for a custom engagement.
- 03
Once fit is confirmed, we handle quick licensing paperwork and get started. Everything happens through our platform and marketplace.
Careers
We are hiring engineers and researchers who care about vision, data, and model behavior.
If this excites you, send an email to careers@floatingpoint.ai with two or three sentences on why you'd be a good fit. Include your resume and proof of work - anything that shows what you've built.
The best way to get our attention is to pitch a single-task eval that stress-tests frontier vision capabilities.

