Skip to content

diffgram/awesome-training-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 

Repository files navigation

awesome-training-data

Curated list of Awesome Training Data! (Data Labeling, Annotation, Discovery, Workflow etc)

Maintained by Diffgram

Contributions welcome!

Open Source Platforms

  • Diffgram Training Data (Data Labeling, Annotation, Workflow) for all Data Types (Image, Video, 3D, Text, Geo, Audio, more) at scale.
  • CVAT Computer Vision Annotion Tool

Training Data Integrity & Agent-Generated Data

  • DOS (dos-kernel) Trust kernel for fleets of AI agents — verifies what an agent actually did from evidence the agent cannot forge (git ancestry, test results); its reward() verdict gates which agent trajectories may enter a training set, rejecting "resolved" claims the evidence refutes.
  • Cleanlab Data-centric AI library for finding label errors and data quality issues in training sets.

Closed Source Platforms

Communities

Books

Blogs