Awesome List of Dataset
Last updated
Was this helpful?
Last updated
Was this helpful?
Pytorch
: High quality training and validation data for AI applications
: ML data management platform
: Save time by creating and managing your training data, people, and processes in a single place
EleutherAI is a grassroots AI research group aimed at democratizing and open sourcing AI research.
: The Pile is a 825 GiB diverse, open source language modelling data set that consists of 22 smaller, high-quality datasets combined together.