Conference paper (published)

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

Pubblico Deposited

Creator
  • Laurençon, Hugo
  • Saulnier, Lucile
  • Wang, Thomas
  • Akiki, Christopher
  • Villanova del Moral, Albert
  • Le Scao, Teven
  • van Strien, Daniel  ( ORCID )
2022

Items:

Metadata