Massive Public Domain Database Opens Doors for AI Development

TECHNOLOGY News

Massive Public Domain Database Opens Doors for AI Development
AIArtificial IntelligencePublic Domain
  • 📰 WIRED
  • ⏱ Reading Time:
  • 47 sec. here
  • 7 min. at publisher
  • 📊 Quality Score:
  • News: 39%
  • Publisher: 51%

A new database containing millions of public domain books is being released, providing researchers and developers with a wealth of data for training AI models. The initiative aims to democratize access to high-quality training data, leveling the playing field for smaller players in the AI industry.

Around five times the size of the Books3 dataset, the Institutional Data Initiative's database includes a vast collection of public domain works spanning genres, decades, and languages. Classics from Shakespeare, Dickens, and Dante are featured alongside lesser-known texts like Czech math textbooks and Welsh dictionaries.

Greg Leppert, executive director of the Institutional Data Initiative, sees this project as a way to democratize access to high-quality training data, leveling the playing field for smaller AI players and individual researchers. Leppert believes this public domain database, when combined with licensed materials, can be used to build powerful AI models. He compares it to the open-source operating system Linux, suggesting that while companies would still need additional data for differentiation, this initiative provides a foundational resource. Microsoft, a supporter of the project, emphasizes its commitment to creating accessible data pools for AI startups, managed in the public interest. Though Microsoft utilizes publicly available data in its own models, they don't necessarily plan to replace all proprietary data with public domain alternatives. OpenAI has also expressed its delight in supporting the project

We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

WIRED /  🏆 555. in US

AI Artificial Intelligence Public Domain Database Data Access

United States Latest News, United States Headlines

Similar News:You can also read news stories similar to this one that we have collected from other news sources.

Public Domain Day Adds New Works to the PublicPublic Domain Day Adds New Works to the PublicNew Year's Day is celebrated as 'Public Domain Day' in copyright law, marking the day when countless works of literature, songs, films, and other creative works enter the public domain, allowing for new uses and interpretations. This year's additions include Popeye, Tintin, and works from several literary classics.
Read more »

Popeye and Tintin enter the public domain in 2025 along with novels from Faulkner and HemingwayPopeye and Tintin enter the public domain in 2025 along with novels from Faulkner and HemingwayPopeye the Sailor and the Belgian boy reporter Tintin lead the class of characters and works of art becoming public domain in 2025. On Jan. 1, 2025 the U.S. copyright expires on creations from 1929. That means the early versions of the comic characters can be used without permission or payment.
Read more »

Popeye and Tintin enter the public domain in 2025 along with novels from Faulkner and HemingwayPopeye and Tintin enter the public domain in 2025 along with novels from Faulkner and HemingwayPopeye the Sailor and the Belgian boy reporter Tintin lead the class of characters and works of art becoming public domain in 2025.
Read more »

Popeye and Tintin enter the public domain in 2025 along with novels from Faulkner and HemingwayPopeye and Tintin enter the public domain in 2025 along with novels from Faulkner and HemingwayPopeye the Sailor and the Belgian boy reporter Tintin lead the class of characters and works of art becoming public domain in 2025.
Read more »

Popeye and Tintin enter the public domain in 2025 along with novels from Faulkner and HemingwayPopeye and Tintin enter the public domain in 2025 along with novels from Faulkner and HemingwayPopeye the Sailor and the Belgian boy reporter Tintin lead the class of characters and works of art becoming public domain in 2025.
Read more »

Early Popeye and 'A Farewell to Arms' among famous entities entering public domainEarly Popeye and 'A Farewell to Arms' among famous entities entering public domainPopeye, Tintin and 'A Farewell to Arms' are among the intellectual properties becoming public domain in the United States on Jan. 1. meaning they can be used and repurposed without the OK of or payment to copyright holders.
Read more »



Render Time: 2025-02-19 19:07:45