Harvard University's Institutional Data Initiative (IDI) has created a vast public domain book database, five times larger than the Books3 dataset, to provide open access to high-quality training materials for AI development. The initiative aims to level the playing field by enabling smaller companies and individual researchers to leverage resources previously accessible only to tech giants. Microsoft, OpenAI, and Google are supporting the project, recognizing the value of accessible data for AI innovation.
Around five times the size of the notorious Books3 dataset that was used to train AI models like Meta’s Llama, the Institutional Data Initiative's database spans genres, decades, and languages, with classics from Shakespeare, Charles Dickens, and Dante included alongside obscure Czech math textbooks and Welsh pocket dictionaries.
The Institutional Data Initiative has asked Google to work together on public distribution, but the details are still being hammered out. In a statement, Kent Walker, Google's president of global affairs, said the company was 'proud to support' the project.
Artificial Intelligence AI Training Public Domain Book Database Harvard University Microsoft Openai Copyright Data Access
United States Latest News, United States Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Harvard Launches Massive Public Domain Dataset for AI TrainingThe Institutional Data Initiative (IDI) unveils a vast public domain database of books spanning centuries and languages, aiming to level the playing field for AI development by providing accessible training materials. Microsoft, OpenAI, and Google are among the supporters of this initiative, which could redefine how AI models are trained.
Read more »
Harvard Is Releasing a Massive Free AI Training Dataset Funded by OpenAI and MicrosoftThe project’s leader says that allowing everyone to access the collection of public-domain books will help “level the playing field” in the AI industry.
Read more »
Russia launches a massive aerial attack against Ukraine with dozens of cruise missiles and dronesRussia on Friday launched a massive aerial attack against Ukraine, involving dozens of cruise missiles and drones. The Russian military targeted Ukrainian power grid, energy minister Herman Halushchenko wrote on his Facebook page. “The enemy continues its terror,” he said.
Read more »
Russia launches a massive aerial attack against Ukraine with dozens of cruise missiles and dronesRussia on Friday launched a massive aerial attack against Ukraine, involving dozens of cruise missiles and drones.
Read more »
Russia launches a massive aerial attack against Ukraine with dozens of cruise missiles and dronesRussia on Friday launched a massive aerial attack against Ukraine, involving dozens of cruise missiles and drones.
Read more »
Russia launches massive aerial attack on Ukraine, targeting energy infrastructureThe U.S. Embassy in Kyiv said Friday's attack also targeted transport networks and other key facilities.
Read more »