Across many fields of science, researchers’ support for sharing data has increased. But given the potential cost and complexity, many are apprehensive about a new NIH policy. Science offers this guide as researchers prepare to plunge in. ⬇️
, DIGITAL SCIENCE REPORT, HTTPS://DOI.ORG/10.6084/M9.FIGSHARE.21276984.V5 Sharing data could also help curtail duplicative efforts to collect them. That could save time and money for smaller labs in particular, says Crystal Rogers, a cell and developmental biologist at the University of California , Davis.
“Maybe this policy will even the playing field,” she says. “It will democratize opportunities.” Research policymakers have gradually expanded requirements to cover grants of all sizes, across all fields of science, and provided more specifics about when and how data should be shared. Existing data can help researchers generate hypotheses, design clinical trials, and teach. And by pooling smaller data sets, scientists can conduct meta-analyses that can produce robust or intriguing findings, says Maryann Martone, a neuroscientistconducted in the 1990s on treatments for spinal cord injury. The results from the individual studies were inconsistent and never published. But a 2021 analysis of pooled data from 1125 animals produced a significant correlation: Animals with blood pressure levels within a certain window during spine surgery fared better, a finding thatFor the researchers who share their data,is increased citations to papers for which data are provided. Papers that provided a link to data gained 25% more citations on average than those that did not, according to a 2020 study of more than 50,000 articles in the PLOS and BMC journals. Even as more funders expect grantees to provide data, a lack of professional rewards may be responsible for widespread noncompliance. Sharing typically doesn’t count for much in tenure and promotion reviews, for example. Academic institutions should encourage departments to develop policies for providing such rewards, according to afrom the Association of American Universities and the Association of Public and Land-grant Universities. It may be hard to overcome fears that researchers who share data won’t get proper credit from others—or may even get scooped. “How do you make sure that somebody doesn’t grab that data and publish it as their own in some minor journal?” worries cancer physician-scientist Jan Grimm of Memorial Sloan Kettering Cancer Center. Advocates for data sharing have called for publishers to discourage such behavior by requiring authors who use data generated by other scientists to name them as “data authors.” Scientists may come to see data sharing as a useful burden, like peer review, says Tim Vines, founder of a data search tool called DataSeer. “Peer review is very annoying, but many people say: ‘It improves my manuscripts.’ Researchers accept that. We need to bring [data] sharing to that level.”Many U.S. funders already have sharing policies. NIH has been a leader of such efforts, rolling out a 1996 policy for its grantees in human genome sequencing and expanding it in 2003 to cover all large projects. Now, the agency isNIH’s new policy “strongly encourages” researchers to deposit project data in repositories where other researchers have free access. The data should be “of sufficient quality to validate and replicate research findings,” the policy says. Data should be deposited when a journal article about them is published or the grant ends, whichever comes first. And the policy extends to unpublished findings, including negative results.“We really wanted to catalyze the research community through a more ubiquitous data-sharing policy,” says Lyric Jorgenson, acting director of NIH’s Office of Science Policy, who oversaw development of the policy. NIH’s push could test the kinds of changes that all federally funded researchers will need to make by December 2025, whenby the White House Office of Science and Technology Policy takes effect. Like NIH’s policy, OSTP’s requires all data “underlying” peer-reviewed scholarly papers be made publicly available for free when papers are published . The policy is significantly more stringent than the current requirement at a key agency it covers, the National Science Foundation , which only requires data sharing within “a reasonable time period.” NSF and other U.S. research-funding agencies are expected to propose details this year and next about how they will implement OSTP’s policy.NIH’s policy requires investigators submitting a grant proposal to include a two-page data-management plan listing the types of data they will produce, the software or tools needed to use the data, and the publicly accessible repositories where they will be stored. When submitting the data, researchers will need to include “metadata,” or details of how the data were collected. And they may need to reformat data to fit a repository’s standards. These steps are meant to make the data comply with international guidelines called the FAIR principles, which stands for “findable, accessible, interoperable, and reusable.” Supporters of sharing also call for something encouraged, but not required by NIH: choosing repositories that attach digital object identifiers to data sets that are different from those used to identify the associated papers. The DOIs—unique, permanent serial numbers—will make it easier for other researchers to find relevant data. DOIs will also identify each data set as an independent scholarly contribution, enabling researchers to claim credit for generating and sharing the data. Some researchers who recently began to share more data under existing NIH policies say the process can be very time-consuming. Relabeling, reformatting, and otherwise preparing all the underlying data collected by co-authors on a paper can take half a day, says Florian Krammer, a virologist at the Icahn School of Medicine at Mount Sinai—work he typically does on a weekend. His data manager needs another full day to upload the data to databases. “I think a lot of people don’t realize how much work it is,” he says. Others point out that if researchers develop plans for data sharing at the start of a project, the costs may go down. “The time decreases the better managed your lab is, because things are documented from the get-go instead of at the very end,” UC San Diego’s Martone says. Stacey Schultz-Cherry, a virologist at St. Jude Children’s Research Hospital, puts it this way: “We’re all going to grumble, but in the long run it’s really going to benefit science.”Biomedical research often involves human subjects who may not have agreed to having their data shared, even if they have been stripped of identifying information. NIH’s policy allows exceptions for such data. But the agency expects that, when possible, consent forms for new studies will ask participants to agree to share their deidentified data. Although some institutional ethics boards have opposed such broad consent, “I think there’s an understanding that this is what the community is moving towards,” says physician Ida Sim of UC San Francisco. She is a co-founder of Vivli, a repository that some institutions plan to use to share participant-level data from clinical trials. Clinical trial researchers are known for keeping their data under wraps because they’re concerned they won’t be analyzed properly, or they’re still writing papers. Many already ignore a 2016 NIH rule requiring that summary data be posted in the federal ClinicalTrials.gov database no later than 1 year after the trial’s primary completion date. But the NIH data-sharing policy is already “being taken as a serious mandate” by those scientists, Sim says. “I am pleased with how much of a cultural change this has catalyzed.”Because biomedical disciplines create and use data differently, NIH says it chose to provide flexibility by not packing its policy with detailed requirements. In particular, it does not specify how much data researchers must share from a given data set. Do they need to deposit an entire video of dividing cells or a molecular marker infiltrating a tumor, which could be gigabytes of data, or just the still images presented in papers? “Many of us do not fully understand at what level, from raw to fully processed and grouped, NIH expects data to be shared,” says cardiovascular disease researcher Curt Sigmund of the Medical College of Wisconsin. The answer, NIH’s Jorgenson says, is that each discipline will need to work out the “granularity” required to reproduce a paper’s findings. In practice, NIH program managers will review an investigator’s sharing plan when a grant proposal is submitted and check progress reports to be sure the plan is being followed. The agency could terminate a grant for noncompliance, although that rarely happens for violations of other NIH policies. But those who don’t share data could be barred from receiving a new grant, Jorgenson says. Achieving the data-sharing policy’s goals will likely be achieved “in stages and steps,” she adds. “We did not want to set the bar so high that we create disincentives for anyone to participate.” Many funders and journals have struggled to enforce their own sharing requirements. Confirming whether authors shared all data supporting an article can require a close, time-consuming examination, Vines says. Publishers receive no extra revenue for the added effort. To avoid data sharing that is incomplete or poorly done, funders and institutions may need to not only threaten researchers with sticks, but also offer them carrots, in the form of technical support and training, says Dylan Ruediger, a project manager at Ithaka S+R, a higher education research and consulting organization. He managedin fields as disparate as agronomy, nuclear imaging, and polar science to examine barriers to data sharing. “Complying with mandates to deposit data is not the same thing as creating an ecosystem that’s really well adapted to help researchers reuse data,” Ruediger says. “That’s a very different kind of challenge.”Currently, many researchers store data primarily on their personal computers . Sharing data will mean shifting them to one of several possible homes: a repository at the researcher’s institution; a discipline-based one, such as OpenNeuro, which holds brain-imaging data, or NIH’s ImmPort, which stores immunology data; or a general repository, such as figshare or Zenodo. Many repositories will need improvements to make it easier to deposit, find, and retrieve data, experts say. To help navigate this new terrain, some universities are beefing up staff who can help, such as IT specialists and librarians who specialize in data. “We’re reprioritizing some of the things that we’re doing in the library to accommodate these requests,” says Vicki Coleman, dean of library services at North Carolina A&T State University, a historically Black research institution. She says the library will shift staffing away from its traditional reference desk—a trend underway at other universities as well. These data experts often have clever ways to adapt commonly used information-management tools to the needs of specific research fields. Many universities, for example, now offer faculty members training in using Jupyter Notebooks, an open-source web application designed to make it easier to share data. The extra staffing and training should address a concern Ruediger found among participants in his project to encourage data sharing: “a sense that the challenges they were facing were unique and idiosyncratic to them.”Scientists say it can be difficult to estimate the cost of cleaning and preparing data for use outside their team. For example, Krammer of Mount Sinai estimates data sharing will eat up at least 10% of his funding. Hiring a data manager might cost $100,000 per year, although not all labs will need one. NIH says researchers applying for a grant can add costs for data managers, staff time to prepare data, and repository fees. But because NIH has a strict dollar limit for many grants, data-sharing costs may cut into the funds available for research. “If you’re loading up your grant budget with data-sharing and management costs, is that going to take away from the funds for doing the science?” says David Kennedy, vice president of the Council on Governmental Relations, which represents major research universities. Universities, for their part, will have to pay for campuswide services supporting data sharing, such as librarians and subscriptions to repositories. Institutions can bill these “indirect” costs to the overhead that funders provide with grants. But those reimbursements are capped. Although universities have long tapped their own revenues to help cover the indirect costs of research, some worry data sharing will become another “unfunded mandate” from the federal government. The costs per institution will exceed $1 million a year, split between overhead and investigators’ budgets, according to an initial analysis based on a survey of 34 Council on Governmental Relations members. That could be a special burden for smaller institutions, Kennedy says. That is “a huge concern,” Jorgenson acknowledges. “We do not want to exacerbate inequities in the funding structure.”sustainable business models. Discipline-specific ones are typically supported by grants for individual projects that don’t assure funding after the grant ends. NIH’s and OSTP’s policies don’t spell out for how long data must be stored and shared; Jorgenson says the agency “will be collecting lots of information” to inform a more specific policy on this.Skeptics say the benefits are yet to be demonstrated. Krammer says funders should collect and analyze data about whether the new push is producing the intended effects. “There needs to be an evaluation after 2 years, 5 years, to look at what type of data isSupporters of data sharing agree—and think the results will bear them out. “We need some real demonstrations of how this level of data sharing can drive the discovery engine,” UC San Francisco’s Sim says. “I don’t think we’re there yet. But it’s kind of like everyone’s hopped in the car, and we’re starting the engine.”
United States Latest News, United States Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
ChatGPT isn't impressing some AI experts and researchers | Digital TrendsMeta's AI chief reportedly said ChatGPT isn't particularly innovative, and simply builds upon decades-old work while being very well engineered.
Read more »
Cosmic Simulation: Researchers Create Curved Spacetime in LabScientists have created a simulation of an entire family of universes with curvature using ultracold quantum gases. Einstein's Theory of Relativity states that space and time are intertwined. In our Universe, the curvature of spacetime is relatively small and unchanging. However, researchers from H
Read more »
In Bluegrass State, Nucor's Green Steel Trumps Anti-Woke PoliciesThe US offshore wind energy industry is getting a big assist from the state of Kentucky, host of Nucor's new green steel factory.
Read more »
U.S. raises 'grave concerns' over Mexico's anti-GMO farm policiesU.S. farm and trade officials raised 'grave concerns' over Mexico's agricultural biotechnology policies in meetings with their Mexican counterparts on Monday, as lingering disagreements threaten decades of booming corn trade between the neighbors.
Read more »
Dunleavy advocates ‘pro-life’ policies in annual addressGov. Mike Dunleavy said he wanted to make Alaska “the most pro-life state in the entire country” during his annual address to the Legislature on Monday evening, marking the Republican governor’s second term in office.
Read more »
Selling USD/JPY rallies as BoJ unconventional policies are not sustainable – BofAEconomists at Bank of America Global Research look to sell USD/JPY on rallies. At the same time, they expect more upside potential for the British Pou
Read more »