That Open Letter

Warning: wall of text/rant. 

Earlier this week I had the good fortune to hear about a seminar on campus at the last minute. The head of the NIH National Institute for Environmental Health and Safety (NIEHS) Office of Sceintific Information Management (Dr. Allen Dearry) was giving a talk to environmental health researchers - "Towards Biomedical Research as Digital Enterprise". Dr. Dearry's talk was fine - an introduction to new and impending data stewardship expectations out of NIH (and many other agencies), funding opportunities from NIH to support these goals, and a discussion of the new (to me at least) Precision Medicine movement. There were plenty of questions from the audience, some of them lobbed by my prickly self, about the data stewardship elements of the talk and what kind of support and guideance NIH would offer and what it all meant for the researcher. Unfortunately, the speaker was hard pressed to answer most of the questions with any real level of clarity. What is data? What should be shared and how? What resources were available? What standards should be used? These questions were all met with a smile, a shrug of shoulders, and promises that more information would be forthcoming. But, as we've heard so many times before, the agency couldn't be responsible for making this happen - the research communities need to step up to the plate and sort it out.

I'm being hard on Dr. Dearry, not because his talk was especially problematic, it wasn't unique - he is just a convenient and recent example. His talk was fine - it offered information on what NIH is doing to address and support these mandates (some stuff) and he was quite honest about how soon we might expect to see real guidance (5-10 years). It is exactly what I have come to expect over the past few years when discussing the impact of and support for the famous OSTP mandate from 2013 and previous mandates (explicit and implicit) for data sharing and curation from funding agencies. Communities of researchers are expected to apply their best practices to enable data curation, data sharing, and all that other data stuff and they should do it because it is the right thing to do - agencies can't force the issue - the researchers need to do this themselves. This is what we've heard over and over. But this is a false dichotomy that has been perpetuated for too long - it is not a choice between an agency "forcing" research communties to be better data stewards and research communities self organizing to do the same. There is a middle ground that is not being explored there, and this lack of exploration is at the expense of a potentially more thorough and holistic suite of support services for meeting NIH (and all of those other agency) goals and the goals of open science. 

So what is that middle ground? Well, for starters, it would be helpful to see some more meaningful engagement from these agencies with communities that are trying to provide guidance and services - something I don't feel like I've seen much of from where I sit. If the agencies aren't going to offer guidance, and the researchers are looking for guidance, maybe there is someone out there who can help kick-start the process. And, it turns out, there is such a someone - in fact, there are many of them. 

Some of these someones already exist in areas of research that have historically been better at data stewardship. We point to them all the time - "Why can't you be more like the astrophysicists?" we say.Great, lets figure out how to make extensible some of those practices and to identify what practices are simply a function of the type of science that community does. Lets have a nuanced discussion.

We can't forget our friends in the commercial world, many of whom are providing high quality services and products to facilitate data stewardship and open science in a way that is effective and easy for researchers to incorporate into existing workflows (I'm looking at you, figshare and the Center for Open Science). 

Last, we have a growing community in academia in libraries, IT organiations, research offices who are trying to develop service profiles to help meet these data stewardship needs. These communities are providing guidance on best practices, providing repositories (often at a very low cost if at any cost at all to users) for data and other digital assets, and a host of other services to facilitate data stewardship and open science. This community is growing and is excited and seems to have a pretty good handle on how to approach this problem. But this community can only get so far with waiting to be invited to the table and waiting to see what guidance comes down the road. I feel strongly about this because this is the community in which I sit and this is the community that is primed to really make a difference to the data stewardship needs of our researchers.

I'd like to call on that last community (and the others if they're up for it) to step up to the table and assert our role in this grand challenge. We're doing this already in our own ways with local projects, regional collaborations, and national grants. But after so many years of, to quote a speaker (sorry I don't remember which one!) at the recent RDAP meeting, "We Exist!", we still are relative unknowns. Luckily some of this is happening and it seems like there is momentum around asserting ourselves into this space. I'm less than a day past the DataQ editor's meeting and, as I mentioned, a few weeks out from RDAP, where there seems to be a common understanding of many of these issues. But what's next?

We've had calls for open letters to the funding agencies to ask them to acknowledge the value of libraries in the data management process. I'd like to call for an open letter that is not apologetic and isn't asking for including, rather, a letter that asserts the value of the data management communities in libraries (and affiliates) to the process of opening science. A letter that points out that agencies have failed to do so thus far. A letter that point out that by pushing off responsibility for providing guidance for data management issues onto the research communities, and by offering a level of financial support that, to many of us, seems misdirected or too small to matter, they have shirked their duties and have created, unnecessarily, a landscape of confusion and frustration.

tl:dr - maybe next time an agency representative comes to campus to talk about data stewardship, they could invite to the table the people that are already providing these services to the campus community