The Model We Use for Research Data Services

Over the past few months I've been reconsidering my long-standing assertions/assumptions about the necessity of library involvement in research data services. For the extent of my admittedly short career in the research data services world, I've convinced myself, and I'm not alone, that libraries have a natural place in data services due to our long-standing tradition of making information accessible to others. It is a common refrain, but I'm not really convinced by it any more. 

I want to be careful about what I'm *not* saying here. I'm not saying that libraries don't or shouldn't have a place in data services. I'm not saying that libraries are doing it wrong or badly. And I'm not saying that everyone should drop everything and do something different. 

What I am saying is that I think there are other models for how research data services could be provided at a university and very few of them, as best I can tell, have been tested. As someone at a university that may be in the (questionably) luxurious state of having a existing data services program *and* an opportunity to rethink how we structure data services, i figured this would be a good time to try to take these assumptions apart a bit and examine the pieces. 

I know that the research data services program I helped build from scratch at my previous place of work was modeled on successful programs I saw elsewhere (Cornell, Minnesota), and the research data services at my current place of work have tried to move in that direction, too. That model, the now classic trio of Libraries, Research Office, and Central Computing, is useful and logical in many ways. But the truly logical (i think) units in that trio are the Research Office (for compliance) and Central Computing (for infrastructure) - the Library fits in that trio for less obvious reasons. Though one of the reasons libraries have fit themselves into this space is very important: libraries have asserted themselves into this space and filled a vacuum. Few were stepping to the plate way back when, and libraries took on the task of predicting and filling a void. Someone has to actually *do* this stuff, and libraries have stepped up in a major way. 

But what other models could there be? Well, I think it helps to consider the components that are required. In my estimation these include:

  • Computing Infrastructure - "Obvious" stuff including storage and backup/replication, but also discovery, hosting, online tools, etc.
  • Compliance Infrastructure - Making sure researchers do all the required data management things so the money stuff keeps flowing
  • Outreach and Education - Facilitate data activities and helping researchers understand best practices
  • Coordination - Central body to ensure services are being provided in a useful way and that researcher needs are being met

I don't think i've left anything major out of that list. And i don't think there is anything in that list that specifically calls for the library to be involved. Now, depending on your institution, the library actually might be the best unit to fill one or all of those roles. On the flip side, depending on your institution, the library may not be the best to fill any of those roles. What would that look like? Here are some examples one might consider at a large institution like mine - all without libraries:

  1. College level outreach, education, and computing infrastructure coordinated by the research office
  2. College level outreach and education, centralized computing infrastructure, coordination by the research office
  3. Research office coordinated outreach and education with centralized computing infrastructure

Of course, each of these models has its own problems including but not limited to issues of trust, recognition of competency of service providers, costs, etc. But those issues do not necessarily go away when the library is involved. 

I'll also note that there are institutions that are doing great research data services work that do not include libraries - look at some domain repositories like some of the NASA DAACs or to other long-standing data providers like the National Weather Service*. 

Where does that leave us? Not sure where this leaves you, dear reader, but makes me want to step back and think about how the services that are needed by researchers could be provided differently at my institution. Should we try to lean more heavily on our university colleges? The Research Office? Computing? I think the answer to all of these might be yes. Should our goal in the library be to focus our role around the repository aspect of our work, outreach and education, coordination? 

Discuss, please. Help me think through this beast.


*pretty sure neither of these explicitly includes libraries in their research data services/curation/sharing but correct me if i'm wrong