530 million songs. 90 years of high-definition video. 250,000 Libraries of Congress. That’s how much data we produce every day—2.5 exabytes according to Northeastern University. I guess that’s not surprising, given the amount of activity that goes on in social media, websites, email messages and texting.
Much of that data, though, is personal and ephemeral. Videos, photos, tweets and stories that can be passed along and deleted without any thought or care about accuracy or archiving.
But in the scholarly community, a similar and perhaps more significant explosion of digital data is occurring. Here the stakes may be much higher. Without trusted stewardship, data from research will not be effectively collected and preserved for reuse. And when this happens, research innovation and advancement slows significantly.
This is new territory in many ways. Data have been collected and preserved for thousands of years, but never at the volume we see today, nor with some of the deliberate (and in some cases, legally mandated) intentions for reuse.