Dark Data and Open Notebook Science
I ran across an interesting Wired article discussing the merits of a new initiative: Open Notebook Science. This touches directly on the issue I brought up in a previous post.
So what happens to all the research that doesn’t yield a dramatic outcome — or, worse, the opposite of what researchers had hoped? It ends up stuffed in some lab drawer. The result is a vast body of squandered knowledge that represents a waste of resources and a drag on scientific progress. This information — call it dark data — must be set free.
It is a well written article, if a bit short, expressing why all research data should be open for public viewing. The “dark data”, results that didn’t see the limelight because they failed, were inconclusive or even contrary to the project, may be important to other research labs. Your trash may be my treasure.
In a similar vein, there is a movement called Open Notebook Science, championed by chemist Jean-Claude Bradley. This movement wants to make science completely open and transparent by having scientists post their notebooks. Complete access to notebooks, including the good, bad and ugly, would move science forward at a striking pace. I happen to agree with them. It is the next revolution in science. That said, it might take years to happen. There are a lot of politics and money tied up in the current system. Old habits die hard.
More links:
Jean-Claude Bradley’s blog
UsefulChem on Open Notebook Science
Science in the Open - blog on Open Science
Jeremiah Faith’s Open Notebook - Graduate student at Boston University
Jeremiah’s thoughts on Open Science
November 1st, 2007 at 10:34 pm
It depends on how “dark data” is defined. If the study is well designed and carried out, then no matter what, I think the data is valuable to the community. But there is a saying in the community that it takes 7 tries to get a study designed right. So what happened to the data collected in the previous 6 studies? Normally, lessons
learned from that 6 studies will be summarized in some way in the publication of the 7th study. I think it’s more important that the researcher that got the “dark data” won’t give up easily, look into what went wrong and try it again.
November 3rd, 2007 at 9:44 am
Thats a very good point. Most published articles probably contain the lessons learned from the researcher and present a concise, streamlined view of the data collection process.
But I know for a fact that our lab generates a ton of data, much of which ends up being unrelated to our study or blind alleys. These may be crucially important for other labs but are of little use to us. Most of this data will never see the light of day because it is unpublishable or does not have supporting data.
November 3rd, 2007 at 11:39 am
Zach,
You really hit the nail on the head. Most data generated in the lab never get communicated to researchers who might be able to use it.
To address Ning’s point, it is not a question of replacing formal publication with Open Notebook Science, but of adding another scientific communication channel.
Zach thanks for promoting the ONS concept and I hope you’ll be able to share some of your work in that way at some point.
By the way I saw you were linking to Cameron Neylon’s blog - he gave a wonderful talk at Drexel yesterday - we recorded it and I’ll post it on my blog this weekend