Saturday, April 18, 2009

Putting Your Data on The Web



I just watched a talk by Tim Berners-Lee (who invented the web browser) that re-inspired me to put the data I collect on the web, so other people can have access to it.

I heard at one point that in psychology you were required to keep your data for five years after publication in case anybody else wanted to look at it. I think the expectation is that if someone asks you for your data, you are supposed to give it to them so they can duplicate your statistical analyses, or whatever.

However, now that data is, to a great degree, stored on and collected with computers, there's no reason to get rid of your data. And if you are going to give it away anyhow, why not just let anyone download it, rather than having to email you?

I run psychological experiments, and I want to start putting my data on the web. I keep a website (http://www.jimdavies.org/) and on it each paper I write gets its own web page (e.g., http://jimdavies.org/research/publications/ijcai/2001/davies2001.html). But now I'm thinking that I should have, linked off of these pages, the data collected. Not only that, everything I can put up there to help someone who wants to know more about what we did. For example, we collect data on computers, and we have to write programs to collect it. We use a piece of software called E-Prime to do it (other labs use "Superlab.") I also want to put the E-Prime source code up there, in case someone wants to replicate.

I also do AI work, and I think it's a good idea to put your AIs on the web too. I have not done this. I seem to rememeber being warned not to do this, when I was in graduate school, but right now I can't remember any reasons that outweighed the potential benefits that society might stand to gain from the access.

I plan to put my data on the web, and I hope all scientists reading this blog will do the same.

The other thing is that I think journals and conferences should maintain websites of the data they publish. This will force/encourage shared data, but also puts it safely in the hands of a big organization. When a scientist dies, who will maintain their website?

I might just write a letter to the cognitive science society suggesting this. 

No comments: