Unilever Centre for Molecular Informatics
 

petermr’s blog

A Scientist and the Web

 

Open Data - the time has come

The term “Open Data” is now becoming commonly used and we (Blue Obelisk) are trying to define it (our mantra being ODOSOS. Open Data, Open Source, Open Standards). It was not commonly used two years ago although the concept is general enough to have been important. In the last 12-15 months there has been a lot of use, particularly in the techie web logs and meetings. The idea is potentially very much broader and looks set to become very important. The earliest references I can find are: Jim Kent on the human genome. An Open Data Consortium was founded in ca. 2003 seemingly concerned with geospatial data. Simon St. Laurent gave a presentation without date but it looks a few years back. It has a strong XML flavour. I became concerned about Open data in ca. 2003-2004 and Henry and I published a Manifesto for Open Chemistry in 2004. I followed these up in 2005 with several mails. (example) presentations to JISC, OAI, STM Publishers, etc. where I used the term “Open Data”. Late in 2005 SPARC set up an Open Data list with me as moderator. Science Commons started in Dec 2004 In 2005 the term started to emerge, possibly independently, in the XML/tech area as in: XTech 2005. It is now a hot topic among the Tims Bray and O’Reilly There seem to be several related threads:
  • scientific data deemed to belong to the commons (e.g. the human genome)
  • infrastructural data essential for scientific endeavour (e.g. GIS)
  • data published in scientific articles which are factual and therefore not copyrightable
  • data as opposed to software and therefore not covered by OS licenses and potentially capable of being misappropriated. (this is a very general idea)
I think the current usages are sufficiently close that we should try to bring them together. Comments here would be useful. Maybe a Wikipedia article would help?

7 Responses to “Open Data - the time has come”

  1. Peter, thank you so much for this list; it is very helpful, indeed.

  2. panlibus says:

    Open Data - oh yes…

    Lorcan Dempsey points to Peter Murray-Rust’s recent post on Open Data; a topic dear to our collective heart here at Talis. We’ve been looking for a while at some of the ways that data might be freely, fairly and……

  3. Pascal says:

    Peter,
    you missed my favoriyte link:
    http://www.opendatafoundation.org
    We hope this will mark a new beginning for collaborative efforts towards open standards and open tools.
    *P

  4. Keith G Jeffery says:

    Peter, all:
    Eric Zimmerman kindly pointed me to this blog. Although the term open data is rather new, the concept is rather old. The International Geophysical Year of 1957-8 caused the setting up of several world data centres and - more importantly - set standards for descriptive metadata to be used for data exchange and utilisation.

    Somewhat surprisingly, commerce and industry has made more progress in this field with metadata (and exchanged data) in e.g. supply chains being particularly effective - but proprietary to a group of companies. There are many different metadata standards - commonly by domain of interest - for ‘open data’ - developed over the last 50 years.

    There is a standard (technically an EU recommendation to member states) for metadata and data describing research - a standard fomat to describe projects, persons, organisations, products, patents, publications, facilities, equipment, funding etc etc. It is named CERIF; details under http://www.eurocris.org This is really useful to understand the context of an open dataset - and usually helps with issues like provenance etc too.

    And finally a plea; please make open data metadata formal; that is - unlike Dubln Coe - it should be machine-understandable as well as machine-readable; then it will scale (automated processes can be used rather than requiring human browsing).

  5. La science comme bien commun….

    En ce moment même se tient à Washington une conférence traitant des Science Commons, autrement dit de l’accès libre et ouvert aux données de la science (que d’aucun baptisent open data), ou encore du bien commun qu’est la science. Je…

  6. [...] Chemical informatics is beginning to embrace the concepts of Open Source and Open Data already in widespread use elsewhere. This shift will bring into sharp focus the need for robust and open methods for accurately encoding molecular structure. Existing technologies have not kept up with the chemists themselves, as the axial chirality problem demonstrates. Future articles in this series will show how FlexMol can offer a solution to this and other important molecular representation problems. [...]

Leave a Reply