Main Page
From WWMM
Contents |
Murray-Rust Research Group
We're using cutting-edge informatics and software engineering to develop the future of knowledge-driven scientific research in chemistry and related subjects.
We do this by building and using software and systems for the representation, extraction and processing of scientific data across chemistry, materials science and solid-state physics. We're particularly interested in the following areas:
Chemical data representation and semantics
We are the lead developers of:
- CML, the Chemical Markup Language, and Jumbo, its corresponding Java library; (Chem4Word project announced 2008-08-18)
- CMLCryst Application of CML to Crystallography (announced at IUCrXXI, Osaka, 2008)
- Golem, an ontology language and toolkit for building CML-processing applications.
We are also very interested in using Semantic Web technologies - RDF, RDF Schema, SPARQL and OWL - to represent and mine scientific data and observations; this crosscuts our work on the chemical literature, polymer informatics, and computational chemistry and physics.
Scientific publication and scientific literature
Most scientific data is lost during publication. To create a global knowledge base the current process must change dramatically. This requires the development of repositories to store scientific information; automated overlay journals and post-hoc extraction of experimental data and semantic content from the literature; and tools to allow working scientists to make use of the data we liberate.
Our projects in this area include:
- CrystalEye, an automatically-extracted, highly interactive, rich repository and index of published crystallographic measurements;
- OSCAR, a toolkit for chemical computational linguistics, chemical named entity recognition, and extraction and validation of experimental measurements from the text of journal articles.
- SPECTRa-T, a proof-of-concept system to build a semantic data repository by text mining of chemical theses;
- SPECTRa, tools to simplify the deposition of chemistry data into (perhaps institutional) repositories, in order to promote Open Data;
We're advocates of Open Data and of openness in scientific communication; as you can see, our group homepage is a wiki, and many of us maintain blogs.
- A Scientist and the Web - Peter Murray-Rust
- Coding Trombonist - Jim Downing
- Brighten the Corners - Andrew Walkingshaw
- Staudinger's Semantic Molecules - Nico Adams
- Teaching computers to read chemistry papers - Peter Corbett
- Ramblings - Joe Townsend
- The CML Blog
Polymer informatics
We are building class-leading tools and ontologies for the prediction, simulation and representation of polymers.
Materials informatics and simulation
In the form of the original WWMM, as well as the computational crystallography of Joe Townsend and NMR prediction of Nick Day, we have worked extensively on "black-box" approaches to computational chemistry and physics. At the moment, we have two major projects:
- Dr Volker Thome leads work on computational combinatoric mineralogy;
- we are partners in MaterialsGrid, developing a database and service for high-throughput ab initio materials simulation.
Further information
We maintain several chemistry web services - have a look!
