Data mining bibliographic records
This was a fun one. We decided with Adam Stevenson to download all the bibliographic records of the past 40 years that had the keyword “ceramics” in the title, abstract, or title of the journal, and analysed them. That’s 253 000 papers, if you wondered. This project, which was initially just an excuse to get started with Python, turned out to be really interesting. We gave a few examples of what you can do with this simple approach, such as following the evolutions of techniques or track the interest for various materials. We also made some beautiful networks or related concepts.
One of the most interesting find was the use of positive keywords, which has consistently been rising for the past 40 years. Hype, anyone ? Maybe it’s start to also publish negative results.
The ACERS wrote a nice article about our paper.
- Deville, S. & Stevenson, A. J. Mapping Ceramics Research and Its Evolution. J. Am. Ceram. Soc. 98, 2324–2332 (2015).
Data mining properties of ice-templated materials
A different type of data mining. Although the data set was a lot smaller here (a few hundreds papers), collecting the data took me much longer. The idea was to collect all the processing parameters and freezing conditions, as well as all the properties and characteristics of ice-templated materials, and see the range of properties we could get. The second idea, much more ambitious, was to try to find new correlations.
Kristen Scotti (from Northwestern Univ) liked the idea a lot, and reiterated the analysis on a much larger dataset. She also made the data available on a website (freezecasting.net), so that anyone is free to explore. Very cool.