January 4, 2015 § 3 Comments
I muss confess: I am a lazy person. I hate to spend unnecessary time on tasks, in particular mundane or recurrent ones. When doing science, even though we are constantly exploring new ideas or novel methods, there is a fairly high number of recurring activities, from literature review to data analysis or writing. Being (barely) part of the computer native generation, I am very fond of any tools that can help me save time in my academic workflow (and improve the reproducibility of our science). Although I keep an open eye for new options, I have developed a relatively steady number of practices and tools over the years, which help me saving a lot of time and concentrate on the tasks I enjoy. So here they are, I hope you will learn new ones here.
Essential tools I use: Google account (Google Docs, Google Drive, Google Scholar), Feedly, IFTTT, IPython notebook, Twitter, Mendeley, Pandoc, ORCID.
Tasks I automated: literature survey, citations formatting, reading lists, data analysis, email, writing.
Disclaimer: I am not sure whether to consider myself as a geek or not. When it comes to automation, many options require an advanced control of the tools we use, aka, be a power user. Almost all of the solutions I have listed below have a very low entry barrier (except IPython, for which you need to be familiar with … Python !), and can be set up rapidly by anyone.
Literature review is an essential activity of academic research. I have already covered the topic here, so here is a quick breakdown of the tools I use.
- RSS feeds (free): to keep track of all new articles from a given number of journals. After the death of Google Reader, I set on Feedly (free). Good enough for now.
- Google Scholar (free) alerts. Google Scholar has become an essential part of my workflow, to keep track of what is going on in the world of peer-reviewed science. The most useful part of it are the e-mail alerts. I set up a couple of search alerts, based on keywords relevant to my research. They come almost daily to my inbox. I only wish I could combine several alerts into one, which would help me reduce the number of emails I get.
- Twitter (free). I am a Twitter addict. And one of the many reasons is that it helps me stumble upon new content in the world of science. Although plenty can be done with the basic Twitter tools (hashtags, lists, etc.), you can build a few more elaborated tools. A while ago, I set up a Twitter bot based on PubMed, which automatically posts tweets with link to new paper on a given topic. More explanations here. Twitter can also be combined with IFTTT (free) for a number of tasks. If you do not want to be involved with the Twitter API, you can do some basic tracking with IFTTT, such as automatically listing tweets with a given hashtag (15 tweets max per search), and saving the output to a Google Doc file. I just set up a number of tasks based on this, I will let you know how it goes in a while.
Reading is another essential, recurring task of my workflow. I read mostly two kinds of documents: peer-review papers, and articles. All my papers are automatically organized in Mendeley (free), thanks to the watched folder (I download every article in a specific folder, the content of which is automatically added to Mendeley). For all other articles, I tend to send everything to Instapaper (free), which I like a lot (removes all the clutter). This can be done directly from Feedly. With IFTTT, I can also send links in tweets I have favorited automatically to Instapaper.
To keep track of articles I particularly enjoyed or found relevant, you can automatically create a listing or liked or archived articles in Instapaper to Google Docs. Mostly future proof, I guess.
I also set up an IFTTT recipe to send to Instapaper links from tweets with a given hashtag (e.g #ipython).
E-mail is like peer-review or democracy. It is the best solution until we find something better, and it’s quite clear that it’s here to stay. I work hard to be close to Inbox Zero, which I usually achieve. Rules are nevertheless a very powerful tool to automate the wonderful task of dealing with your email. Kind of obvious, but super efficient.
Duh ! If you do not backup your data, expect a slow, painful death in a near future. You will have deserved it.
Data analysis, file organization
- Tags vs. folders. Should you organize your files ? Even though there are a lot of tagging solutions for the files out there (it comes with OS X), I still use folders. You can automate some of your files management with Hazel ($29) for instance. The only automation I use is the watched folder for updating my Mendeley library, as discussed above.
- Data analysis. There are probably a number of recurring experiments in your workflow. And there is a good chance that you end up with CSV files containing your data. If that’s the case, it would be a good idea to get rid of Excel and move to IPython, and in particular the IPython notebook (free). @ajsteven130 turned me into an Python fan, and for me, there is now way back. I am just completed a project (i.e, a paper) for which I did the entire analysis in the notebook, and it is just too good. It is also a big win for reproducibility and sharing what you did. More here.
- Getting values from plots. I use Graphclick (OS X). This little gem automatically extract the values from plots when you don’t have access to the raw data. Super useful, when compiling data from the literature, for instance. It hasn’t been updated for years, but does the job perfectly. Ridiculously cheap ($8).
Whether it’s papers, grant applications, or reports, we spend a fair amount of time writing. Even though papers do not write themselves, there are a number of things that can be automated to help you concentrate on the content.
- Scheduling time for writing. This is not really an automation solution, but I settled on this routine a while ago (2 years ago, maybe ?). Whenever I have something to write, which is, pretty much, all the time, I block a dedicated amount of time in my day to write. No matter what. I am a morning person when it comes to writing, so I write 1h (or more) every day first thing when I get to the lab, and it makes a huge difference at the end of the week. Given the amount of writing I will have to deal with this year, I certainly plan to keep this approach. If 1h per day scares you, try 20 min. At the end of the week you will end up with 2 hours ! Big win.
- Incorporating references in your writing, and formatting the references. If you’re not using a reference manager, you’re doing it wrong. Period. There are plenty of options out there, so you don’t have any decent excuse. I set on Mendeley many years ago, and am not planning to change since they gave me a shirt (private joke here). Bonus point for syncing my library (including PDFs) between the various computers I use.
- If some part of you writing involved repetitive expressions, it might be a good idea to use a text replacement software such as textExpander ($35) and alike. I don’t. Yet.
- Conversion. I am still chasing the « One file to rule them all » dream: one master file for all kind of outputs, from PDF to html, xml, and so on. I became of big fan of Markdown (a very simple markup language) for the first draft, and am seriously considering it as my master format, relying on Pandoc (free) for all the conversions.
- Solutions for collaborative writing. As soon as you are not alone on a writing project, you have several options to collaborate. And no, emailing the files back and forth to your colleagues is not a suitable option. Depending on your colleagues and your geekiness level, you have many options, including Google Docs (excellent for comments and review mode), GitHub in combination with Markdown, Overleaf (free)(for LaTeX fans), etc. Bonus point for Mendeley for automatically populating your library with the references cited in the file you received and not in your library yet. Very useful.
Updating your CV
Most academics love (and are often asked) to have an up-to date list of their achievements. You have many options here. The solution with the lowest amount of efforts is to sign up for a Google Scholar account. It seems to become one of the standard today, along with ORCID (free). Bonus point for keeping track of citations to your work if you are addicted to metrics. Alternative solution if you need a PDF with a list of your papers: keep track of your papers in Mendeley, get a bib file from it with all your papers, and use this with your favorite LaTeX template.
- Password management. Automation AND safety. I use 1Password ($50) and it does the job perfectly.
- Keeping tracks of things of papers you’ve reviewed. I just came across IFTTT (you have probably guessed that by now), and made a recipe involving Gmail and Google Drive. All incoming emails in my inbox with « review » in their title are listed in a Google Doc automatically. Tons of variations possible based on this workflow. Get creative.
Anything you use that I missed ? Let us know in the comments.
December 22, 2013 § 3 Comments
2103 is the year when I finally bought a MacBook Air as my main machine. I have thus been able to shift to a Apple-only software environment, although I still have a Thinkpad in the lab (yeah, don’t ask).
Like any academic, my four main activities are reading, writing, compiling data, and preparing figures.
My main requirement is that I need to keep my machines in sync. This include 2 laptops (Apple and PC) and 1 desktop (Apple). The tricky part is that I cannot go online in the lab with the MBA, for corporate reasons (this is a Windows environment). I rely thus on a few pieces of software able to sync from behind the firewall (and no, I cannot use Dropbox on my PC at work), and exchange a few files, when needed, by bluetooth.
Here are my most-used app for 2013, with no particular order.
- 1Password. Takes care of all my password needs and more. I have only 50 characters passwords now.
- Keynote. I gave up with Powerpoint when moving to the MBA, and enjoy using Keynote so far. My needs are very basic, as most of my slides usually have just a title and one figure.
- Alfred. I use it mostly as a launcher for apps. I use spotlight a lot, but Alfred appears in the center of my screen (yes, it’s silly, I know) and the text is larger.
- Dropbox. Of course. If you don’t use it yet, please sign up with this link , this will earn me some extra space.
Writing and code
- iAwriter. For all my drafts. A perfect distraction free-environment. Love the font.
- SublimeText. For coding. This is an outstanding piece of software. I don’t even understand how people can code without it. Also great for LaTeX writing, once you’ve set a few snippets.
- TexPad. I wrote a few long and structures documents this year (such as my habilitation), which was a good excuse to be back to LaTeX. TexPad is real pleasure to use. The interface is uncluttered and does the job perfectly. Mendeley automatically generate a .bib file of my library, which is super convenient.
- MS Word. alas. I only use it to prepare the final version of manuscripts and exchange with co-authors. One thing I like, though, is the revision mode.
- F.lux. This little gem automatically adjusts the color of the screen. Warm at night and bright during the day. I cannot use a computer where it’s not installed. This is the first piece of software I actually install on a new machine.
- Adobe Illustrator. For all my figure needs when preparing manuscripts. Been using it for years. Keeps getting better. Just love it.
- Picasa. To keep track of all the images on my PC and some shared resources on the internal network, without actually organizing them. A time saver.
- Fiji and ImageJ. Fulfill all my image analysis needs. Even better since you can use python with it.
References and Science stuff
- Mendeley. My reference manager of choice. I use it constantly. Tends to be a bit slow when running search (>2k papers). Just perfect for preparing manuscript. I know, I know, Elsevier owns it now, but it’s just too useful for me. The competition is getting fierce, though, which is good, with the release of Paper 3 and Readcube. The automatic bib file saving is a must have for me.
- Simplenote and Notational Velocity. I’m throwing everything here: notes, to-do list, recipes, drafts. The killer is the shortcut ((Maj+Cmd+E) to open the file in an external editor (iA Writer for me). Using Markdown for the drafts. I’m very keen on keeping everything in text file, to ensures readability on the long-term.
- Graphclick. This little gem automatically extract the values from plots when you don’t have access to the raw data. Super useful, when compiling data from the literature, for instance. It hasn’t been updated for years, but does the job perfectly. Ridiculously cheap.
- Gephi. I had fun with networks recently (more info coming soon, hopefully). Beware, this is a mesmerizing piece of software. Be ready to waste a lot of time.
- Mediawiki. We finally set up a wiki for the lab last year, and have used Mediawiki. Does the job perfectly.
- Doodle. To find a date for meetings. Does the job simply and perfectly.
- Instapaper. For my casual (i.e, not papers) reading needs. I sometime send full text papers, and it’s actually a pleasure in this context. Using the snippet to save the articles during the day, and read everything on my iPad.
- Twitter. Been using it more and more, but this might be a story for another post. I tweet here @devillesylvain.
Next for 2014 ?
Who knows what 2014 will be made of ? Pretty much all my needs are fulfilled now, so I am not really looking for anything special. A few apps are under my radar nevertheless, and could be possible new additions to my workflow.
- Scrivener. For complex documents such as review papers. I downloaded the trial version and started playing with it. Seems to be very powerful. Make sure to check out the tutorials.
- Mindnode. For mind mapping. I’m a visual type of person.
- WriterPro. The new version of iA Writer. I don’t care about the workflow thing, but the syntax highlighting could be a game changer for my academic writing, as I am still not a native-english speaker and am still working hard on improving my writing.
Got any advice ? Let me know if I miss anything in the comments.
April 10, 2013 § 5 Comments
My twitter feed is on fire, since the announcement of Elsevier having bought Mendeley, after a few months of rampant rumors. “Elsevier is evil ! They will shut down Mendeley ! Mendeley lost its soul ! We should in no way contribute to Elsevier’s business and benefits”. These are a few of the reactions that quickly followed the announcement. What should I do ? Should I care ?
Elsevier has an awful track record: from fake journals to insane profits on journal bundles, to name a few. Everybody agrees on that, and for sure they realized it and are trying to make up for it, somehow. Now that they own Mendeley, they are going to do all sort of crazy things. Maybe, maybe not, time will tell. Mr Gunn seems confident at this point. Others much less, to say the least.
I have a different take on the current events. I am usually a very pragmatic guy. I used to use Endnote, like everybody else a few years ago when there were no alternatives. Their habit was to update the software every year, although I never found any significant improvement in the update. I remember that sometime the update was WORSE than the previous version, breaking my library. And I had to pay 100$, give or take, to update. Every year, although I quickly gave up on the update. No PDF organization, no way to perform full text search. No sync. Quite rough.
Then Papers came out. And it was awesome. Finally a decent PDF organizer, that quickly improved. Not having the choice of my OS (Win), I had to give up on Papers when I came back from the US. Too bad. A windows version has been developed since, but I already gave up. It’s been bought by Springer since, and I’m not sure Springer is any better than Mendeley.
And then I came across Mendeley. It more or less provides everything I need: easy import (I love the DOI look up), easy organization, full text search, cross plat-form sync. I’ve paid for a data plan for a while to have all my files synced between my laptop and desktop computer (Dropbox is not allowed where I work). Works flawlessly. Excellent to insert bibliography in papers I write. Automatic bibtex file creation when I need to use LaTeX. If only they could provide the abbreviated journal name, that would be perfect. I now trow in it every interesting paper I came across, whether it’s directly related to my interest or not. It is thus becoming my personal, curated papers database. The value I get from this software has very quickly become extremely valuable.
And now it belongs to Elsevier. Well, I try not to submit papers anymore to Elsevier journals (although Acta Materiala is a solid journal in my field), I avoid to review for them. I use Scopus less and less since Google Scholar has become extensive. I get little or no value from Elsevier’s products. But Mendeley is different. As I said, I get a lot of value from it right now, and I don’t mind paying 5$ a month for my data plan, it’s worth it. My files are synced across all on my computers. If the situation turns ugly, I don’t lose anything but the time spent migrating to another platform. So for now, I’ll stick to Mendeley, and see what happens.
March 14, 2013 § Leave a comment
Based on my twitter feed, there were two main news yesterday: the election of an old dude in Rome, and the not very classy decision of Google to kill Reader in a few months. As you can guess, I am much more concerned about that second one, for my daily work routine. I have expressed my love for RSS previously. As of today, my strategy hasn’t changed. RSS is still the best way, by far, to keep track of new articles.
Many people today are claiming that RSS is dead, and twitter will do the job instead. Not at all, as far as I am concerned. I have a very different usage for both. I use twitter to discover recommendations and keep track of the scientific buzz around. The constant flow of tweet is nevertheless a guarantee that I will miss some stuff. It’s ok. It’s in the very nature of twitter. When it comes to tracking new articles in journals, twitter just doesn’t do the job. I use (mostly) Google Scholar to search for article on a topic in which I have some interest. Something specific. But it’s definitely not a tool for systematic tracking of new papers. My current RSS feed currently comprises around 50 journals, 30 blogs, and roughly 40 RSS feed of Scopus search results or equivalent. Since October 2008, I have read over 300k items in Reader. The counter is stuck at 300k for over a year, actually. My current feed provides about 3k items per month (I used to have much more). I spend about 10-15 min per day to keep track of new articles, and usually discover 2 or 3 new papers of interest for me, not directly related to my specific niche (freezing !). If I need to visit every single journal website to get the same information… well, there’s just no way. RSS is still the best choice. No question.
My second constraint is that during my day, I use 2 different computers, a phone and an iPad to check on my RSS feed, depending on where I am and what I do. Reader was providing a flawless solution for the sync. There will be another one soon, that’s ok.
The only question left now is: how long will Google Scholar survive? Reader was much more useful to me, and I guess I’m not the only one like this in the academic world. There are now ads in Scholar. I don’t see why they should even bother keep working on it, unless they have some long terms plans for it that goes beyond the simple search engine it is today. By which I mean an iTunes store-like system for academic papers, for instance.
Will I survive ? Of course, because I don’t have the choice. I will export my RSS feed to another service and keep using it. I will miss the convenience of Google Reader until a better solution comes up. Good bye, you’ve served me well.
January 29, 2013 § Leave a comment
I’m currently wrapping up a long review paper (>10k words) that should hopefully be published this September. As usual, as a non-native speaker, I ran into many common grammar and style mistakes. Luckily, I have next door a native speaker, and he’s patient enough to correct most of my mistake. He’s my first secret weapon. The second one is this little gem, called The Elements of Style (4th Edition), by William Strunk Jr. and E. B. White. This book is probably the best money I’ve ever spend on a book.
So without further ado, here are my top ten mistakes, that I’ve learned to correct thanks to my two secret weapons:
- You should place a comma after abbreviations like i.e., e.g., etc.
- If you enumerate several terms with a single conjunction, use a comma after each term. Example: “… bla bla bla in materials science, chemistry, and life science”. Same if you enumerate with “or”.
- Put statements in positive forms. It is much stronger.
- Omit needless words. For some reason, we french people seem to be using a lot of these. So here you go, go and mercilessly chase expressions like “the reason why is that”, “the question as to whether”, etc.
- “Due to” is synonym to “attributable to”. Avoid using it for “owing to” or “because of”.
- “Interesting”. It might be interesting to you, but not to everyone else. Remove it. Just remove it.
- “Type” is not a synonym for “kind of”. So get it straight.
- “While”. Just stick to it if you can replace it with “during the time that”.
- Don’t say “very unique”. “unique” is good enough.
- Split infinitive: when you put an adverb between “to” and the verb. I used this form a lot and thought it was cool. Apparently it’s not. Don’t say: “to thoroughly investigate”, say: “to investigate thoroughly”.
This is just the top ten. The entire book is full of stuff like this. Go and get it. And don’t lend it to anyone, you’d never get it back. Do you have another one? Share it in the comments.
August 23, 2012 § Leave a comment
Almost back to the lab. It’s been a good summer with the boys, mostly at home. Reading books, papers and blog posts when I had free time. Which does not occur so often with children less than 5 years old, as anyone in the same situation can testify.
A lot of heated discussion are occurring online now about open access and data mining. While some benefits are straightforward in certain domains such as genetics or chemistry, this is a brand new world to explore. I came across the fascinating comments by Philip Ball on chematica, a network of the transformations that link chemical species. Chemistry is not really my cup of tea, and I don’t have any of the coding abilities, unlike prominent data miners like Peter Murray-Rust. One thing I have, though, is a Mendeley library stuffed with papers (over 1400 as of today). Since my main focus now is on this ice-templating thing, I have a bit more than 350 papers on this topic only.
In addition, I am also fascinated by issues related to presenting data, aka the visual display of quantitative informations , as described by Tufte, among many others. I’ve been playing with Wordle before , it’s all over the internet now. Wordle are beautiful clouds of keywords, where the size of the words relates to their occurrence in a list or a text. You have a good example with the display of keywords in the right column of the blog page.
Today, I did some quick and dirty analysis of my collection of papers. Exporting the Mendeley data to a bib file, I compiled lists of titles of the papers in my library. I used the freely available wordle website. The whole process was really fast, like 15 minutes or so. The first result I got is shown below (clik to enlarge).
Well, as you can expect, being interested in porous ceramic materials templated by ice crystals, these keywords are obviously dominating the wordle. In the upper right you can find “zirconia”, reminiscent of my PhD on the low temperature degradation of zirconia containing ceramics. This was in the pre-Mendeley years, I don’t have many papers left on this topic.
Things get more interesting if I restrict the analysis to the titles of the papers related to ice-templating. I got about 340 of them. I’ve followed really closely the ceramic domain, and much less the polymer field. Polymers are thus largely under-represented in the following analysis, although ice-templated polymers came first.
The first obvious observation is the absolute domination of “freeze”, “casting”, “porous” and “ceramics”. They are almost in every tile. So if you want to be original, don’t come up with a paper entitled “freeze casting of porous ceramics”. The other dominant keywords are “structure” and “properties”, which is a pretty good image of the current approach to the phenomenon. Freeze whatever you have and look at the structure and properties. Not groundbreaking, most of the time. But the underlying mechanisms are so complex that very few people are willing to tackle them. “Tissue” and “scaffolds” are pretty strong too, and tissue engineering have indeed been one of the main focus so far in terms of potential applications. “Ice” is less prominent than “freeze”, and reflects how people are currently describing the process, “freeze-casting” instead of “ice templating”. I am not a big fan of “freeze-casting”, since it was originally used to describe the processing of dense materials. Although pretty much everyone is doing porous materials, “freeze-casting” still dominates. “Ice-templating” exclude all solvents other than water, so it’s not perfect either.
I also did the same analysis compiling all the abstracts. This is much closer to mining the full text of the papers. The output is much more balanced.
“Pore”, “porous”, “structure” and “freeze” still dominates, but the relative occurrences of other keywords is much more balanced. Since people tend to report almost exclusively positive results, we got a lot of “increased”, “high”, “new”, “novel”, “potential” “significantly” and “significant”, better represented than “low” and “decreased”. “Defects” is noticeably absent, although it remains a major issue of the process. “Control” is missing from the wordle (well, not really missing, but it’s really tiny), a fair representation of the majority of the papers, where people exert no control whatsoever. Freeze and see.
“Properties” is relatively large, although people are almost exclusively looking at mechanical properties (hence the presence of “MPa”). People became interested only very recently in other properties, such as conductivity or piezoelectricity.
Regarding materials, “silica” and “alumina” are the only ones found here. A lot of room for testing other materials, and therefore other properties. “Water” and “camphene” are of similar size, as people are equally interested in both solvents.
Missing keywords are equally interesting. “Colloids” is hardly visible, although everyone is dealing with colloidal suspensions. Ceramists are usually talking about slurries instead of colloidal suspensions, which is why we get “slurry” and “slurries” instead. Maybe. I still believe we have a lot to learn if we look at the colloid science papers.
“Interface” is the other elephant in the room. The control of the process largely depends on controlling the interface, and is something that people have largely ignored so far.
Without digging too much into the details, this quick and simple analysis is very informative about the current state of the art. Having followed very closely the domain for the past 5 or 6 years, the keyword clouds obtained here are very representative of the current state of the art. I’d love to extend this analysis to the full text of the papers, although I will need different tools to do it. Maybe I should get an access to the Mendeley API. They are responding to over 100 millons calls to their database each month, they can surely afford a few more. In the meantime, I’ll try to apply the same analysis to a different domains, using Google Scholar or Scopus and Mendeley. More later if I’m successfull.
Funny coincidence, this month’s issue of Nature Materials was released today while I was playing around with this analysis. Check out the front cover…
April 2, 2012 § Leave a comment
Google is now tracking the metrics of journals. They chose the h5 factor, which is basically the h factor taking into account the last 5 years. The search function works with keyword, as you can guess. So if you search for materials science journal with the keyword “materials”, you will only get results of journals whose name include “materials”, and skip journals like Nanoletters, ACS Nano or other ones.
If you click on the h5 link, you get a list of the top cited paper for that journal, neat. The 2007 graphene paper in Nature Materials of Geim and Novoselov is already cited >5600 times. Holy cow.