The Automated Academic
January 4, 2015 § 8 Comments
I muss confess: I am a lazy person. I hate to spend unnecessary time on tasks, in particular mundane or recurrent ones. When doing science, even though we are constantly exploring new ideas or novel methods, there is a fairly high number of recurring activities, from literature review to data analysis or writing. Being (barely) part of the computer native generation, I am very fond of any tools that can help me save time in my academic workflow (and improve the reproducibility of our science). Although I keep an open eye for new options, I have developed a relatively steady number of practices and tools over the years, which help me saving a lot of time and concentrate on the tasks I enjoy. So here they are, I hope you will learn new ones here.
Essential tools I use: Google account (Google Docs, Google Drive, Google Scholar), Feedly, IFTTT, IPython notebook, Twitter, Mendeley, Pandoc, ORCID.
Tasks I automated: literature survey, citations formatting, reading lists, data analysis, email, writing.
Disclaimer: I am not sure whether to consider myself as a geek or not. When it comes to automation, many options require an advanced control of the tools we use, aka, be a power user. Almost all of the solutions I have listed below have a very low entry barrier (except IPython, for which you need to be familiar with … Python !), and can be set up rapidly by anyone.
Literature review is an essential activity of academic research. I have already covered the topic here, so here is a quick breakdown of the tools I use.
- RSS feeds (free): to keep track of all new articles from a given number of journals. After the death of Google Reader, I set on Feedly (free). Good enough for now.
- Google Scholar (free) alerts. Google Scholar has become an essential part of my workflow, to keep track of what is going on in the world of peer-reviewed science. The most useful part of it are the e-mail alerts. I set up a couple of search alerts, based on keywords relevant to my research. They come almost daily to my inbox. I only wish I could combine several alerts into one, which would help me reduce the number of emails I get.
- Twitter (free). I am a Twitter addict. And one of the many reasons is that it helps me stumble upon new content in the world of science. Although plenty can be done with the basic Twitter tools (hashtags, lists, etc.), you can build a few more elaborated tools. A while ago, I set up a Twitter bot based on PubMed, which automatically posts tweets with link to new paper on a given topic. More explanations here. Twitter can also be combined with IFTTT (free) for a number of tasks. If you do not want to be involved with the Twitter API, you can do some basic tracking with IFTTT, such as automatically listing tweets with a given hashtag (15 tweets max per search), and saving the output to a Google Doc file. I just set up a number of tasks based on this, I will let you know how it goes in a while.
Reading is another essential, recurring task of my workflow. I read mostly two kinds of documents: peer-review papers, and articles. All my papers are automatically organized in Mendeley (free), thanks to the watched folder (I download every article in a specific folder, the content of which is automatically added to Mendeley). For all other articles, I tend to send everything to Instapaper (free), which I like a lot (removes all the clutter). This can be done directly from Feedly. With IFTTT, I can also send links in tweets I have favorited automatically to Instapaper.
To keep track of articles I particularly enjoyed or found relevant, you can automatically create a listing or liked or archived articles in Instapaper to Google Docs. Mostly future proof, I guess.
I also set up an IFTTT recipe to send to Instapaper links from tweets with a given hashtag (e.g #ipython).
E-mail is like peer-review or democracy. It is the best solution until we find something better, and it’s quite clear that it’s here to stay. I work hard to be close to Inbox Zero, which I usually achieve. Rules are nevertheless a very powerful tool to automate the wonderful task of dealing with your email. Kind of obvious, but super efficient.
Duh ! If you do not backup your data, expect a slow, painful death in a near future. You will have deserved it.
Data analysis, file organization
- Tags vs. folders. Should you organize your files ? Even though there are a lot of tagging solutions for the files out there (it comes with OS X), I still use folders. You can automate some of your files management with Hazel ($29) for instance. The only automation I use is the watched folder for updating my Mendeley library, as discussed above.
- Data analysis. There are probably a number of recurring experiments in your workflow. And there is a good chance that you end up with CSV files containing your data. If that’s the case, it would be a good idea to get rid of Excel and move to IPython, and in particular the IPython notebook (free). @ajsteven130 turned me into an Python fan, and for me, there is now way back. I am just completed a project (i.e, a paper) for which I did the entire analysis in the notebook, and it is just too good. It is also a big win for reproducibility and sharing what you did. More here.
- Getting values from plots. I use Graphclick (OS X). This little gem automatically extract the values from plots when you don’t have access to the raw data. Super useful, when compiling data from the literature, for instance. It hasn’t been updated for years, but does the job perfectly. Ridiculously cheap ($8).
Whether it’s papers, grant applications, or reports, we spend a fair amount of time writing. Even though papers do not write themselves, there are a number of things that can be automated to help you concentrate on the content.
- Scheduling time for writing. This is not really an automation solution, but I settled on this routine a while ago (2 years ago, maybe ?). Whenever I have something to write, which is, pretty much, all the time, I block a dedicated amount of time in my day to write. No matter what. I am a morning person when it comes to writing, so I write 1h (or more) every day first thing when I get to the lab, and it makes a huge difference at the end of the week. Given the amount of writing I will have to deal with this year, I certainly plan to keep this approach. If 1h per day scares you, try 20 min. At the end of the week you will end up with 2 hours ! Big win.
- Incorporating references in your writing, and formatting the references. If you’re not using a reference manager, you’re doing it wrong. Period. There are plenty of options out there, so you don’t have any decent excuse. I set on Mendeley many years ago, and am not planning to change since they gave me a shirt (private joke here). Bonus point for syncing my library (including PDFs) between the various computers I use.
- If some part of you writing involved repetitive expressions, it might be a good idea to use a text replacement software such as textExpander ($35) and alike. I don’t. Yet.
- Conversion. I am still chasing the « One file to rule them all » dream: one master file for all kind of outputs, from PDF to html, xml, and so on. I became of big fan of Markdown (a very simple markup language) for the first draft, and am seriously considering it as my master format, relying on Pandoc (free) for all the conversions.
- Solutions for collaborative writing. As soon as you are not alone on a writing project, you have several options to collaborate. And no, emailing the files back and forth to your colleagues is not a suitable option. Depending on your colleagues and your geekiness level, you have many options, including Google Docs (excellent for comments and review mode), GitHub in combination with Markdown, Overleaf (free)(for LaTeX fans), etc. Bonus point for Mendeley for automatically populating your library with the references cited in the file you received and not in your library yet. Very useful.
Updating your CV
Most academics love (and are often asked) to have an up-to date list of their achievements. You have many options here. The solution with the lowest amount of efforts is to sign up for a Google Scholar account. It seems to become one of the standard today, along with ORCID (free). Bonus point for keeping track of citations to your work if you are addicted to metrics. Alternative solution if you need a PDF with a list of your papers: keep track of your papers in Mendeley, get a bib file from it with all your papers, and use this with your favorite LaTeX template.
- Password management. Automation AND safety. I use 1Password ($50) and it does the job perfectly.
- Keeping tracks of things of papers you’ve reviewed. I just came across IFTTT (you have probably guessed that by now), and made a recipe involving Gmail and Google Drive. All incoming emails in my inbox with « review » in their title are listed in a Google Doc automatically. Tons of variations possible based on this workflow. Get creative.
Anything you use that I missed ? Let us know in the comments.