The Automated Academic

I muss confess: I am a lazy person. I hate to spend unnecessary time on tasks, in particular mundane or recurrent ones. When doing science, even though we are constantly exploring new ideas or novel methods, there is a fairly high number of recurring activities, from literature review to data analysis or writing. Being (barely) part of the computer native generation, I am very fond of any tools that can help me save time in my academic workflow (and improve the reproducibility of our science). Although I keep an open eye for new options, I have developed a relatively steady number of practices and tools over the years, which help me saving a lot of time and concentrate on the tasks I enjoy. So here they are, I hope you will learn new ones here.

TL;DR

Essential tools I use: Google account (Google Docs, Google Drive, Google Scholar), Feedly, IFTTT, IPython notebook, Twitter, Mendeley, Pandoc, ORCID.
Tasks I automated: literature survey, citations formatting, reading lists, data analysis, email, writing.

Disclaimer: I am not sure whether to consider myself as a geek or not. When it comes to automation, many options require an advanced control of the tools we use, aka, be a power user. Almost all of the solutions I have listed below have a very low entry barrier (except IPython, for which you need to be familiar with … Python !), and can be set up rapidly by anyone.

Literature review

Literature review is an essential activity of academic research. I have already covered the topic here, so here is a quick breakdown of the tools I use.

  • RSS feeds (free): to keep track of all new articles from a given number of journals. After the death of Google Reader, I set on Feedly (free). Good enough for now.
  • Google Scholar (free) alerts. Google Scholar has become an essential part of my workflow, to keep track of what is going on in the world of peer-reviewed science. The most useful part of it are the e-mail alerts. I set up a couple of search alerts, based on keywords relevant to my research. They come almost daily to my inbox. I only wish I could combine several alerts into one, which would help me reduce the number of emails I get.
  • Twitter (free). I am a Twitter addict. And one of the many reasons is that it helps me stumble upon new content in the world of science. Although plenty can be done with the basic Twitter tools (hashtags, lists, etc.), you can build a few more elaborated tools. A while ago, I set up a Twitter bot based on PubMed, which automatically posts tweets with link to new paper on a given topic. More explanations here. Twitter can also be combined with IFTTT (free) for a number of tasks. If you do not want to be involved with the Twitter API, you can do some basic tracking with IFTTT, such as automatically listing tweets with a given hashtag (15 tweets max per search), and saving the output to a Google Doc file. I just set up a number of tasks based on this, I will let you know how it goes in a while.

Reading

Reading is another essential, recurring task of my workflow. I read mostly two kinds of documents: peer-review papers, and articles. All my papers are automatically organized in Mendeley (free), thanks to the watched folder (I download every article in a specific folder, the content of which is automatically added to Mendeley). For all other articles, I tend to send everything to Instapaper (free), which I like a lot (removes all the clutter). This can be done directly from Feedly. With IFTTT, I can also send links in tweets I have favorited automatically to Instapaper.
To keep track of articles I particularly enjoyed or found relevant, you can automatically create a listing or liked or archived articles in Instapaper to Google Docs. Mostly future proof, I guess.
I also set up an IFTTT recipe to send to Instapaper links from tweets with a given hashtag (e.g #ipython).

E-mail

E-mail is like peer-review or democracy. It is the best solution until we find something better, and it’s quite clear that it’s here to stay. I work hard to be close to Inbox Zero, which I usually achieve. Rules are nevertheless a very powerful tool to automate the wonderful task of dealing with your email. Kind of obvious, but super efficient.

Backup

Duh ! If you do not backup your data, expect a slow, painful death in a near future. You will have deserved it.

Data analysis, file organization

  • Tags vs. folders. Should you organize your files ? Even though there are a lot of tagging solutions for the files out there (it comes with OS X), I still use folders. You can automate some of your files management with Hazel ($29) for instance. The only automation I use is the watched folder for updating my Mendeley library, as discussed above.
  • Data analysis. There are probably a number of recurring experiments in your workflow. And there is a good chance that you end up with CSV files containing your data. If that’s the case, it would be a good idea to get rid of Excel and move to IPython, and in particular the IPython notebook (free). @ajsteven130 turned me into an Python fan, and for me, there is now way back. I am just completed a project (i.e, a paper) for which I did the entire analysis in the notebook, and it is just too good. It is also a big win for reproducibility and sharing what you did. More here.
  • Getting values from plots. I use Graphclick (OS X). This little gem automatically extract the values from plots when you don’t have access to the raw data. Super useful, when compiling data from the literature, for instance. It hasn’t been updated for years, but does the job perfectly. Ridiculously cheap ($8).

Writing

Whether it’s papers, grant applications, or reports, we spend a fair amount of time writing. Even though papers do not write themselves, there are a number of things that can be automated to help you concentrate on the content.

  • Scheduling time for writing. This is not really an automation solution, but I settled on this routine a while ago (2 years ago, maybe ?). Whenever I have something to write, which is, pretty much, all the time, I block a dedicated amount of time in my day to write. No matter what. I am a morning person when it comes to writing, so I write 1h (or more) every day first thing when I get to the lab, and it makes a huge difference at the end of the week. Given the amount of writing I will have to deal with this year, I certainly plan to keep this approach. If 1h per day scares you, try 20 min. At the end of the week you will end up with 2 hours ! Big win.
  • Incorporating references in your writing, and formatting the references. If you’re not using a reference manager, you’re doing it wrong. Period. There are plenty of options out there, so you don’t have any decent excuse. I set on Mendeley many years ago, and am not planning to change since they gave me a shirt (private joke here). Bonus point for syncing my library (including PDFs) between the various computers I use.
  • If some part of you writing involved repetitive expressions, it might be a good idea to use a text replacement software such as textExpander ($35) and alike. I don’t. Yet.
  • Conversion. I am still chasing the « One file to rule them all » dream: one master file for all kind of outputs, from PDF to html, xml, and so on. I became of big fan of Markdown (a very simple markup language) for the first draft, and am seriously considering it as my master format, relying on Pandoc (free) for all the conversions.
  • Solutions for collaborative writing. As soon as you are not alone on a writing project, you have several options to collaborate. And no, emailing the files back and forth to your colleagues is not a suitable option. Depending on your colleagues and your geekiness level, you have many options, including Google Docs (excellent for comments and review mode), GitHub in combination with Markdown, Overleaf (free)(for LaTeX fans), etc. Bonus point for Mendeley for automatically populating your library with the references cited in the file you received and not in your library yet. Very useful.

Updating your CV

Most academics love (and are often asked) to have an up-to date list of their achievements. You have many options here. The solution with the lowest amount of efforts is to sign up for a Google Scholar account. It seems to become one of the standard today, along with ORCID (free). Bonus point for keeping track of citations to your work if you are addicted to metrics. Alternative solution if you need a PDF with a list of your papers: keep track of your papers in Mendeley, get a bib file from it with all your papers, and use this with your favorite LaTeX template.

Other

  • Password management. Automation AND safety. I use 1Password ($50) and it does the job perfectly.
  • Keeping tracks of things of papers you’ve reviewed. I just came across IFTTT (you have probably guessed that by now), and made a recipe involving Gmail and Google Drive. All incoming emails in my inbox with « review » in their title are listed in a Google Doc automatically. Tons of variations possible based on this workflow. Get creative.

Anything you use that I missed ? Let us know in the comments.

My twitter achievements

Tweeps, and in particular scientists, love discussing why they use twitter. They also usually discuss it… on twitter of course ! Trying to convince people already on twitter to use twitter is an interesting recursive situation, but people not on the network are very often dubious about the benefits. One of the question I got asked quite often is the following: can you give me some practical examples of things that happened because you were on twitter ?

Earlier this week, I was invited to a PhD viva at the college de France. The work (biomineralization of bone) was loosely related to my direct research interests (freezing and self-assembly) but brilliant, and I really enjoyed that day. The jury was eclectic and we had a good scientific discussion. While enjoying the post-viva champagne at the top of the roof -the view over Paris is truly outstanding.

Room with a few. Not bad.

Room with a few. Not bad.

I’ll take a position there any day, not even asking for an office, the terrace will be fine – I learned about the reason I was there on this day. When some of the work was published in Nature Materials last year, I tweeted about it, like I do when I see papers which I find of interest, and that tweet showed up on the altmetrics page of the paper. That’s how they realized I could be interested in participating to the jury.

That case was just one more example of things that happened to me through twitter. For the sake of giving simple, practical examples of similar situations, here is a quick summary of what I would call my twitter achievements:
– invited to a PhD viva.
– co-authored a review paper with authors I’ve never met in real life. The paper is on the verge of being accepted in a prestigious journal (fingers crossed).
– shared a few beers and nice meals in Paris with a few CNRS colleagues which I met on twitter.
wrote an op-ed in Le Monde (online edition) to discuss science communication in France and the use of social media.
wrote an article in Rue89 (a mainstream media in France, online only) on open access, following a comment I tweeted about one of their papers.

So there you go. Simple examples. Share yours in the comments.

Top 10 twitter hashtags for scientists

Twitter has become an integral part of my daily activities as a scientists, from keeping up to date with science news to getting advices and building a network of collaborators. It takes time to get started with twitter, and the benefits (in my case) only appeared after a quite long period of practice. Beyond your own list of people you follow, hashtags are a terrific way of finding great content on twitter. Here’s my top 10 list of science-related hashtags that I use and check regularly.

  1.  #icanhazpdf. Don’t have access to the paper you want to read ? Ask your kind fellows on twitter if anyone haz access to it. Be aware that you’re crossing a line here. Read this first.
  2.  #scholarsunday. Scholars don’t work on Sunday, right? RIGHT ? It’s therefore a great time to update your following list and find great people to follow on twitter. #Scholarsunday will point you to relevant accounts.
  3. #OA and #openscience. Open-access and open science related tweets. Very active.
  4. #scicomm. Science communication tweets. Also very active.
  5. #figureclub. Designing a figure for your next manuscript and want to get some feedback ? Tweet your figure with #figureclub. I wish more people were using it.
  6. #realtimechem. A very popular hashtag for chemists. There’s even a twitter account for it now: @RealTimeChem
  7. #chemophobia. Self-explanatory. Keep track of irrational chemophobic tweets.
  8. #acwri , #acwrimo, and #GetYourManuscriptOut. Procrastinating ? Need motivation to get your manuscript written ? You’re not alone.
  9. #SciArt. The intersection of science and art. I use it to tweet scientific pictures with are simply beautiful.
  10. #ecrchat, #phdchat. Generic chit-chat on early-career research and Phd.

Bonus:  #emojiresearch, #overlyhonestmethods. And finally, when you just need some distractions from your intense brain activity, just head towards these popular hashtags, for a different way to explain your research, or science-related semi-private jokes. Have fun.

Share your favorites in the comments !

On this #overlyhonestmethods thing

The #overlyhonestmethods hashtag is crazily popular since it started two days ago. Thousands of tweets are flying around, in a beautiful pluridisciplinary ballet. Scientists from all over the planet are cranking their witty jokes as fast as they can, in an interesting mix of behind-the-scenes insights and private jokes. If you’re a scientist yourself, you can often tell whether it’s a witty joke or a scientific confession. If you’re not, it can be a different story. Reading non-scientists tweets and a few comments on non-scientific sites, like here and there, I realized many people took this hashtag as a confession for scientists. That’s not exactly the case. First hint: scientist are humans too, and as incredible as it may sound, some have a solid sense of humor.

In every place I’ve been, people are working way too much. Scientists are passionate workaholics. Working around the clock. On weekends. At night because that’s the only time where the equipment is available. Skipping lunch. So yes, sometime we need to relieve some of the pressure. We get tired. Our caffeine intake is high, but it’s not because we don’t have anything better than hanging around that coffee table the whole day.

Some of these tweets revealed the frustration we all share, in particular regarding some weird conventions of the writing style or the trending topics.

@eperlste: We used jargon instead of plain English to prove that a decade of grad school and postdoc made us smart. #overlyhonestmethods

@biochembelle: We decided to use Technique Y because it’s new and sexy, plus hot and cool. And because we could. #overlyhonestmethods

@Bashir_Course9: method isn’t described here b/c this High Impact Report is 200 words. see Supplement Appendix L for vague description #overlyhonestmethods

@ProfLikeSubst: This paper represents just the sexiest stuff we could skim from the data. The carcass paper will be dumped somewhere #overlyhonestmethods

Science is also expensive, most of the time. And we have to adapt your dream experiment to the practical and financial reality of the lab. We often have to improvise. And no, we don’t have access to each and every article ever published. Paywalls are still a major source of grief for most of us, my tweet on open-access got >230 RT and >100 faves (and counting). I guess I struck a sensitive point, here. If it helps us get the message to the public, it’s all good news. A few major outlets mentioned it (here and here).

@KayLa_D_87: Compound Q was excluded from study, because it was expensive. #overlyhonestmethods

@talesfromlabs: Compound A was preferred to B because there was leftovers from post-doc who left three years ago. (also B costs $$$) #overlyhonestmethods

@paulcoxon: The beam shutter was held stable by an in-house built support made from BluTak & the top off an old Biro #overlyhonestmethods

I have my share of stories like this. When we are working on the beamline at the ESRF, we work around the clock. If our in-house setup is breaking at 3AM, we don’t go to bed and come back in the morning after a good night’s sleep. We fix it. With whatever we have laying around. Oh, and these in situ freezing experiments we did ? The molds were actually straws from orange juice packs. Perfect diameter, ideal thickness. Why bother ordering expensive technical ones ? The orange juice was actually really bad. And BlueTack is a scientist’s best friend.

Science is hard. Often frustrating. People are moving in and out, and it’s sometime difficult to keep track. It often start from a failed experiment or a mistake, and then it takes a long time to understand what’s going on, so that superstition can be invoked at some point, until we figure out why it’s working this way.

@JacquelynGill: The microbalance was so temperamental that an undergrad named it “Larry” in order to yell at it more effectively. #overlyhonestmethods

@AnneOsterrieder: We don’t know how this method was performed because the PhD student’s lab book is written in a foreign language. #overlyhonestmethods

@aivelo: I’m ready to surrender and write “For no apparent reasons, my method works with only half of the samples.” #overlyhonestmethods

@james_gilbert: Apparatus was placed on the 2nd shelf up, approx 1 foot left of the spider plant. Results were irreproducible elsewhere #overlyhonestmethods

@researchremix: Data are available upon request because then we can tidy the spreadsheet only if absolutely necessary #overlyhonestmethods

@AkshatRathi: It took 10 years of work to write this 6-page long paper, but you wouldn’t be able to guess that from reading it. #overlyhonestmethods

A large number of tweets also revolved around the never-ending chase for grants.

‏@multisensebrain: Our results have significant implications for that we are seeking grant funding for #overlyhonestmethods

@dbmoore: Our study used string theory, global warming, and big data because that’s where the grant money is #overlyhonestmethods

@peds_id_doc: We’re submitting this half-finished experiment for publication because we ran out of grant money. #overlyhonestmethods‏

@paulcoxon: This work was made possible by @EPSRC Grant #1234 & @eBay from where we scrounged parts to repair our ancient apparatus #overlyhonestmethods

Finally, some revealed some of the privilege we have. Working in funny locations. Whenever we want. Choosing the people we work with. And considering the many sacrifices we accept otherwise, I don’t feel spoiled doing it.

@Gomblemomble: This part of the experimental work was carried out in Western Australia, because my supervisor has a friend there. #OverlyhonestMethods

So shoud we be worried about the way science is done ? Were these last two days a massive confession of scientific misconduct ? Not really. Partially maybe, at least that’s my feeling.

@Crommunist: I’m just making up a lot of these. #OverlyHonestMethods

We, as scientists, all know it’s a messy business. And guess what, science is performed by humans. Alive. That sometime go to the restroom, enjoy their weekend at home and have babies. But somehow it works and we make progress overall.

@researchremix: The data is old because in between writing the first draft and doing the revisions I had a baby #overlyhonestmethods