Three years of open access efforts: preprints are my future

December 3, 2016 § 1 Comment

Like many people, I started to boycott Elsevier 3 years ago. I went for the full boycott: I pledged not to publish any paper in Elsevier journals anymore, as well as not to review for any of their journals. I declined the first invitation to review on November 25, 2013. I am an established scientist with a wonderful permanent position at the CNRS, almost 60 papers already published, and enough grants secured for a few years. It was therefore much easier to do it for me than for someone trying to get a place in the sun.

Although I did not kept track every papers I decline to review, I probably declined to review 30 to 40 papers, give or take. Most editors (not all) quickly removed me from their reviewer database, so that I stopped receiving invitations from them. Others did not, so I kept declining and sending the same message for 3 years. I receive more papers than I can review anyway, so that it did not change my overall reviewing activity.

I had a small paragraph (I found one on the internet somewhere and adapted it. Can’t remember where, sorry about that) that I always sent to the editor whenever I declined a review, explaining why I did so. From the 40 papers or so I declined to review, I got feedback about my message only twice.

One editor-in-chief emailed me once, and was rather sympathetic to my cause (he was himself publishing some of his papers in open access journals). He told me he never had such straightforward and strongly worded comments on this topic before, even though some (many?) people discussed it with him. I understood that these people were scared from being blacklisted by the journal.

The other feedback was from an editor-in-chief I personally know … as he is my former PhD supervisor. Of course he disagreed with me, but we had a good discussion on the topic. I didn’t convinced him to resign from the journal.

Other than this: no feedback whatsoever. None. No one cared. As far as I can tell, it did not make any difference. And I assume that I was probably the only one to decline the review for these reasons (the materials science community is not exactly at the edge of open access efforts).

The other side of the boycott were my own papers. A few Elsevier journals are quite important in my domain. Before starting to boycott them, I published 13 papers in Elsevier journals (Acta Materialia, Biomaterials, Journal of the European Ceramic Society). The last paper I published in an Elsevier journal was in 2012. I did not published with them anymore after that, unlike many other scientists who pledged to boycott them. There are many reasons to break this boycott (mainly: not putting students or postdocs in a difficult position by excluding a relevant journal for their paper).

Whenever we had a paper ready for submission, we had to choose a journal. Although I always raised the question and explained my reasons, I never forced my co-authors to comply with my own choices. Their response was so-so. Most of them did not care too much, although they understood my point. We always found a good solution (in terms of journal). The two main issues discussed were (1) why just Elsevier, and not Wiley (I published 15 papers in the Journal of the American Ceramic Society, published by Wiley), Springer, etc. which are for profit publishers too ? and (2) the APC costs. On this last point, I was in a rather good position, having a large grants where APC are eligible cost. However, this also means not using this money for something else in the lab. As this fantastic grant is coming to an end, I may have to reconsider my opinion on this, though.

I also recently asked an editor to make my paper open access. YMMV, but hey, if you don’t ask, you’ll never know. In this case, I accepted an invitation at the condition of the paper being made open access, and the editor kindly accepted (the publisher agreed to make a few papers -which they deemed important enough- open access every year). Very nice (I have to write the paper, though). This will not work for most papers, although APC can sometimes be waived if you have good reasons).

I therefore experienced with a few open access journals, with various degrees of satisfaction. The open access journals I submitted to were either not for profit or society journals (PLOS OneScience and Technology of Advanced Materials, Materials, Inorganics), or mega open access journals from the big players (Scientific Reports, ACS Omega). We also published a few other papers in paywalled journals, and made the preprints available for them.

I did not spent too much on APC. I paid them for PLOs One (happily), Scientific Reports (not happily), and Science and Technology of Advanced Materials (twice. Reasonable APC), and that’s it. The APCs were waived in Materials (the paper was an invited review). We also had a feature paper in a paywalled journal that was made open to anyone (without actually asking, which was very nice). The APC of our latest paper (in ACS Omega) were reduced from $2000 to $0 ! A $500 transfer discount (the paper was rejected from another ACS journal), plus 2x$750 waivers offered by the ACS because I previously published two other papers in one of their journals (Langmuir). Overall, it was thus not a huge amount spent on APC during these three years.

Although I initially quite liked the idea of these mega journals, I have a different opinion today, after a few years of seeing what they published. In some of these mega journals, there is a lot of so-so, or frankly terrible papers (won’t name, won’t shame). In others (e.g. PLOS One), our community is not publishing, so I almost never found anything relevant in them (we published in PLOS One because I wanted biologists to see this paper, which was about antifreeze proteins. And they found it.).

Overall, I still believe in the value of journals, for the filtering they provide (or that authors provide by choosing to submit to them). Even though I use Google Scholar and the likes for keeping track of what is published (through keywords and alerts), I am also following a number of journals to see what the different communities are up to (e.g. Langmuir, Soft Matter, etc.). I cannot achieve this with the mega journals. There is just too much noise, and too many communities publishing in these journals.

Open access journals initially tried to differentiate themselves also by providing new services to the authors, such as altmetrics. However, this is not the case anymore today as pretty much all journals are jumping on the train (I like to know how many times my papers were downloaded, even though it is sometimes a bit depressing). In my own experience, it is difficult to tell if our papers received more attention because they were not behind paywalls, although I’d like to believe so. But hey, the idea it to make everything accessible. Who knows when and how a paper will be useful to someone and make an impact ? Nobody has any answer to this question (which is a good thing I believe).

In the meantime, preprints have attracted a considerable attention, and develop rapidly. Although physicist have used arXiv for ever, chemists (chemRxiv), biologists (bioRxiv), and many others (SocArXiv) are now joining the game, and journals are increasingly opened to preprints (of course). Elsevier now has a Romeo-green policy regarding preprints for its journals. As more and more people know about preprints, they also head to these servers when looking for the access to a paper they don’t have access to (search engines point to them, too). This is therefore a very cost-effective solution for making papers available right now. Feel free to argue in the comments below.

A number of other openness initiatives have also gained a lot of steam recently, besides papers. I am talking here about the data and figures, of course. I have become a huge fan of services like Figshare or Github. There is as much value (if not more) in sharing data and code (and giving them DOI to get citations and keep track of their use), than in just publishing a paper. Even if you are not convinced by this, just think about your h-index: people are more likely to cite your paper if you give them stuff (tools, data) they can reuse. Being an increasingly avid user of image analysis, we are now providing our codes (Python) whenever we publish a paper (2 papers so far, here and here, and more coming soon). The code is a Jupyter notebook with Python code and explanations inside, trying to explain as precisely as possible what we did so that people can check, replicate, reuse or iterate if they are interested. Based on the download counts, it proved almost as popular as the paper. This one was accessed 1589 times, and downloaded 219 times (while the paper itself was accessed 3064 times to date)! I was positively surprised by this. It also initiated a new project and collaboration on open data (in the pipe, be patient). I am certainly going to continue in this direction in a foreseeable future.

Besides the code and data, I found another very interesting use for Figshare (or anything similar you’d like): claiming the copyright of my own figures, so that their reuse (by yourself or someone else) is easy and does not depend on publishers. I started thus to upload a number of figures to Figshare (before submitting the paper). No editor has complained about this so far (I suspect editors actually like it since they like to have a clear view of which license is used). This is not very useful for simple plots: as long as you provide the data, they can be easily replotted in most cases. For complex plots or drawings and images that took a lot of time and efforts, I found this idea very exciting and incredibly simple to implement. It takes 2min per item to upload and tag it on FigShare.

Based on this analysis, where do I stand today?

  • Regarding the boycott to Elsevier: I will do my best to avoid them, but if the community we are targeting is publishing (reading) in an Elsevier journal, so be it. Like I said, Elsevier is Romeo-green on preprints, so we can make the paper available at no cost, and for me, that’s good enough for now. Our main criterion for selecting a journal is (and has always been): which community do we target ? Who do we think will be the most interested by our paper?
  • Reviewing: because nobody cared about my boycott in these journals, I am not declining reviews anymore (I am not accepting ALL reviews either, so don’t send me everything). There’s no reason I can’t kill papers like everybody else, right?
  • Whenever I give a talk, I always mention on my slides if the papers are open access. I see more and more people doing this. It raises awareness among those not convinced yet.
  • Preprints: yes, yes, and yes. This is now my number one criterion. If the journal does not allow preprints and is not open access, it will probably be a no-go. On the short term, I believe that preprints are the easiest way to make papers available at no cost (the cost of running arXiv is not negligible but the cost per paper is incredibly low, compared to the typical APC).
  • Data, code, presentations: Figshare ! I love it. We now always release the codes we developed, even if I am not a good coder (ahem). The feedback on the data/code we released so far was excellent. I also started to share the slides of my talks too, with a very good feedback.
  • Keeping the copyright of my own figures using Figshare (or something similar if you don’t like Figshare). I’ll try to do this as much as possible. I love the idea and its simplicity. Figshare items can be embargoed, so this is not an issue in principle if you have a super fancy paper coming up.
  • Mega journals of for-profit publishers: I most likely won’t publish with them anymore. Besides the APC issue (I am not going to pay $5k for a paper), I just found too much noise in these journals. It has become very clear that this is just another way of making money for them. Other mega journals: same reasoning applies.
  • Educate our students about the publishing system, so that they can make their own choices, knowing how it works. This will take a generation or two, so we’ll have to be patient.

Even if you do not want to pay to make your papers open, there is therefore a lot you can do today to make your papers and their code/data available. Even though it’s nice to see individuals fighting for this, I believe that the most efficient way to change the system is for the funders to require open access. The ERC does this now. Other funders are joining the trend. Even reluctant academics will change their habit, because they won’t have the choice. And this actually be done rapidly. The journals will have to adapt, somehow.

That’s my position today. Feel free to argue in the comments or on Twitter.

Writing academic papers in plain text with Markdown and Jupyter notebook

July 17, 2015 § 4 Comments

TL;DR

My new workflow for writing academic papers involves Jupyter Notebook for data analysis and generating the figures, Markdown for writing the paper, and Pandoc for generating the final output. Works great !

Long version

As academics, writing is one of our core activity. Writing academic papers is not quite like writing blog posts or tweets. The text is structured, and include figures, lots of maths (usually), and many citations. Everyone has its own workflow, which usually involves Word or LaTex at some point, as well as some reference management solutions. I have been rethinking about my writing workflow recently, and come up with a new solutions solving a number of requirements I have:

  • future proof. I do not want to depend on a file format that might become obsolete.
  • lightweight.
  • one master file for all kind of outputs (PDF, DOC, but eventually HTML, etc…).
  • able to deal with citation management automatically (of course).
  • able to update the paper (including plots) as revisions are required, with a minimal amount of efforts (I told you I was lazy).
  • open source tools is a bonus.
  • strongly binded to my data analysis workflow (more on that later).

After playing around with a couple of tools, I experimented with a nice solution for our latest paper, and will share it here in case anyone else in interested.

This particular paper was particularly suited for my new workflow. What we did was data mine 120+ papers for process parameters and properties of materials to extract trends and look at the relative influence of the various parameters on the properties of the material. The data in that case was a big CSV file, with hundreds of lines. Each data point was labelled by its bib key (e.g. Deville2006), which turned out to be super convenient later.

Data analysis

I became a big fan of the Jupyter notebook for our data analysis. The main selling points for me were the following:

  • document how the analysis was done (future proof). The mix of Markdown, LaTeX, and code is a game changer for me.
  • ability to easily change the format of the output (plots) depending on the journal requirements and my own preferences.
  • ability to instantaneously update plots in the final paper with new data. As I run the notebook, the figures are generated and saved in a folder.
  • ability to share how the analysis was done, so as to provide a reproducible paper. The notebook of our latest paper is hosted on FigShare along with the raw data, with its own DOI (you can cite it if you reuse it).
  • ability to generate the bibliography automatically. As each data point in my CSV file comes with its bib key, I can track exactly which references were used for a plot. This was particularly useful when writing that particular paper. After each plot, where data are coming from many different papers, I can generate a list of the bib keys used for the plot, and copy/paste that list into the paper. Boom !

All the analysis was done in a Jupyter notebook, that I uploaded later on FigShare when the paper was published. The notebook is generating the figures with a consistent style, as well as the bib keys list. This turned out to be the biggest time saver here. To give you a rough idea, here is the simple function that I use to generate the list of bib keys.

Capture d’écran 2015-07-20 à 08.58.25

And here is the result when I run it for a figure. Now I just have to copy this list and paste it directly into the Markdown file of the paper. Very cool.

Capture d’écran 2015-07-20 à 08.58.03

Writing the paper

I am a big fan of LaTex for long documents (PhD manuscript, etc.), but not so much for regular academic papers. I am not a physicist, so my papers are usually light in terms of maths. I chose to write everything in Markdown, which is something like LaTex for dummies. It is a very, very simple markup syntax, very popular for blogging, among other uses. The files are plain text files, which is certainly the most future proof solution that I can think about. The syntax is dead simple, you will get it in literally 5 minutes.

I do all my writing in Sublime Text, boosted with a couple of packages. Of particular interest in this case: SmartMarkdown, and PandocAcademic (not mandatory, though).

Bibliography

I use Mendeley for my reference managements. My favorite function is the automatic generation of a bib file, which I can use for my LaTeX or Markdown writing later on.

Getting the final version

What do you do with the Markdown file, then ? The one tool that glues everything together is Pandoc, dubbed as the « swiss army knife » document converter. It is a simple but extremely powerful command line tool. In my case, it takes the Markdown file and convert it into a Word of PDF document (or many other format if you need them). The beauty of it is of course the generation of the bibliography and the incorporation of figures and beautifully typeset equations. You can run pandoc from the command line directly. Here is the typical command line for what I want to do:

pandoc -s paper.md -t docx -o paper.docx —filter pandoc-citeproc —bibliography=library.bib —csl=iop-numerics.csl

Pandoc takes the paper.md file, the library.bib file for the bibliography, and use citeproc and the iop-numerics.csl file for formatting the bibliography, and create the paper.docx file for me. Easy !

Putting everything together

So I have everything I need now. Here is how it works.

  • The Jupyter notebook generates the figures and saves them in a folder.
  • The Markdown file starts with a few YAML metadata, that I use to provide the title, authors, affiliation, and dates.


title: A meta-analysis of the mechanical properties of ice-templated ceramics and metals
author: Sylvain Deville^1^\footnote{Corresponding author – Sylvain.Deville@saint-gobain.com}, Sylvain Meille^2^, Jordi Seuba^1^
abstract : Ice templating, also known as freeze casting, is a popular shaping route for macroporous materials. bla bla bla. We hope these results will be a helpful guide to anyone interested in such materials.
include-before: ^1^ Laboratoire de Synthèse et Fonctionnalisation des Céramiques, UMR3080 CNRS/Saint-Gobain, 84306 Cavaillon, France. \newline ^2^ Université de Lyon, INSA-Lyon, MATEIS CNRS UMR5510, F-69621 Villeurbanne, France \newline \newline Keywords 10.03 Ceramics, 20.04 Crystal growth, 30.05 Mechanical properties
date: \today

  • the text itself is formatted in Markdown. Take note how the citations are used in the text. Markdown use relative references to folder and files, take note how I point to the figure file.

# Introduction
Ice templating, or freeze casting[@Deville2008b], has become a popular shaping route for all kinds of macroporous materials. The process is based on the segregation of matter (particles or solute) by growing crystals in a suspension or solution (Fig. 1). After complete solidification, the solvent crystals are removed by sublimation. The porosity obtained is thus an almost direct replica of the solvent crystals.

![Principles of ice-templating. The colloidal suspension is frozen, the solvent crystals are then sublimated, and the resulting green body sintered.](../figures/ice_templating_principles.png)

Ice templating has been applied to all classes of materials, but particularly ceramics over the past 15 years. Although a few review papers [@Deville2008b; @Deville2010a; @Wegst2010; @Li2012b; @Deville2013b; @Fukushima2014, @Pawelec2014b] have been published, they mostly focus on the underlying principles. Little can be found on the range of properties that could be achieved.

Here is how the PDF looks like.

Capture d’écran 2015-07-20 à 10.00.51

 

  • You can build from the command line. You can also do everything from Sublime Text. Just set the user settings of the SmartMarkdown package to automatically use the bib file (generated by Mendeley, for instance) and the CSL file (depending on which journal I submit to). You can also provide Pandoc with a LaTex template if you want to.

« pandoc_args_pdf »: [« —latex-engine=/usr/texbin/pdflatex », « -V », « —bibliography=/Users/sylvaindeville/Desktop/library.bib », « —csl=iop-numerics.csl », « —filter=/usr/local/bin/pandoc-citeproc », « —template=/Users/sylvaindeville/Documents/pandoc/templates/latex2.template »],

To build the final version, I either run Pandoc from the command line, or hit Maj+Cmd+P in ST and « Pandoc: render PDF », and Pandoc generates the final document for me, with the correctly formatted bibliography and the figures in place. That’s it ! I also saved the pandoc command line argument (as a text file) in the folder where the markdown file is, so that I do not depend on Sublime Text in case I change my mind, and do not have to remember the exact command line to type (lazy, I told you).

Summary of the tools you need

  • A valid Python and Jupyter notebook installation, if you are doing your data analysis with it.
  • Pandoc.
  • A valid LaTex installation.
  • A bib file for your bibliography.
  • CSL file for the bibliography styles you want to use. Get the one you need here.
  • A text editor. Many choices available.
    Total cost: 0$.

Final Thoughts

It took a while to get everything in place and working, but I am happy with it now. This workflow was particularly suitable for this paper, since all the data analysis was done in the Jupyter notebook and there were many citations (in particular for each plots) that I did not wanted to input manually. During the review of the paper, one of the referees mentioned a couple of papers that we did not found initially. I updated the CSV file with the new data plots, ran the notebook, and the figures were instantaneously updated. Rebuild the final file from the updated Markdown file, and boom. Very little friction indeed.

A common question is the co-writing/proof reading when the paper is collaborative. In that case, I wrote almost everything. The other authors just sent me their parts in plain text and I pasted. I used the PDF for proofreading, and everyone annotated the PDF files. If I am in charge of the paper, I choose the tools. Deal with it.

Future improvements

I still have to copy/paste the list of bib keys corresponding to the figures in the Markdown files. Ideally, the list would be automatically generated within the Markdown file, so that there is even less friction in the whole process. I am not quite sure how to do this. Any suggestion is welcome.

If you want more control of the pagination of your output files, you can tell Pandoc to use a template (many journals provide LaTeX templates, for instance. At least in physics). I did not try, as the pagination requirements for submission are very minimal. The whole idea of a master text file is to *not* have to deal with these sort of things.

Finally, some version control (e.g. with GitHub) would be nice.

 

Update 20/07/15

  • Added some Jupyter screenshots.
  • I forgot to mention the main limitation (for me) of this approach: Pandoc does not do cross-references. The impossibility of automatic references to figures and equations is thus the main limitation. That is a trade-off that I can accept for now, as I usually have a limited number of equations and figures. Overall, I prefer to save time on reference management than on cross-references of figures and equations. YMMV.

OS X text editors for academics

May 29, 2015 § 3 Comments

Most of our time, as academics, is spent writing on our computers. Papers, dissertation, grant applications, reviews, but also coding. The geek in me has been somewhat obsessed about finding the best tools for these various jobs, and I have spend a fair amount of time testing different solutions. My writing activities fit into three different groups:

  • note-taking. I do this in raw text or markdown, a lightweight and future proof solution, for which I’m not dependent on a proprietary file format. That also includes the few blog posts I write every once in a while.
  • scientific writing. I do it mostly in markdown these days (more on this in another blog post), but also in LaTeX when I collaborate with hardcore physicists or for longer projects.
  • code. I am not an expert, far from it, but I am using more and more coding for my research (mostly image analysis). Most of it is done with Jupyter (formerly IPython notebook), but a text editor is also a light IDE environment, convenient for small projects, or when the text editing abilities of Jupyter are limiting for what I do (e.g snippets, multi-line editing, etc.).

I you are not sure which one to use, here is a (not exhaustive) list of softwares I came across and that you can test.

Free

Commercial

  • Sublime Text  Unlimited free trial, 70$ for a license.
  • BBedit  Free trial, 50$ for a license.
  • Subthaedit  Has been designed specifically for collaborative work. Get it from the AppStore (30$), no free trial.
  • Chocolat  . Projects are super easy — just drag a folder onto it. Free trial, 50$ otherwise. Bonus point for the name.
  • Textstatic  Get it from the AppStore. Free trial, license for 9$.

There are also lots of specialized text editors for plain text, Markdown (Marked, Mou, Byword, Macchiato), or TeX (TexShop, TexPad). But you may not bother with specialized softwares when one can take care of everything, right ?

Finally, a different category is related dedicated to authors of long, elaborated documents (Ulysse, Scrivener). Multi-files writing projects (e.g. your PhD manuscript) can be done with most text editors like Sublime Text, though.

I set on the following 2 or 3 years ago to fulfill my needs:

  • iAWriter Pro for all the basics raw text writing and note taking, used in combination with Simplenote/Notational Velocity. I can open each note from Notational Velocity (Cmd+Maj+E). Super convenient. The full screen, text-only mode is perfect to make the most of my 11’ screen. Word counting is helpful, too.
  • Sublime Text for everything else, that is paper writing, or the coding when I am not using Jupyter. With proper LaTex, Python, and Pandoc installations, I can build anything from ST with a shortcut and get the PDF or anything else out of it. I will describe my new workflow for paper writing in another blog post. I could replace iAWriter with ST, of course, but they fit slightly different purposes. The syntax highlighting of iAWriter, for instance, is very useful for the non native english writer that I am.
  • For collaborative writing, I am using online solutions, depending on who my co-authors are. Overleaf is excellent for LaTeX writing. When my colleagues are less inclined to advanced tools, I stick to Google Docs, which is perfect for grant proposals with little formatting, figures, and just a few references.

Here is a pro-tip to conclude: if you are working on a multi-file LaTeX project, add the following line to each of your individual chapter file so that you can build your project (PDF) from any file (assuming main.tex is your main file), hitting Cmd+B:

%!TEX root = main.tex

The Automated Academic

January 4, 2015 § 5 Comments

I muss confess: I am a lazy person. I hate to spend unnecessary time on tasks, in particular mundane or recurrent ones. When doing science, even though we are constantly exploring new ideas or novel methods, there is a fairly high number of recurring activities, from literature review to data analysis or writing. Being (barely) part of the computer native generation, I am very fond of any tools that can help me save time in my academic workflow (and improve the reproducibility of our science). Although I keep an open eye for new options, I have developed a relatively steady number of practices and tools over the years, which help me saving a lot of time and concentrate on the tasks I enjoy. So here they are, I hope you will learn new ones here.

TL;DR

Essential tools I use: Google account (Google Docs, Google Drive, Google Scholar), Feedly, IFTTT, IPython notebook, Twitter, Mendeley, Pandoc, ORCID.
Tasks I automated: literature survey, citations formatting, reading lists, data analysis, email, writing.

Disclaimer: I am not sure whether to consider myself as a geek or not. When it comes to automation, many options require an advanced control of the tools we use, aka, be a power user. Almost all of the solutions I have listed below have a very low entry barrier (except IPython, for which you need to be familiar with … Python !), and can be set up rapidly by anyone.

Literature review

Literature review is an essential activity of academic research. I have already covered the topic here, so here is a quick breakdown of the tools I use.

  • RSS feeds (free): to keep track of all new articles from a given number of journals. After the death of Google Reader, I set on Feedly (free). Good enough for now.
  • Google Scholar (free) alerts. Google Scholar has become an essential part of my workflow, to keep track of what is going on in the world of peer-reviewed science. The most useful part of it are the e-mail alerts. I set up a couple of search alerts, based on keywords relevant to my research. They come almost daily to my inbox. I only wish I could combine several alerts into one, which would help me reduce the number of emails I get.
  • Twitter (free). I am a Twitter addict. And one of the many reasons is that it helps me stumble upon new content in the world of science. Although plenty can be done with the basic Twitter tools (hashtags, lists, etc.), you can build a few more elaborated tools. A while ago, I set up a Twitter bot based on PubMed, which automatically posts tweets with link to new paper on a given topic. More explanations here. Twitter can also be combined with IFTTT (free) for a number of tasks. If you do not want to be involved with the Twitter API, you can do some basic tracking with IFTTT, such as automatically listing tweets with a given hashtag (15 tweets max per search), and saving the output to a Google Doc file. I just set up a number of tasks based on this, I will let you know how it goes in a while.

Reading

Reading is another essential, recurring task of my workflow. I read mostly two kinds of documents: peer-review papers, and articles. All my papers are automatically organized in Mendeley (free), thanks to the watched folder (I download every article in a specific folder, the content of which is automatically added to Mendeley). For all other articles, I tend to send everything to Instapaper (free), which I like a lot (removes all the clutter). This can be done directly from Feedly. With IFTTT, I can also send links in tweets I have favorited automatically to Instapaper.
To keep track of articles I particularly enjoyed or found relevant, you can automatically create a listing or liked or archived articles in Instapaper to Google Docs. Mostly future proof, I guess.
I also set up an IFTTT recipe to send to Instapaper links from tweets with a given hashtag (e.g #ipython).

E-mail

E-mail is like peer-review or democracy. It is the best solution until we find something better, and it’s quite clear that it’s here to stay. I work hard to be close to Inbox Zero, which I usually achieve. Rules are nevertheless a very powerful tool to automate the wonderful task of dealing with your email. Kind of obvious, but super efficient.

Backup

Duh ! If you do not backup your data, expect a slow, painful death in a near future. You will have deserved it.

Data analysis, file organization

  • Tags vs. folders. Should you organize your files ? Even though there are a lot of tagging solutions for the files out there (it comes with OS X), I still use folders. You can automate some of your files management with Hazel ($29) for instance. The only automation I use is the watched folder for updating my Mendeley library, as discussed above.
  • Data analysis. There are probably a number of recurring experiments in your workflow. And there is a good chance that you end up with CSV files containing your data. If that’s the case, it would be a good idea to get rid of Excel and move to IPython, and in particular the IPython notebook (free). @ajsteven130 turned me into an Python fan, and for me, there is now way back. I am just completed a project (i.e, a paper) for which I did the entire analysis in the notebook, and it is just too good. It is also a big win for reproducibility and sharing what you did. More here.
  • Getting values from plots. I use Graphclick (OS X). This little gem automatically extract the values from plots when you don’t have access to the raw data. Super useful, when compiling data from the literature, for instance. It hasn’t been updated for years, but does the job perfectly. Ridiculously cheap ($8).

Writing

Whether it’s papers, grant applications, or reports, we spend a fair amount of time writing. Even though papers do not write themselves, there are a number of things that can be automated to help you concentrate on the content.

  • Scheduling time for writing. This is not really an automation solution, but I settled on this routine a while ago (2 years ago, maybe ?). Whenever I have something to write, which is, pretty much, all the time, I block a dedicated amount of time in my day to write. No matter what. I am a morning person when it comes to writing, so I write 1h (or more) every day first thing when I get to the lab, and it makes a huge difference at the end of the week. Given the amount of writing I will have to deal with this year, I certainly plan to keep this approach. If 1h per day scares you, try 20 min. At the end of the week you will end up with 2 hours ! Big win.
  • Incorporating references in your writing, and formatting the references. If you’re not using a reference manager, you’re doing it wrong. Period. There are plenty of options out there, so you don’t have any decent excuse. I set on Mendeley many years ago, and am not planning to change since they gave me a shirt (private joke here). Bonus point for syncing my library (including PDFs) between the various computers I use.
  • If some part of you writing involved repetitive expressions, it might be a good idea to use a text replacement software such as textExpander ($35) and alike. I don’t. Yet.
  • Conversion. I am still chasing the « One file to rule them all » dream: one master file for all kind of outputs, from PDF to html, xml, and so on. I became of big fan of Markdown (a very simple markup language) for the first draft, and am seriously considering it as my master format, relying on Pandoc (free) for all the conversions.
  • Solutions for collaborative writing. As soon as you are not alone on a writing project, you have several options to collaborate. And no, emailing the files back and forth to your colleagues is not a suitable option. Depending on your colleagues and your geekiness level, you have many options, including Google Docs (excellent for comments and review mode), GitHub in combination with Markdown, Overleaf (free)(for LaTeX fans), etc. Bonus point for Mendeley for automatically populating your library with the references cited in the file you received and not in your library yet. Very useful.

Updating your CV

Most academics love (and are often asked) to have an up-to date list of their achievements. You have many options here. The solution with the lowest amount of efforts is to sign up for a Google Scholar account. It seems to become one of the standard today, along with ORCID (free). Bonus point for keeping track of citations to your work if you are addicted to metrics. Alternative solution if you need a PDF with a list of your papers: keep track of your papers in Mendeley, get a bib file from it with all your papers, and use this with your favorite LaTeX template.

Other

  • Password management. Automation AND safety. I use 1Password ($50) and it does the job perfectly.
  • Keeping tracks of things of papers you’ve reviewed. I just came across IFTTT (you have probably guessed that by now), and made a recipe involving Gmail and Google Drive. All incoming emails in my inbox with « review » in their title are listed in a Google Doc automatically. Tons of variations possible based on this workflow. Get creative.

Anything you use that I missed ? Let us know in the comments.

My Must-Have Apps for Science, 2013 Edition

December 22, 2013 § 3 Comments

2103 is the year when I finally bought a MacBook Air as my main machine. I have thus been able to shift to a Apple-only software environment, although I still have a Thinkpad in the lab (yeah, don’t ask).

Like any academic, my four main activities are reading, writing, compiling data, and preparing figures.

My main requirement is that I need to keep my machines in sync. This include 2 laptops (Apple and PC) and 1 desktop (Apple). The tricky part is that I cannot go online in the lab with the MBA, for corporate reasons (this is a Windows environment). I rely thus on a few pieces of software able to sync from behind the firewall (and no, I cannot use Dropbox on my PC at work), and exchange a few files, when needed, by bluetooth.

Here are my most-used app for 2013, with no particular order.

Main

  • 1Password. Takes care of all my password needs and more. I have only 50 characters passwords now.
  • Keynote. I gave up with Powerpoint when moving to the MBA, and enjoy using Keynote so far. My needs are very basic, as most of my slides usually have just a title and one figure.
  • Alfred. I use it mostly as a launcher for apps. I use spotlight a lot, but Alfred appears in the center of my screen (yes, it’s silly, I know) and the text is larger.

Writing and code

  • iAwriter. For all my drafts. A perfect distraction free-environment. Love the font.
  • SublimeText. For coding. This is an outstanding piece of software. I don’t even understand how people can code without it. Also great for LaTeX writing, once you’ve set a few snippets.
  • TexPad. I wrote a few long and structures documents this year (such as my habilitation), which was a good excuse to be back to LaTeX. TexPad is real pleasure to use. The interface is uncluttered and does the job perfectly. Mendeley automatically generate a .bib file of my library, which is super convenient.
  • MS Word. alas. I only use it to prepare the final version of manuscripts and exchange with co-authors. One thing I like, though, is the revision mode.
  • F.lux. This little gem automatically adjusts the color of the screen. Warm at night and bright during the day. I cannot use a computer where it’s not installed. This is the first piece of software I actually install on a new machine.

Image Editing

  • Adobe Illustrator. For all my figure needs when preparing manuscripts. Been using it for years. Keeps getting better. Just love it.
  • Picasa. To keep track of all the images on my PC and some shared resources on the internal network, without actually organizing them. A time saver.
  • Fiji and ImageJ. Fulfill all my image analysis needs. Even better since you can use python with it.

References and Science stuff

  • Mendeley. My reference manager of choice. I use it constantly. Tends to be a bit slow when running search (>2k papers). Just perfect for preparing manuscript. I know, I know, Elsevier owns it now, but it’s just too useful for me. The competition is getting fierce, though, which is good, with the release of Paper 3 and Readcube. The automatic bib file saving is a must have for me.
  • Simplenote and Notational Velocity. I’m throwing everything here: notes, to-do list, recipes, drafts. The killer is the shortcut ((Maj+Cmd+E) to open the file in an external editor (iA Writer for me). Using Markdown for the drafts. I’m very keen on keeping everything in text file, to ensures readability on the long-term.
  • Graphclick. This little gem automatically extract the values from plots when you don’t have access to the raw data. Super useful, when compiling data from the literature, for instance. It hasn’t been updated for years, but does the job perfectly. Ridiculously cheap.
  • Gephi. I had fun with networks recently (more info coming soon, hopefully). Beware, this is a mesmerizing piece of software. Be ready to waste a lot of time.
  • Mediawiki. We finally set up a wiki for the lab last year, and have used Mediawiki. Does the job perfectly.

Online tools

  • Doodle. To find a date for meetings. Does the job simply and perfectly.
  • Instapaper. For my casual (i.e, not papers) reading needs. I sometime send full text papers, and it’s actually a pleasure in this context. Using the snippet to save the articles during the day, and read everything on my iPad.
  • Twitter. Been using it more and more, but this might be a story for another post. I tweet here @devillesylvain.

Next for 2014 ?

Who knows what 2014 will be made of ? Pretty much all my needs are fulfilled now, so I am not really looking for anything special. A few apps are under my radar nevertheless, and could be possible new additions to my workflow.

  • Scrivener. For complex documents such as review papers. I downloaded the trial version and started playing with it. Seems to be very powerful. Make sure to check out the tutorials.
  • Mindnode. For mind mapping. I’m a visual type of person.
  • WriterPro. The new version of iA Writer. I don’t care about the workflow thing, but the syntax highlighting could be a game changer for my academic writing, as I am still not a native-english speaker and am still working hard on improving my writing.

Got any advice ? Let me know if I miss anything in the comments.

To #mendelete or not to #mendelete ?

April 10, 2013 § 5 Comments

My twitter feed is on fire, since the announcement of Elsevier having bought Mendeley, after a few months of rampant rumors. “Elsevier is evil ! They will shut down Mendeley ! Mendeley lost its soul ! We should in no way contribute to Elsevier’s business and benefits”. These are a few of the reactions that quickly followed the announcement. What should I do ? Should I care ?

Elsevier has an awful track record: from fake journals to insane profits on journal bundles, to name a few. Everybody agrees on that, and for sure they realized it and are trying to make up for it, somehow. Now that they own Mendeley, they are going to do all sort of crazy things. Maybe, maybe not, time will tell. Mr Gunn seems confident at this point. Others much less, to say the least.

I have a different take on the current events. I am usually a very pragmatic guy. I used to use Endnote, like everybody else a few years ago when there were no alternatives. Their habit was to update the software every year, although I never found any significant improvement in the update. I remember that sometime the update was WORSE than the previous version, breaking my library. And I had to pay 100$, give or take, to update. Every year, although I quickly gave up on the update. No PDF organization, no way to perform full text search. No sync. Quite rough.

Then Papers came out. And it was awesome. Finally a decent PDF organizer, that quickly improved. Not having the choice of my OS (Win), I had to give up on Papers when I came back from the US. Too bad. A windows version has been developed since, but I already gave up. It’s been bought by Springer since, and I’m not sure Springer is any better than Mendeley.

And then I came across Mendeley. It more or less provides everything I need: easy import (I love the DOI look up), easy organization, full text search, cross plat-form sync. I’ve paid for a data plan for a while to have all my files synced between my laptop and desktop computer (Dropbox is not allowed where I work). Works flawlessly. Excellent to insert bibliography in papers I write. Automatic bibtex file creation when I need to use LaTeX. If only they could provide the abbreviated journal name, that would be perfect. I now trow in it every interesting paper I came across, whether it’s directly related to my interest or not. It is thus becoming my personal, curated papers database. The value I get from this software has very quickly become extremely valuable.

And now it belongs to Elsevier. Well, I try not to submit papers anymore to Elsevier journals (although Acta Materiala is a solid journal in my field), I avoid to review for them. I use Scopus less and less since Google Scholar has become extensive. I get little or no value from Elsevier’s products. But Mendeley is different. As I said, I get a lot of value from it right now, and I don’t mind paying 5$ a month for my data plan, it’s worth it. My files are synced across all on my computers. If the situation turns ugly, I don’t lose anything but the time spent migrating to another platform. So for now, I’ll stick to Mendeley, and see what happens.

Google killing Reader (I will survive)

March 14, 2013 § Leave a comment

Based on my twitter feed, there were two main news yesterday: the election of an old dude in Rome, and the not very classy decision of Google to kill Reader in a few months. As you can guess, I am much more concerned about that second one, for my daily work routine. I have expressed my love for RSS previously. As of today, my strategy hasn’t changed. RSS is still the best way, by far, to keep track of new articles.

Many people today are claiming that RSS is dead, and twitter will do the job instead. Not at all, as far as I am concerned. I have a very different usage for both. I use twitter to discover recommendations and keep track of the scientific buzz around. The constant flow of tweet is nevertheless a guarantee that I will miss some stuff. It’s ok. It’s in the very nature of twitter. When it comes to tracking new articles in journals, twitter just doesn’t do the job. I use (mostly) Google Scholar to search for article on a topic in which I have some interest. Something specific. But it’s definitely not a tool for systematic tracking of new papers. My current RSS feed currently comprises around 50 journals, 30 blogs, and roughly 40 RSS feed of Scopus search results or equivalent. Since October 2008, I have  read over 300k items in Reader. The counter is stuck at 300k for over a year, actually. My current feed provides about 3k items per month (I used to have much more). I spend about 10-15 min per day to keep track of new articles, and usually discover 2 or 3 new papers of interest for me, not directly related to my specific niche (freezing !). If I need to visit every single journal website to get the same information… well, there’s just no way. RSS is still the best choice. No question.

My second constraint is that during my day, I use 2 different computers, a phone and an iPad to check on my RSS feed, depending on where I am and what I do. Reader was providing a flawless solution for the sync. There will be another one soon, that’s ok.

The only question left now is: how long will Google Scholar survive? Reader was much more useful to me, and I guess I’m not the only one like this in the academic world. There are now ads in Scholar. I don’t see why they should even bother keep working on it, unless they have some long terms plans for it that goes beyond the simple search engine it is today. By which I mean an iTunes store-like system for academic papers, for instance.

Will I survive ? Of course, because I don’t have the choice. I will export my RSS feed to another service and keep using it. I will miss the convenience of Google Reader until a better solution comes up. Good bye, you’ve served me well.

Where Am I?

You are currently browsing entries tagged with tools at Sylvain Deville.