Yihui Xie

The Spreadsheet Lady

Yihui Xie / 2017-12-08


rOpenSci published an interview of Jenny Bryan today, and I enjoyed reading it. It is very much worth reading as a profile of an R developer, because it showed a few aspects and roles of a software developer that are often less known to the general public: a software developer as a (former) teacher, a researcher, a collaborator, and a mother.

I found Jenny and I had several things in common (yes, you may have discovered that I rarely talk to my colleagues, so I didn’t realize these things until today). For example, both of us thought we’d make more substantial contributions on the software tools’ side, although we were statisticians by training. I also enjoy teaching, but I feel teaching is less efficient than writing good software tools: teaching 50 students a semester is too slow. MOOC sounds like a better way to go, but those Simply Statistics guys have already done an excellent job of promoting some of the software tools I created through Coursera, so I don’t really need to worry much about it.

In terms of collaboration, she told us an interesting fact:

[…] I would work with people in genomics and if I’m completely honest with myself, often my biggest contribution to the paper would be getting all the datasets and analyses organized. […] And I was like, I have a PhD in stats, why is this my main contribution?

That also reveals one of the reasons why I don’t want to work in academia: I simply wouldn’t be able to survive, because I cannot make contributions on statistical methodology or theory. I’m not good at or interested in making these kinds of contributions. Jenny is a “Spreadsheet Lady”, and I’m probably a “Markdown Man”. RStudio is a perfect place for us. I don’t need to write grant proposals. If I’m interested in working on anything, it often only needs a 5-min video chat or an email before I’m able to actually work on it. The total time I spend every week on meetings is about 15 minutes (and sometimes the meeting could be cancelled).

On work-life balance, we are also similar. I often start working after 9pm and go to bed at 1am. That is the best work time for software engineers with kids. RStudio does not care “when” I work. It is always about the quality of the work instead of how many hours you spend on the work (or pretend to be working). Occasionally I just don’t work at all for a whole day if I feel I’m burned out, and often times I still work on weekends (even to the late night), although in the latter case, it is usually because the project is too interesting and I cannot stop.

It turned out that I was also an Emacs/ESS zealot like Jenny before I discovered RStudio. I tried to teach Emacs to other people and often saw puzzled faces. Same thing happened when I tried to promote Sweave or LaTeX.

There is one difference between us, though — I’m still blindly trying to reach inbox zero. Normally I have 30 emails in my inbox: five from 2011, one from 2013, three from 2016, and the rest are from this year. Since 2005, I have sent 17,322 email threads (each thread may contain multiple replies) and received much more than that number (near 100,000). I learned a few months ago from a book that Jöns Jacob Berzelius, a super hard-working chemist (1779–1848), received 7150 letters and sent 3250 in his life. Then I started to doubt if inbox zero was still meaningful. My contribution to this world would be nowhere near Berzelius’s, and Berzelius didn’t have to process emails.

P.S. Finally rOpenSci fixed the links of blog posts! There is no longer a duplicated /blog/blog/ in the URLs. Good job! But I’d use .Rprofile instead of .rprofile.