I may want to add a subtitle “Why R-Forge Must Die” (thinking of Barry Rowlingson’s talk earlier this year). I have been a GitHub user for two years, and I was mainly influenced by Hadley. Now I even feel a little bit addicted to GitHub (its slogan is “social coding”), because it is really convenient for collaboration and makes me more productive.
As some readers might have known, I started a new package knitr last month, and this time I decided to try to use the power of social network like Google+ and Twitter (something I used to stay away from), and so far I’m pretty satisfied with my attempts. I call GitHub the “Facebook” of programmers, and it is also very powerful to connect programmers and users. There are a few features that I think R programmers may want to try (I use knitr as an example):
- Browse R code online: I hate checking out a whole package to read its source code, and R-Forge is clumsy (following the steps of sourceforge) in this aspect; the code on GitHub is highlighted and easy to thumb through. Besides, you can browse each commit and see what was changed (the difference is highlighted).
- Issues: instead of writing your own TODO list which is often forgotten, both your users and you can create issues, and you can make discussions there; when you have got a fix, you can write a commit message like “
fixed #46” in GIT and the issue 46 will be automatically closed. This is super cool little feature to me. What is more, there will be a reference to the commit which fixed the issue, so you can come back in the future and see how the issue was fixed. Currently knitr has got 50 issues in total, with 24 of them from users. - Inline comments: you can discuss code directly along the lines; this is a super super cool feature. Ramnath started to contribute to the code theme feature recently after he forked my repository, and the original author (i.e. me) can go to the fork and check changes; each change can be commented, e.g. we had quite a few discussions in this commit. This feels like you can sit together with another programmer, and point to the code with a pen, saying “I like this and you may need to revise that, …”. In comparison, the traditional way of collaboration is usually through email — email patches back and forth, which is way less straightforward. When Ramnath and I feel the work is mature, he can simply send me a pull request, and all the changes can be merged back to my repository. The other example is I saw the blog post by Songpants the other day, and I suggested he move the work to GitHub so I can make suggestions closer to the R code, and now the code is happily sitting on GitHub (so are my comments).
- Wiki: it makes it so easy to quickly set up a documentation page; I have not done it for knitr yet, but I did it for the formatR package. It looks better than R’s documentation, right? Again, other people can collaborate with you in editing the wiki pages. The other way to make your documentation better is to write vignettes in Sweave, which usually takes a lot of efforts (wrestling with LaTeX) unless you use LyX+knitr like me; I feel vignettes are easy to make, but this is another story.
- Stay tuned with a package: you can watch a repository (use the button in the top-right) so that you can read the updates of a package in the dashboard; alternatively, you can follow a GitHub user like you follow people on Twitter.
- GitHub pages: this is probably the coolest feature; you can use another branch (called
gh-pages) in your GIT repository to build your website based on Jekyll. I made the website for knitr in this way, because I really want to make knitr a beautiful package, so everything has to be beautiful, then R documentation was ruled out. Of course, Hadley is a pioneer in documenting R packages with websites. In the future, I may want to develop a package based on knitr which turns R documentation into a website automatically (with examples parsed and evaluated, plots inserted), so you can host it on GitHub or somewhere else. This is only an idea at the moment, and feel free to contact me if you are interested.
I cannot say I’m already an efficient R programmer, but GitHub did make me much more efficient.
The world has changed. You can feel it on GitHub. You can smell it on Google+. The knitr package, as an alternative tool to Sweave, has features that you have been longing for, and features that you might have never imagined. Thumb through the PDF manual to see some of them.
Currently this package is still a beta version, so I’m looking for feedback from early birds on:
- is the PDF documentation confusing in any places? e.g. you have no idea on how to install the package because it was not mentioned in the manual;
- does the website look ugly in your browser? (I know it does with IE under Windows) I used a font from Google Font API, and it does not seem to be consistent across different web browsers/OS’es;
- what kind of difficulties did you have in switching from Sweave/pgfSweave/whatever-Sweave to knitr?
- do you like the idea of putting R code/output in a shaded frame in LaTeX? is the default shading (
rgb(.97, .97, .97)) too dark or too light? how about the highlighting theme? - have you ever tried to hack at Sweave? I’d love to listen to your stories;
- what else do you expect from knitr?
Feel free to file a bug report in the Issues page if you find any problems or have any suggestions. I appreciate your efforts in making this knitr package even neater!
A couple of days ago we released a package named fun to CRAN, but I did not dare to send an announcement to r-packages@r-project.org as usual. This package is a collection of some classical computer games (e.g. the Mine sweeper and Five in a row) as well as other funny stuff. Some examples:
## install.packages('fun')
library(fun)
if (.Platform$OS.type == "windows")
x11() else x11(type = "Xlib")
mine_sweeper()

Mine Sweeper in R
library(fun) gomoku()

Five in a row in R
You can take a look at the list of functions in this package by reading the HTML help page (go to help.start()), and I also need to mention the demos, e.g. see demo('TurtleGraphics') for a demo of Turtle graphics (how many people know the old Logo programming language?), and demo(package = 'fun') for a list of all demos in this package.
demo('RealTurtle', package = 'fun')

A turtle drawn in R
Although these topics are not new, they can still be good programming exercises.
We started writing this package more than two years ago, but it was almost forgotten later until a few days ago someone mentioned the game “Five in a row” in our web forum. This forum is almost the Chinese version of R-help, and it is not unusual for people to bring forward all kinds of funny ideas with R. If you are at useR! 2011 right now, you probably have heard from George Zhang about the Chinese R conferences these years, and this forum has been the sponsor and organizer ever since the first conference (which I initiated). However, please do not get a wrong impression that Chinese useRs are doing mine sweepers with R every day.
Feel free to share with us if you have more fun. The developers’ page is at: https://github.com/yihui/fun
P. S. This package may remind some people about the sudoku package (e.g. Joshua Wiley has noticed it), and some people may even remember this:
library(fortunes)
fortune('sudoku')
About half a year ago, I wrote a post on the configuration of (pgf)Sweave and LyX, which was intended to save us some efforts in going through all the details during the configuration. Now many things have changed: LyX 2.0 has internal support for Sweave, and fortunately I have been in touch with the developers on this feature (thanks to Gregor); meanwhile, there have also been many changes in the pgfSweave package. In all, we have a number of new features which we should definitely make use of.
New Features
A list of new features as far as I can remember:
- support for Sweave in LyX 2.0 is internal, so there is no need to modify the preferences file manually (the converters have been defined internally)
- most importantly, Sweave becomes an independent module now in LyX, which means you can use it with arbitrary layouts!
- we can see the messages during compilation in LyX 2.0 (View–>View Messages), which is really really helpful and I would strongly recommend you to turn on this option when compiling Sweave documents, because you will know which code chunk goes wrong in case of any errors (in the past, you only got an annoying error dialog box which told you almost nothing about the error)
- pgfSweave is faster: it uses the GNU make utility to compile graphics, and you can use multi cores if you like; the compilation becomes 3 steps (pdflatex, make, then pdflatex); other nice features include: the R code is put in an environment Hinput now so you can customize it in LaTeX preamble; there will be no longer a huge gap between the R code and the output (fixed by Liang Qi)…
- tikzDevice has better support for multi-byte characters (using UTF8)
I have been working on improving the Sweave module and adding a new pgfSweave module to LyX, and now I have basically finished what I planned to do. See the ticket #7555 for details. To sum up,
- LaTeX will not complain about not being able to find Sweave.sty; I used several tricks to guarantee this — even in the worst case, LaTeX can still use the hard-coded Sweave style;
- Spaces and dots in path names or filenames will no longer be a problem;
- the pgfSweave module is also working now;
- you can export the reformatted R code in a LyX document with the pgfSweave module;
I have also tried to document all the cool bells and whistles in two examples, sweave.lyx and pgfsweave.lyx. Or you can directly read the PDF documents, sweave.pdf.tar.xz and pgfsweave.pdf.
Try the (pgf)Sweave Module
It seems several people are interested in testing the two modules, and it is actually very easy under Linux. So far I have had no luck with Windows to build LyX from source (I tried once; it took me days to compile and ended up with errors).
- check out the source code of LyX:
svn co svn://svn.lyx.org/lyx/lyx-devel/trunk lyx-devel - (
cd lyx-devel) apply my patch sweave-patch.diff to the svn source you checked out just now:patch -p0 -i /path/to/sweave-patch.diff - build LyX
./autogen.sh ./configure make sudo make install
Done.
Currently the patch is still waiting on LyX Trac. If you run into any problems before the developers begin to look at the patch, please let me know and we will try to make the two modules more stable and useful. But first of all, please remember to keep all your software packages up-to-date: R 2.13.0, pgfSweave 1.2.1 (run update.packages() in R as frequently as you can) and LaTeX package pgf 2.10 (very important for externalization).
I remember a few weeks ago, there was a challenge in the R-help list to make the prime symbol in R graphics. In LaTeX, we simply write $X'$ or $X^\prime$. R has a rough support for math expressions (see demo(plotmath)) and they are certainly unsatisfactory for LaTeX users. In fact we can write native LaTeX code in R plots via the tikzDevice package! Why bother to use all kinds of tricks to cheat R?
Here is an example per request of a reader of my blog:
For those who have been struggling with the installation of GGobi and the rggobi package under Windows: a major update of GGobi 2.1.9 is that GTK+ has been bundled with GGobi, so the installation of GTK+ is no longer required (I recommend you to uninstall it if it is not used elsewhere in your system); besides, the rggobi package, which interfaces R to GGobi, is now built with the GGobi 2.1.9 on CRAN too. You might know that the Windows binary of rggobi is not available on CRAN in the past (and Prof Ripley kindly provided the binary), but now things have changed. Hopefully this can make our life with GGobi easier.
You may use install.packages('rggobi') to install the new version of rggobi from CRAN.
Also note that if you are a user of the RGtk2 package, you don’t need a standalone installation of GTK+ either if you have already installed GGobi 2.1.9, because the path of GGobi will be written in the PATH variable of your system and RGtk2 can load the required dll’s from GGobi’s directory.
A new version of the formatR package is available on CRAN now (binary packages are still on the way). There are three major updates:
- the inline comments will also be preserved in most cases (in earlier versions, only single lines of comments are preserved)
tidy.source()gained a new argument'text'to accept a character vector as the source code- multi-byte characters are supported in the
formatR()GUI now (sorry, this is not completely true in 0.1-6; it has been fixed in 0.1-7)
The first feature is a request from Cameron, which is actually from another request of another user. I also feel this is a necessary feature even from the first version of this package, but dealing with inline comments is not as easy as the single lines of comments, and it can be dangerous. Please read the help page of the function tidy.source() for all the dark and dirty tricks for preserving R comments when formatting the R code. Here is a quick example:
> library(formatR)
> src = c("# a single line of comments is preserved",
+ "1+1", "if(TRUE){",
+ paste("x=1 ", "# comments begin with at least 2 spaces!"),
+ "}else{", "x=2;print('Oh no... ask the right bracket to go away!')}",
+ "1*3 # this comment will be dropped!")
>
> ## source code
> cat(src, sep = "\n")
# a single line of comments is preserved
1+1
if(TRUE){
x=1 # comments begin with at least 2 spaces!
}else{
x=2;print('Oh no... ask the right bracket to go away!')}
1*3 # this comment will be dropped!
We can reformat the code as:
> ## the formatted version
> tidy.source(text = src)
# a single line of comments is preserved
1 + 1
if (TRUE) {
x = 1 # comments begin with at least 2 spaces!
} else {
x = 2
print("Oh no... ask the right bracket to go away!")
}
1 * 3
R’s default theme of the HTML help pages is too plain for me to read, but we can easily modify the theme, which is essentially a CSS file. You can find the file under:
file.path(R.home('doc'), 'html', 'R.css')
Simply replace this file with my version:
Download R.css (1K)which looks like:
Of course you can design your own R.css if you know CSS.
Last year I posted an animation created in R to celebrate the new year, and this year I’ve got a more fabulous animation. Unfortunately our Lord of CRAN (Kurt) has been out of office for several days, so I’m unable to publish my animation package on CRAN as scheduled. Anyway, for those who are curious about the new version the animation package, you can download the version 2.0-0 from my GitHub development page.
The above animation comes from the demo('fireworks') in the R package animation 2.0-0. Thanks for the contribution of Weicheng Zhu.
Another demo I did not mention on this Christmas was demo('Xmas2'):
Thanks for the contribution of Jing Jiao.
There are a whole bunch of new features in the animation package v2.0-0 (this version is a milestone), but I will keep silent for the time being, since it is not published on CRAN yet.
It is well-known that R has several graphics devices — either the screen devices (X11(), windows(), …) or the off-screen devices (pdf(), png(), …). We can query the default graphics device in options():
getOption('device')
In a non-interactive session, the default device is pdf(). This is why Sweave has to create a file named Rplots.pdf no matter if you want it or not when you run Sweave on an Rnw file which has code chunks creating plots. Such a behaviour is annoying to me — the PDF file is not only unnecessary, but also time-consuming (creating this PDF file is completely a waste of time). Is there a way to set a “null” device? (like the /dev/null for *nix users) The answer is yes, but not so obvious. I have not found the device below documented anywhere:
options(device = function(...) {
.Call("R_GD_nullDevice", PACKAGE = "grDevices")
})
This device can speed up Sweave a lot when there are many plots to draw. Here is a comparison:
x = rnorm(1000)
system.time({
.Call("R_GD_nullDevice", PACKAGE = "grDevices")
replicate(500, plot(x, pch = 1:21))
dev.off()
})
# user system elapsed
# 1.51 0.02 1.53
system.time({
pdf(file.path(tempdir(), "Rplots.pdf"))
replicate(500, plot(x, pch = 1:21))
dev.off()
})
# user system elapsed
# 47.81 0.20 48.10
One thing I don’t understand in Sweave is that it evaluates the code chunk twice if its Sweave options contain fig=TRUE. I think this might be a waste of time as well, and this is why I like pgfSweave, which has both the mechanism of caching R objects (using cacheSweave) and a smart way to cache graphics (using pgf).
WARNING: this null device may not work with plots that contain (math) expressions! (take a look at demo(plotmath) in case you do not know what are expressions in R graphics)



Recent Comments