It is not uncommon to see messy R code which is almost not human-readable like this:
# rotation of the word "Animation"
# in a loop; change the angle and color
# step by step
for (i in 1:360) {
# redraw the plot again and again
plot(1,ann=FALSE,type="n",axes=FALSE)
# rotate; use rainbow() colors
text(1,1,"Animation",srt=i,col=rainbow(360)[i],cex=7*i/360)
# pause for a while
Sys.sleep(0.01)}
Apparently it is pain reading unformatted R code, but on the other hand, it is natural for us to be lazy. I don’t care about adding spaces or indent to my raw R code — I’ll concentrate on programming first and format my code later. The R package ‘formatR‘ is intended to help us format our messy R code.
# formatR optionally depends on gWidgetsRGtk2
# please use the latest version of R (>=2.12.0)
install.packages('formatR')
library(formatR)
formatR()
## you will get an error if the package gWidgetsRGtk2 is not installed;
## then you need to install it
install.packages('gWidgetsRGtk2')
formatR('RGtk2')
Then you can either paste your code into the text box or click the “Open” button to open an existing R code file. Click the “Convert” button and you are done!
There are several options in the “Preferences” panel, e.g. you can specify whether to keep comments or blank lines, or specify the width of the formatted R code.
No matter how messy your code looks like, formatR can make it tidy and structured as long as there are no syntax errors in your R code. If you prefer the command line interface, you may want to take a look at the function tidy.source() in this package.
Note that multi-byte characters (say, Chinese) are also supported in the GUI.
Here are some (trivial) R tips in the course Stat 511. I’ll update this post till the semester is over.
-
Formatting R Code
I’ve submitted an R package named
formatR to CRAN yesterday. This package should be easier than the code below, because there is a GUI to tidy your R code. Install with install.packages('formatR').
Reading code is pain, but the well-formatted code might alleviate the pain a little bit. The function tidy.source() in the animation package can help us format our R code automatically. By default it will read your code in the clipboard, parse it and return the well-formatted code. You have options to keep or remove the comments/blank lines and set the width of the code, etc. Spaces and indent will be added automatically. This can save us time typing spaces and paying attention to indent.
## install.packages('animation') if it is not installed yet
library(animation)
## copy some R code somewhere and type:
tidy.source()
## or specify the path of your code file
tidy.source(file.path(system.file(package = "graphics"), "demo", "image.R"))
## can also use a URL
tidy.source('http://www.public.iastate.edu/~dnett/S511/twofactor.R')
## remove blank lines
tidy.source('http://www.public.iastate.edu/~dnett/S511/twofactor.R',
keep.blank.line = FALSE)
## remove comments
tidy.source('http://www.public.iastate.edu/~dnett/S511/twofactor.R',
keep.comment = FALSE)
After a few hours’ work, I modified the function tidy.source() in the animation package so that it can preserve complete comment lines. See the tidy.source() wiki page for example.
tidy.source <- function(source = "clipboard", keep.comment = TRUE,
keep.blank.line = FALSE, begin.comment, end.comment, ...) {
# parse and deparse the code
tidy.block = function(block.text) {
exprs = parse(text = block.text)
n = length(exprs)
res = character(n)
for (i in 1:n) {
dep = paste(deparse(exprs[i]), collapse = "\n")
res[i] = substring(dep, 12, nchar(dep) - 1)
}
return(res)
}
text.lines = readLines(source, warn = FALSE)
if (keep.comment) {
# identifier for comments
identifier = function() paste(sample(LETTERS), collapse = "")
if (missing(begin.comment))
begin.comment = identifier()
if (missing(end.comment))
end.comment = identifier()
# remove leading and trailing white spaces
text.lines = gsub("^[[:space:]]+|[[:space:]]+$", "",
text.lines)
# make sure the identifiers are not in the code
# or the original code might be modified
while (length(grep(sprintf("%s|%s", begin.comment, end.comment),
text.lines))) {
begin.comment = identifier()
end.comment = identifier()
}
head.comment = substring(text.lines, 1, 1) == "#"
# add identifiers to comment lines to cheat R parser
if (any(head.comment)) {
text.lines[head.comment] = gsub("\"", "\'", text.lines[head.comment])
text.lines[head.comment] = sprintf("%s=\"%s%s\"",
begin.comment, text.lines[head.comment], end.comment)
}
# keep blank lines?
blank.line = text.lines == ""
if (any(blank.line) & keep.blank.line)
text.lines[blank.line] = sprintf("%s=\"%s\"", begin.comment,
end.comment)
text.tidy = tidy.block(text.lines)
# remove the identifiers
text.tidy = gsub(sprintf("%s = \"|%s\"", begin.comment,
end.comment), "", text.tidy)
}
else {
text.tidy = tidy.block(text.lines)
}
cat(paste(text.tidy, collapse = "\n"), "\n", ...)
invisible(text.tidy)
}
Note that inline comments will still be removed. I don’t want to spend more time on dealing with inline comments any more.
I’m going to give a talk in the CUEB on some topics in the discipline of statistics at the invitation of the Association of Statistics of CUEB, and I’ve mainly prepared two topics for them: one is about those jokes from Prof. Gary’s gallery of statistics jokes, and the other is about some tools for the research of statistics. Below are materials for this talk:
Slides for “Jokes in Statistics” (English, PDF by LaTeX):
Downdload the file hereSlides for “A Leisure Look on Some Tools for Statistics” (Chinese, PDF by PowerPoint):
Downdload the file hereR Codes for my talk (most of them contain somewhat interesting animations):
Downdload the file hereIf there are any errors in these materials, please tell me (through email x@y with x = xieyihui & y = gmail.com or leave a message here directly) . Thanks!
P.S. The time of this talk has been decided now: Nov 1st, 2007. For details please refer to: http://cos.name/cn/topic/8122
This demo was written by me about three months ago when I was illustrating the algorithm of “Gradient Descent” in the class of “Data Mining & Machine Learning”. I like to combine iterations (or loopings) with animated pictures, because it’s simple and heuristic, and of course, it’s easy in R: just use Sys.sleep() to control the time of steps of your demonstration and some low-level graphics functions such as lines(), points(), rect(), polygon() and segments(), etc to illustrate the process of your algorithm. To understand the figure below, you need to be clear about what’s contour plot.
The code for the above example is as follows:
This is my function for tidying up R code:
Downdload the file hereActually it’s quite easy, though I didn’t know it before. When R was upgraded from 2.4.1 to 2.5.0, the function source() was also modified. In the past I used to make use of source(my_source_file, echo = TRUE, prompt = "") to “tidy up” my code because it’s not convenient for me to type every space between operators, what’s more, I have no fixed rules to break a line or make a proper indent. Thus I need a function to automatically “tidy up” my code.
After I’ve read the source code of the function source(), I quickly found that the most critical function is parse(), which can turn your code file into neat expressions, and the rest work is just to extract substrings.
tidy.source = function(file = choose.files()) {
exprs = parse(file)
for (i in 1:length(exprs)) {
dep = paste(deparse(exprs[i]), collapse = "\n")
dep = substring(dep, 12, nchar(dep) - 1)
cat(dep, "\n")
}
}



Recent Comments