Apr 132010

It is not uncommon to see messy R code which is almost not human-readable like this:

 # rotation of the word "Animation"
# in a loop; change the angle and color
# step by step
for (i in 1:360) {
 # redraw the plot again and again
plot(1,ann=FALSE,type="n",axes=FALSE)
# rotate; use rainbow() colors
text(1,1,"Animation",srt=i,col=rainbow(360)[i],cex=7*i/360)
# pause for a while
Sys.sleep(0.01)}

Apparently it is pain reading unformatted R code, but on the other hand, it is natural for us to be lazy. I don’t care about adding spaces or indent to my raw R code — I’ll concentrate on programming first and format my code later. The R package ‘formatR‘ is intended to help us format our messy R code.

# formatR optionally depends on gWidgetsRGtk2
# please use the latest version of R (>=2.12.0)
install.packages('formatR')
library(formatR)
formatR()

## you will get an error if the package gWidgetsRGtk2 is not installed;
## then you need to install it
install.packages('gWidgetsRGtk2')
formatR('RGtk2')

Then you can either paste your code into the text box or click the “Open” button to open an existing R code file. Click the “Convert” button and you are done!

formatR: unformatted R code

formatR: unformatted R code

formatR: tidy R code

formatR: tidy R code

There are several options in the “Preferences” panel, e.g. you can specify whether to keep comments or blank lines, or specify the width of the formatted R code.

No matter how messy your code looks like, formatR can make it tidy and structured as long as there are no syntax errors in your R code. If you prefer the command line interface, you may want to take a look at the function tidy.source() in this package.

Note that multi-byte characters (say, Chinese) are also supported in the GUI.

Mar 232010

Here are some (trivial) R tips in the course Stat 511. I’ll update this post till the semester is over.

  1. Formatting R Code

  2. I’ve submitted an R package named formatR to CRAN yesterday. This package should be easier than the code below, because there is a GUI to tidy your R code. Install with install.packages('formatR').

    Reading code is pain, but the well-formatted code might alleviate the pain a little bit. The function tidy.source() in the animation package can help us format our R code automatically. By default it will read your code in the clipboard, parse it and return the well-formatted code. You have options to keep or remove the comments/blank lines and set the width of the code, etc. Spaces and indent will be added automatically. This can save us time typing spaces and paying attention to indent.

    ## install.packages('animation') if it is not installed yet
    library(animation)
    ## copy some R code somewhere and type:
    tidy.source()
    ## or specify the path of your code file
    tidy.source(file.path(system.file(package = "graphics"), "demo", "image.R"))
    ## can also use a URL
    tidy.source('http://www.public.iastate.edu/~dnett/S511/twofactor.R')
    ## remove blank lines
    tidy.source('http://www.public.iastate.edu/~dnett/S511/twofactor.R',
               keep.blank.line = FALSE)
    ## remove comments
    tidy.source('http://www.public.iastate.edu/~dnett/S511/twofactor.R',
               keep.comment = FALSE)
    
Mar 312009

After a few hours’ work, I modified the function tidy.source() in the animation package so that it can preserve complete comment lines. See the tidy.source() wiki page for example.

Downdload the R code here
tidy.source <- function(source = "clipboard", keep.comment = TRUE,
  keep.blank.line = FALSE, begin.comment, end.comment, ...) {
  # parse and deparse the code
  tidy.block = function(block.text) {
      exprs = parse(text = block.text)
      n = length(exprs)
      res = character(n)
      for (i in 1:n) {
        dep = paste(deparse(exprs[i]), collapse = "\n")
        res[i] = substring(dep, 12, nchar(dep) - 1)
      }
      return(res)
  }
  text.lines = readLines(source, warn = FALSE)
  if (keep.comment) {
      # identifier for comments
      identifier = function() paste(sample(LETTERS), collapse = "")
      if (missing(begin.comment))
        begin.comment = identifier()
      if (missing(end.comment))
        end.comment = identifier()
      # remove leading and trailing white spaces
      text.lines = gsub("^[[:space:]]+|[[:space:]]+$", "",
        text.lines)
      # make sure the identifiers are not in the code
      # or the original code might be modified
      while (length(grep(sprintf("%s|%s", begin.comment, end.comment),
        text.lines))) {
        begin.comment = identifier()
        end.comment = identifier()
      }
      head.comment = substring(text.lines, 1, 1) == "#"
      # add identifiers to comment lines to cheat R parser
      if (any(head.comment)) {
        text.lines[head.comment] = gsub("\"", "\'", text.lines[head.comment])
        text.lines[head.comment] = sprintf("%s=\"%s%s\"",
          begin.comment, text.lines[head.comment], end.comment)
      }
      # keep blank lines?
      blank.line = text.lines == ""
      if (any(blank.line) & keep.blank.line)
        text.lines[blank.line] = sprintf("%s=\"%s\"", begin.comment,
          end.comment)
      text.tidy = tidy.block(text.lines)
      # remove the identifiers
      text.tidy = gsub(sprintf("%s = \"|%s\"", begin.comment,
        end.comment), "", text.tidy)
  }
  else {
      text.tidy = tidy.block(text.lines)
  }
  cat(paste(text.tidy, collapse = "\n"), "\n", ...)
  invisible(text.tidy)
}

Note that inline comments will still be removed. I don’t want to spend more time on dealing with inline comments any more.

Oct 042007

I’m going to give a talk in the CUEB on some topics in the discipline of statistics at the invitation of the Association of Statistics of CUEB, and I’ve mainly prepared two topics for them: one is about those jokes from Prof. Gary’s gallery of statistics jokes, and the other is about some tools for the research of statistics. Below are materials for this talk:

Slides for “Jokes in Statistics” (English, PDF by LaTeX):

Downdload the file here

Slides for “A Leisure Look on Some Tools for Statistics” (Chinese, PDF by PowerPoint):

Downdload the file here

R Codes for my talk (most of them contain somewhat interesting animations):

Downdload the file here

If there are any errors in these materials, please tell me (through email x@y with x = xieyihui & y = gmail.com or leave a message here directly) . Thanks!

P.S. The time of this talk has been decided now: Nov 1st, 2007. For details please refer to: http://cos.name/cn/topic/8122

Sep 062007

This demo was written by me about three months ago when I was illustrating the algorithm of “Gradient Descent” in the class of “Data Mining & Machine Learning”. I like to combine iterations (or loopings) with animated pictures, because it’s simple and heuristic, and of course, it’s easy in R: just use Sys.sleep() to control the time of steps of your demonstration and some low-level graphics functions such as lines(), points(), rect(), polygon() and segments(), etc to illustrate the process of your algorithm. To understand the figure below, you need to be clear about what’s contour plot.

Process of Minimization by Gradient Descent (2D)

The code for the above example is as follows:

Aug 122007

This is my function for tidying up R code:

Downdload the file here

Actually it’s quite easy, though I didn’t know it before. When R was upgraded from 2.4.1 to 2.5.0, the function source() was also modified. In the past I used to make use of source(my_source_file, echo = TRUE, prompt = "") to “tidy up” my code because it’s not convenient for me to type every space between operators, what’s more, I have no fixed rules to break a line or make a proper indent. Thus I need a function to automatically “tidy up” my code.

After I’ve read the source code of the function source(), I quickly found that the most critical function is parse(), which can turn your code file into neat expressions, and the rest work is just to extract substrings.

tidy.source = function(file = choose.files()) {
   exprs = parse(file)
   for (i in 1:length(exprs)) {
       dep = paste(deparse(exprs[i]), collapse = "\n")
       dep = substring(dep, 12, nchar(dep) - 1)
       cat(dep, "\n")
   }
}
WWW.YIHUI.NAME XIE@YIHUI.NAME © 2007 - 2012 by Yihui Xie