Oct 102009

Today Romain Francois posted an interesting topic in the R-help list, and you can read his blog post for more details: celebrating R commit #50000. 50000 is certainly not a small number; we do owe R core members a big “thank you” for their great efforts in this fantastic statistical language in the 13 years. When I saw Romain’s data, I suddenly remembered a question I asked to one of Prof Ripley’s student a couple of years ago: does Prof Ripley ever sleep? And he answered “No!”. No wonder we can see Prof Ripley so frequently in the R-help/devel mailing list. If you have stayed on R-help list for enough long time, you’ll surely know several facts, e.g. Martin Maechler will arrive in less than 3 minutes if you dare call an R package “library”, and you will get “Ripleyed” if you are not careful enough in posting your R code.

> library(fortunes)
> fortune("Ripleyed")

And the fear of getting Ripleyed on the mailing list also makes me think, read,
and improve before submitting half baked questions to the list.
 -- Eric Kort
 R-help (January 2006)
Mar 162009

Today Ruya Gokhan Kocer asked me how to use the R function identify() in off-screen graphics devices. Actually it’s pretty easy as long as we obtain the list returned by identify(pos = TRUE). For example,

Downdload the file here
# open a windows device
x11()
x = rnorm(20)
y = rnorm(20)
plot(x, y)
# identify 5 points
id = identify(x, y, n = 5, pos = TRUE)

# $ind
# [1]  2  6 10 14 16
#
# $pos
# [1] 1 1 4 4 1

# then open a bitmap device
png("identify.png")
plot(x, y)
# use the information from above mouse click
text(x[id$ind], y[id$ind], id$ind, pos = id$pos)
dev.off()
Sep 062008

The other day I sent a small assignment to a group of people in order that they could “play” with statistics and become more interested with this subject. The data provided to them is:

Downdload the file here (104K)

The data-generating process was quite simple: first I generated 20000 random numbers (10000 rows, 2 columns) from N(0, 1) and then add 10000 rows of numbers which lie exactly on a circle; at last I provided this data in a randomized order so people cannot easily discover the pattern just from the numbers.

The question is, how to reveal the particular pattern in this “pile of sand”? Let’s look at the original plot:

The original scatter plot

The original scatter plot

What can we observe from this scatter plot? Perhaps nothing but “a pile of sand”. However, if we choose alternative ways to create the plot again, things will be completely different. Here are my approaches:

Dec 242007

In some operating systems, a few R graphical devices might not be available, so we have to check the capabilities of devices before writing code for creating image files in case that there should be errors. The function is just capabilities().

I didn’t notice this and was wondering why there were errors in the check summary of my R package “animation“. Now I understand the reason. Thus I’ll modify the function savePNG() a little.

Dec 222007
NOTE: this illusion has been implemented in the R package “animation” v0.1-4. The function is vi.lilac.chaser().

Yihui @ Dec 25, 2007

This was a sudden idea that came into my mind yesterday. Actually some optical illusions can be very easily created using R graphics system. Here is one example I wrote yesterday:

Downdload the file here
# By Yihui XIE, Dec 22, 2007 www.yihui.name
op = par(bg = "gray", mar = rep(2, 4), xpd = NA)
x = seq(0, 2 * pi, length = 16)
invisible(replicate(100, {
    for (i in 1:length(x)) {
        plot(1, xlim = c(-1, 1), ylim = c(-1, 1), axes = F, ann = F,
            type = "n")
        points(sin(x[-i]), cos(x[-i]), col = "magenta", cex = 7,
            pch = 19)
        points(0, 0, pch = "+", cex = 5, lwd = 2)
        Sys.sleep(0.05)
    }
}))
par(op)

Focus your eyes on the center “+” for a few seconds, and you will find the color of the “circling” point just changes (to green). Perhaps I’ll write a package for these illusions next year.

Oct 252007
Note: the website introduced below has been moved to http://animation.yihui.name.

This afternoon I went to the Beijing Custom to give a lecture on sampling techniques as well as my R program. Actually I didn’t make any preparations until late in this morning. When I finished my lunch, I made some animated pictures to illustrate these four kinds of sampling methods: simple random sampling, stratified sampling, cluster sampling and systematic sampling.

Sampling Survey

After I came back to school, I added these animations to my little project “Animated Statistics Using R“. You may see them here.

Oct 042007

I’m going to give a talk in the CUEB on some topics in the discipline of statistics at the invitation of the Association of Statistics of CUEB, and I’ve mainly prepared two topics for them: one is about those jokes from Prof. Gary’s gallery of statistics jokes, and the other is about some tools for the research of statistics. Below are materials for this talk:

Slides for “Jokes in Statistics” (English, PDF by LaTeX):

Downdload the file here

Slides for “A Leisure Look on Some Tools for Statistics” (Chinese, PDF by PowerPoint):

Downdload the file here

R Codes for my talk (most of them contain somewhat interesting animations):

Downdload the file here

If there are any errors in these materials, please tell me (through email x@y with x = xieyihui & y = gmail.com or leave a message here directly) . Thanks!

P.S. The time of this talk has been decided now: Nov 1st, 2007. For details please refer to: http://cos.name/bbs/read.php?tid=8122

Sep 142007

There are many graphical functions offering the availability of the parameter alpha which is usually used to specify semi-transparent colors, however, such kind of colors can only be displayed in certain devices, as stated in the help of rgb():

Semi-transparent colors (0 < alpha < 1) are supported only on some devices: at the time of writing only on the pdf and (on MacOS X) quartz devices as well as several third-party devices such as those in packages Cairo, cairoDevice, JavaGD and RSvgDevice.

Here is an example illustrating semi-transparent colors in a pdf device:

Sep 062007

This demo was written by me about three months ago when I was illustrating the algorithm of “Gradient Descent” in the class of “Data Mining & Machine Learning”. I like to combine iterations (or loopings) with animated pictures, because it’s simple and heuristic, and of course, it’s easy in R: just use Sys.sleep() to control the time of steps of your demonstration and some low-level graphics functions such as lines(), points(), rect(), polygon() and segments(), etc to illustrate the process of your algorithm. To understand the figure below, you need to be clear about what’s contour plot.

Process of Minimization by Gradient Descent (2D)

The code for the above example is as follows:

WWW.YIHUI.NAME XIE@YIHUI.NAME © 2007 - 2010 by Yihui Xie