R Tips in Stat 511

Here are some (trivial) R tips in the course Stat 511. I’ll update this post till the semester is over.

Formatting R Code

Reading code is pain, but the well-formatted code might alleviate the pain a little bit. The function `tidy.source()` in the formatR package can help us format our R code automatically. By default it will read your code in the clipboard, parse it and return the well-formatted code. You have options to keep or remove the comments/blank lines and set the width of the code, etc. Spaces and indent will be added automatically. This can save us time typing spaces and paying attention to indent.

Approximating Rationals by Fractions

We often deal with matrices like `C(X'X)^{-1}X'` in 511 and may wonder what on earth they are. If we directly compute `solve(t(X) %*% X) %*% t(X)` (or generalized inverse `ginv()` in MASS) we often end up with seeing a lot of decimals, which makes it difficult to see what these numbers really mean. The function `fractions()` in the MASS package can approximate rationals by fractions. For example:

Jittered Strip Chart

Strip chart is a common tool for batch comparisons. When points get overlapped in the plot, we may “jitter” the points by adding a little noise to the data. The R function `jitter()` is an option to manipulate the data, but `stripchart()` already supports jittered points.

Testing `C beta = d` in a Linear Model

R base does not provide a general test for the coefficients of a linear model, but we can use the function `glh.test()` in the gmodels package to do it. If you take a look at its source code, you will find unsurprisingly it is nothing but the code in page 7 of slide set 9 of Dr Nettleton’s lecture notes.

Demo for the F Distribution

I created a dynamic demo to illustrate the power of the F test here: Demonstrating the Power of F Test with gWidgets. Play with it and have fun!

Tricks in `read.table()`

Many people do not realize the possibility of converting the data types of columns in `read.table()` and always use such specific post hoc conversion:

But in fact, we can specify the types of columns while reading data:

There are other tips in `read.table()` but I find this one the most useful. Check the 22 arguments in `?read.table` if you want to know more magic (e.g. how to specify the first column in the data file as the row names).

Demo for Newton’s Method

There is a function `newton.method()` in the package animation which shows the detailed iterations in Newton’s method. Here is a demo:

I hope this is useful for understanding iterative algorithms.

Misc Tips

Some little tips:

1. `unname()`: to remove the names of objects x = c(a = 1, b = 2) x # a b # 1 2 unname(x) ## x = unname(x) if one wants to replace x # [1] 1 2
/
Published under (CC) BY-NC-SA in categories R language  Statistics