Sweave

Transition from Sweave to knitr

2012-02-24

Before knitr 1.0, it was compatible with Sweave for easier transition from Sweave to knitr, but the compatibility was dropped since v1.0 for (much) easier maintenance of this package. If you have an Rnw document written for Sweave, the first step you can do is to call Sweave2knitr() on it, and knitr will automatically correct the syntax (mainly chunk options, e.g. results=hide should be results='hide', and eval=true should be eval=TRUE, etc).

library(knitr)
Sweave2knitr('old-document.Rnw') # you will get old-document-knitr.Rnw by default
# see ?Sweave2knitr for details

New syntax for chunk options

By default, knitr uses a new syntax to parse chunk options (in chunk headers <<>>=), which is similar to R function arguments. This gives us much more power than the old Sweave syntax. You can use arbitrary objects in chunk options and make use of full power of R. Here is a trivial example of setting graphical parameters for base R graphics:

<<par-hook, cache = FALSE>>=
knit_hooks$set(pars = function(before, options, envir) {
  if (before) graphics::par(options$pars)
})
@

<<use-pars, pars = list(mar=c(4, 4, .1, .1), mgp=c(2, 1, 0))>>=
plot(mtcars[, 1:2])
@

First we have set a chunk hook named pars, which uses the chunk option pars to set par(); then we pass a list of parameters to the chunk option pars, which will be passed to the hook function and used before (if (before)) this chunk is evaluated. This enables us to hide the long and boring code to set graphical parameters in the output.

The chunk options can even make use of objects in a chunk dynamically. Here is another example of setting the caption for the figure environment:

<<setup, cache = FALSE>>=
opts_knit$set(eval.after = 'fig.cap') # evaluate fig.cap after the chunk
@

<<t-test, fig.cap = paste("The P-value is", t.test(x)$p.value)>>=
x = rnorm(100)
boxplot(x)
@

By default, all chunk options are evaluated before a chunk is executed (e.g. the pars option in the first example), but we can also postpone the evaluation of some options by setting the package option eval.after. In this example, the figure caption is dynamically generated from the P-value of a t-test on the object x in the chunk.

Neither of the above cases is possible in Sweave: on one hand, it is impossible to write literal commas in chunk options because commas are reserved as separators for options; on the other, there are only three types of objects supported in Sweave options – logical, numeric and character values, and all of them should be scalars. The root reason is Sweave parses the options by text string operations such as strsplit(), and knitr treats these options as formal arguments of a function (see ?formals), so you can use any valid R expressions.

The new syntax is more consistent with R syntax, so you do not have to remember any new rules, e.g. in the old syntax, you must not quote character strings or write literal commas, but in the new syntax, you write character strings in exactly the same way as you do in R.

Other compatibility issues with Sweave

Note most of the issues described in this section can be automatically solved by Sweave2knitr().

Some features of Sweave were dropped in knitr and some were changed, including:

For logical options, only TRUE/FALSE/T/F are supported (the first two are recommended), and true/false will not work, e.g., eval=FALSE is OK, and eval=false is not (unless there is an R object named false which happens to take a logical value). Chunk reference using <<chunk-label>> is still available, and there are other approaches for reusing chunks, e.g., use the new option ref.label or the function run_chunk(); chunk references can be recursive (see the demo chunk reference).

Besides, the LaTeX style file Sweave.sty was dropped as well; it has brought too much confusion to users since it is shipped with R instead of LaTeX; knitr has built-in styles using standard LaTeX packages, and users are free to change them using hooks.

For inline R code as in \Sexpr{}, knitr will automatically format the results – too big or too small numbers will be written in scientific notations (e.g. $3.14 \times 10^{-5}$ instead of the less readable 3.14e-05; see output demo). We can also call purl() to extract R code, which is similar to Stangle().

Some answers to Sweave FAQ’s from vignette('Sweave', package='utils') are different in knitr:

Note knitr does not check the encoding of the input document using the tricks in Sweave like using \usepackage[foo]{inputenc}. You have to specify the encoding explicitly if you are not using the native encoding in your system, e.g. knit('foo.Rnw', encoding = 'GBK').

Compatibility with pgfSweave

dev='tikz' in knitr means tikz=TRUE in pgfSweave, and external=TRUE was implemented differently – the cache of tikz graphics is moved to the R level instead of relying on the LaTeX package tikz (if a tikz plot is externalized, knitr will try to compile it to PDF immediately and use \includegraphics{filename} to insert it into the output; in comparison, external=FALSE uses \input{filename.tikz}); this frees the users from the GNU make utility and understanding tikz externalization.

Compatibility with cacheSweave

Cache was implemented in knitr with functions in base R (e.g., lazyLoad()) and does not rely on other add-on packages; for cached chunks, the results from the last run will be loaded and written into the output, and this is more consistent with the default behavior of R code (users may wonder why print(x) does not produce any output for cached chunks; plots in cached chunks will still be in the output as well). However, bear in mind that not all side-effects can be cached; see the cache page.

Sweave as a subset of knitr

The design of knitr is highly modularized so that even if you want to go back to the Sweave style, you are always free to do so with a single function in your chunk:

<<setup, include=FALSE, cache=FALSE>>=
render_sweave()
@

This tells knitr that you miss the good old Sweave.sty and Sinput/Soutput environments, and knitr is ready to use them for your output. It is just a matter of output hooks, which is orthogonal to other steps in the whole process.