Yihui Xie

GLM, Logistic Regression & Statistical Computation

谢益辉 / 2007-04-09

I noticed that there were few (domestic) researchers paying attention to statistical computation when they apply models to certain data. This is too dangerous.

In the class of multivariate statistics this morning, Li gave a lecture on logistic regression. Actually in my opinion, students in the major of statistics should not study this specific regression at first. Instead, I believe generalized linear models (GLM) must be emphasized, or a sequence of problems such as methods of parameter estimation will arise when studying Poisson regression, negative binomial regression, etc. In a word, general framework is much more important than specific cases.

Well, let’s get back to the topic. In this class, teacher Du brought forward a good question, i.e. how does the computer get the final estimation? Indeed it is critical. I know many people are using logistic regression, but few of them know (or even ever heard of) what’s Fisher scoring or Newton-Raphson scheme.

Yes, we rely too much on computers. As Guo Zhigang (an author of a book on logistic regression) told the reader, “Just pass this question to your computer/software, and it’ll tell you the result”. OK, this is an era for IT; computers are all-mighty; just click a button.

God knows the result is true or not! Or I’d rather believe he doesn’t know either, because he’s never attended a class of GLM.