This wiki is meant for people who are comfortable in Matlab and need some of the functionality of R. This guide is made with only the relevant knowledge of R and attempts to minimize the use of R. This means that all data pre-processing must occur outside of R.
This wiki will cover the following relevant topics in R.
R can read most text files quite easily. The text files can come with any combination of headers and/or indexes:
Header1 | Header2 | … | Headern |
Data1 | Data2 | … | Datan |
or
Header1 | Header2 | … | Headern | |
Index | Data1 | Data2 | … | Datan |
or
Data1 | Data2 | … | Datan |
So, in order to read data into R, you need to write out a file in that format.
If you are going to do any statistics, make sure that you remove outliers and omitted points BEFORE loading into R, because I'm not going to tell you how to omit points once you are in R.
In order to enter data into R, it's quite easy. Type:
DataArray ← read.table(“filename”,header=TRUE)
or
DataArray ← read.table(“filename”,header=FALSE)
Once the data is in R, you can refer to parts of the data like this:
DataArray$Header_1
There is a way to not have to type DataArray$ every time, but I don't reccommend it for our purposes.
This is exceedingly easy in R, once you know what all the variables.
Let y and x be column vectors of data values that are all the same length. The linear regression of these vectors is:
fitted.model ← lm(y ~ x,data=data.frame)
This model allows for intercept estimation as well.
fitted.model ← lm(y~x+a,data=data.frame)
This model fixes the intercept to be a.
Note that x and y must be the name of the column in the data matrix. Aka: data$x and data$y must exist.
If this is not the case, change the name of the variables in the function.
Similar to linear regression, all we now have to do is change our data and our formula.
Example:
fitted.model ← lm(y ~ x1 + x2, data=data.frame)
In this case, the data.frame has columns titled y, x1, and x2. It can have more columns than that as well.
Similarly to linear regression, all we have to do now is change our formula, which was the first argument. We can change the formula on BOTH sides of the equation.
Examples:
fitted.model ← nls(y ~ x1*log(x1),data=data.frame)
or
fitted.model ← nls(log(y) ~ x1*exp(x1),data=data.frame)
For parameter estimation, use the following code:
fitted.model ← nls(y~ a*x1*log(b*x2),data=data.frame,start=list(a=0,b=0));
To do both a multiple regression and a non-linear regression, the generalization is the same from linear regression to multiple linear regression.
Example:
fitted.model ← nlm(y^p[2] ~ (p[1]*x1)^p[2]+(p[3]*x2)^p[2],p=c(start values for p), data=data.frame)
In order to output the results of your regression, there are MANY possibilities. I reccomend:
summary(fitted.model)
This is a comprehensive summary of the entire fit.
Other possibilities are:
formula(fitted.model)
This extracts the model formula.
deviance(fitted.model)
Residual sum of squares of the model.
coef(fitted.model)
Extracts the regression coefficient (matrix)
fitted.model
This prints a concise version of the model.
resid(fitted.model)
Displays the residuals
Shiny provides for interactive R widgets on websites: