R is Not So Hard! A Tutorial, Part 2

October 30, 2012
By

(This article was originally published at The Analysis Factor, and syndicated at StatsBlogs.)

by David Lillis, Phd.

In Part 1 we installed R and used it to create a variable and summarize it using a few simple commands. Today let’s re-create that variable and also create a second variable, and see what we can do with them.

As before, we take height to be a variable that describes the heights (in cm) of ten people. Type the following code to the R command line to create this variable.

height = c(176, 154, 138, 196, 132, 176, 181, 169, 150, 175)

Now let’s take weight to be a variable that describes the weights (in kg) of the same ten people. Copy and paste the following code to the R command line to create the weight variable.

weight = c(82, 49, 53, 112, 47, 69, 77, 71, 62, 78)

Both variables are now stored in the R workspace. To view them, enter:

height

weight

We can now create a simple plot of the two variables as follows:

plot(weight, height)

However, this is a rather simple plot and we can embellish it a little. Type the following code into the R workspace:

plot(weight, height, pch = 16, cex = 1.3, col = “red”, main = “MY FIRST PLOT USING R”, xlab = “WEIGHT (kg)”, ylab = “HEIGHT (cm)”)

[Note: R is very picky about the quotation marks you use.  If the font that is displaying this post shows the beginning and ending quotation marks as facing in different directions, it won't work in R.  They both have to look the same--just straight lines.  You may have to retype them within R rather than cutting and pasting.]

In the above code, the syntax pch = 16 creates solid dots, while cex = 1.3 creates dots that are 1.3 times bigger than the default (where cex = 1). More about these commands later.

Now let’s perform a linear regression on the two variables by adding the following text at the command line:

lm(height, weight)

We see that the intercept is 98.0054 and the slope is 0.9528. By the way – lm stands for “linear model”.

Finally, we can add a best fit line to our plot by adding the following text at the command line:

abline(98.0054, 0.9528)

None of this was so difficult!

In Part 3 we will look again at regression and create more sophisticated plots.

About the Author: David Lillis has taught R to many researchers and statisticians. His company, Sigma Statistics and Research Limited, provides both on-line instruction and face-to-face workshops on R, and coding services in R. David holds a doctorate in applied statistics.


Bookmark and Share

Like this post?
Enter your email address to have posts delivered:



Please comment on the article here: The Analysis Factor

Tags: , , , , ,


Subscribe

Email:

  Subscribe