SESSION SIX

Table of Contents


Required files:



Plots: plot(x,y) gives a scatterplot if x is continuous, and a box-and-whisker plot if x is

a factor. Some people prefer the alternative syntax plot(y~x) using ‘tilde’ as in a model

formula.

Type of plot: Options include lines type="l" or null (axes only) type="n".

Lines: lines(x,y) plots a smooth function of y against x using the x and y values provided.

You might prefer lines(y~x).

Line types: Useful with multiple line plots, lty=2 (an option in plot or lines).

Points: points(x,y) adds another set of data points to a plot. You might prefer

points(y~x).

Plotting characters for different data sets: pch=2 or pch="*" (an option in points

or plot)

Setting non-default limits to the x or y axis scales uses xlim=c(0,25) and/or ylim=c(0,1)

as an option in plot.

Plots with Two Variables

plot(2:7,4:9)

y=c(2,3,4,5)

barplot(y)


Plotting with two continuous explanatory variables: scatterplots


data1<-read.table("scatter1.txt",header=T)

attach(data1)

names(data1)



Producing the scatterplot


plot(xv,ys,col="red")

plot(xv,ys,col='red',xlab='Explanatory variable',ylab='Response variable')

abline(lm(ys~xv))




data2<-read.table("scatter2.txt",header=T)

attach(data2)

names(data2)


points(xv2,ys2,col="blue")

abline(lm(ys2~xv2))


plot(c(xv,xv2),c(ys,ys2),xlab="x",ylab="y",type="n")

points(xv,ys,col="red")

points(xv2,ys2,col="blue")

abline(lm(ys~xv))

abline(lm(ys2~xv2))



range(c(xv,xv2))

range(c(ys,ys2))


plot(c(xv,xv2),c(ys,ys2),xlim=c(0,100),ylim=c(0,80),xlab="x",ylab="y",type="n")

points(xv,ys,col="red")

points(xv2,ys2,col="blue")

abline(lm(ys~xv))

abline(lm(ys2~xv2))


1 black (the default)

2 red

3 green

4 blue

5 pale blue

6 purple

Identifying individuals in scatterplots


data<-read.table("sleep.txt",header=T)

attach(data)

Subject<-factor(Subject)

plot(Days,Reaction,col=as.numeric(Subject),pch=as.numeric(Subject))



Using a third variable to label a scatterplot


data<-read.table("pgr.txt",header=T)

attach(data)

names(data)



plot(hay,pH)

text(hay, pH, labels=round(FR, 2), pos=1, offset=0.5,cex=0.7)


plot(hay,pH,pch=16,col=ifelse(FR>median(FR),"red","black"))



Adding text to scatterplots

text(0.8,45,"(b)")



map.places <-read.csv("map.places.csv",header=T)

attach(map.places)

names(map.places)


map.data<-read.csv("bowens.csv",header=T)

attach(map.data)

names(map.data)




nn<-ifelse(north<60,north+100,north)


plot(c(20,100),c(60,110),type="n",xlab="", ylab="")


for (i in 1:length(wanted)){

ii <- which(place == as.character(wanted[i]))

text(east[ii], nn[ii], as.character(place[ii]), cex = 0.6) }


Drawing mathematical functions


curve(x^3-3*x, -2, 2)



Here is the more cumbersome code to do the same thing using plot:

x<-seq(-2,2,0.01)

y<-x^3-3*x

plot(x,y,type="l")



You can easily add extra graphical objects to plots:

rect rectangles

arrows arrows and bars

polygon more complicated straight-sided shapes


plot(0:10,0:10,xlab="",ylab="",xaxt="n",yaxt="n",type="n")

rect(6,6,9,9)##### rect(xleft, ybottom, xright, ytop)


corners<-function(){

coos<-c(unlist(locator(1)),unlist(locator(1)))

rect(coos[1],coos[2],coos[3],coos[4])

}

corners()



Drawing arrows as in the diagram above is straightforward. The syntax for the arrows

function to draw a line from the point (x0, y0) to the point (x1, y1) with the arrowhead, by

default, at the ‘second’ end (x1, y1) is:

### arrows(x0, y0, x1, y1) ###


arrows(1,1,3,8) ### Thus, to draw an arrow from (1,1) to (3,8) with the head at (3,8) type


arrows(1,9,5,9,code=3) ### A double-headed arrow from (1,9) to (5,9) is produced by adding code=3


arrows(4,1,4,6,code=3,angle=90)#### A vertical bar with two square ends (e.g. like an error bar) uses angle = 90 instead of the default angle = 30)


click.arrows<-function(){

coos<-c(unlist(locator(1)),unlist(locator(1)))

arrows(coos[1],coos[2],coos[3],coos[4])

}

click.arrows()


locations<-locator(6)

class(locations)

locations$x

locations$y


Now we draw the lavender-coloured polygon like this

polygon(locations,col="lavender")


xv<-seq(-3,3,0.01)

yv<-dnorm(xv)

plot(xv,yv,type="l")

Then fill the area to the left of xv≤−1 in red:

polygon(c(xv[xv<=-1],-1),c(yv[xv<=-1],yv[xv==-3]),col="red")



Smooth curves


yA =482xe0_045x_ yB =518xe0_055x_

The first decision to be made is the range of x values for the plot. In our case this is easy

because we know from the literature that the minimum value of x is 0 and the maximum

value of x is 100. Next we need to generate about 100 values of x at which to calculate and

plot the smoothed values of y:


xv<-0:100


Next, calculate vectors containing the values of yA and yB at each of these x values:


yA<-482*xv*exp(-0.045*xv)


yB<-518*xv*exp(-0.055*xv)



plot(c(xv,xv),c(yA,yB),xlab="stock",ylab="recruits",type="n")




We want to draw the smooth curve for yA as a solid blue line (lty = 1, col = "blue"),

lines(xv,yA,lty=1,col="blue")

and the curve for yB as a navy blue dotted line (lty = 2, col = "navy"),

lines(xv,yB,lty=2,col="navy")



Fitting non-linear parametric curves through a scatterplot


Here is a set of data showing a response variable, y (recruits to a fishery), as a function of

a continuous explanatory variable x (the size of the fish stock):

rm(x,y)

info<-read.table("plotfit.txt",header=T)

attach(info)

names(info)


plot(x,y,xlab="stock",ylab="recruits",pch=16)



model<-nls(y~a*x*exp(-b*x),start=list(a=500,b=0.05))


lines(xv,predict(model,list(x=xv)),lty=2)





Fitting non-parametric curves through a scatterplot

lowess (a non-parametric curve fitter);

loess (a modelling tool);

gam (fits generalized additive models; p. 611);

lm for polynomial regression (fit a linear model involving powers of x).



data<-read.table("jaws.txt",header=T)

attach(data)

names(data)



par(mfrow=c(1,1))

Two plots side by side is

par(mfrow=c(1,2))

A panel of four plots in a 2×2 square is

par(mfrow=c(2,2))



par(mfrow=c(2,2))

plot(age,bone,pch=16)

text(45,20,"lowess",pos=2)

lines(lowess(age,bone))


plot(age,bone,pch=16)

text(45,20,"loess",pos=2)

model<-loess(bone~age)

xv<-0:50

yv<-predict(model,data.frame(age=xv))

lines(xv,yv)


plot(age,bone,pch=16)

text(45,20,"gam",pos=2)

library(mgcv)

model<-gam(bone~s(age))

yv<-predict(model,list(age=xv))

lines(xv,yv)


plot(age,bone,pch=16)

text(45,20,"polynomial",pos=2)

model<-lm(bone~age+I(age^2)+I(age^3))

yv<-predict(model,list(age=xv))

lines(xv,yv)



Joining the dots


smooth<-read.table("smoothing.txt",header=T)

attach(smooth)

names(smooth)



If you do not order the x values, and just use the lines function, this is what happens:

plot(x,y,pch=16)

lines(x,y)

Categorical variables are factors with two or more levels. Our first example

uses the factor called month (with levels 1 to 12) to investigate weather patterns at

Silwood Park:


weather<-read.table("SilwoodWeather.txt",header=T)

attach(weather)

names(weather)



month<-factor(month)

plot(month,upper)

plot(month,upper,ylab="daily maximum temperature",xlab="month")



trial<-read.table("compexpt.txt",header=T)

attach(trial)

names(trial)

plot(clipping,biomass,xlab="treatment",ylab="yield")




Boxplots with notches to indicate significant differences


The size of the notch increases with the magnitude of the interquartile range and declines with

the square root of the replication, like this:

where IQR is the interquartile range and n is the replication per sample.



par(mfrow=c(1,2))

boxplot(biomass~clipping)

boxplot(biomass~clipping,notch=T)


means<-tapply(biomass,clipping,mean)


par(mfrow=c(1,1))

barplot(means,xlab="treatment",ylab="yield")




Plots for multiple comparisons


data<-read.table("box.txt",header=T)

attach(data)

names(data)


plot(response~factor(fact))


index<-order(tapply(response,fact,mean))

ordered<-factor(rep(index,rep(20,8)))

boxplot(response~ordered,notch=T,names=as.character(index),

xlab="ranked treatments",ylab="response")

model<-aov(response~factor(fact))

summary(model)

plot(TukeyHSD(model))





Plots for Single Samples


Histograms


hist(rpois(1000,1.7),

main="",xlab="random numbers from a Poisson with mean 1.7")


hist(rpois(1000,1.7),breaks=seq(-0.5,9.5,1),

main="",xlab="random numbers from a Poisson with mean 1.7")


Overlaying histograms with smooth density functions


y<-rnbinom(158,mu=1.5,size=1)

bks<- -0.5:(max(y)+0.5)

hist(y,bks,main="")




To get the best fit of a density function for this histogram we should estimate the

parameters of our particular sample of negative binomially distributed counts:

mean(y)

var(y)

mean(y)^2/(var(y)-mean(y))



xs<-0:11

ys<-dnbinom(xs,size=1.2788,mu=1.772)

lines(xs,ys*158)



Density estimation for continuous variables



library(MASS)

attach(faithful)

(max(eruptions)-min(eruptions))/(2*(1+log(length(eruptions),base=2)))


par(mfrow=c(1,2))

hist(eruptions,15,freq=FALSE,main="",col=27)

lines(density(eruptions,width=0.6,n=200))

truehist(eruptions,nbins=15,col=27)

lines(density(eruptions,n=200))



Index plots


response<-read.table("das.txt",header=T)

plot(response$y)


which(response$y > 15)

response$y[50]

response$y[50]<-2.179386


plot(response$y)



Time series plots


data(UKLungDeaths)

ts.plot(ldeaths, mdeaths, fdeaths, xlab="year", ylab="deaths", lty=c(1:3))



data(sunspots)

plot(sunspots)


class(sunspots)

is.ts(sunspots)



Pie charts


data<-read.csv("piedata.csv",header=T)

data

pie(data$amounts,labels=as.character(data$names))



The stripchart function


data(OrchardSprays)

with(OrchardSprays,

stripchart(decrease ~ treatment,

ylab = "decrease", vertical = TRUE, log = "y"))



Plots with multiple variables



The pairs function

ozonedata<-read.table("c:\\temp\\ozone.data.txt",header=T)

attach(ozonedata)

names(ozonedata)


pairs(ozonedata,panel=panel.smooth)



The coplot function


coplot(ozone~wind|temp,panel = panel.smooth)


Interaction plots


yields<-read.table("splityield.txt",header=T)

attach(yields)

names(yields)


interaction.plot(fertilizer,irrigation, yield)





Special Plots


Trellis graphics


data<-read.table("panels.txt",header=T)

attach(data)

names(data)


xyplot(weight ~ age|gender)


By using function below generate 5 graph with “panels.txt”


barchart for barplots;

bwplot for box-and-whisker plots;

densityplot for kernel density plots;

dotplot for dot plots;

histogram for panels of histograms;

qqmath for quantile plots against mathematical distributions;

stripplot for a one-dimensional scatterplot;

qq for a QQ plot for comparing two distributions;

xyplot for a scatterplot;

levelplot for creating level plots (similar to image plots);

contourplot for contour plots;

cloud for three-dimensional scatterplots;

wireframe for 3D surfaces (similar to persp plots);

splom for a scatterplot matrix;

parallel for creating parallel coordinate plots;

rfs to produce a residual and fitted value plot (see also oneway);

tmd for a Tukey mean–difference plot.



data<-read.table("daphnia.txt",header=T)

attach(data)

names(data)



library(lattice)

trellis.par.set(col.whitebg())

bwplot(Growth.rate~Water+Daphnia|Detergent)




Design plots


plot.design(Growth.rate~Water*Detergent*Daphnia)


plot.design(Growth.rate~Water*Detergent*Daphnia,fun="sd")



Effect sizes


install.packages("effects")

library(effects)

model<-lm(Growth.rate~Water*Detergent*Daphnia)

daph.effects<-all.effects(model)

plot(daph.effects,"Water:Detergent:Daphnia")




Bubble plots

ddd<-read.table("pgr.txt",header=T)

attach(ddd)

names(ddd)

bubble.plot(hay,pH,FR)



Plots with many identical values


numbers<-read.table("longdata.txt",header=T)

attach(numbers)

names(numbers)


sunflowerplot(xlong,ylong)