\documentclass[11pt]{article} % \VignetteIndexEntry{Getting Started with Spatstat} <>= options(SweaveHooks=list(fig=function() par(mar=c(1,1,1,1)))) @ \usepackage{graphicx} \usepackage{anysize} \marginsize{2cm}{2cm}{2cm}{2cm} \newcommand{\pkg}[1]{\texttt{#1}} \newcommand{\bold}[1]{{\textbf {#1}}} \newcommand{\R}{{\sf R}} \newcommand{\spst}{\pkg{spatstat}} \newcommand{\Spst}{\pkg{Spatstat}} \begin{document} \bibliographystyle{plain} \thispagestyle{empty} \SweaveOpts{eps=TRUE} \setkeys{Gin}{width=0.6\textwidth} <>= library(spatstat) spatstat.options(image.colfun=function(n) { grey(seq(0,1,length=n)) }) sdate <- read.dcf(file = system.file("DESCRIPTION", package = "spatstat"), fields = "Date") sversion <- read.dcf(file = system.file("DESCRIPTION", package = "spatstat"), fields = "Version") options(useFancyQuotes=FALSE) @ \title{Getting started with \texttt{spatstat}} \author{Adrian Baddeley, Rolf Turner and Ege Rubak} \date{For \spst\ version \texttt{\Sexpr{sversion}}} \maketitle Welcome to \spst, a package in the \R\ language for analysing spatial point patterns. This document will help you to get started with \spst. It gives you a quick overview of \spst, and some cookbook recipes for doing basic calculations. \section*{What kind of data does \spst\ handle?} \Spst\ is mainly designed for analysing \emph{spatial point patterns}. For example, suppose you are an ecologist studying plant seedlings. You have pegged out a $10 \times 10$ metre rectangle for your survey. Inside the rectangle you identify all the seedlings of the species you want, and record their $(x,y)$ locations. You can plot the $(x,y)$ locations: <>= data(redwood) plot(redwood, pch=16, main="") @ This is a \emph{spatial point pattern} dataset. Methods for analysing this kind of data are summarised in the highly recommended book by Diggle \cite{digg03}, or our own book \cite{baddrubaturn15}, or other references in the bibliography below. \nocite{handbook10,bivapebegome08} Alternatively the points could be locations in one dimension (such as road accidents recorded on a road network) or in three dimensions (such as cells observed in 3D microscopy). You might also have recorded additional information about each seedling, such as its height, or the number of fronds. Such information, attached to each point in the point pattern, is called a \emph{mark} variable. For example, here is a stand of pine trees, with each tree marked by its diameter at breast height (dbh). The circle radii represent the dbh values (not to scale). <>= data(longleaf) plot(longleaf, main="") @ You might also have recorded supplementary data, such as the terrain elevation, which might serve as explanatory variables. These data can be in any format. \Spst\ does not usually provide capabilities for analysing such data in their own right, but \spst\ does allow such explanatory data to be taken into account in the analysis of a spatial point pattern. \Spst\ is \underline{\bf not} designed to handle point data where the $(x,y)$ locations are fixed (e.g.\ temperature records from the state capital cities in Australia) or where the different $(x,y)$ points represent the same object at different times (e.g.\ hourly locations of a tiger shark with a GPS tag). These are different statistical problems, for which you need different methodology. \section*{What can \spst\ do?} \Spst\ supports a very wide range of popular techniques for statistical analysis for spatial point patterns, for example \begin{itemize} \item kernel estimation of density/intensity \item quadrat counting and clustering indices \item detection of clustering using Ripley's $K$-function \item spatial logistic regression \item model-fitting \item Monte Carlo tests \end{itemize} as well as some advanced statistical techniques. \Spst\ is one of the largest packages available for \R, containing over 1000 commands. It is the product of 25 years of software development by leading researchers in spatial statistics. \section*{How do I start using \spst?} \begin{enumerate} \item Install \R\ on your computer \begin{quote} Go to \texttt{r-project.org} and follow the installation instructions. \end{quote} \item Install the \spst\ package in your \R\ system \begin{quote} Start \R\ and type \verb!install.packages("spatstat")!. If that doesn't work, go to \texttt{r-project.org} to learn how to install Contributed Packages. \end{quote} \item Start \R\ \item Type \texttt{library(spatstat)} to load the package. \item Type \texttt{help(spatstat)} for information. \end{enumerate} \section*{How do I get my data into \spst?} <>= data(finpines) mypattern <- unmark(finpines) mydata <- round(as.data.frame(finpines), 2) @ Here is a cookbook example. Suppose you've recorded the $(x,y)$ locations of seedlings, in an Excel spreadsheet. You should also have recorded the dimensions of the survey area in which the seedlings were mapped. \begin{enumerate} \item In Excel, save the spreadsheet into a comma-separated values (CSV) file. \item Start \R\ \item Read your data into \R\ using \texttt{read.csv}. \begin{quote} If your CSV file is called \texttt{myfile.csv} then you could type something like <>= mydata <- read.csv("myfile.csv") @ to read the data from the file and save them in an object called \texttt{mydata} (or whatever you want to call it). You may need to set various options inside the \texttt{read.csv()} command to get this to work for your file format: type \texttt{help(read.csv)} for information. \end{quote} \item Check that \texttt{mydata} contains the data you expect. \begin{quote} For example, to see the first few rows of data from the spreadsheet, type <<>>= head(mydata) @ To select a particular column of data, you can type \texttt{mydata[,3]} to extract the third column, or \verb!mydata$x! to extract the column labelled \texttt{x}. \end{quote} \item Type \texttt{library(spatstat)} to load the \spst\ package \item Now convert the data to a point pattern object using the \spst\ command \texttt{ppp}. \begin{quote} Suppose that the \texttt{x} and \texttt{y} coordinates were stored in columns 3 and 7 of the spreadsheet. Suppose that the sampling plot was a rectangle, with the $x$ coordinates ranging from 100 to 200, and the $y$ coordinates ranging from 10 to 90. Then you would type <>= mypattern <- ppp(mydata[,3], mydata[,7], c(100,200), c(10,90)) @ The general form is <>= ppp(x.coordinates, y.coordinates, x.range, y.range) @ Note that this only stores the seedling locations. If you have additional columns of data (such as seedling height, seedling sex, etc) these can be added as \emph{marks}, later. \end{quote} \item Check that the point pattern looks right by plotting it: <>= plot(mypattern) @ \item Now you are ready to do some statistical analysis. Try the following: \begin{itemize} \item Basic summary of data: type <>= summary(mypattern) @ \item Ripley's $K$-function: <>= options(SweaveHooks=list(fig=function() par(mar=rep(4,4)+0.1))) @ <>= plot(Kest(mypattern)) @ For more information, type \texttt{help(Kest)} \item Envelopes of $K$-function: <>= plot(envelope(mypattern,Kest)) @ <>= env <- envelope(mypattern,Kest, nsim=39) @ <>= plot(env, main="envelope(mypattern, Kest)") @ <>= options(SweaveHooks=list(fig=function() par(mar=c(1,1,1,1)))) @ For more information, type \texttt{help(envelope)} \item kernel smoother of point density: <>= plot(density(mypattern)) @ For more information, type \texttt{help(density.ppp)} \end{itemize} \item Next if you have additional columns of data recording (for example) the seedling height and seedling sex, you can add these data as \emph{marks}. Suppose that columns 5 and 9 of the spreadsheet contained such values. Then do something like <>= marks(mypattern) <- mydata[, c(5,9)] @ <>= mypattern <-finpines @ Now you can try things like the kernel smoother of mark values: <>= plot(Smooth(mypattern)) @ \setkeys{Gin}{width=0.8\textwidth} <>= plot(Smooth(mypattern, sigma=1.2), main="Smooth(mypattern)") @ \setkeys{Gin}{width=0.4\textwidth} \item You are airborne! Now look at the book \cite{baddrubaturn15} for more hints. \end{enumerate} \section*{How do I find out which command to use?} Information sources for \spst\ include: \begin{itemize} \item the Quick Reference guide: a list of the most useful commands. \begin{quote} To view the quick reference guide, start \R, then type \texttt{library(spatstat)} and then \texttt{help(spatstat)}. Alternatively you can download a pdf of the Quick Reference guide from the website \texttt{www.spatstat.org} \end{quote} \item online help: \begin{quote} The online help files are useful --- they give detailed information and advice about each command. They are available when you are running \spst. To get help about a particular command \texttt{blah}, type \texttt{help(blah)}. There is a graphical help interface, which you can start by typing \texttt{help.start()}. Alternatively you can download a pdf of the entire manual (1000 pages!) from the website \texttt{www.spatstat.org}. \end{quote} \item vignettes: \begin{quote} \Spst\ comes installed with several `vignettes' (introductory documents with examples) which can be accessed using the graphical help interface. They include a document about \texttt{Handling shapefiles}. \end{quote} \item book: \begin{quote} Our book \cite{baddrubaturn15} contains a complete course on \texttt{spatstat}. \end{quote} \item website: \begin{quote} Visit the \spst\ package website \texttt{www.spatstat.org} \end{quote} \item forums: \begin{quote} Join the forum \texttt{R-sig-geo} by visiting \texttt{r-project.org}. Then email your questions to the forum. Alternatively you can ask the authors of the \spst\ package (their email addresses are given in the package documentation). \end{quote} \end{itemize} \begin{thebibliography}{10} % \bibitem{badd10wshop} % A. Baddeley. % \newblock Analysing spatial point patterns in {{R}}. % \newblock Technical report, CSIRO, 2010. % \newblock Version 4. % \newblock URL \texttt{https://research.csiro.au/software/r-workshop-notes/} % \bibitem{baddrubaturn15} A. Baddeley, E. Rubak, and R. Turner. \newblock {\em Spatial Point Patterns: Methodology and Applications with {{R}}}. \newblock Chapman \& Hall/CRC Press, 2015. \bibitem{bivapebegome08} R. Bivand, E.J. Pebesma, and V. G{\'{o}}mez-Rubio. \newblock {\em Applied spatial data analysis with {R}}. \newblock Springer, 2008. \bibitem{cres93} N.A.C. Cressie. \newblock {\em Statistics for Spatial Data}. \newblock {John Wiley and Sons}, {New York}, second edition, 1993. \bibitem{digg03} P.J. Diggle. \newblock {\em Statistical Analysis of Spatial Point Patterns}. \newblock Hodder Arnold, London, second edition, 2003. \bibitem{fortdale05} M.J. Fortin and M.R.T. Dale. \newblock {\em Spatial analysis: a guide for ecologists}. \newblock Cambridge University Press, Cambridge, UK, 2005. \bibitem{fothroge09handbook} A.S. Fotheringham and P.A. Rogers, editors. \newblock {\em The {SAGE} {H}andbook on {S}patial {A}nalysis}. \newblock SAGE Publications, London, 2009. \bibitem{gaetguyo09} C. Gaetan and X. Guyon. \newblock {\em Spatial statistics and modeling}. \newblock Springer, 2009. \newblock Translated by Kevin Bleakley. \bibitem{handbook10} A.E. Gelfand, P.J. Diggle, M. Fuentes, and P. Guttorp, editors. \newblock {\em Handbook of Spatial Statistics}. \newblock CRC Press, 2010. \bibitem{illietal08} J. Illian, A. Penttinen, H. Stoyan, and D. Stoyan. \newblock {\em Statistical Analysis and Modelling of Spatial Point Patterns}. \newblock John Wiley and Sons, Chichester, 2008. \bibitem{mollwaag04} J. M{\o}ller and R.P. Waagepetersen. \newblock {\em Statistical Inference and Simulation for Spatial Point Processes}. \newblock Chapman and Hall/CRC, Boca Raton, 2004. \bibitem{pfeietal08} D.U. Pfeiffer, T. Robinson, M. Stevenson, K. Stevens, D. Rogers, and A. Clements. \newblock {\em Spatial analysis in epidemiology}. \newblock Oxford University Press, Oxford, UK, 2008. \bibitem{wallgotw04} L.A. Waller and C.A. Gotway. \newblock {\em Applied spatial statistics for public health data}. \newblock Wiley, 2004. \end{thebibliography} \end{document}