--- title: "Plotting with the `distfreereg` Package" output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{Plotting with the `distfreereg` Package} %\VignetteEngine{knitr::rmarkdown_notangle} %\VignetteEncoding{UTF-8} linkcolor: blue link-citations: true csl: american-statistical-association.csl bibliography: bibliography.bib --- \def\eqdef{\mathrel{\raise0.4pt\hbox{$:$}\hskip-2pt=}} ```{r setup, include = FALSE} knitr::opts_chunk$set(echo = TRUE, out.width = "75%", fig.dim = c(6,5), fig.align = "center") is_check <- ("CheckExEnv" %in% search()) || any(c("_R_CHECK_TIMINGS_", "_R_CHECK_LICENSE_") %in% names(Sys.getenv())) knitr::opts_chunk$set(eval = !is_check) library(distfreereg) ``` # Introduction Plotting methods for `distfreereg` and `compare` objects are intended to provide sensible defaults, almost all of which can be modified easily as needed. While the examples below are separated into `distfreereg` and `compare` classes, the way modifications are made in each case is similar. # Plotting `distfreereg` Objects The following code creates the `distfreereg` object with a fairly simple setup that is used in most of the examples that follow. ```{r dfr, message = FALSE} set.seed(20240214) n <- 1e2 true_mean <- function(X, theta) theta[1] + theta[2]*X[,1] theta <- c(2,5) X <- matrix(runif(n, min = 1, max = 100)) Y <- true_mean(X, theta) + rnorm(n) dfr <- distfreereg(Y = Y, X = X, test_mean = true_mean, covariance = list(Sigma = 1), theta_init = rep(1, length(theta))) ``` ## Density Plots The default plot displays the estimated density of the simulated statistics: ```{r} plot(dfr) ``` The estimated density of the simulated statistic is plotted, as is a vertical line showing the observed statistic. The upper tail is shaded, and the p-value (the area of the shaded region) is shown. A 95% simultaneous confidence band for the estimated density function is plotted, as well. ### Modifying the P-Value Label The plot above is not ideal because the p-value label overlaps the density curve. The specifications for this label can be modified using the `text_args` argument, the elements of which are passed to `graphics::text()`. The default placement is vertically centered on the left side of the vertical line. By using the `text_args` argument, the text can be printed on the right side of the line and shifted down a bit. Note that only the elements to change need to be specified in the list supplied to `text_args`. ```{r} plot(dfr, text_args = list(adj = c(0, 0.5), y = 0.5)) ``` One special value, `text_args = FALSE`, prevents the label from being printed. ```{r} plot(dfr, text_args = FALSE) ``` ### Selecting the Statistic to Plot The `stat` argument can be used to produce the corresponding plot for the CvM statistic. ```{r} plot(dfr, stat = "CvM", text_args = FALSE) ``` ### Modifying the Density Calculation and Plotting The calculation of the density curve can be modified using `density_args`, whose elements are passed to `density()`. To modify the appearance of the curve once the density curve has been estimated, pass arguments to `plot.default()` via the `...` argument. The example below shows how to modify the line type. ```{r} plot(dfr, text_args = FALSE, lty = 2) ``` ### Modifying the P-Value Line and Shading The appearance of the vertical line can be modified using the `abline_args` argument, whose elements are passed to `abline()`. As with the text label, the line can be omitted by setting `abline_args` equal to `FALSE`. ```{r} plot(dfr, text_args = FALSE, abline_args = FALSE) ``` The shading under the curve is produced by calling `polygon()`, and can be modified by passing arguments to that function via `polygon_args`. It can also be omitted by setting `polygon_args = FALSE`. ```{r} plot(dfr, text_args = FALSE, polygon_args = list(density = 10)) ``` As a convenience, the shading color can be changed using the `shade_col` argument. This is equivalent to modifying the `col` argument of `polygon()`. ```{r} plot(dfr, text_args = FALSE, shade_col = rgb(0.5, 0.5, 0.8, 0.5)) ``` ### Borders The default behavior is to omit the border of the shaded region. As seen above, this is notable when the vertical line is omitted. This can be changed by setting `border = NULL`, its default value in `polygon()`, which (usually) results in a border. ```{r} plot(dfr, text_args = FALSE, abline_args = FALSE, polygon_args = list(border = NULL)) ``` ### Output In case the values used to create a plot are useful for further calculation, each call to `plot.distfreereg()` invisibly returns these values in a list with either two or three elements. The first two elements, `x` and `y`, contain the coordinates of the curve. If confidence bands are plotted, then a third element named `confband` is also returned. ```{r distfreereg_returned_values} output <- plot(dfr, confband_args = NULL, text_args = FALSE) names(output) ``` The $y$-values of the curves that determine the confidence band are saved in the `cb_lower` and `cb_upper` elements, while the $x$-values are saved in `w`. ```{r} names(output$confband) ``` ## Diagnostic Plots Below are examples of two diagnostic plots available through `plot.distfreereg()`. ### Residual Plots A useful diagnostic plot displays the transformed residuals ordered according to the `res_order` element of the `distfreereg` object. ```{r residual_plot} plot(dfr, which = "residuals") ``` As with the density plot, all options can be modified if needed by including additional arguments for `plot()`. ```{r modified_residual_plot} plot(dfr, which = "residuals", main = "New Title", lty = "dashed") ``` ### Empirical Partial Sum Process Plots Another useful diagnostic plot displays the values of the empirical partial sum process, which can be modified as expected. ```{r epsp_plot} plot(dfr, which = "epsp") plot(dfr, which = "epsp", xlab = "i", col = "red") ``` # Plotting `compare` Objects Most of the examples of plot modifications apply to `compare` objects, as well. These are neverthless illustrated explicitly below. ## Setup The following code creates the `compare` object with a fairly simple setup that is used in all of the examples that follow. ```{r reference_object, message = FALSE} set.seed(20240920) n <- 100 func <- function(X, theta) theta[1] + theta[2]*X[,1] + theta[3]*X[,2] theta <- c(2,5,-1) X <- matrix(rexp(2*n), nrow = n) cdfr <- compare(theta = theta, true_mean = func, test_mean = func, true_X = X, true_covariance = list(Sigma = 3), X = X, covariance = list(Sigma = 3), prog = Inf, theta_init = rep(1, length(theta))) ``` ## CDF Plots By default, `plot.compare()` displays the graphs of the estimated cumulative distribution functions of the observed and simulated statistics. ```{r cdf} plot(cdfr) ``` ### Curves The appearance of the function curves can be modified using the `curve_args` argument. Note that this passes values to `lines()`, not `curve()`. The value of `curve_args` must be a list. If an argument of `lines()` is an element of this list, then its value is passed to the calls for both curves. For example, the width of both curves can be changed as follows. ```{r modify_curves} plot(cdfr, curve_args = list(lwd = 3)) ``` To change a property of only one curve, two special (named) elements of this list are available: "`obs`" and "`mcsim`". Each of these, if present, must be a list. Their elements are passed to the `lines()` call of the corresponding curve. The following example shows how to change the thickness of both curves but the style of only the observed statistics curve. ```{r modify_curves_differently} plot(cdfr, curve_args = list(lwd = 3, obs = list(lty = 4))) ``` ### Legend The argument `legend` can be used to modify the default behavior of the legend. ```{r modify_legend} plot(cdfr, legend = list(title = "A Title", bg = "grey")) ``` While not recommended, it can be omitted by setting `legend` to "`FALSE`". ```{r omit_legend} plot(cdfr, legend = FALSE) ``` ### Horizontal Lines The horizontal dashed lines are plotted by default to mimic the default behavior of `plot.ecdf()`. These can be modified using the `hlines` argument, whose value is a list of arguments to pass to `abline()`. ```{r modify_horizontal_lines} plot(cdfr, hlines = list(lty = 1, lwd = 3)) ``` ### Confidence Bands Confidence bands are not plotted by default, but they can be included with default values as follows. ```{r confidence_bands} plot(cdfr, confband_args = NULL) ``` ## Density Plots Estimated density curves can be plotted using the `which` argument. ```{r density} plot(cdfr, which = "dens") ``` The area under each density curve is shaded by default. These can be modified, either together or separately, as can be done with `curve_args`. The example below shows how to change the density of the shading for both curves but the color and angle only of the observed statistics curve. ```{r modify_shading} plot(cdfr, which = "dens", poly = list(density = 20, obs = list(col = rgb(0.5,0.2,0.2,0.2), angle = -45))) ``` Other options, such as `curve_args`, operate here as described above in the discussion of CDF plotting. ## Q--Q Plots The other plots available are Q--Q plots: one compares observed and simulated statistics, and the other compares p-values to uniform quantiles. ```{r qqplots} plot(cdfr, which = "qq") plot(cdfr, which = "qqp") ``` Both of these plots accept optional lists of arguments to pass to `qqplot()` ```{r modify_qqplots} plot(cdfr, which = "qq", conf.level = 0.95) ``` The diagonal line can be modified using the `qqline` argument, whose elements are passed to `abline()`. ```{r modify_diagonal_line} plot(cdfr, which = "qqp", qqline = list(lwd = 3)) ``` ## Multiple `compare` Objects It might be useful to compare the observed statistics in two `compare` objects. This can be done by supplying both objects to `compare()`. To illustrate, we first create a second `compare` object: ```{r second_object, message = FALSE} set.seed(20240920) n <- 100 func <- function(X, theta) theta[1] + theta[2]*X[,1] theta <- c(7,3) X <- matrix(rexp(n), nrow = n) cdfr2 <- compare(theta = theta, true_mean = func, test_mean = func, true_X = X, true_covariance = list(Sigma = 3), X = X, covariance = list(Sigma = 3), prog = Inf, theta_init = rep(1, length(theta))) ``` The following call compares the observed statistics from the two `compare` objects. ```{r compare_two_objects} plot(cdfr, cdfr2) ``` ## Output In case the values used to create a plot are useful for further calculation, each call to `plot.compare()` invisibly returns these values in a list. The Q--Q plots both return the value returned by `qqplot()` itself. The other plots return values corresponding to their curves, including confidence bands, if plotted. ```{r compare_returned_values} output <- plot(cdfr, confband_args = NULL) names(output) ``` The $y$-values of the curves that determine the confidence band are saved in the `cb_lower` and `cb_upper` elements, while the $x$-values are saved in `w`. ```{r} names(output$confband_observed) ```