\documentclass[nojss]{jss}
%\documentclass[article]{jss}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% declarations for jss.cls %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
\usepackage{amssymb}%
%
\newcommand{\N}{\mathbb N}
\newcommand{\R}{\mathbb R}
\newcommand{\B}{\mathbb B}
%
\newcommand\smkreis[1]{{(M#1)}}
% -------------------------------------------------------------------------------
\SweaveOpts{keep.source=TRUE}
% -------------------------------------------------------------------------------

%\VignetteIndexEntry{R Package distrMod: S4 Classes and Methods for Probability Models}
%\VignetteDepends{distr,distrEx,distrMod}
%\VignetteKeywords{probability models, minimum criterion estimators, maximum likelihood estimators, minimum distance estimators, S4 classes, S4 methods}
%\VignettePackage{distrMod} 

%% almost as usual
\author{Matthias Kohl\\FH Furtwangen\And
        Peter Ruckdeschel\\ Carl von Ossietzky University, Oldenburg}%Fraunhofer ITWM Kaiserslautern}
\title{\proglang{R} Package~\pkg{distrMod}: \proglang{S}4 Classes and Methods for Probability Models}

%% for pretty printing and a nice hypersummary also set:
\Plainauthor{Matthias Kohl, Peter Ruckdeschel} %% comma-separated
\Plaintitle{R Package distrMod: S4 Classes and Methods for Probability Models} %% without formatting
\Shorttitle{R Package distrMod} %% a short title (if necessary)

%% an abstract and keywords
\Abstract{
  This vignette is published as \citet{distrMod}.
  Package~\pkg{distrMod} provides an object oriented (more specifically
  \proglang{S}4-style)
  implementation of probability models. Moreover, it contains functions
  and methods to compute minimum criterion estimators -- in particular,
  maximum likelihood and minimum distance estimators.
}
\Keywords{probability models, minimum criterion estimators, minimum distance estimators,
maximum likelihood estimators, \proglang{S}4 classes, \proglang{S}4 methods}
\Plainkeywords{probability models, minimum criterion estimators, maximum likelihood 
estimators, minimum distance estimators, S4 classes, S4 methods} %% without formatting
%% at least one keyword must be supplied

%% publication information
%% NOTE: Typically, this can be left commented and will be filled out by the technical editor
%% \Volume{13}
%% \Issue{9}
%% \Month{September}
%% \Year{2004}
%% \Submitdate{2004-09-29}
%% \Acceptdate{2004-09-29}

%% The address of (at least) one author should be given
%% in the following format:
\Address{
  Matthias Kohl\\
  Hochschule Furtwangen\\
Fakult\"at Maschinenbau und Verfahrenstechnik\\
Jakob-Kienzle-Strasse 17\\
78054 Villingen-Schwenningen, Germany \\
  E-mail: \email{Matthias.Kohl@hs-furtwangen.de}\\
  \bigskip \\
  Peter Ruckdeschel\\
  Institute for Mathematics\\
  School of Mathematics and Sciences\\
  Carl von Ossietzky University Oldenburg,
  Postfach 25 03\\
  26111 Oldenburg, Germany\\
  E-mail: \email{peter.ruckdeschel@uni-oldenburg.de}
}
%% It is also possible to add a telephone and fax number
%% before the e-mail in the following format:
%% Telephone: +43/1/31336-5053
%% Fax: +43/1/31336-734

%% for those who use Sweave please include the following line (with % symbols):
%% need no \usepackage{Sweave.sty}

%% end of declarations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


\begin{document}
%----------------------------------------------------------------------------
<<library, echo=FALSE, results = hide>>=
library(distrMod)
distrModoptions(show.details="minimal")
options(prompt = "R> ", continue = "+  ", width = 70, 
        useFancyQuotes = FALSE)
@
%----------------------------------------------------------------------------
\section{Introduction}
% -----------------------------------------------------------------------------
\subsection[Aims of package distrMod]{Aims of package \pkg{distrMod}}
% -----------------------------------------------------------------------------
\textbf{What is \pkg{distrMod}? }
It is an extension package for the statistical software \proglang{R}, \citep{RCore}
and is the latest member of a family of packages, which we call {\tt distr}-family. 
The family so far consists of packages~\pkg{distr}, \pkg{distrEx},
\pkg{distrEllipse}, \pkg{distrSim}, \pkg{distrTEst}, \pkg{distrTeach}, and \pkg{distrDoc};
see~\citet{distr} and \citet{distrTeach}.\\ 
Package~\pkg{distrMod} makes extensive use of the distribution classes of 
package~\pkg{distr} as well as the functions and methods of package~\pkg{distrEx}. 
Its purpose is to extend support in base \proglang{R} for distributions 
and in particular for parametric modelling by ``object oriented'' implementation of 
probability models via several new \proglang{S}4 classes and methods; see
Section~\ref{OOPinR} and \citet{Ch98} for more details. 
In addition, it includes functions and methods to compute minimum criterion estimators --
in particular, maximum likelihood (ML[E]) (i.e.\ minimum negative log-likelihood)
and minimum distance estimators (MDE).\\ 
Admittedly, \pkg{distrMod} is not the first package to provide infrastructure for
ML estimation, we compete in some sense with such prominent functions as 
\code{fitdistr} from package~\pkg{MASS} \citep{V:R:02} and, already using 
the \proglang{S}4 paradigm, \code{mle} from package~\pkg{stats4} 
\citep{RCore}.\\
Our implementation however, goes beyond the scope of these packages, as we work
with distribution objects and have quite general methods available to operate
on these objects.

\textbf{Who should use it? }
It is aimed at users who want to use non-standard parametric models, allowing them 
to either explore these models, or fit them to data by non-standard techniques. 
The user will receive standardized output on which she/he may
apply standard \proglang{R} functions like
\code{plot}, \code{show}, \code{confint}, \code{profile}.\\
By non-standard parametric models we mean models not in the list of 
explicit models covered by \code{fitdistr}; that is, \code{"Poisson"}, \code{"beta"}, 
\code{"cauchy"}, \code{"chi-squared"}, \code{"exponential"}, 
\code{"gamma"}, \code{"geometric"}, \code{"lognormal"}, \code{"logistic"}, 
\code{"negative binomial"}, \code{"normal"}, \code{"f"}, \code{"t"}, \code{"weibull"}.
Standard as well as non-standard models can easily be implemented based on the infrastructure 
provided by packages \pkg{distr} and \pkg{distrEx}. We will demonstrate this using 
examples $\smkreis{2}$ and $\smkreis{4}$ specified in Section~\ref{examples}.\\
Non-standard techniques may include minimum criterion estimation, 
minimum distance estimation, a particular optimization routine not covered 
by \code{optim}/\code{optimize} in the MLE case, or some explicit expression 
for the MLE not covered by the standard class-room examples. Non-standard 
techniques may also stand for estimation of a (differentiable) function of the 
parameter as illustrated in example $\smkreis{3}$.\\
Despite this flexibility, we need not modify our code to cover all this.
In short, we are able to implement {\bf one} static algorithm which by \proglang{S}4
method dispatch dynamically takes care of various models and optimization
techniques, thus avoiding redundancy and simplifying maintenance. We will explain this 
more precisely in Section~\ref{OOApproach}.\\
All information relevant for a specific parametric model is grouped within an object of 
class \code{ParamFamily} or subclasses for which it may for instance be of interest 
to explore the (perhaps automatically derived, as in the case of example~$\smkreis{2}$) 
score function and the corresponding Fisher information. The return value of the model 
fit, an estimate of the parameter, is an object of class \code{Estimator} or subclasses 
for which one may want to have confidence intervals, some profiling, etc. For 
objects of these classes we provide various methods for standard \proglang{R} functions; 
see Sections~\ref{modelsec} and \ref{estsec} for more details.

\textbf{Availability } The current version of package~\pkg{distrMod} is 2.8
and can be found on the Comprehensive \proglang{R} Archive Network at 
\url{https://CRAN.R-project.org/package=distrMod}. The development version of the distr-family is located at R-Forge;
see~\citet{RForge}.
% -----------------------------------------------------------------------------
\subsection{Running examples}\label{examples}
% -----------------------------------------------------------------------------
For illustrating the functionality of \pkg{distrMod}, we will use four running
examples for each of which we assume i.i.d.\ observations $X_i$ ($i=1,\ldots,n$, $n\in\N$) 
distributed according to the respective $P_\theta$:
\begin{itemize}
\item[$\smkreis{1}$] the one-dimensional normal location family
${\cal P}:=\{P_\theta\,|\, \theta\in\R\}$ with $P_\theta = {\cal N}(\theta,1)$. 
This model is $L_2$-differentiable (i.e.\ smoothly parametrized) with scores
$\Lambda_\theta(x)=x-\theta$.
\item[$\smkreis{2}$]
a one-dimensional location and scale family
${\cal P}:=\{P_{\theta}\,|\, \theta=(\mu,\sigma)'\in\R\times(0,\infty)\}$ with
some non-standard $P_\theta$. More precisely we assume,
\begin{equation}
   X_i=\mu+\sigma V_i \qquad \mbox{for}\quad V_i\stackrel{\rm i.i.d}{\sim}  P
\end{equation}
where $P=P_{\theta_0}$ ($\theta_0=(0,1)'$) is the following central distribution
\begin{equation}
P(dx)=p(x)\,dx,\qquad \;\;p(x)\propto e^{-|x|^3}
\end{equation}
${\cal P}$ is $L_2$-differentiable with scores
$\Lambda_\theta(x)= (3\,{\rm sign}(y) y^2, 3 |y|^3-1)/\sigma$
for $y=(x-\mu)/\sigma$.
\item[$\smkreis{3}$]
the gamma family
${\cal P}:=\{P_{\theta}={\rm gamma}(\theta)\,|\, \theta=(\beta,\xi)'
  \in(0,\infty)^2\}$ for scale parameter $\beta$ and shape parameter $\xi$.
This model is $L_2$-differentiable with scores
$\Lambda_\theta(x)= \big(\frac{y - \xi}\beta,
                     \log(y) - (\log \Gamma)'(\xi)\big)$
for $y=x/\beta$ and
\item[$\smkreis{4}$] a censored Poisson family:
${\cal P}:=\{P_{\theta}\,|\, \theta\in(0,\infty)\}$
where $P_{\theta}={\cal L}_\theta(X | X>1)$ for $X \sim {\rm Pois}(\theta)$,
that is, we only observe counts larger or equal to $2$ in a Poisson model.
This model is $L_2$-differentiable with scores
$\Lambda_\theta(x)= %\frac
{x}/{\theta} -  %\frac{
(1- e^{-\theta})%}{
/(1-(1+\theta) e^{-\theta})
%}
\;$.
\end{itemize}
We will estimate $\theta$ from $X_1,\ldots, X_n$ with mean squared
error (MSE) as risk. This makes the MLE asymptotically optimal. Other considerations, 
in particular robustness issues, suggest that one should also look at alternatives. 
For the sake of this paper, we will limit ourselves to one alternative in each model. 
In model~$\smkreis{1}$ we will use the median as most-robust estimator, in
model~$\smkreis{2}$ we will look at the very robust estimator
$\theta_r=({\rm median},{\rm mad})$ (mad = suitable standardized MAD), while
in models~$\smkreis{3}$ and $\smkreis{4}$ we use minimum distance
estimators (MDE) to the Cram\'er-von-Mises distance.\\
The four examples were chosen for the following reasons:\newline
In Example~$\smkreis{1}$, nothing has to be redefined. Estimation by MDE or
MLE is straightforward: We define an object of class \code{NormLocationFamily}
and generate some data.
%----------------------------------------------------------------------------
<<locNorm, eval = TRUE>>=
(N <- NormLocationFamily(mean = 3))
x <- r(N)(20)
@
%----------------------------------------------------------------------------
We compute the MLE and the Cram\'er-von-Mises MDE using some (preliminary) 
method for the computation of the asymptotic covariance of the MDE.
%----------------------------------------------------------------------------
<<locNorm1, eval = TRUE>>=
MLEstimator(x,N)
MDEstimator(x,N,distance=CvMDist,
            asvar.fct = distrMod:::.CvMMDCovariance)
@
%----------------------------------------------------------------------------
Example~$\smkreis{2}$ illustrates the use of a ``parametric group model''
in the sense of \citet[Section~1.3, pp.~19--26]{Leh:83}, and as 
this model is quite non-standard, we use it to demonstrate some capabilities
of our generating functions. Example~$\smkreis{3}$
illustrates the use of a predefined \proglang{S}4 class; specifically, 
class \code{GammaFamily}. In this case there are various equivalent 
parameterizations, which in our setup can easily be transformed into each 
other; see Section~\ref{ParamFamP}.  
Example~$\smkreis{4}$, also available in package~\pkg{distrMod}
as demo \code{censoredPois}, illustrates a situation where we have to set
up a model completely anew.
% -----------------------------------------------------------------------------
\subsection{Organization of the paper}
% -----------------------------------------------------------------------------
We first explain some aspects of the specific way object orientation (OO) is 
realized in \proglang{R}. We then present the new model \proglang{S}4 classes 
and demonstrate how package~\pkg{distrMod} can be used to compute minimum 
criterion estimators. The global options which may be set in our package and 
some general programming practices are given in the appendix.
% -----------------------------------------------------------------------------
\section[Object orientation in S4]{Object orientation in \proglang{S}4}   \label{OOPinR}
% -----------------------------------------------------------------------------
In \proglang{R}, OO is realized in the \proglang{S}3 class concept as introduced
in \citet{Cham:93a,Cham:93b} and by its successor, the \proglang{S}4 class concept, 
as developed in \citet{Ch98,Ch99,Ch01} and described in detail in \citet{Ch08}. 
Of course, also \citet[Section~5]{RLangDef} may serve as reference.\\ 
An account of some of the differences to standard OO may be found in \citet{ChL01}, 
\citet{Beng:03}, and \citet{Ch06}.\\
Using the terminology of \citet{Beng:03}, mainstream software engineering (e.g.~\proglang{C++}) 
uses \emph{COOP\/} (class-object-oriented programming) style whereas the \proglang{S}3/\proglang{S}4 
concept of \proglang{R} uses \emph{FOOP\/} (function-object-oriented programming) style 
or, according to \citet{Ch06}, at least \emph{F+COOP} (i.e.\ both styles).\\
In COOP style, methods providing access to or manipulation of an object are part 
of the object, while in FOOP style, they are not, but belong 
to \emph{generic functions\/} -- abstract functions which allow for arguments of 
varying type/class. A dispatching mechanism then decides on run-time which method 
best fits the \emph{signature\/} of the function, that is, the types/classes of
(a certain subset of) its arguments. {\tt C++} has a similar concept, 
``overloaded functions'' as discussed by \citet[Section 4.6.6]{Stro:92}.
\par
In line with the different design of OO within \proglang{R}, some 
notions have different names in \proglang{R} context as well. This is in part 
justified by slightly different meanings; e.g., members in \proglang{R} are 
called \emph{slots}, and constructors are called \emph{generating functions}. 
In the case of the latter, the notion does mean something similar but not identical 
to a constructor: a generating function according to \citet{Ch01} is a user-friendly 
wrapper to a call to \code{new()}, the actual constructor in the \proglang{S}4 system. 
In general it does not have the same flexibility as the full-fledged constructor 
in that some calling possibilities will still be reserved to a call to \code{new()}.
\par
Following the (partial) FOOP style of \proglang{R}, we sometimes have to
deviate from best practice in mainstream OO, namely documenting the methods of
each class hierarchy together as a group. Instead we document the
corresponding particular methods in the help file for the corresponding 
generic.\\
Although the use of OO in the \proglang{R} context will certainly not be able
to gain benefits using object identity, information hiding and encapsulation,
the mere use of inheritance and polymorphism does provide advantages:\newline
Polymorphism is a very important feature in interactively used languages
as the user will not have to remember a lot of different function names but
instead is able to say \code{plot} to many different objects of classes
among which there need not be any inheritance structure.
On the other hand, inheritance will make it possible to have a general (default) 
code which applies if nothing else is known while still any user may register 
his own particular method for a derived class, without interference of the authors 
of the class and generic function definitions. Of course, this could also be 
achieved by functional arguments, but using method dispatch we have much more 
control on the input and output types of the corresponding function. This 
is important, as common \proglang{R} functions neither have type checking for 
input arguments nor for return values. In addition to simple type checking 
we could even impose some refined checking by means of the \proglang{S}4 
validity checking.
% -----------------------------------------------------------------------------
\section[S4 classes: Models and parameters]{\proglang{S}4 classes: Models and parameters}\label{modelsec}
% -----------------------------------------------------------------------------
% -----------------------------------------------------------------------------
\subsection{Model classes}
% -----------------------------------------------------------------------------
\textbf{Models in Statistics and in \proglang{R} }
In Statistics, a probability model or shortly model is a family of probability 
distributions. More precisely, a subset ${\cal P}\subset {\cal M}_1({\cal A})$
of all probability measures on some sample space $(\Omega,{\cal A})$.
In case we are dealing with a parametric model, there is a finite-dimensional 
parameter domain $\Theta$ (usually an open subset of $\R^k$) and a mapping 
$\theta\mapsto P_\theta$, assigning each parameter $\theta\in\Theta$ a corresponding 
member of the family ${\cal P}$. If this parametrization is smooth, more specifically
$L_2$-differentiable, see~\citet[Section~2.3]{Ri94}, we additionally have an 
$L_2$-derivative $\Lambda_\theta$ for each $\theta\in\Theta$; that is, some random 
variable (RV) in $L_2(P_\theta)$ and its corresponding (co)variance, the Fisher 
information ${\cal I}_\theta$. In most cases, 
$\Lambda_\theta=\frac{d}{d\theta} \log p_\theta$ (the classical scores) 
for $p_\theta$ the density of $P_\theta$ w.r.t.\ Lebesgue or counting measure.
\par
One of the strengths of \proglang{R} (or more accurately of \proglang{S}) right
from the introduction of \proglang{S}3 in \citet{B:C:W:88} is that models, more 
specifically [generalized] linear models (see functions \code{lm} and \code{glm} 
in package \pkg{stats}) may be explicitly formulated in terms of the language.
The key advantage of this is grouping of relevant information, re-usability,
and of course the formula interface (see \code{formula} in package \pkg{stats}) 
by which computations \emph{on} the model are possible in \proglang{S}.\\
From a mathematical point of view however, these models are somewhat incomplete: In 
the case of \code{lm}, there is an implicit assumption of Gaussian errors, while in 
the case of \code{glm} only a limited number of explicit families and explicit link 
functions are ``hard-coded''. So in fact, again the user will not enter any 
distributional assumption.\\
Other models like the more elementary location and scale family (with general central 
distribution) so far have not even been implemented.
\par
With our distribution classes available from package \pkg{distr} we go ahead 
in this direction in package \pkg{distrMod}, although admittedly, up to now, 
we have not yet implemented any regression model or integrated any formula 
interface, but this will hopefully be done in the future.

\textbf{Packages \pkg{distr} and \pkg{distrEx} } Much of our infrastructure
relies on our \proglang{R} packages \pkg{distr} and \pkg{distrEx} available on
\href{https://cran.r-project.org/}{\tt CRAN}.
Package \pkg{distr}, see~\citet{distr,distrTeach}, aims to provide a conceptual
treatment of distributions by means of \proglang{S}4 classes. A mother class 
\code{Distribution} is introduced with slots for a parameter and for functions 
\code{ r}, \code{ d}, \code{ p} and \code{ q} for simulation, for 
evaluation of density, c.d.f.\ and quantile function of the corresponding 
distribution, respectively. All distributions of the \pkg{stats} package are implemented as 
subclasses of either \code{AbscontDistribution} or \code{DiscreteDistribution},
which themselves are again subclasses of \code{UnivariateDistribution}.
As usual in stochastics, we identify distributions with RVs distributed accordingly.
By means of these classes, we may automatically generate new objects of these 
classes for the laws of RVs under standard univariate mathematical transformations 
and under standard bivariate arithmetical operations acting on independent RVs. 
Here is a short example: We create objects of ${\cal N}\,(2,1.69)$ and 
${\rm Pois}\,(1.2)$ and convolve an affine transformation of them.
%----------------------------------------------------------------------------
<<exam1, eval = TRUE, fig = TRUE, height = 4, include = FALSE>>=
library(distr)
N <- Norm(mean = 2, sd = 1.3)
P <- Pois(lambda = 1.2)
Z <- 2*N + 3 + P
Z
plot(Z, cex.inner = 0.9)
@
%----------------------------------------------------------------------------
The new distribution has corresponding slots \code{ r}, \code{ d}, \code{ p} 
and \code{ q}.
%----------------------------------------------------------------------------
<<exam11, eval = TRUE>>=
p(Z)(0.4)
q(Z)(0.3)
## in RStudio or Jupyter IRKernel, use q.l(.)(.) instead of q(.)(.)
r(Z)(5)
@
%----------------------------------------------------------------------------
\begin{figure}[!ht]
  \begin{center}
    \includegraphics[width=12cm]{distrMod-exam1.pdf}%
    \caption{Plot of \code{Z}, an object of class \code{AbscontDistribution}.}
  \end{center}
\end{figure}%
\par
Package~\pkg{distrEx} extends \pkg{distr} by covering statistical functionals like
expectation, variance or the median evaluated at distributions, as well as
distances between distributions and basic support for multivariate and
conditional distributions. E.g., using the distributions generated above, we 
can write
%----------------------------------------------------------------------------
<<expectation, eval = TRUE>>=
library(distrEx)
E(N)
E(P)
E(Z)
E(Z, fun=function(x) sin(x))
@
where \code{E(N)} and \code{E(P)} return the analytic value whereas the last
two calls invoke some numerical computations.

\textbf{Models in \pkg{distrMod} }
Based on class \code{Distribution} of package \pkg{distr} and its subclasses
we define classes for families of probability measures in package \pkg{distrMod}. 
So far, we specialized this to parametric families of probability measures in class 
\code{ParamFamily}; see~Figure~\ref{figPF}. The concept however, also allows
the derivation of subclasses for other (e.g.\ semiparametric) families of 
probability measures. In the case of $L_2$-differentiable parametric families
we introduce several additional slots for scores $\Lambda_\theta$ and Fisher 
information ${\cal I}_\theta$. In particular, slot \code{L2deriv} for the score 
function is of class \code{EuclRandVarList}, a class defined in package~\pkg{RandVar} 
\citep{RandVar}.
\begin{figure}[!ht]
  \begin{center}
    \includegraphics[width=12cm]{ProbFamily.pdf}%
    \caption{\label{figPF}Inheritance relations and slots of the
    corresponding \mbox{(sub-)}classes for \code{ProbFamily} where we do not
    repeat inherited slots.}
  \end{center}
\end{figure}%
The mother class \code{ProbFamily} is virtual and objects can only be created for
all derived classes.
\par
Class \code{ParamFamily} and all its subclasses have pairs of slots: actual
value slots and functional slots, the latter following the COOP paradigm.
The actual value slots like \code{distribution}, \code{param}, \code{L2deriv},
and \code{FisherInfo} are used for computations at a certain value of
the parameter, while functional slots like \code{modifyParam}, \code{L2deriv.fct},
and \code{FisherInfo.fct} provide mappings $\Theta\to {\cal M}_1(\B)$,
$\theta\mapsto P_\theta$, $\Theta \to \bigcup_{\theta\in\Theta} L_2(P_\theta)$,
$\theta \mapsto \Lambda_\theta$, and $\Theta \to \R^{k \times k}$,
$\theta \mapsto {\cal I}_\theta$, respectively, and are needed to modify the
actual parameter of the model, or to move the model from one parameter value
to another. The different modifications due after a change in the parameter
are grouped in \proglang{S}4 method \code{modifyModel}.

\textbf{Generating functions } Generating objects of class
\code{L2ParamFamily} and derived classes involves filling a considerable
number of slots. Hence, for convenience, there are several user-friendly
generating functions as displayed in Table~\ref{TabPF}.
\begin{table}[!ht]
  \begin{center}\small
    \begin{tabular}{|r|l|}
    \hline
      \textbf{Name} & \textbf{Family}\\ \hline
      \code{ParamFamily} & general parametric family\\
      \code{L2ParamFamily} & general $L_2$ differentiable parametric family \\
      \code{L2LocationFamily} & general $L_2$ differentiable location family \\
      \code{L2LocationUnknownScaleFamily} & general $L_2$ differentiable location family\\
                                          & with unknown scale (nuisance parameter)\\
      \code{L2ScaleFamily} & general $L_2$ differentiable scale family\\
      \code{L2ScaleUnknownLocationFamily} & general $L_2$ differentiable scale family\\
                                          & with unknown location (nuisance parameter)\\
      \code{L2LocationScaleFamily} & general $L_2$ differentiable location and\\
                                   & scale family\\
      \code{BetaFamily} & beta family\\
      \code{BinomFamily} & binomial family\\
      \code{GammaFamily} & gamma family\\
      \code{PoisFamily} & Poisson family\\
      \code{ExpScaleFamily} & exponential scale family\\
      \code{LnormScaleFamily} & log-normal scale family\\
      \code{NormLocationFamily} & normal location family\\
      \code{NormLocationUnknownScaleFamily} & normal location family with\\
                                            & unknown scale (nuisance parameter)\\
      \code{NormScaleFamily} & normal scale family\\
      \code{NormScaleUnknownLocationFamily} & normal scale family with\\
                                            & unknown location (nuisance parameter)\\
      \code{NormLocationScaleFamily} & normal location and scale family\\
    \hline
    \end{tabular}
    \caption{\label{TabPF}Generating functions for \code{ParamFamily} and
    derived classes.}
  \end{center}
\end{table}%

\textbf{Examples }
In order to follow our running example~$\smkreis{2}$, consider the 
following code: we first define the (non-standard) central distribution \code{myD} 
and then generate the location and scale model. For the central distribution, 
the corresponding standardizing constant could be expressed in closed form 
in terms of the gamma function, but instead we present the more general approach 
in which (by argument \code{withS}) standardization to mass $1$ is enforced 
by numerical integration.
<<centralD, eval = TRUE>>=
myD <- AbscontDistribution(d = function(x) exp(-abs(x)^3), 
                           withS = TRUE)
@
The logarithmic derivative of the density $d/dx\,\log p(x)$ is
determined by numerical differentiation (but could have been given
explicitly, too, by specifying argument \code{LogDeriv}). The $L_2$-derivative 
is calculated automatically using this derivative, and the corresponding Fisher 
information is then determined by numerical integration (but could also be specified 
explicitly using argument \code{FisherInfo.0}); see also the help file for 
function \code{L2LocationScaleFamily}. In this case we have checked accuracy to 
be of order $1{\rm e}-6$.
<<locscFam, eval = TRUE>>= 
(myFam <- L2LocationScaleFamily(loc = 3, scale = 2,
                          name = "location and scale family",
                          centraldistribution = myD))
@
In the already implemented gamma family of example~$\smkreis{3}$,
a corresponding object for this model is readily defined as
<<GamFam, eval = TRUE>>=
(G <- GammaFamily(scale = 1, shape = 2))
@
In example~$\smkreis{4}$, we have to set up the model completely anew. 
Still, it is not too complicated, as we may use the generating function
\code{L2ParamFamily} as illustrated in the following code: \label{codePois}
<<CensPois, eval = TRUE>>= 
CensoredPoisFamily <- function(lambda = 1, trunc.pt = 2){
    name <- "Censored Poisson family"
    distribution <- Truncate(Pois(lambda = lambda), 
                             lower = trunc.pt)
    param0 <- lambda
    names(param0) <- "lambda"
    param <- ParamFamParameter(name = "positive mean",
                               main = param0, 
                               fixed = c(trunc.pt=trunc.pt))
    modifyParam <- function(theta){
                      Truncate(Pois(lambda = theta), 
                               lower = trunc.pt)}
    startPar <- function(x,...) c(.Machine$double.eps,max(x))
    makeOKPar <- function(param){
        if(param<=0) return(.Machine$double.eps)
        return(param)
    }
    L2deriv.fct <- function(param){
        lambda <- main(param)
        fct <- function(x){}
        body(fct) <- substitute({
                       x/lambda-ppois(trunc.pt-1, 
                                      lambda = lambda,
                                      lower.tail=FALSE)/
                                ppois(trunc.pt, 
                                      lambda = lambda,
                                      lower.tail=FALSE)},
                                list(lambda = lambda))
        return(fct)
    }
    res <- L2ParamFamily(name = name, 
                         distribution = distribution,
                         param = param, 
                         modifyParam = modifyParam,
                         L2deriv.fct = L2deriv.fct,
                         startPar = startPar, 
                         makeOKPar = makeOKPar)
    res@fam.call <- substitute(CensoredPoisFamily(lambda = l, 
                                                  trunc.pt = t),
                               list(l = lambda, t = trunc.pt))
    return(res)
}
(CP <- CensoredPoisFamily(3,2))
@
Function \code{ParamFamParameter()} generates the parameter of this class,
and ``movements'' of the model when changing the parameter are realized
in function \code{modifyParam}. More on this will be described in 
Section~\ref{ParamFamP}. Functions \code{startPar} and \code{makeOKPar} are 
helper functions for estimation which are discussed at the end of 
Section~\ref{estims}.
The more difficult parts in the implementation concern the distribution
of the observations -- here package~\pkg{distr} with its powerful methods
for automatic generation of image distributions is very helpful -- and
the $L_2$-derivative as a function, \code{L2deriv.fct}. Here some off-hand
calculations are inevitable.

\textbf{Plotting } There are also quite flexible \code{plot} methods for 
objects of class \code{ParamFamily}. In the case of \code{L2ParamFamily} the default 
plot consists of density, cumulative distribution function (cdf), quantile function 
and scores. An example is given in Figure~\ref{figmyFam}; for details see
the help page of \code{plot} in \pkg{distrMod}.
<<plotFam, eval = TRUE, fig = TRUE, include = FALSE>>=
layout(matrix(c(1,1,2,2,3,3,4,4,4,5,5,5), nrow = 2,
       byrow = TRUE))
plot(myFam, mfColRow = FALSE, cex.inner = 1,
     inner = c("density", "cdf", "quantile function",
               "location part", "scale part"))
@
\begin{figure}[!ht]
  \begin{center}
    \includegraphics[width=12cm]{distrMod-plotFam.pdf}%
    \caption{\label{figmyFam}Plot of \code{L2ParamFamily} object.}
  \end{center}
\end{figure}%
% -----------------------------------------------------------------------------
\subsection[Parameter in a parametric family: class ParamFamParameter]%
{Parameter in a parametric family: class \code{ParamFamParameter}} \label{ParamFamP}
% -----------------------------------------------------------------------------
At first glance, the fact that the distribution classes of package \pkg{distr}
already contain a slot \code{param} which is either \code{NULL} or of
\proglang{S}4 class \code{Parameter} could lead us to question why we need
the infrastructure provided by package \pkg{distrMod}.
In fact, for class \code{ParamFamily}, we define \proglang{S}4-class
\code{ParamFamParameter} as subclass of class \code{Parameter}
since we additionally want to allow for partitions and transformations.\\
As always, there is a generating function with the same name for this class; 
that is, a function \code{ParamFamParameter()} which already showed up in the 
code for the implementation of model~$\smkreis{4}$.\\
This new class is needed since in many (estimation) applications, it is not the
whole parameter of a parametric family which is of interest, but rather parts
of it, while the rest of it is either known and fixed or has to be estimated 
as a nuisance parameter. In other situations, we are interested in a (smooth) 
transformation of the parameter. All these cases are realized in the design of 
class \code{ParamFamParameter} which has slots \code{name}, the name of the 
parameter, \code{main}, the interesting aspect of the parameter, \code{nuisance}, 
an unknown part of the parameter of secondary interest, but which has to be 
estimated, for example for confidence intervals, and \code{fixed}, a known and 
fixed part of the parameter (see also Figure~\ref{figPFParam}).
\begin{figure}[!ht]
  \begin{center}
    \includegraphics[width=4cm]{ParamFamParameter.pdf}%
    \caption{\label{figPFParam}Inheritance relations and slots of
    \code{ParamFamParameter} where we do not repeat inherited slots.}
  \end{center}
\end{figure}%
In addition, it has a slot \code{trafo} (an abbreviation of ``transformation'')
which is also visible in class \code{Estimate} (see Figure~\ref{figEst}).
Slot \code{trafo} for instance may be used to realize partial influence curves,
see~\citet[Definition~4.2.10]{Ri94}, where one is only interested in some
possibly lower dimensional smooth (not necessarily linear or coordinate-wise)
aspect/transformation $\tau$ of the parameter~$\theta$.\\
To be coherent with the corresponding {\em nuisance} implementation, we use 
the following convention: 
The full parameter $\theta$ is split up coordinate-wise into a main parameter
$\theta'$, a nuisance parameter $\theta''$, and a fixed, known part
$\theta'''$.\\
Without loss of generality, we restrict ourselves to the case that
transformation $\tau$ only acts on the main parameter $\theta'$ -- in case we want
to transform the whole parameter, we have to assume both nuisance parameter
$\theta''$ and known part of the parameter $\theta'''$ have length zero.

\textbf{Implementation }
Slot \code{trafo} can contain either a (constant) matrix $D_\theta$ or a function
$$\tau\colon \Theta' \to \tilde \Theta,\qquad \theta \mapsto \tau(\theta)$$%
mapping the main parameter $\theta'$ to some range $\tilde\Theta$.\\
If {\em slot value} \code{trafo} is a function, besides $\tau(\theta)$,
it will also return the corresponding derivative matrix
$\frac{\partial}{\partial \theta}\tau(\theta)$.
More specifically, the return value of this function is a list with entries 
\code{fval}, the function value $\tau(\theta)$, and \code{mat}, the derivative 
matrix.\\
In the case that \code{trafo} is a matrix $D$, we interpret it as such a derivative
matrix $\frac{\partial}{\partial \theta}\tau(\theta)$,
and correspondingly, $\tau(\theta)$ is the linear mapping
$\tau(\theta)=D\,\theta$.\\
According to the signature, the return value of accessor function/method~\code{trafo} 
varies. For signatures \code{(ParamFamily,ParamFamParameter)}, 
\code{(Estimate,ParamFamParameter)}, and \newline\code{(Estimate,missing)}, the result is a list
with entries \code{fct}, the function $\tau$, and \code{mat}, the matrix
$\frac{\partial}{\partial \theta}\tau(\theta)$.
Function $\tau$ will then return a list with entries \code{fval} and \code{mat}
mentioned above. For signatures \code{(ParamFamily,missing)} and
\code{(ParamFamParameter,missing)}, \code{trafo} will just return the corresponding
matrix.

\textbf{Movements in parameter space and model space }
Our implementation of models has both function components providing mappings
$\theta\mapsto {\rm Component}(\theta)$ (like \code{L2deriv.fct}) and components
evaluated at an actual parameter value $\theta$ (like \code{L2deriv}). When
we ``move'' a model object from $\theta$ to $\theta'$, i.e.\ when
we change the reference parameter of this model object from $\theta$ to $\theta'$,
the latter components have to be modified accordingly.
To this end, there are
\code{modifyModel} methods for classes \code{L2ParamFamily},
\code{L2LocationFamily}, \code{L2ScaleFamily}, \code{L2LocationScaleFamily},
\code{ExpScaleFamily}, and \code{GammaFamily}, where the second argument to
dispatch on has to be of class \code{ParamFamParameter} but this probably will be 
extended in the future.
The code for example~$\smkreis{3}$, that is to signature \code{model="GammaFamily"},
for instance, reads
<<GammaFamilyModify, eval = FALSE>>=
setMethod("modifyModel", 
          signature(model = "GammaFamily", 
                    param = "ParamFamParameter"),
    function(model, param, ...){
        M <- modifyModel(as(model, "L2ParamFamily"), 
                         param = param, .withCall = FALSE)
        M@L2derivSymm <- FunSymmList(OddSymmetric(SymmCenter = 
                                            prod(main(param))),
                                     NonSymmetric())
        class(M) <- class(model)
        return(M)
    })
@ 
Internally, the default \code{modifyModel} method makes use of
slots \code{modifParam} (to move the distribution of the observations) 
and \code{L2deriv.fct} (to move the $L_2$-derivative). For internal 
reasons, these two functions have different implementations of the parameter
as argument: \code{L2deriv.fct}, only internally used by our routines,
requires an instance of class \code{ParamFamParameter}. In contrast to this, 
\code{modifyParam} uses a representation of the parameter as slot \code{main} of
class \code{ParamFamParameter}; that is, simply as a numeric vector,
because its results are passed on to non-\pkg{distrMod}-code.
This inconsistency, which also holds for the passed-on functional slots \code{makeOKPar}
and \code{startPar}, might confuse some users and will hence 
probably be changed in a subsequent package version.

\textbf{Example }
Our implementation of the gamma family follows the parametrization of the
\proglang{R} core functions \code{d/p/q/rgamma}. Hence, we use the parameters
\code{("scale","shape")} (in this order). Package~\pkg{MASS} \citep{V:R:02}
however uses the (equivalent) parametrization \code{("shape","rate")},
where \code{rate=1/scale}. To be able to compare the results obtained
for the MLE by \code{fitdistr} and by our package, we therefore use the
corresponding transformation $\tau({\rm scale},{\rm shape})=({\rm shape},1/{\rm scale})$.
More specifically, this can be done as follows
<<ParaTrafex, eval = TRUE>>=
mtrafo <- function(x){
     nms0 <- c("scale","shape")
     nms <- c("shape","rate")
     fval0 <- c(x[2], 1/x[1])
     names(fval0) <- nms
     mat0 <- matrix(c(0, -1/x[1]^2, 1, 0), nrow = 2, 
                     dimnames = list(nms,nms0))
     list(fval = fval0, mat = mat0)
}
(G.MASS <- GammaFamily(scale = 1, shape = 2, trafo = mtrafo))
@
% -----------------------------------------------------------------------------
\section{Minimum criterion estimation}\label{estsec}
% -----------------------------------------------------------------------------
\subsection[Implementations in R so far]%
{Implementations in \proglang{R} so far} \label{OOApproach}
% -----------------------------------------------------------------------------
To better appreciate the generality of our object oriented approach, let us
contrast it with the two already mentioned implementations:

\textbf{fitdistr } Function \code{fitdistr} comes with arguments: \code{x}, 
\code{densfun}, \code{start} (and \code{...}) and returns an object of 
\proglang{S}3-class \code{fitdistr} which is a list with components \code{estimate}, 
\code{sd}, and \code{loglik}.
% -----------------------------------------------------------------------------
%\begin{Rem}%\rm\small
As starting estimator \code{start} in case~$\smkreis{2}$, we select median 
and MAD, where for the latter we need a consistency constant to obtain a consistent 
estimate for scale. In package \pkg{distrMod} this is adjusted automatically.\\ 
Due to symmetry about the location parameter, this constant is just the inverse 
of the upper quartile of the central distribution, obtainable via
<<mad-const, echo = TRUE, eval = TRUE>>=
(mad.const <- 1/q(myD)(0.75))
## in RStudio or Jupyter IRKernel, use q.l(.)(.) instead of q(.)(.)
@
%\end{Rem}
% -----------------------------------------------------------------------------
Then, function~\code{fitdistr} from package~\pkg{MASS} may be called as
<<MLE-MASS,  echo = TRUE, eval = TRUE>>=
set.seed(19)
x <- r(distribution(myFam))(50)
mydf <- function(x, loc, scale){
    y <- (x-loc)/scale; exp(-abs(y)^3)/scale
}
Med <- median(x)
MAD <- mad(x, constant = mad.const)
c(Med, MAD)
fitdistr(x, mydf, start = list("loc" = Med, "scale" = MAD))
@

\textbf{mle } Function \code{mle} from package~\pkg{stats4} on the other hand, 
has arguments \code{minuslogl} (without data argument!), \code{start}, \code{method}
(and \code{...}) and returns an object of \proglang{S}4-class \code{mle} with slots
\code{call}, \code{coef}, \code{full}, \code{vcov}, \code{min}, \code{details},
\code{minuslogl}, and \code{method}. The MLE for model $\smkreis{2}$ may
then be computed via
<<MLE-stats4,  echo = TRUE, eval = TRUE>>=
ll <- function(loc,scale){-sum(log(mydf(x,loc,scale)))}
mle(ll, start = list("loc" = Med, "scale" = MAD))
@
%\begin{Rem}%\rm\small
There are further packages with implementations for MLE like 
\pkg{bbmle} \citep{bbmle}, \pkg{fitdistrplus} \citep{fitdistrplus}, 
and \pkg{maxLik} \citep{maxLik}. Package \pkg{bbmle} provides modifications and 
extensions for the \code{mle} classes of the \pkg{stats4} package. The implementation is 
very similar to package \pkg{stats4} and the computation of the MLE is based on function 
\code{optim}. As the name of package \pkg{fitdistrplus} already suggests, it contains 
an extension of the function \code{fitdistr} with a very similar implementation. That is, 
there are analogous \code{if} clauses like in function \code{fitdistr} and if none 
of the special cases apply, numerical optimization via \code{optim} is performed. 
In addition to \code{fitdistr}, a user-supplied function can be used for optimization. 
Package \pkg{maxLik} includes tools for computing MLEs by means of numerical optimization 
via functions \code{optim} and \code{nlm}, respectively.\\
There is also at least one package which can be used to compute MDEs. With function \code{mde}
of package \pkg{actuar} \citep{actuar} one can minimize the Cram\'er-von-Mises distance 
for individual and grouped data. Moreover, for grouped data a modified $\chi^2$-statistic 
as well as the squared difference between the theoretical and empirical limited expected 
value can be chosen. The optimization is again performed by function \code{optim}.
%\end{Rem}
% -----------------------------------------------------------------------------
\subsection[Estimators in package distrMod]%
{Estimators in package~\pkg{distrMod}}\label{estims}
% -----------------------------------------------------------------------------
In our framework, it is possible to implement a general, dispatchable
minimum criterion estimator (MCE); that is, an estimator minimizing a
criterion incorporating the observations / the empirical distribution and the model.
Examples of MCEs comprise MLEs (with neg.\ Loglikelihood as criterion) and MDEs
where the criterion is some distance between the empirical distribution and
$P_\theta$.
\par
We have implemented MCEs in form of the function \code{MCEstimator}, and also
provide functions \code{MDEstimator} and \code{MLEstimator} -- essentially just
wrapper functions for \code{MCEstimator}.\\
\code{MCEstimator} takes as arguments the observations \code{x},
the parametric family \code{ParamFamily}, the criterion,
and a starting parameter \code{startPar} (and again some optional ones).
It returns an object of class \code{MCEstimate} (a subclass of class \code{Estimate})
with slots \code{estimate}, \code{criterion}, \code{samplesize}, \code{asvar}
(and some more); for more detailed information see the help page of class 
\code{MCEstimate}.

\textbf{The main achievement} of our approach is that estimators are available for
\emph{any} object of class \code{ParamFamily} (e.g., Poisson, beta, gamma and many more).
At the same time, using \proglang{S}4 inheritance this generality does not hinder
specialization to particular models: internal dispatch on run-time according to argument
\code{ParamFamily} allows the definition of new specialized methods {\em without modifying
existing code}, a feature whose importance should not be underestimated when
we are dealing with distributed collaborative package development.\\
Specialized methods for the computation of these MCEs can easily be established
as described below. In particular, this can be done externally to our package, 
and nonetheless all our infra-structure is fully available right from the beginning; 
i.e., a unified interfacing/calling function, unified plotting, printing, interface
to confidence intervals etc.

\textbf{Limitations of the other implementations }
In principle, the last few points could already have been realized within the 
elder \proglang{S}3 approach as realized in \code{fitdistr}. However, there are some 
subtle limitations which apply to function \code{fitdistr} and function \code{mle} 
as well as to the other implementations: while the general \proglang{R} optimization 
routines/interfaces \code{optim}/\code{optimize}/\code{nlm} clearly stand out for their 
generality and their implementation quality, ``general'' numerical optimization to 
determine an MLE in almost all cases is bound to the use of these routines.\\
There are cases however, where either other optimization techniques could be more 
adequate (e.g.\ in estimating the integer-valued degrees-of-freedom parameter in 
a $\chi^2$-distribution), or where numerical optimization is not necessary, because 
optima can be determined analytically. The first case may to a certain extend be handled 
using package \pkg{fitdistrplus} where one can specify a user-supplied function for 
optimization. However, for the second case it would be necessary to change existing code; 
e.g., convince the authors Brian Ripley and Bill Venables to append another \code{if}-clause 
in \code{fitdistr}.\\
The above is also true for MDE. The numerical optimization in the case of function \code{mde} 
of package \pkg{actuar} is restricted to \code{optim} and it would be necessary to change
the existing code if one for example wants to use another distance (e.g.\ Kolmogorov distance).
\par
As already mentioned, the approach realized in package~\pkg{distrMod} does not have 
these limitations. As an example one can quite easily insert ``intermediate'' layers 
like group models; e.g., location (and/or scale) models. In the \proglang{S}4-inheritance
structure for class \code{ParamFamily}, one can define a corresponding subclass ${\cal S}$ 
and ``intermediate'' general methods for ${\cal S}$ which are more special than 
\code{optim}/\code{optimize}/\code{nlm}, but still apply by default for more than 
one model, and which -- if desired -- could then again be overridden by more special 
methods applying to subclasses of ${\cal S}$.\\
The main function for this purpose is \code{MCEstimator}. As an example we can use 
the negative log-likelihood as criterion; i.e., compute the maximum likelihood estimator.
<<MCEstimator, eval = TRUE>>=
negLoglikelihood <- function(x, Distribution){
    res <- -sum(log(Distribution@d(x)))
    names(res) <- "Negative Log-Likelihood"
    return(res)
}
MCEstimator(x = x, ParamFamily = myFam, 
            criterion = negLoglikelihood)
@
The user can then specialize the behaviour of \code{MCEstimator} on two layers:
instance-individual or class-individual.

\textbf{Instance-individually } Using the first layer, we may specify model-individual
starting values/search intervals by slot \code{startPar} of class \code{ParamFamily}, 
pass on special control parameters to functions \code{optim}/\code{optimize} by a 
\code{...} argument in function \code{MCEstimator}, and we may enforce valid parameter 
values by specifying function slot \code{makeOKPar} of class \code{ParamFamily}.
In addition, one can specify a penalty value penalizing invalid parameter values. 
For an example, see the code to example~$\smkreis{4}$ on page~\pageref{codePois}.

\textbf{Class-individually } 
In some situations, one would rather like to define rules for groups of models or 
to be even more flexible. This can be achieved using the class-individual layer: 
In order to use method dispatch to find the ``right'' function to determine the MCE, 
we define subclasses to class \code{ParamFamily} as e.g., in the case of class \code{PoisFamily}.
In general, these subclasses will not have any new slots, but merely serve as the basis 
for a refined method dispatching. As an example, the code to define class \code{PoisFamily} 
simply is
<<PoisFamilyDef, eval = FALSE>>=
setClass("PoisFamily", contains = "L2ParamFamily")
@
For group models, like the location and scale model, there may be additional
slots and intermediate classes; e.g.,
<<NormLocationFamily, eval = FALSE>>=
setClass("NormLocationFamily", contains = "L2LocationFamily")
@
Specialized methods may then be defined for these subclases. So far,
in package~\pkg{distrMod} we have particular \code{validParameter} methods
for classes \code{ParamFamily}, \code{L2ScaleFamily}, \code{L2LocationFamily}
and \code{L2LocationScaleFamily}; e.g., the code to signature \code{L2ScaleFamily}
simply is
<<L2ScaleFamily, eval = FALSE>>=
setMethod("validParameter", 
           signature(object = "L2ScaleFamily"),
    function(object, param, tol=.Machine$double.eps){
        if(is(param,"ParamFamParameter"))
            param <- main(param)
        if(!all(is.finite(param))) return(FALSE)
        if(length(param)!=1) return(FALSE)
        return(param > tol)
    })
@
Class-individual routines are realized by calling \code{mceCalc} and 
\code{mleCalc} within function \newline\code{MCEstimator} and \code{MLEstimator}, 
respectively; e.g.,
<<mlecalcEx,  echo=TRUE, eval = FALSE>>=
setMethod("mleCalc", signature(x = "numeric", 
                              PFam = "NormLocationScaleFamily"),  
          function(x, PFam){ 
              n <- length(x)
              c(mean(x), sqrt((n-1)/n)*sd(x)) 
          })
@
and the maximum likelihood estimator in example~$\smkreis{2}$ can easily be 
computed as follows.
<<MLEstimator, eval = TRUE>>=
MLEstimator(x = x, ParamFamily = myFam)
@
Similarly we could evaluate the Kolmogorov-MDE in this model:
<<MDEstimatorK, eval = TRUE>>=
MDEstimator(x = x, ParamFamily = myFam, 
            distance = KolmogorovDist)
@
Note that the last two calls would be identical (only replacing argument \code{myFam})
for examples~$\smkreis{3}$ and $\smkreis{4}$.

\textbf{Functions mceCalc and mleCalc }
Both \code{mceCalc} and \code{mleCalc} dispatch according to their arguments 
\code{x} and \code{PFam}. For \code{mceCalc}, so far there is only a method for 
signature \code{(numeric,ParamFamily)}, while \code{mleCalc} already has several 
particular methods for argument \code{PFam} of classes \code{ParamFamily}, 
\code{BinomFamily}, \code{PoisFamily}, \code{NormLocationFamily}, \newline\code{NormScaleFamily}, 
and \code{NormLocationScaleFamily}. To date, in both \code{mceCalc} and \code{mleCalc}, 
argument \code{x} must inherit from class \code{numeric}, but we plan to allow 
for more general classes (e.g.\ \code{data.frames}) in subsequent versions. 
Note that for technical reasons, \code{mleCalc} must have an extra \code{...} 
argument to cope with different callings from \code{MLEstimator}. Additional arguments 
are of course possible. The return value must be a list with prescribed structure. 
To this end, function \code{meRes()} can be used as a helper to produce this structure.
For example the \code{mleCalc}-method for signature \code{numeric,NormScaleFamily} is
<<NormScaleFam, eval = FALSE>>=
setMethod("mleCalc", signature(x = "numeric", 
                               PFam = "NormScaleFamily"),
    function(x, PFam, ...){
        n <- length(x)
        theta <- sqrt((n - 1)/n) * sd(x)
        mn <- mean(distribution(PFam))
        ll <- -sum(dnorm(x, mean = mn, sd = theta, log = TRUE))
        names(ll) <- "neg.Loglikelihood"
        crit.fct <- function(sd) -sum(dnorm(x, mean = mn, 
                                        sd = sd, log = TRUE))  
        param <- ParamFamParameter(name = "scale parameter", 
                                   main = c("sd" = theta))
        if(!hasArg(Infos)) Infos <- NULL
        return(meRes(x, theta, ll, param, crit.fct, 
                     Infos = Infos))
    })
@

\textbf{Coercion to class mle }
We also provide a coercion to class \code{mle} from package~\pkg{stats4},
hence making profiling by the \code{profile}-method therein possible.
In order to be able to do so, we need to fill a functional slot
\code{criterion.fct} of class \code{MCEstimate}. In many examples this
is straightforward, but in higher dimensions, helper function
\code{get.criterion.fct} can be useful; e.g., it handles the general case
for signature \code{ParamFamily}.\\
The values of functions \code{MCEstimator}, \code{MDEstimator}, and
\code{MLEstimator} are objects of
\proglang{S}4-class \code{MCEstimate} which inherits from \proglang{S}4-class
\code{Estimate}; see~Figure~\ref{figEst} for more details.
\begin{figure}[!ht]
  \begin{center}
    \includegraphics[width=6cm]{Estimate.pdf}%
    \caption{\label{figEst}Inheritance relations and slots of the
    corresponding \mbox{(sub-)}classes for \code{Estimate} where we do not
    repeat inherited slots.}
  \end{center}
\end{figure}
% -----------------------------------------------------------------------------
\subsection{Confidence intervals and profiling}\label{sec.ci}
% -----------------------------------------------------------------------------
We also provide particular methods for functions \code{confint} and, as already 
mentioned in the previous section, \code{profile} of package \pkg{stats}.
Moreover, by adding an argument \code{method} to the signature of \code{confint}, 
we gain more flexibility. Note that the addition of an extra argument was only possible
by masking method \code{confint}. This masking is done in a way that it reproduces
exactly the \pkg{stats} behaviour when called with the corresponding arguments,
however. This additional argument is for example used in package~\pkg{ROptEst} 
\citep{ROptEst} where one can determine robust (asymptotic) confidence intervals 
which are uniformly valid on neighborhoods and may incorporate bias in various ways. 
Also, in principle, one-sided confidence intervals are possible, as well as confidence 
intervals produced by other techniques like bootstrap.
\par
To make confidence intervals available for objects of class \code{MCEstimate}, 
there is a method \code{confint}, which produces confidence intervals 
(of class \code{Confint}). As an example consider the following: We first
generate some data and determine the Cram\'er-von-Mises MDE as starting estimator
for \code{fitdistr}. We then compute the MLE using \code{fitdistr} and \code{MLEstimator}.
<<CIex, eval = TRUE>>=
set.seed(19)
y <- rgamma(50, scale = 3, shape = 2)
(MDest <- MDEstimator(x = y, ParamFamily = G.MASS, 
                     distance = CvMDist))
fitdistr(x = y, densfun = "gamma", 
         start = list("shape" = estimate(MDest)[1], 
                      "rate" = estimate(MDest)[2]))
(res <- MLEstimator(x = y, ParamFamily = G.MASS))
(ci <- confint(res))
@
And we can do some profiling. The results are given in Figure~\ref{figProfile}.
<<profile, eval = TRUE, fig = TRUE, include = FALSE>>=
par(mfrow=c(2,1))
plot(profile(res))
@
\begin{figure}[!ht]
  \begin{center}
    \includegraphics[width=11cm]{distrMod-profile.pdf}%
    \caption{\label{figProfile}Profiling: behavior of objective function near 
    the solution.}
  \end{center}
\end{figure}
Note again that only minimal changes in the preceding \pkg{distrMod}-code 
would be necessary to also apply the code to examples~$\smkreis{3}$
and $\smkreis{4}$.
% -----------------------------------------------------------------------------
\subsection{Customizing the level of detail in output}
% -----------------------------------------------------------------------------
For class \code{Confint} as well as for class \code{Estimate} we have particular 
\code{show} and \code{print} methods where you may scale the output. This scaling 
can either be done by setting global options with \code{distrModOptions} 
(see Appendix~\ref{distrModO}), or, in the case of \code{print}, locally by the extra 
argument \code{show.details}. The default value of \code{show.details} is \code{"maximal"}.\\ 
Note that this departure from functional programming style is necessary, as \code{show} 
does not allow for additional arguments.
\begin{table}[!ht]
  \begin{center}\small
    \begin{tabular}{|c|l|l|}
    \hline
      \textbf{Value of \code{show.details}} & \textbf{Object of class \code{MCEstimate}} & 
      \textbf{Object of class \code{Confint}}\\ \hline
      \code{"minimal"} & parameter estimates and    & lower and upper confidence \\
                       & estimated standard errors  & limits for each parameter \\
      \code{"medium"}  & call, sample size, asymptotic & call, sample size, and type \\
                       & (co)variance, and criterion   & of estimator \\ 
      \code{"maximal"} & untransformed estimate,        & transformation, and \\
                       & asymptotic (co)variance of     & derivative matrix \\
                       & untransformed estimate,        & \\
                       & transformation, and derivative & \\
                       & matrix                         & \\
    \hline
    \end{tabular}
    \caption{\label{TabShow}The level of detail in output. In case of \code{"medium"} and
    \code{"maximal"} only the additional output is specified.}
  \end{center}
\end{table}%
% -----------------------------------------------------------------------------
\section{Conclusions}
% -----------------------------------------------------------------------------
Package \pkg{distrMod} represents the most flexible implementation of minimum 
criterion estimators for univariate distributions, including maximum likelihood 
and minimum distance estimators, available in R so far. These estimators are 
provided for any object of \proglang{S}4 class \code{ParamFamily} and by 
\proglang{S}4 inheritance can be adapted to particular models without modifying 
existing code. In addition it contains infra-structure in terms of unified 
interfacing/calling functions, unified plotting, printing, interface to 
confidence intervals and more.
%------------------------------------------------------------------------------
\bibliography{distrMod}
%------------------------------------------------------------------------------
\appendix
\section{Global options}\label{distrModO}
%------------------------------------------------------------------------------
Analogously to the \code{options} command in R package~\pkg{base} one may specify 
a number of global ``constants'' to be used within the package via
\code{distrModoptions}/\code{getdistrModOption}. These include
\begin{itemize}
\item \code{use.generalized.inverse.by.default} which is a logical variable giving 
  the default value for argument \code{generalized}
  of our method \code{solve} in package~\pkg{distrMod}. This argument decides whether
  our method \code{solve} is to use generalized inverses if the original
  \code{solve}-method from package~\pkg{base} fails. If the option is set to
  \code{FALSE}, in the case of failure, and unless argument \code{generalized}
  is not explicitly set to \code{TRUE}, \code{solve} will throw an error as is
  the \pkg{base}-method behavior. The default value of the option is \code{TRUE}.
\item \code{show.details} controls the level of detail of method \code{show} for objects
  of the classes of the \pkg{distr} family of packages. Possible values are
  \begin{itemize}
  \item \code{"maximal"}: {all information is shown}
  \item \code{"minimal"}: {only the most important information is shown}
  \item \code{"medium"}: {somewhere in the middle; see actual \code{show}-methods
  for details.}
  \end{itemize}
  The default value is \code{"maximal"}. For more details see Table~\ref{TabShow}.
\end{itemize}
%------------------------------------------------------------------------------
\section{Following good programming practices}
% -----------------------------------------------------------------------------
There are several new \proglang{S}4 classes in package~\pkg{distrMod}.
With respect to inspection and modification of the class slots we follow
the suggestion of Section~3.4 in \citet{Ge08}. That is, there are accessor
functions/methods for all slots included in the new classes. Moreover, we 
implemented replacement functions/methods for those slots which are intended 
to be modifiable by the user.\\
One could also use the \code{@}-operator to modify or access slots. However, 
this operator relies on the implementation details of the class, and hence, 
this may lead to difficulties if the class implementation has to be changed. 
In addition, as no checking is invoked one may easily produce inconsistent objects.
\par
We also implemented generating functions for all non-virtual classes which 
shall ease the definition of objects for the user; see~\citet[Section~1.6]{Ch98}.
These functions in most cases have the same name as the class.
By obtaining convenience via generating functions and not via new
\code{initialize}-methods, we see the advantage that the default
\code{initialize}-methods called by means of \code{new} remain valid and
can be used for programming.
\par
Finally, there are \code{show}-methods for all new \proglang{S}4 classes which
display the essential information of instantiated objects; see 
\citet[Section~1.6]{Ch98}.
% -----------------------------------------------------------------------------
\section*{Acknowledgments}
%------------------------------------------------------------------------------
Both authors contributed equally to this work.
We thank the referees and the associate editor for their helpful comments.
% -----------------------------------------------------------------------------
<<cleanup, echo=FALSE>>=
options("prompt"=">")
@
% -----------------------------------------------------------------------------
\end{document}