% \VignetteIndexEntry{Fitting the Nelson--Siegel--Svensson model with Differential Evolution} % \VignetteKeyword{heuristics} % \VignetteKeyword{Differential Evolution} % \VignetteKeyword{yield curve} % \VignetteKeyword{optimize} % \SweaveOpts{keep.source = FALSE} \documentclass[a4paper,11pt]{article} \usepackage[left=2.5cm,top=2cm,bottom=3cm,right=3.5cm]{geometry} \usepackage[noae]{Sweave} \usepackage[scaled=0.9]{inconsolata} \usepackage{mathptmx} \usepackage{amsmath,amstext} \usepackage{natbib} \usepackage{framed} \usepackage{url} \usepackage{color} \definecolor{grau2}{rgb}{.2,.2,.2} \definecolor{grau7}{rgb}{.7,.7,.7} % define *Sweave* layout \DefineVerbatimEnvironment{Sinput}{Verbatim}{} \DefineVerbatimEnvironment{Soutput}{Verbatim}{frame=single,xleftmargin=0em,% formatcom=\color{grau2},rulecolor=\color{grau7}} \DefineVerbatimEnvironment{Scode}{Verbatim}{xleftmargin=2em} \fvset{listparameters={\setlength{\topsep}{0pt}}} \renewenvironment{Schunk}{\vspace{\topsep}}{\vspace{\topsep}} \usepackage{hyperref} <>= options(continue = " ", digits = 5) pv <- packageVersion("NMOF") pv <- gsub("(.*)[.](.*)", "\\1-\\2", pv) @ \begin{document} {\raggedright{\LARGE Fitting the Nelson--Siegel--Svensson model\\[0.2ex]% with Differential Evolution}}\hspace*{\fill} {\footnotesize Package version \Sexpr{pv}}\medskip \noindent Enrico Schumann\\ \noindent \texttt{es@enricoschumann.net}\\ \bigskip \section{Introduction} \noindent In this tutorial we look into fitting the Nelson--Siegel--Svensson (NSS) model to data; for more details, please see \citep{Gilli2011b}. Further information can be found in \citet{Gilli2010h} and \citet{Gilli2010i}. We start by attaching the package. Since we will use a stochastic technique for optimisation, we should be running several restarts (see \citealp[Chapter~12]{Gilli2011b}, for a discussion). The variable \texttt{nRuns} sets the number of restarts for the examples to come. We set it to only two here to keep the build-time of the package acceptable; increase it to check the stochastics of the solutions. We set a seed to make the computations reproducible. <<>>= require("NMOF") nRuns <- 2L set.seed(112233) @ \section{Fitting the NS model to given zero rates} \subsection*{The NS model} We create a `true' yield curve \texttt{yM} with given parameters \texttt{betaTRUE}. The times-to-payment, measured in years, are collected in the vector \texttt{tm}. <>= tm <- c(c(1, 3, 6, 9)/12, 1:10) betaTRUE <- c(6, 3, 8, 1) yM <- NS(betaTRUE, tm) par(ps = 11, bty = "n", las = 1, tck = 0.01, mgp = c(3, 0.2, 0), mar = c(4, 4, 1, 1)) plot(tm, yM, xlab = "maturities in years", ylab = "yields in %") @ The aim is to fit a smooth curve through these points. Since we have used the model to create the points, we should be able to obtain a perfect fit. We start with the objective function \texttt{OF}. It takes two arguments: \texttt{param}, which is a candidate solution (a numeric vector), and the list \texttt{data}, which holds all other variables. It returns the maximum absolute difference between a vector of observed (`market') yields \texttt{yM}, and the model's yields for parameters \texttt{param}. <<>>= OF <- function(param, data) { y <- data$model(param, data$tm) maxdiff <- y - data$yM maxdiff <- max(abs(maxdiff)) if (is.na(maxdiff)) maxdiff <- 1e10 maxdiff } @ We have a added a crude but effective safeguard against `strange' parameter values that lead to \texttt{NA} values: the objective function returns a large positive value. We minimise, and hence parameters that produce \texttt{NA} values are marked as bad. In this first example, we set up \texttt{data} as follows: <<>>= data <- list(yM = yM, tm = tm, model = NS, ww = 0.1, min = c( 0,-15,-30, 0), max = c(15, 30, 30,10)) @ We add a \texttt{model} (a function; in this case \texttt{NS}) that describes the mapping from parameters to a yield curve, and vectors \texttt{min} and \texttt{max} that we will later use as constraints. \texttt{ww} is a penalty weight, explained below. \texttt{OF} will take a candidate solution \texttt{param}, transform this solution via \texttt{data\$model} into yields, and compare these yields with \texttt{yM}, which here means to compute the maximum absolute difference. <<>>= param1 <- betaTRUE ## the solution... OF(param1, data) ## ...gives 0 param2 <- c(5.7, 3, 8, 2) ## anything else OF(param2, data) ## ... gives a postive number @ We can also compare the solutions in terms of yield curves. <>= par(ps = 11, bty = "n", las = 1, tck = 0.01, mgp = c(3, 0.2, 0), mar = c(4, 4, 1, 1)) plot(tm, yM, xlab = "maturities in years", ylab = "yields in %") lines(tm, NS(param1, tm), col = "blue") lines(tm, NS(param2, tm), col = "red") legend(x = "topright", legend = c("true yields", "param1", "param2"), col = c("black", "blue", "red"), pch = c(1, NA, NA), lty = c(0, 1, 1)) @ We generally want to obtain parameters such that certain constraints are met. We include these through a penalty function. <<>>= penalty <- function(mP, data) { minV <- data$min maxV <- data$max ww <- data$ww ## if larger than maxV, element in A is positiv A <- mP - as.vector(maxV) A <- A + abs(A) ## if smaller than minV, element in B is positiv B <- as.vector(minV) - mP B <- B + abs(B) ## beta 1 + beta2 > 0 C <- ww*((mP[1L, ] + mP[2L, ]) - abs(mP[1L, ] + mP[2L, ])) A <- ww * colSums(A + B) - C A } @ We already have \texttt{data}, so let us see what the function does to solutions that violate a constraint. Suppose we have a population \texttt{mP} of three solutions (the \texttt{m} in \texttt{mP} is to remind us that we deal with a matrix). <<>>= param1 <- c( 6, 3, 8, -1) param2 <- c( 6, 3, 8, 1) param3 <- c(-1, 3, 8, 1) mP <- cbind(param1,param2,param3) rownames(mP) <- c("b1","b2","b3","lambda") mP @ The first and the third solution violate the constraints. In the first solution, $\lambda$ is negative; in the third solution, $\beta_1$ is negative. <<>>= penalty(mP,data) @ The parameter \texttt{ww} controls how heavily we penalise. <<>>= data$ww <- 0.5 penalty(mP,data) @ For valid solutions, the penalty should be zero. <<>>= param1 <- c( 6, 3, 8, 1) param2 <- c( 6, 3, 8, 1) param3 <- c( 2, 3, 8, 1) mP <- cbind(param1, param2, param3) rownames(mP) <- c("b1","b2","b3","lambda") penalty(mP, data) @ Note that \texttt{penalty} works on the complete population at once; there is no need to loop over the solutions. So we can run a test. We start by defining the parameters of DE. Note in particular that we pass the penalty function, and that we set \texttt{loopPen} to \texttt{FALSE}. <<>>= algo <- list(nP = 100L, ## population size nG = 500L, ## number of generations F = 0.50, ## step size CR = 0.99, ## prob. of crossover min = c( 0,-15,-30, 0), max = c(15, 30, 30,10), pen = penalty, repair = NULL, loopOF = TRUE, ## loop over popuation? yes loopPen = FALSE, ## loop over popuation? no loopRepair = TRUE, ## loop over popuation? yes printBar = FALSE) @ \texttt{DEopt} is then called with the objective function \texttt{OF}, the list \texttt{data}, and the list \texttt{algo}. <<>>= sol <- DEopt(OF = OF, algo = algo, data = data) @ To check whether the objective function works properly, we compare the maximum error with the returned objective function value -- they should be the same. <<>>= max( abs(data$model(sol$xbest, tm) - data$model(betaTRUE, tm)) ) sol$OFvalue @ As a benchmark, we run the function \texttt{nlminb} from the \textsf{stats} package. This is not a fair test: \texttt{nlminb} is not appropriate for such problems. (But then, if we found that it performed better than DE, we would have a strong indication that something is wrong with our implementation of DE.) We use a random starting value \texttt{s0}. <<>>= s0 <- algo$min + (algo$max - algo$min) * runif(length(algo$min)) sol2 <- nlminb(s0, OF, data = data, lower = data$min, upper = data$max, control = list(eval.max = 50000L, iter.max = 50000L)) @ Again, we compare the returned objective function value and the maximum error. <<>>= max( abs(data$model(sol2$par, tm) - data$model(betaTRUE,tm)) ) sol2$objective @ To compare our two solutions (DE and \texttt{nlminb}), we can plot them together with the true yields curve. But it is important to stress that the results of both algorithms are stochastic: in the case of DE because it deliberately uses randomness; in the case of \texttt{nlminb} because we set the starting value randomly. To get more meaningful results we should run both algorithms several times. To keep the build-time of the vignette down, we only run both methods once. But increase \texttt{nRuns} for more restarts. <>= par(ps = 11, bty = "n", las = 1, tck = 0.01, mgp = c(3, 0.2, 0), mar = c(4, 4, 1, 1)) plot(tm, yM, xlab = "maturities in years", ylab = "yields in %") algo$printDetail <- FALSE for (i in seq_len(nRuns)) { sol <- DEopt(OF = OF, algo = algo, data = data) lines(tm, data$model(sol$xbest,tm), col = "blue") s0 <- algo$min + (algo$max-algo$min) * runif(length(algo$min)) sol2 <- nlminb(s0, OF, data = data, lower = data$min, upper = data$max, control = list(eval.max = 50000L, iter.max = 50000L)) lines(tm,data$model(sol2$par,tm), col = "darkgreen", lty = 2) } legend(x = "topright", legend = c("true yields", "DE", "nlminb"), col = c("black","blue","darkgreen"), pch = c(1, NA, NA), lty = c(0, 1, 2)) @ It is no error that there tpyically appears to be only one curve for DE: there are, in fact, \texttt{nRuns} lines, but they are printed on top of each other. \subsection*{Other constraints} The parameter constraints on the NS (and NSS) model are to make sure that the resulting zero rates are nonnegative. But in fact, they do not guarantee positive rates. <>= tm <- seq(1, 10, length.out = 100) ## 1 to 10 years betaTRUE <- c(3, -2, -8, 1.5) ## 'true' parameters yM <- NS(betaTRUE, tm) par(ps = 11, bty = "n", las = 1, tck = 0.01, mgp = c(3, 0.2, 0), mar = c(4, 4, 1, 1)) plot(tm, yM, xlab = "maturities in years", ylab = "yields in %") abline(h = 0) @ This is really a made-up example, but nevertheless we may want to include safeguards against such parameter vectors: we could include just one constraint that all rates are greater than zero. This can be done, again, with a penalty function. <<>>= penalty2 <- function(param, data) { y <- data$model(param, data$tm) maxdiff <- abs(y - abs(y)) sum(maxdiff) * data$ww } @ Check: <<>>= penalty2(c(3, -2, -8, 1.5),data) @ This penalty function only works for a single solution, so it is actually simplest to write it directly into the objective function. <<>>= OFa <- function(param,data) { y <- data$model(param,data$tm) aux <- y - data$yM res <- max(abs(aux)) ## compute the penalty aux <- y - abs(y) ## aux == zero for nonnegative y aux <- -sum(aux) * data$ww res <- res + aux if (is.na(res)) res <- 1e10 res } @ So just as a numerical test: suppose the above parameters were true, and interest rates were negative. <>= algo$pen <- NULL; data$yM <- yM; data$tm <- tm par(ps = 11, bty = "n", las = 1, tck = 0.01, mgp = c(3, 0.2, 0), mar = c(4, 4, 1, 1)) plot(tm, yM, xlab = "maturities in years", ylab = "yields in %") abline(h = 0) sol <- DEopt(OF = OFa, algo = algo, data = data) lines(tm,data$model(sol$xbest,tm), col = "blue") legend(x = "topleft", legend = c("true yields", "DE (constrained)"), col = c("black", "blue"), pch = c(1, NA, NA), lty = c(0, 1, 2)) @ \section{Fitting the NSS model to given zero rates} There is little that we need to change if we want to use the NSS model instead. We just have to pass a different \texttt{model} to the objective function (and change the \texttt{min}/\texttt{max}-vectors). An example follows. Again, we fix true parameters and try to recover them. <<>>= tm <- c(c(1, 3, 6, 9)/12, 1:10) betaTRUE <- c(5,-2,5,-5,1,6) yM <- NSS(betaTRUE, tm) @ The lists \texttt{data} and \texttt{algo} are almost the same as before; the objective function stays exactly the same. <<>>= data <- list(yM = yM, tm = tm, model = NSS, min = c( 0,-15,-30,-30, 0,5), max = c(15, 30, 30, 30, 5, 10), ww = 1) algo <- list(nP = 100L, nG = 500L, F = 0.50, CR = 0.99, min = c( 0,-15,-30,-30, 0,5), max = c(15, 30, 30, 30, 5, 10), pen = penalty, repair = NULL, loopOF = TRUE, loopPen = FALSE, loopRepair = TRUE, printBar = FALSE, printDetail = FALSE) @ It remains to run the algorithm. (Again, we check the returned objective function value.) <<>>= sol <- DEopt(OF = OF, algo = algo, data = data) max( abs(data$model(sol$xbest, tm) - data$model(betaTRUE, tm)) ) sol$OFvalue @ We compare the results with \texttt{nlminb}. <<>>= s0 <- algo$min + (algo$max - algo$min) * runif(length(algo$min)) sol2 <- nlminb(s0,OF,data = data, lower = data$min, upper = data$max, control = list(eval.max = 50000L, iter.max = 50000L)) max( abs(data$model(sol2$par, tm) - data$model(betaTRUE, tm)) ) sol2$objective @ Finally, we compare the yield curves resulting from several runs. (Recall that the number of runs is controlled by \texttt{nRuns}, which we have set initially.) <>= par(ps = 11, bty = "n", las = 1, tck = 0.01, mgp = c(3, 0.2, 0), mar = c(4, 4, 1, 1)) plot(tm, yM, xlab = "maturities in years", ylab = "yields in %") for (i in seq_len(nRuns)) { sol <- DEopt(OF = OF, algo = algo, data = data) lines(tm, data$model(sol$xbest,tm), col = "blue") s0 <- algo$min + (algo$max - algo$min) * runif(length(algo$min)) sol2 <- nlminb(s0, OF, data = data, lower = data$min, upper = data$max, control = list(eval.max = 50000L, iter.max = 50000L)) lines(tm, data$model(sol2$par,tm), col = "darkgreen", lty = 2) } legend(x = "topright", legend = c("true yields", "DE", "nlminb"), col = c("black","blue","darkgreen"), pch = c(1,NA,NA), lty = c(0,1,2), bg = "white") @ \section{Fitting the NSS model to given bond prices} \begin{framed} \noindent The section was removed to reduce the build-time of the package. The examples were moved to the `NMOF manual' (see \url{http://enricoschumann.net/NMOF.htm}). The code is in the subdirectory \texttt{NMOFex}; to show the code in \textsf{R}, you can use the function \texttt{system.file}. <>= whereToLook <- system.file("NMOFex/NMOFman.R", package = "NMOF") file.show(whereToLook, title = "NMOF examples") @ \end{framed} \section{Fitting the NSS model to given yields-to-maturity} \begin{framed} \noindent The section was removed to reduce the build-time of the package. The examples were moved to the `NMOF manual' (see \url{http://enricoschumann.net/NMOF.htm}). The code is in the subdirectory \texttt{NMOFex}; to show the code in \textsf{R}, you can use the function \texttt{system.file}. <>= whereToLook <- system.file("NMOFex/NMOFman.R", package = "NMOF") file.show(whereToLook, title = "NMOF examples") @ \end{framed} \bibliographystyle{plainnat} \bibliography{NMOF} \end{document}