Bootstrap implementation
Jaime Mosquera
2023-10-23
Source:vignettes/Bootstrap_implementation.Rmd
Bootstrap_implementation.Rmd
Computation of standard errors
Objects of maxlogL
class (outputs from
maxlogL
and maxlogLreg
) stores the estimated
parameters of probability density/mass functions by Maximum Likelihood.
The variance-covariance matrix is computed from Fisher information
matrix, which is obtained by means of the Inverse Hessian matrix of
estimators:
\[\begin{equation} Var(\hat{\boldsymbol{\theta}}) = \mathcal{J}^{-1}(\hat{\boldsymbol{\theta}}) = C(\hat{\boldsymbol{\theta}}), \end{equation}\]
where \(\mathcal{J}(\hat{\boldsymbol{\theta}})\) is the observed Fisher Information Matrix. Hence, the standard errors can be calculated as the square root of the diagonal elements of matrix \(C\), as follows:
\[\begin{equation} SE(\hat{\boldsymbol{\theta}}) = \sqrt{C_{jj}(\hat{\boldsymbol{\theta}})}, \end{equation}\]
To install the package, type the following commands:
if (!require('devtools')) install.packages('devtools')
devtools::install_github('Jaimemosg/EstimationTools', force = TRUE)
In EstimationTools Hessian matrix is computed in the following way:
- If
StdE_Method = optim
, it is estimated through theoptim
function (with optionhessian = TRUE
under the hood inmaxlogL
ormaxlogLreg
function). - If the previous implementation fails or if the user chooses
StdE_Method = numDeriv
, it is calculated withhessian
function from numDeriv package. - If both of the previous methods fail, then standard errors are
computed by bootstrapping with the function
bootstrap_maxlogL
.
Additionally, EstimationTools allows implementation of bootstrap for standard error estimation, even if the Hessian computation does not fail.
Standard Error with maxlogL
function
Lets fit the following distribution:
\[ \begin{aligned} X &\sim N(\mu, \:\sigma^2) \\ \mu &= 160 \quad (\verb|mean|) \\ \sigma &= 6 \quad (\verb|sd|) \end{aligned} \]
The following chunk illustrates the fitting with Hessian computation
via optim
:
library(EstimationTools)
x <- rnorm(n = 10000, mean = 160, sd = 6)
theta_1 <- maxlogL(x = x, dist = 'dnorm', control = list(trace = 1),
link = list(over = "sd", fun = "log_link"),
fixed = list(mean = 160))
#> 0: 43065.241: 1.00000
#> 1: 32420.631: 2.00000
#> 2: 32167.874: 1.91435
#> 3: 32017.524: 1.74893
#> 4: 32006.735: 1.78620
#> 5: 32006.535: 1.78186
#> 6: 32006.534: 1.78171
#> 7: 32006.534: 1.78171
summary(theta_1)
#> _______________________________________________________________
#> Optimization routine: nlminb
#> Standard Error calculation: Hessian from optim
#> _______________________________________________________________
#> AIC BIC
#> 64013.07 64013.07
#> _______________________________________________________________
#> Estimate Std. Error Z value Pr(>|z|)
#> sd 5.940 0.042 141.4 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> _______________________________________________________________
#> Note: p-values valid under asymptotic normality of estimators
#> ---
## Hessian
print(theta_1$fit$hessian)
#> [,1]
#> [1,] 566.8292
## Standard errors
print(theta_1$fit$StdE)
#> [1] 0.04200238
print(theta_1$outputs$StdE_Method)
#> [1] "Hessian from optim"
Note that Hessian was computed with no issues. Now, lets check the
aforementioned feature in maxlogL
: the user can implement
bootstrap algorithm available in bootstrap_maxlogL
function. To illustrate this, we are going to create another object
theta_2
:
# Bootstrap
theta_2 <- maxlogL(x = x, dist = 'dnorm', control = list(trace = 1),
link = list(over = "sd", fun = "log_link"),
fixed = list(mean = 160))
#> 0: 43065.241: 1.00000
#> 1: 32420.631: 2.00000
#> 2: 32167.874: 1.91435
#> 3: 32017.524: 1.74893
#> 4: 32006.735: 1.78620
#> 5: 32006.535: 1.78186
#> 6: 32006.534: 1.78171
#> 7: 32006.534: 1.78171
bootstrap_maxlogL(theta_2, R = 200)
#>
#> ...Bootstrap computation of Standard Error. Please, wait a few minutes...
#>
#> --> Done <---
summary(theta_2)
#> _______________________________________________________________
#> Optimization routine: nlminb
#> Standard Error calculation: Bootstrap
#> _______________________________________________________________
#> AIC BIC
#> 64013.07 64013.07
#> _______________________________________________________________
#> Estimate Std. Error Z value Pr(>|z|)
#> sd 5.94003 0.04418 134.4 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> _______________________________________________________________
#> Note: p-values valid under asymptotic normality of estimators
#> ---
## Hessian
print(theta_2$fit$hessian)
#> [,1]
#> [1,] 566.8292
## Standard errors
print(theta_2$fit$StdE)
#> [1] 0.04418072
print(theta_2$outputs$StdE_Method)
#> [1] "Bootstrap"
Notice that Standard Errors calculated with optim
(\(0.042002\)) and those calculated with
bootstrap implementation (\(0.044181\))
are approximately equals, but no identical.