Maximum Likelihood Estimation for parametric linear regression models

Function to compute maximum likelihood estimators (MLE) of regression parameters of any distribution implemented in R with covariates (linear predictors).

Usage

maxlogLreg(
  formulas,
  y_dist,
  support = NULL,
  data = NULL,
  subset = NULL,
  fixed = NULL,
  link = NULL,
  optimizer = "nlminb",
  start = NULL,
  lower = NULL,
  upper = NULL,
  inequalities = NULL,
  control = NULL,
  silent = FALSE,
  StdE_method = c("optim", "numDeriv"),
  ...
)

Arguments

formulas

a list of formula objects. Each element must have an ~, with the terms on the right separated by + operators. The response variable on the left side is optional. Linear predictor of each parameter must be specified with the name of the parameter followed by the suffix '.fo'. See the examples below for further illustration.

y_dist

a formula object that specifies the distribution of the response variable. On the left side of ~ must be the response, and in the right side must be the name o de probability density/mass function. See the section Details and the examples below for further illustration.

support

a list with the following entries:

interval: a two dimensional atomic vector indicating the set of possible values of a random variable having the distribution specified in y_dist.
type: character indicating if distribution has a discrete or a continous random variable.

data

an optional data frame containing the variables in the model. If data is not specified, the variables are taken from the environment from which maxlogLreg is called.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

fixed

a list with fixed/known parameters of distribution of interest. Fixed parameters must be passed with its name and its value (known).

link

a list with names of parameters to be linked, and names of the link function object. For names of parameters, please visit documentation of density/mass function. There are three link functions available: log_link, logit_link and NegInv_link. Take into account: the order used in argument over corresponds to the order in argument link.

optimizer

a length-one character vector with the name of optimization routine. nlminb, optim and DEoptim are available; nlminb is the default routine.

start

a numeric vector with initial values for the parameters to be estimated. Zero is the default value.

lower

a numeric vector with lower bounds, with the same lenght of argument start (for box-constrained optimization). -Inf is the default value.

upper

a numeric vector with upper bounds, with the same lenght of argument start (for box-constrained optimization). Inf is the default value.

inequalities

a character vector with the inequality constrains for the distribution parameters.

control

control parameters of the optimization routine. Please, visit documentation of selected optimizer for further information.

silent

logical. If TRUE, warnings of maxlogL are suppressed.

StdE_method

a length-one character vector with the routine for Hessian matrix computation. The This is needed for standard error estimation. The options available are "optim" and "numDeriv". For further information, visit optim or hessian.

...

Further arguments to be supplied to the optimization routine.

Value

A list with class maxlogL containing the following lists:

fit

A list with output information about estimation and method used.

inputs

A list with all input arguments.

outputs

A list with additional information. The most important outputs are:

npar: number of parameters.
n: sample size
Stde_method: standard error computation method.
b_lenght: a list with the number of regression parameters.
design_matrix: a list with the $\mathbf{X}$ matrix for each parameter, the response values (called y) and the censorship matrix (called status). See the Details section for further information.

Details

maxlogLreg computes programmatically the log-likelihood (log L) function corresponding for the following model:

$$ y_i \stackrel{iid.}{\sim} \mathcal{D}(\theta_{i1},\theta_{i2},\dots, \theta_{ij}, \dots, \theta_{ik}) $$ $$ g(\boldsymbol{\theta}_{j}) = \boldsymbol{\eta}_{j} = \mathbf{X}_j^\top \boldsymbol{\beta}_j, $$

where,

$g_k(\cdot)$ is the $k$-th link function.
$\boldsymbol{\eta}_{j}$ is the value of the linear predictor for the $j^th$ for all the observations.
$\boldsymbol{\beta}_j = (\beta_{0j}, \beta_{1j},\dots, \beta_{(p_j-1)j})^\top$ are the fixed effects vector, where $p_j$ is the number of parameters in linear predictor $j$ and $\mathbf{X}_j$ is a known design matrix of order $n\times p_j$. These terms are specified in formulas argument.
$\mathcal{D}$ is the distribution specified in argument y_dist.

Then, maxlogLreg maximizes the log L through optim, nlminb or DEoptim. maxlogLreg generates an S3 object of class maxlogL.

Estimation with censorship can be handled with Surv objects (see example 2). The output object stores the corresponding censorship matrix, defined as $r_{il} = 1$ if sample unit $i$ has status $l$, or $r_{il} = 0$ in other case. $i=1,2,\dots,n$ and $l=1,2,3$ ($l=1$: observation status, $l=2$: right censorship status, $l=3$: left censorship status).

Note

The following generic functions can be used with a maxlogL object: summary, print, logLik, AIC.
Noncentrality parameters must be named as ncp in the distribution.

References

Nelder JA, Mead R (1965). “A Simplex Method for Function Minimization.” The Computer Journal, 7(4), 308--313. ISSN 0010-4620, doi:10.1093/comjnl/7.4.308 , https://academic.oup.com/comjnl/article-lookup/doi/10.1093/comjnl/7.4.308.

Fox PA, Hall AP, Schryer NL (1978). “The PORT Mathematical Subroutine Library.” ACM Transactions on Mathematical Software, 4(2), 104--126. ISSN 00983500, doi:10.1145/355780.355783 , https://dl.acm.org/doi/10.1145/355780.355783.

Nash JC (1979). Compact Numerical Methods for Computers. Linear Algebra and Function Minimisation, 2nd Edition edition. Adam Hilger, Bristol.

Dennis JE, Gay DM, Walsh RE (1981). “An Adaptive Nonlinear Least-Squares Algorithm.” ACM Transactions on Mathematical Software, 7(3), 348--368. ISSN 00983500, doi:10.1145/355958.355965 , https://dl.acm.org/doi/10.1145/355958.355965.

Author

Jaime Mosquera Gutiérrez, jmosquerag@unal.edu.co

Examples

library(EstimationTools)

#--------------------------------------------------------------------------------
# Example 1: Estimation in simulated normal distribution
n <- 1000
x <- runif(n = n, -5, 6)
y <- rnorm(n = n, mean = -2 + 3 * x, sd = exp(1 + 0.3* x))
norm_data <- data.frame(y = y, x = x)

# It does not matter the order of distribution parameters
formulas <- list(sd.fo = ~ x, mean.fo = ~ x)
support <- list(interval = c(-Inf, Inf), type = 'continuous')

norm_mod <- maxlogLreg(formulas, y_dist = y ~ dnorm, support = support,
                       data = norm_data,
                       link = list(over = "sd", fun = "log_link"))
summary(norm_mod)
#> _______________________________________________________________
#> Optimization routine: nlminb 
#> Standard Error calculation: Hessian from optim 
#> _______________________________________________________________
#>        AIC      BIC
#>   5042.371 5062.002
#> _______________________________________________________________
#> Fixed effects for mean
#> ---------------------------------------------------------------
#>              Estimate Std. Error Z value  Pr(>|z|)    
#> (Intercept) -2.032566   0.108962 -18.654 < 2.2e-16 ***
#> x            2.985870   0.029282 101.968 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> _______________________________________________________________
#> Fixed effects for log(sd) 
#> ---------------------------------------------------------------
#>              Estimate Std. Error Z value  Pr(>|z|)    
#> (Intercept) 0.9588786  0.0226100  42.410 < 2.2e-16 ***
#> x           0.2927067  0.0070322  41.624 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> _______________________________________________________________
#> Note: p-values valid under asymptotic normality of estimators 
#> ---


#--------------------------------------------------------------------------------
# Example 2: Fitting with censorship
# (data from https://www.itl.nist.gov/div898/handbook/apr/section4/apr413.htm)

failures <- c(55, 187, 216, 240, 244, 335, 361, 373, 375, 386)
fails <- c(failures, rep(500, 10))
status <- c(rep(1, length(failures)), rep(0, 10))
Wei_data <- data.frame(fails = fails, status = status)

# Formulas with linear predictors
formulas <- list(scale.fo=~1, shape.fo=~1)
support <- list(interval = c(0, Inf), type = 'continuous')

# Bounds for optimization. Upper bound set with default values (Inf)
start <- list(
  scale = list(Intercept = 100),
  shape = list(Intercept = 10)
)
lower <- list(
  scale = list(Intercept = 0),
  shape = list(Intercept = 0)
)

mod_weibull <- maxlogLreg(formulas, y_dist = Surv(fails, status) ~ dweibull,
                          support = c(0, Inf), start = start,
                          lower = lower, data = Wei_data)
summary(mod_weibull)
#> _______________________________________________________________
#> Optimization routine: nlminb 
#> Standard Error calculation: Hessian from optim 
#> _______________________________________________________________
#>        AIC      BIC
#>   154.2437 156.2352
#> _______________________________________________________________
#> Fixed effects for shape
#> ---------------------------------------------------------------
#>             Estimate Std. Error Z value  Pr(>|z|)    
#> (Intercept)  1.72563    0.50341  3.4279 0.0006083 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> _______________________________________________________________
#> Fixed effects for scale
#> ---------------------------------------------------------------
#>             Estimate Std. Error Z value  Pr(>|z|)    
#> (Intercept)   606.00     124.43  4.8701 1.115e-06 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> _______________________________________________________________
#> Note: p-values valid under asymptotic normality of estimators 
#> ---


#--------------------------------------------------------------------------------