Maximum Likelihood Estimation for parametric linear regression models
Source:R/maxlogLreg.R
maxlogLreg.Rd
Function to compute maximum likelihood estimators (MLE) of regression parameters
of any distribution implemented in R
with covariates (linear predictors).
Usage
maxlogLreg(
formulas,
y_dist,
support = NULL,
data = NULL,
subset = NULL,
fixed = NULL,
link = NULL,
optimizer = "nlminb",
start = NULL,
lower = NULL,
upper = NULL,
inequalities = NULL,
control = NULL,
silent = FALSE,
StdE_method = c("optim", "numDeriv"),
...
)
Arguments
- formulas
a list of formula objects. Each element must have an
~
, with the terms on the right separated by+
operators. The response variable on the left side is optional. Linear predictor of each parameter must be specified with the name of the parameter followed by the suffix'.fo'
. See the examples below for further illustration.- y_dist
a formula object that specifies the distribution of the response variable. On the left side of
~
must be the response, and in the right side must be the name o de probability density/mass function. See the section Details and the examples below for further illustration.- support
a list with the following entries:
interval
: a two dimensional atomic vector indicating the set of possible values of a random variable having the distribution specified iny_dist
.type
: character indicating if distribution has adiscrete
or acontinous
random variable.
- data
an optional data frame containing the variables in the model. If data is not specified, the variables are taken from the environment from which
maxlogLreg
is called.- subset
an optional vector specifying a subset of observations to be used in the fitting process.
- fixed
a list with fixed/known parameters of distribution of interest. Fixed parameters must be passed with its name and its value (known).
- link
a list with names of parameters to be linked, and names of the link function object. For names of parameters, please visit documentation of density/mass function. There are three link functions available:
log_link
,logit_link
andNegInv_link
. Take into account: the order used in argumentover
corresponds to the order in argumentlink
.- optimizer
a length-one character vector with the name of optimization routine.
nlminb
,optim
andDEoptim
are available;nlminb
is the default routine.- start
a numeric vector with initial values for the parameters to be estimated. Zero is the default value.
- lower
a numeric vector with lower bounds, with the same lenght of argument
start
(for box-constrained optimization).-Inf
is the default value.- upper
a numeric vector with upper bounds, with the same lenght of argument
start
(for box-constrained optimization).Inf
is the default value.- inequalities
a character vector with the inequality constrains for the distribution parameters.
- control
control parameters of the optimization routine. Please, visit documentation of selected optimizer for further information.
- silent
logical. If TRUE, warnings of
maxlogL
are suppressed.- StdE_method
a length-one character vector with the routine for Hessian matrix computation. The This is needed for standard error estimation. The options available are
"optim"
and"numDeriv"
. For further information, visitoptim
orhessian
.- ...
Further arguments to be supplied to the optimization routine.
Value
A list with class maxlogL
containing the following
lists:
- fit
A list with output information about estimation and method used.
- inputs
A list with all input arguments.
- outputs
A list with additional information. The most important outputs are:
npar
: number of parameters.n
: sample sizeStde_method
: standard error computation method.b_lenght
: a list with the number of regression parameters.design_matrix
: a list with the \(\mathbf{X}\) matrix for each parameter, the response values (calledy
) and the censorship matrix (calledstatus
). See the Details section for further information.
Details
maxlogLreg
computes programmatically the log-likelihood
(log L) function corresponding for the following model:
$$ y_i \stackrel{iid.}{\sim} \mathcal{D}(\theta_{i1},\theta_{i2},\dots, \theta_{ij}, \dots, \theta_{ik}) $$ $$ g(\boldsymbol{\theta}_{j}) = \boldsymbol{\eta}_{j} = \mathbf{X}_j^\top \boldsymbol{\beta}_j, $$
where,
\(g_k(\cdot)\) is the \(k\)-th link function.
\(\boldsymbol{\eta}_{j}\) is the value of the linear predictor for the $j^th$ for all the observations.
\(\boldsymbol{\beta}_j = (\beta_{0j}, \beta_{1j},\dots, \beta_{(p_j-1)j})^\top\) are the fixed effects vector, where \(p_j\) is the number of parameters in linear predictor \(j\) and \(\mathbf{X}_j\) is a known design matrix of order \(n\times p_j\). These terms are specified in
formulas
argument.\(\mathcal{D}\) is the distribution specified in argument
y_dist
.
Then, maxlogLreg
maximizes the log L through optim
,
nlminb
or DEoptim
. maxlogLreg
generates
an S3 object of class maxlogL
.
Estimation with censorship can be handled with Surv
objects
(see example 2). The output object stores the corresponding censorship matrix,
defined as \(r_{il} = 1\) if sample unit \(i\) has status \(l\), or
\(r_{il} = 0\) in other case. \(i=1,2,\dots,n\) and \(l=1,2,3\)
(\(l=1\): observation status, \(l=2\): right censorship status,
\(l=3\): left censorship status).
Note
The following generic functions can be used with a
maxlogL
object:summary, print, logLik, AIC
.Noncentrality parameters must be named as
ncp
in the distribution.
References
Nelder JA, Mead R (1965). “A Simplex Method for Function Minimization.” The Computer Journal, 7(4), 308--313. ISSN 0010-4620, doi:10.1093/comjnl/7.4.308 , https://academic.oup.com/comjnl/article-lookup/doi/10.1093/comjnl/7.4.308.
Fox PA, Hall AP, Schryer NL (1978). “The PORT Mathematical Subroutine Library.” ACM Transactions on Mathematical Software, 4(2), 104--126. ISSN 00983500, doi:10.1145/355780.355783 , https://dl.acm.org/doi/10.1145/355780.355783.
Nash JC (1979). Compact Numerical Methods for Computers. Linear Algebra and Function Minimisation, 2nd Edition edition. Adam Hilger, Bristol.
Dennis JE, Gay DM, Walsh RE (1981). “An Adaptive Nonlinear Least-Squares Algorithm.” ACM Transactions on Mathematical Software, 7(3), 348--368. ISSN 00983500, doi:10.1145/355958.355965 , https://dl.acm.org/doi/10.1145/355958.355965.
See also
summary.maxlogL
, optim
, nlminb
,
DEoptim
, DEoptim.control
,
maxlogL
, bootstrap_maxlogL
Other maxlogL:
cum_hazard.maxlogL()
,
expected_value.maxlogL()
,
maxlogL()
Author
Jaime Mosquera Gutiérrez, jmosquerag@unal.edu.co
Examples
library(EstimationTools)
#--------------------------------------------------------------------------------
# Example 1: Estimation in simulated normal distribution
n <- 1000
x <- runif(n = n, -5, 6)
y <- rnorm(n = n, mean = -2 + 3 * x, sd = exp(1 + 0.3* x))
norm_data <- data.frame(y = y, x = x)
# It does not matter the order of distribution parameters
formulas <- list(sd.fo = ~ x, mean.fo = ~ x)
support <- list(interval = c(-Inf, Inf), type = 'continuous')
norm_mod <- maxlogLreg(formulas, y_dist = y ~ dnorm, support = support,
data = norm_data,
link = list(over = "sd", fun = "log_link"))
summary(norm_mod)
#> _______________________________________________________________
#> Optimization routine: nlminb
#> Standard Error calculation: Hessian from optim
#> _______________________________________________________________
#> AIC BIC
#> 5042.371 5062.002
#> _______________________________________________________________
#> Fixed effects for mean
#> ---------------------------------------------------------------
#> Estimate Std. Error Z value Pr(>|z|)
#> (Intercept) -2.032566 0.108962 -18.654 < 2.2e-16 ***
#> x 2.985870 0.029282 101.968 < 2.2e-16 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> _______________________________________________________________
#> Fixed effects for log(sd)
#> ---------------------------------------------------------------
#> Estimate Std. Error Z value Pr(>|z|)
#> (Intercept) 0.9588786 0.0226100 42.410 < 2.2e-16 ***
#> x 0.2927067 0.0070322 41.624 < 2.2e-16 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> _______________________________________________________________
#> Note: p-values valid under asymptotic normality of estimators
#> ---
#--------------------------------------------------------------------------------
# Example 2: Fitting with censorship
# (data from https://www.itl.nist.gov/div898/handbook/apr/section4/apr413.htm)
failures <- c(55, 187, 216, 240, 244, 335, 361, 373, 375, 386)
fails <- c(failures, rep(500, 10))
status <- c(rep(1, length(failures)), rep(0, 10))
Wei_data <- data.frame(fails = fails, status = status)
# Formulas with linear predictors
formulas <- list(scale.fo=~1, shape.fo=~1)
support <- list(interval = c(0, Inf), type = 'continuous')
# Bounds for optimization. Upper bound set with default values (Inf)
start <- list(
scale = list(Intercept = 100),
shape = list(Intercept = 10)
)
lower <- list(
scale = list(Intercept = 0),
shape = list(Intercept = 0)
)
mod_weibull <- maxlogLreg(formulas, y_dist = Surv(fails, status) ~ dweibull,
support = c(0, Inf), start = start,
lower = lower, data = Wei_data)
summary(mod_weibull)
#> _______________________________________________________________
#> Optimization routine: nlminb
#> Standard Error calculation: Hessian from optim
#> _______________________________________________________________
#> AIC BIC
#> 154.2437 156.2352
#> _______________________________________________________________
#> Fixed effects for shape
#> ---------------------------------------------------------------
#> Estimate Std. Error Z value Pr(>|z|)
#> (Intercept) 1.72563 0.50341 3.4279 0.0006083 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> _______________________________________________________________
#> Fixed effects for scale
#> ---------------------------------------------------------------
#> Estimate Std. Error Z value Pr(>|z|)
#> (Intercept) 606.00 124.43 4.8701 1.115e-06 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> _______________________________________________________________
#> Note: p-values valid under asymptotic normality of estimators
#> ---
#--------------------------------------------------------------------------------