Performs stepwise model selection using forward, backward, or both directions across different regression approaches. Returns a summary table with evaluation metrics (AIC, BIC, log-likelihood, deviance) and the best model.
Arguments
- data
A data frame containing the outcome and predictor variables.
- outcome
A character string indicating the outcome variable.
- exposures
vector of predictor variables to consider in the model.
- approach
Regression method. One of:
"logit"
,"log-binomial"
,"poisson"
,"robpoisson"
,"negbin"
, or"linear"
.- direction
Stepwise selection direction. One of:
"forward"
(default),"backward"
, or"both"
.
Value
A list with the following components:
results_table
: A tibble summarising each tested model's metric (AIC, BIC, deviance, log-likelihood, adjusted R² if applicable).best_model
: The best-fitting model object based on low AIC.all_models
: A named list of all fitted models.
Examples
data <- data_PimaIndiansDiabetes
stepwise <- select_models(
data = data,
outcome = "glucose",
exposures = c("age", "pregnant", "mass"),
approach = "linear",
direction = "forward"
)
summary(stepwise)
#> Length Class Mode
#> results_table 8 tbl_df list
#> best_model 13 lm list
#> all_models 3 -none- list
stepwise$results_table
#> # A tibble: 3 × 8
#> model_id formula n_predictors AIC BIC logLik deviance adj_r2
#> <dbl> <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 glucose ~ 1 1 7386. 7395. -3691. 710508. 0
#> 2 2 glucose ~ mass 2 7242. 7256. -3618. 665157. 0.0529
#> 3 3 glucose ~ mass + age 2 7190. 7209. -3591. 618925. 0.118
stepwise$best_model
#>
#> Call:
#> lm(formula = glucose ~ mass + age, data = data)
#>
#> Coefficients:
#> (Intercept) mass age
#> 67.0644 1.0029 0.6702
#>