Descriptive Summary Table for Study Characteristics (User-Friendly)
Source:R/descriptive_table.R
descriptive_table.Rd
Creates a clean, publication-ready summary table using `gtsummary::tbl_summary()`. Designed for beginner analysts, this function applies sensible defaults and flexible options to display categorical and continuous variables with or without stratification. It supports one-line summaries of dichotomous variables, handles missing data gracefully, and includes an optional "Overall" column for comparison.
Arguments
- data
A data frame containing your study dataset.
- exposures
A character vector specifying the variable names (columns) in `data` that should be included in the summary table. These can be categorical or continuous.
- by
Optional. A single character string giving the name of a grouping variable (e.g., outcome). If supplied, the table will show stratified summaries by this variable.
- percent
Character. Either `"column"` (default) or `"row"`. - `"column"` calculates percentages within each group defined by `by` (i.e., denominator = column total). - `"row"` calculates percentages across `by` groups (i.e., denominator = row total). If `by` is not specified, `"column"` is used and `"row"` is ignored.
- digits
Integer. Controls how many decimal places are shown for percentages and means. Defaults to 1.
- show_missing
Character. One of `"ifany"` (default) or `"no"`. - `"ifany"` shows missing value counts only when missing values exist. - `"no"` hides missing counts entirely.
- show_dichotomous
Character. One of `"all_levels"` (default) or `"single_row"`. - `"all_levels"` displays all levels of binary (dichotomous) variables. - `"single_row"` shows only one row (typically "Yes", "Present", or a user-defined level), making the table more compact.
- show_overall
Character. One of `"no"` (default), `"first"`, or `"last"`. If `by` is supplied: - `"first"` includes a column for overall summaries before the stratified columns. - `"last"` includes the overall column at the end. - `"no"` disables the overall column.
- statistic
Optional named vector of summary types for specific variables. For example, use `statistic = c(age = "mean", bmi = "median")` to override default summaries. Accepted values: `"mean"`, `"median"`, `"mode"`, `"count"`.
- value
Optional. A list of formulas specifying which level of a binary variable to show when `show_dichotomous = "single_row"`. For example, `value = list(sex ~ "Female")` will report only the "Female" row.
Value
A `gtsummary::tbl_summary` object with additional class `"descriptive_table"`. Can be printed, customized, merged, or exported.
Examples
# \donttest{
if (requireNamespace("mlbench", quietly = TRUE)) {
data("PimaIndiansDiabetes2", package = "mlbench")
library(dplyr)
pima <- PimaIndiansDiabetes2 |>
mutate(
diabetes = ifelse(diabetes == "pos", 1, 0),
bmi = cut(
mass,
breaks = c(-Inf, 18.5, 24.9, 29.9, Inf),
labels = c("Underweight", "Normal", "Overweight", "Obese")
)
)
descriptive_table(pima, exposures = c("age", "bmi"),
by = "diabetes")
}
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
Characteristic
0
N = 5001
1
N = 2681
1 Mean (SD); n (%)
# }