Skip to contents

This function takes in delay data and prepares it for use with the primarycensored Stan model.

Usage

pcd_as_stan_data(
  data,
  delay = "delay",
  delay_upper = "delay_upper",
  n = "n",
  pwindow = "pwindow",
  relative_obs_time = "relative_obs_time",
  dist_id,
  primary_id,
  param_bounds,
  primary_param_bounds,
  priors,
  primary_priors,
  compute_log_lik = FALSE,
  use_reduce_sum = FALSE,
  truncation_check_multiplier = 2
)

Arguments

data

A data frame containing the delay data.

delay

Column name for observed delays (default: "delay")

delay_upper

Column name for upper bound of delays (default: "delay_upper")

n

Column name for count of observations (default: "n")

pwindow

Column name for primary window (default: "pwindow")

relative_obs_time

Column name for relative observation time (default: "relative_obs_time")

dist_id

Integer identifying the delay distribution: 1 = Lognormal, 2 = Gamma, 3 = Weibull, 4 = Exponential, 5 = Generalized Gamma, 6 = Negative Binomial, 7 = Poisson, 8 = Bernoulli, 9 = Beta, 10 = Binomial, 11 = Categorical, 12 = Cauchy, 13 = Chi-square, 14 = Dirichlet, 15 = Gumbel, 16 = Inverse Gamma, 17 = Logistic

primary_id

Integer identifying the primary distribution: 1 = Uniform, 2 = Exponential growth

param_bounds

A list with elements lower and upper, each a numeric vector specifying bounds for the delay distribution parameters.

primary_param_bounds

A list with elements lower and upper, each a numeric vector specifying bounds for the primary distribution parameters.

priors

A list with elements location and scale, each a numeric vector specifying priors for the delay distribution parameters.

primary_priors

A list with elements location and scale, each a numeric vector specifying priors for the primary distribution parameters.

compute_log_lik

Logical; compute log likelihood? (default: FALSE)

use_reduce_sum

Logical; use reduce_sum for performance? (default: FALSE)

truncation_check_multiplier

Numeric multiplier to use for checking if the truncation time D is appropriate relative to the maximum delay for each unique D value. Set to NULL to skip the check. Default is 2.

Value

A list containing the data formatted for use with pcd_cmdstan_model()

See also

Modelling wrappers for external fitting packages fitdistdoublecens(), pcd_cmdstan_model()

Examples

data <- data.frame(
  delay = c(1, 2, 3),
  delay_upper = c(2, 3, 4),
  n = c(10, 20, 15),
  pwindow = c(1, 1, 2),
  relative_obs_time = c(10, 10, 10)
)
stan_data <- pcd_as_stan_data(
  data,
  dist_id = 1,
  primary_id = 1,
  param_bounds = list(lower = c(0, 0), upper = c(10, 10)),
  primary_param_bounds = list(lower = numeric(0), upper = numeric(0)),
  priors = list(location = c(1, 1), scale = c(1, 1)),
  primary_priors = list(location = numeric(0), scale = numeric(0))
)
#> The truncation time D (10) is larger than 2 times the maximum observed delay (3). Consider setting D to Inf for better efficiency with minimal accuracy cost for this case.