Replicate records in a reference sample based on its fitted weights

This function replicates each entry in a reference sample based on its fitted weights. This is useful if the result of multiple replication algorithms are compared to each other, or to generate a full synthetic population based on the result of a ml_fit object. Note that, all individual and group ids of the synthetic population are not the same as those in the original reference sample, and the total number of groups replicated is always very close to or equal the sum of the fitted group weights.

ml_replicate(
  ml_fit,
  algorithm = c("pp", "trs", "round"),
  verbose = FALSE,
  .keep_original_ids = FALSE
)

Arguments

ml_fit: A ml_fit object created by the ml_fit() family.
algorithm: Replication algorithm to use. "trs" is the 'Truncate, replicate, sample' integerisation algorithm proposed by Lovelace et al. (2013), "pp" is weighted sampling with replacement, and "round" is just simple rounding.
verbose: If TRUE, print diagnostic output.
.keep_original_ids: If TRUE, the original individual and group ids of the reference sample will be kept with suffix '_old'.

Value

The function returns a replicated sample in data.frame in the format used in the reference sample of the input ml_fit object.

References

 Lovelace, R., & Ballas, D. (2013). ‘Truncate, replicate, sample’:
     A method for creating integer weights for spatial microsimulation.
     Computers, Environment and Urban Systems, 41, 1-11.

Examples

path <- toy_example("Tiny")
fit <- ml_fit(ml_problem = readRDS(path), algorithm = "entropy_o")
syn_pop <- ml_replicate(fit, algorithm = "trs")
syn_pop
#> # A tibble: 261 × 5
#>     HHNR   PNR  APER CAR   WKSTAT
#>    <int> <int> <int> <fct> <fct> 
#>  1     1     1     3 0     1     
#>  2     1     2     3 0     2     
#>  3     1     3     3 0     3     
#>  4     2     4     3 0     1     
#>  5     2     5     3 0     2     
#>  6     2     6     3 0     3     
#>  7     3     7     3 0     1     
#>  8     3     8     3 0     2     
#>  9     3     9     3 0     3     
#> 10     4    10     3 0     1     
#> # ℹ 251 more rows