Estimate weights for a fitting problem

These functions reweight a reference sample to match constraints given by aggregate controls.

ml_fit() accepts an algorithm as argument and calls the corresponding function. This is useful if the result of multiple algorithms are compared to each other.

ml_fit(
  ml_problem,
  algorithm = c("entropy_o", "dss", "ipu", "hipf"),
  verbose = FALSE,
  ...,
  tol = 1e-06
)

is_ml_fit(x)

# S3 method for class 'ml_fit'
format(x, ...)

# S3 method for class 'ml_fit'
print(x, ...)

ml_fit_dss(
  ml_problem,
  method = c("raking", "linear", "logit"),
  ginv = gginv(),
  tol = 1e-06,
  verbose = FALSE
)

ml_fit_entropy_o(
  ml_problem,
  verbose = FALSE,
  tol = 1e-06,
  dfsane_args = list()
)

ml_fit_hipf(
  ml_problem,
  diff_tol = 16 * .Machine$double.eps,
  tol = 1e-06,
  maxiter = 2000,
  verbose = FALSE
)

ml_fit_ipu(
  ml_problem,
  diff_tol = 16 * .Machine$double.eps,
  tol = 1e-06,
  maxiter = 2000,
  verbose = FALSE
)

Arguments

ml_problem: A fitting problem created by ml_problem() or returned by flatten_ml_fit_problem().
algorithm: Algorithm to use
verbose: If TRUE, print diagnostic output.
...: Further parameters passed to the algorithm
tol: Tolerance, the algorithm has succeeded when all target values are reached within this tolerance.
x: An object
method: Calibration method, one of "raking" (default), "linear", or "logit"
ginv: Function that computes the Moore-Penrose pseudoinverse.
dfsane_args: Additional arguments (as a named list) passed to the BB::dfsane() function used internally for the optimization.
diff_tol: Tolerance, the algorithm stops when relative difference of control values between iterations drops below this value
maxiter: Maximum number of iterations.

Value

All functions return an object of class ml_fit, which is a named list under the hood. The class matches the function called, e.g., the return value of the ml_fit_ipu function also is of class ml_fit_ipu.

All returned objects contain at least the following components, which can be accessed with $ or [[:

weights: Resulting weights, compatible to the original reference sample
tol: The input tolerance
iterations: The actual number of iterations required to obtain the result
flat: The flattened fitting problem, see flatten_ml_fit_problem()
flat_weights: Weights in terms of the flattened fitting problem
residuals: Absolute residuals

rel_residuals: Relative residuals
success: Are the residuals within the tolerance?

is_ml_fit() returns a logical.

References

Deville, J.-C. and Särndal, C.-E. (1992) Calibration estimators in survey sampling. Journal of the American Statistical Association, 87 (418), 376–382.

Deville, J.-C., Särndal, C.-E. and Sautory, O. (1993) Generalized raking procedures in survey sampling. Journal of the American Statistical Association, 88 (423), 1013–1020.

Bar-Gera, H., Konduri, K. C., Sana, B., Ye, X., & Pendyala, R. M. (2009, January). Estimating survey weights with multiple constraints using entropy optimization methods. In 88th Annual Meeting of the Transportation Research Board, Washington, DC.

Müller, K. and Axhausen, K. W. (2011), Hierarchical IPF: Generating a synthetic population for Switzerland, paper presented at the 51st Congress of the European Regional Science Association, University of Barcelona, Barcelona.

Ye, X., K. Konduri, R. M. Pendyala, B. Sana and P. A. Waddell (2009) A methodology to match distributions of both household and person attributes in the generation of synthetic populations, paper presented at the 88th Annual Meeting of the Transportation Research Board, Washington, D.C., January 2009.

Examples

path <- toy_example("Tiny")
fit <- ml_fit(ml_problem = readRDS(path), algorithm = "entropy_o")
fit
#> An object of class ml_fit
#>   Algorithm: entropy_o
#>   Success: TRUE
#>   Residuals (absolute): min = -1.452455e-08, max = 3.887351e-09
#>   Flat problem:
#>   An object of class flat_ml_fit_problem
#>     Dimensions: 5 groups, 8 target values
#>     Model matrix type: combined
#>     Original fitting problem:
#>     An object of class ml_problem
#>       Reference sample: 23 observations
#>       Control totals: 1 at individual, and 1 at group level
#>       Results for algorithms: entropy_o(1,0), entropy_o(0,1), entropy_o(1,1), entropy, ml_ipf, ipu
fit$weights
#>  [1]  8.937470  8.937470  8.937470 23.448579 23.448579  2.613950  2.613950
#>  [8]  2.613950 25.899223 25.899223 25.899223 14.347802 14.347802 14.347802
#> [15] 11.009562 11.009562  2.733852  2.733852  2.733852  2.733852  2.733852
#> [22] 11.009562 11.009562
fit$tol
#> [1] 1e-06
fit$iterations
#> [1] 189
fit$flat
#> An object of class flat_ml_fit_problem
#>   Dimensions: 5 groups, 8 target values
#>   Model matrix type: combined
#>   Original fitting problem:
#>   An object of class ml_problem
#>     Reference sample: 23 observations
#>     Control totals: 1 at individual, and 1 at group level
#>     Results for algorithms: entropy_o(1,0), entropy_o(0,1), entropy_o(1,1), entropy, ml_ipf, ipu
fit$flat_weights
#> [1]  8.937470 23.448579  2.613950 25.899223 14.347802 11.009562  2.733852
#> [8] 11.009562
fit$residuals
#> (Intercept)_g (Intercept)_i       CAR_g_1    WKSTAT_i_2    WKSTAT_i_3 
#>  2.494232e-09  3.887351e-09 -1.452455e-08  3.906564e-10 -2.833261e-09 
fit$rel_residuals
#> (Intercept)_g (Intercept)_i       CAR_g_1    WKSTAT_i_2    WKSTAT_i_3 
#>  2.494227e-11  1.495137e-11 -2.234546e-10  6.010081e-12 -2.724287e-11 
fit$success
#> [1] TRUE
ml_fit_dss(ml_problem = readRDS(path))
#> An object of class ml_fit
#>   Algorithm: dss
#>   Success: TRUE
#>   Residuals (absolute): min = 2.338015e-08, max = 1.613564e-07
#>   Flat problem:
#>   An object of class flat_ml_fit_problem
#>     Dimensions: 5 groups, 8 target values
#>     Model matrix type: combined
#>     Original fitting problem:
#>     An object of class ml_problem
#>       Reference sample: 23 observations
#>       Control totals: 1 at individual, and 1 at group level
#>       Results for algorithms: entropy_o(1,0), entropy_o(0,1), entropy_o(1,1), entropy, ml_ipf, ipu
ml_fit_dss(ml_problem = readRDS(path), ginv = solve)
#> An object of class ml_fit
#>   Algorithm: dss
#>   Success: TRUE
#>   Residuals (absolute): min = 2.338014e-08, max = 1.613563e-07
#>   Flat problem:
#>   An object of class flat_ml_fit_problem
#>     Dimensions: 5 groups, 8 target values
#>     Model matrix type: combined
#>     Original fitting problem:
#>     An object of class ml_problem
#>       Reference sample: 23 observations
#>       Control totals: 1 at individual, and 1 at group level
#>       Results for algorithms: entropy_o(1,0), entropy_o(0,1), entropy_o(1,1), entropy, ml_ipf, ipu
ml_fit_entropy_o(ml_problem = readRDS(path))
#> An object of class ml_fit
#>   Algorithm: entropy_o
#>   Success: TRUE
#>   Residuals (absolute): min = -1.452455e-08, max = 3.887351e-09
#>   Flat problem:
#>   An object of class flat_ml_fit_problem
#>     Dimensions: 5 groups, 8 target values
#>     Model matrix type: combined
#>     Original fitting problem:
#>     An object of class ml_problem
#>       Reference sample: 23 observations
#>       Control totals: 1 at individual, and 1 at group level
#>       Results for algorithms: entropy_o(1,0), entropy_o(0,1), entropy_o(1,1), entropy, ml_ipf, ipu
ml_fit_hipf(ml_problem = readRDS(path))
#> An object of class ml_fit
#>   Algorithm: hipf
#>   Success: TRUE
#>   Residuals (absolute): min = -0.000103996, max = 0
#>   Flat problem:
#>   An object of class flat_ml_fit_problem
#>     Dimensions: 5 groups, 8 target values
#>     Model matrix type: combined
#>     Original fitting problem:
#>     An object of class ml_problem
#>       Reference sample: 23 observations
#>       Control totals: 1 at individual, and 1 at group level
#>       Results for algorithms: entropy_o(1,0), entropy_o(0,1), entropy_o(1,1), entropy, ml_ipf, ipu
ml_fit_ipu(ml_problem = readRDS(path))
#> An object of class ml_fit
#>   Algorithm: ipu
#>   Success: TRUE
#>   Residuals (absolute): min = -6.41906e-05, max = 0
#>   Flat problem:
#>   An object of class flat_ml_fit_problem
#>     Dimensions: 5 groups, 8 target values
#>     Model matrix type: separate
#>     Original fitting problem:
#>     An object of class ml_problem
#>       Reference sample: 23 observations
#>       Control totals: 1 at individual, and 1 at group level
#>       Results for algorithms: entropy_o(1,0), entropy_o(0,1), entropy_o(1,1), entropy, ml_ipf, ipu

Estimate weights for a fitting problem

Arguments

Value

References

See also

Examples