Calibrate sample weights — dss • mlfit

Calibrate sample weights according to known marginal population totals. Based on initial sample weights, the so-called g-weights are computed by generalized raking procedures. The final sample weights need to be computed by multiplying the resulting g-weights with the initial sample weights.

dss(
  X,
  d,
  totals,
  q = NULL,
  method = c("raking", "linear", "logit"),
  bounds = NULL,
  maxit = 500,
  ginv = gginv(),
  tol = 1e-06,
  attributes = FALSE
)

Arguments

X: a matrix of calibration variables.
d: a numeric vector giving the initial sample (or design) weights.
totals: a numeric vector of population totals corresponding to the calibration variables in X.
q: a numeric vector of positive values accounting for heteroscedasticity. Small values reduce the variation of the g-weights.
method: a character string specifying the calibration method to be used. Possible values are "linear" for the linear method, "raking" for the multiplicative method known as raking and "logit" for the logit method.
bounds: a numeric vector of length two giving bounds for the g-weights to be used in the logit method. The first value gives the lower bound (which must be smaller than or equal to 1) and the second value gives the upper bound (which must be larger than or equal to 1). If NULL, the bounds are set to c(0, 10).
maxit: a numeric value giving the maximum number of iterations.
ginv: a function that computes the Moore-Penrose generalized inverse (default: an optimized version of MASS::ginv()). In some cases it is possible to speed up the process by using a function that computes a "regular" matrix inverse such as {solve.default}.
tol: relative tolerance; convergence is achieved if the difference of all residuals (relative to the corresponding total) is smaller than this tolerance.
attributes: should additional attributes (currently success, iterations, method and bounds) be added to the result? If FALSE (default), a warning is given if convergence within the given relative tolerance could not be achieved.

Value

A numeric vector containing the g-weights.

Note

This is a faster implementation of parts of sampling::calib() from package sampling. Note that the default calibration method is raking and that the truncated linear method is not yet implemented.

References

Deville, J.-C. and Särndal, C.-E. (1992) Calibration estimators in survey sampling. Journal of the American Statistical Association, 87(418), 376–382.

Deville, J.-C., Särndal, C.-E. and Sautory, O. (1993) Generalized raking procedures in survey sampling. Journal of the American Statistical Association, 88(423), 1013–1020.

Author

Andreas Alfons, with improvements by Kirill Müller

Examples

obs <- 1000
vars <- 100
Xs <- matrix(runif(obs * vars), nrow = obs)
d <- runif(obs) / obs
totals <- rep(1, vars)
g <- dss(Xs, d, totals, method = "linear", ginv = solve)
g2 <- dss(Xs, d, totals, method = "raking")