The ml_problem()
function is the first step for fitting a reference
sample to known control totals with mlfit.
All algorithms (see ml_fit()
) expect an object created by this function (or
optionally processed with flatten_ml_fit_problem()
).
The special_field_names()
function is useful for the field_names
argument
to ml_problem
.
ml_problem(
ref_sample,
controls = list(individual = individual_controls, group = group_controls),
field_names,
individual_controls = NULL,
group_controls = NULL,
prior_weights = NULL,
geo_hierarchy = NULL
)
is_ml_problem(x)
# S3 method for class 'ml_problem'
format(x, ...)
# S3 method for class 'ml_problem'
print(x, ...)
special_field_names(
groupId,
individualId,
individualsPerGroup = NULL,
count = NULL,
zone = NULL,
region = NULL,
prior_weight = NULL
)
The reference sample
Control totals, by default initialized from the
individual_controls
and group_controls
arguments
Names of special fields, construct using
special_field_names()
Control totals at individual and group level, given as a list of data frames where each data frame defines a control
(Deprecated) Use special_field_names(prior_weight = '<column-name>')
to specify the prior weight column in the ref_sample
instead.
A table shows mapping between a larger zoning level to
many zones of a smaller zoning level. The column name of the larger level
should be specified in field_names
as 'region' and the smaller one as
'zone'.
An object
Ignored.
Name of the column that defines the ID of the group or the individual
Obsolete.
Name of control total column in control tables (use first numeric column in each control by default).
Name of the column that defines the region of the reference sample or the zone of the controls. Note that region is a larger area that contains more than one zone.
Name of the column that defines the prior weight of the reference sample. Prior (or design) weights at group level; by default a vector of ones will be used, which corresponds to random sampling of groups.
An object of class ml_problem
or a list of them if geo_hierarchy
was given, essentially a named list with the following components:
refSample
The reference sample, a data.frame
.
controls
A named list with two components, individual
and group
. Each contains a list of controls as data.frame
s.
fieldNames
A named list with the names of special fields.
is_ml_problem()
returns a logical.
# Create example from Ye et al., 2009
# Provide reference sample
ye <- tibble::tribble(
~HHNR, ~PNR, ~APER, ~HH_VAR, ~P_VAR,
1, 1, 3, 1, 1,
1, 2, 3, 1, 2,
1, 3, 3, 1, 3,
2, 4, 2, 1, 1,
2, 5, 2, 1, 3,
3, 6, 3, 1, 1,
3, 7, 3, 1, 1,
3, 8, 3, 1, 2,
4, 9, 3, 2, 1,
4, 10, 3, 2, 3,
4, 11, 3, 2, 3,
5, 12, 3, 2, 2,
5, 13, 3, 2, 2,
5, 14, 3, 2, 3,
6, 15, 2, 2, 1,
6, 16, 2, 2, 2,
7, 17, 5, 2, 1,
7, 18, 5, 2, 1,
7, 19, 5, 2, 2,
7, 20, 5, 2, 3,
7, 21, 5, 2, 3,
8, 22, 2, 2, 1,
8, 23, 2, 2, 2
)
ye
#> # A tibble: 23 × 5
#> HHNR PNR APER HH_VAR P_VAR
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 3 1 1
#> 2 1 2 3 1 2
#> 3 1 3 3 1 3
#> 4 2 4 2 1 1
#> 5 2 5 2 1 3
#> 6 3 6 3 1 1
#> 7 3 7 3 1 1
#> 8 3 8 3 1 2
#> 9 4 9 3 2 1
#> 10 4 10 3 2 3
#> # ℹ 13 more rows
# Specify control at household level
ye_hh <- tibble::tribble(
~HH_VAR, ~N,
1, 35,
2, 65
)
ye_hh
#> # A tibble: 2 × 2
#> HH_VAR N
#> <dbl> <dbl>
#> 1 1 35
#> 2 2 65
# Specify control at person level
ye_ind <- tibble::tribble(
~P_VAR, ~N,
1, 91,
2, 65,
3, 104
)
ye_ind
#> # A tibble: 3 × 2
#> P_VAR N
#> <dbl> <dbl>
#> 1 1 91
#> 2 2 65
#> 3 3 104
ye_problem <- ml_problem(
ref_sample = ye,
field_names = special_field_names(
groupId = "HHNR", individualId = "PNR", count = "N"
),
group_controls = list(ye_hh),
individual_controls = list(ye_ind)
)
ye_problem
#> An object of class ml_problem
#> Reference sample: 23 observations
#> Control totals: 1 at individual, and 1 at group level
fit <- ml_fit_dss(ye_problem)
fit$weights
#> [1] 8.937470 8.937470 8.937470 23.448579 23.448579 2.613950 2.613950
#> [8] 2.613950 25.899223 25.899223 25.899223 14.347802 14.347802 14.347802
#> [15] 11.009562 11.009562 2.733852 2.733852 2.733852 2.733852 2.733852
#> [22] 11.009562 11.009562