Calculates the segregation between group and unit within each category defined by within.

mutual_within(data, group, unit, within, weight = NULL, se = FALSE,
  n_bootstrap = 10, base = exp(1), wide = FALSE)

Arguments

data

A data frame.

group

A categorical variable or a vector of variables contained in data. Defines the first dimension over which segregation is computed.

unit

A categorical variable or a vector of variables contained in data. Defines the second dimension over which segregation is computed.

within

A categorical variable or a vector of variables contained in data that defines the within-segregation categories.

weight

Numeric. Only frequency weights are allowed. (Default NULL)

se

If TRUE, standard errors are estimated via bootstrap. (Default FALSE)

n_bootstrap

Number of bootstrap iterations. (Default 10)

base

Base of the logarithm that is used in the calculation. Defaults to the natural logarithm.

wide

Returns a wide dataframe instead of a long dataframe. (Default FALSE)

Value

Returns a data.table with four rows for each category defined by within. The column est contains four statistics that are provided for each unit: M is the within-category M, and p is the proportion of the category. Multiplying M and p gives the contribution of each within-category towards the total M. H is the within-category H, and h_weight provides the weight. Multiplying H and h_weight gives the contribution of each within-category towards the total H. h_weight is defined as p * EW/E, where EW is the within-category entropy, and E is the overall entropy. If se is set to TRUE, an additional column se contains the associated bootstrapped standard errors, and the column est contains bootstrapped estimates. If wide is set to TRUE, returns instead a wide dataframe, with one row for each within category, and the associated statistics in separate columns.

References

Henri Theil. 1971. Principles of Econometrics. New York: Wiley.

Ricardo Mora and Javier Ruiz-Castillo. 2011. "Entropy-based Segregation Indices". Sociological Methodology 41(1): 159–194.

Examples

(within <- mutual_within(schools00, "race", "school", within = "state", weight = "n", wide = TRUE))
#> state M p H h_weight #> 1: A 0.4085965 0.2768819 0.4969216 0.2240667 #> 2: B 0.2549959 0.4035425 0.2680884 0.3777638 #> 3: C 0.3450221 0.3195756 0.3611257 0.3004955
# the M for "AL" is .409 # manual calculation schools_AL <- schools00[schools00$state=="AL",] mutual_total(schools_AL, "race", "school", weight = "n") # M => .409
#> stat est #> 1: M 0 #> 2: H NaN
# to recover the within M and H from the output, multiply # p * M and h_weight * H, respectively sum(within$p * within$M) # => .326
#> [1] 0.3262953
sum(within$H * within$h_weight) # => .321
#> [1] 0.3211343
# compare with: mutual_total(schools00, "race", "school", within = "state", weight = "n")
#> stat est #> 1: M 0.3262953 #> 2: H 0.3211343