Returns the total segregation between group and unit. If within is given, calculates segregation within each within category separately, and takes the weighted average. Also see mutual_within for detailed within calculations.

mutual_total(data, group, unit, within = NULL, weight = NULL,
  se = FALSE, n_bootstrap = 10, base = exp(1))

Arguments

data

A data frame.

group

A categorical variable or a vector of variables contained in data. Defines the first dimension over which segregation is computed.

unit

A categorical variable or a vector of variables contained in data. Defines the second dimension over which segregation is computed.

within

A categorical variable or a vector of variables contained in data. The variable(s) should be a superset of either the unit or the group for the calculation to be meaningful. If provided, segregation is computed within the groups defined by the variable, and then averaged. (Default NULL)

weight

Numeric. Only frequency weights are allowed. (Default NULL)

se

If TRUE, standard errors are estimated via bootstrap. (Default FALSE)

n_bootstrap

Number of bootstrap iterations. (Default 10)

base

Base of the logarithm that is used in the calculation. Defaults to the natural logarithm.

Value

Returns a data frame with two rows. The column est contains the Mutual Information Index, M, and Theil's Entropy Index, H. The H is the the M divided by the group entropy. If within was given, M and H are weighted averages of the within-category segregation scores. If se is set to TRUE, an additional column se contains the associated bootstrapped standard errors, and the column est contains bootstrapped estimates.

References

Henri Theil. 1971. Principles of Econometrics. New York: Wiley.

Ricardo Mora and Javier Ruiz-Castillo. 2011. "Entropy-based Segregation Indices". Sociological Methodology 41(1): 159–194.

Examples

# calculate school racial segregation mutual_total(schools00, "school", "race", weight="n") # M => .425
#> stat est #> M M 0.42553898 #> H H 0.05642991
# note that the definition of groups and units is arbitrary mutual_total(schools00, "race", "school", weight="n") # M => .425
#> stat est #> M M 0.4255390 #> H H 0.4188083
# if groups or units are defined by a combination of variables, # vectors of variable names can be provided - # here there is no difference, because schools # are nested within districts mutual_total(schools00, "race", c("district", "school"), weight="n") # M => .424
#> stat est #> M M 0.4255390 #> H H 0.4188083
# estimate standard errors for M and H mutual_total(schools00, "race", "school", weight="n", se=TRUE)
#> stat est se #> M M 0.4292977 0.0008489586 #> H H 0.4225437 0.0007268017
# estimate segregation within school districts mutual_total(schools00, "race", "school", within="district", weight="n") # M => .087
#> stat est #> M M 0.08758648 #> H H 0.08620114
# estimate between-district racial segregation mutual_total(schools00, "race", "district", weight="n") # M => .338
#> stat est #> M M 0.3379525 #> H H 0.3326072
# note that the sum of within-district and between-district # segregation equals total school-race segregation; # here, most segregation is between school districts