Paper Summary

This paper sits in the realm of public opinion: the authors want to test survey respondents’ understanding of abstract political ideas, and how their degree of understanding dictates opinion estimates. They contend that since many people, particularly those with low political interest or sophistication, do not link political outcomes with concrete implementation, survey-based estimates for the support of such political programs may be overstated. Put another way: people tend to support reductions in income inequality, yet their support wanes when provided with methodologies for producing said reduction. The authors employ two studies within the realm of income redistribution to test the abstract-concrete linkage among members of the public.

In Study 1, the authors field a survey to assess people’s knowledge of the subject matter. Using open-ended and fixed-choice questions, they find that most people do not associate the mainline policies for redistribution — progressive taxation and social transfers — with the abstract concept of inequality reduction.

In Study 2, the authors’ experimental treatment conditions involve varying the degree to which a question’s wording is abstract or concrete. The idea is that more abstract questions provide a weaker signal of respondents’ true preferences. This ambiguity masks variation: support could fall or rise when respondents are presented with an actual method of income redistribution. The replication here focuses on Study 2, for two reasons: (1) the theme of this week is causal inference with randomization, and (2) Study 2 more directly addresses the effects of policy wording on support — Study 1 is primarily exploratory.


Setup

packages <- c("tidyverse", "dplyr", "srvyr", "ggplot2", "haven",
              "survey", "vtable", "car", "modelr", "forcats",
              "broom", "scales", "ggeffects", "modelsummary",
              "lmtest", "sandwich", "MASS", "marginaleffects", "ri2",
              "knitr", "kableExtra", "patchwork")

installed <- packages %in% rownames(installed.packages())
if (any(!installed)) install.packages(packages[!installed])
lapply(packages, library, character.only = TRUE)

Study 1 Replication

Study 1 is a descriptive survey designed to assess whether respondents connect the abstract goal of reducing inequality with standard redistributive policies. The key finding is that only 3 percent of open-ended respondents spontaneously mentioned progressive taxation and social transfers together, and even when these options were presented in a closed-ended format, fewer than 30 percent selected them. Around 40 percent of respondents explained their support in terms of a general principle rather than any specific policy.

This establishes the motivating puzzle for Study 2: if most people support redistribution as an abstraction but cannot identify its instruments, survey estimates of redistributive preference may be systematically overstated.

suppressMessages(suppressWarnings(source("Study 1 Script.R", local = FALSE)))
print(Fig_1)
Figures 1,2, and SI4 from Margalit & Raviv (2024)

Figures 1,2, and SI4 from Margalit & Raviv (2024)

print(Fig_2)
Figures 1,2, and SI4 from Margalit & Raviv (2024)

Figures 1,2, and SI4 from Margalit & Raviv (2024)

print(Fig_SI4)
Figures 1,2, and SI4 from Margalit & Raviv (2024)

Figures 1,2, and SI4 from Margalit & Raviv (2024)


Study 2 Replication

In Study 2, the authors present experimental evidence for the claim that question concreteness affects support for redistribution. To understand exactly how concreteness is related to support, they partition their analysis into two parts. First, they examine the raw aggregate support across the four treatment groups. Then, they examine how this support differs across income groups.

The four experimental conditions, shown in Table 1, vary only in the specificity of the question wording — from the most abstract (asking about income differences with no mention of government) to the most concrete (explicitly naming progressive taxation and social assistance as the instruments of redistribution).

Table 1. Experimental treatments
Treatment Wording
Income differences On the whole, do you think income differences should or should not be reduced between the rich and the poor?
Government’s responsibility On the whole, do you think it should or should not be the government’s responsibility to reduce income differences between the rich and poor?
Market intervention On the whole, do you think the government should or should not intervene in the market to reduce income differences between the rich and poor?
Redistributive measures On the whole, do you think it should or should not be the government’s responsibility to reduce income differences between the rich and poor by raising the taxes on higher earners and providing income assistance to people with lower incomes?

After receiving one of the four treatments, respondents were asked whether it should or should not be the government’s responsibility to reduce income differences, on a four-point scale: (1) Definitely should be, (2) Probably should be, (3) Probably should not be, (4) Definitely should not be. To condense the analysis, the authors recode this as a binary variable, combining the two “should be” categories into support and the two “should not be” categories into no support.

Figure 3 Replication: Aggregate Treatment Effects

Their first figure presents a simple difference in means across the four treatment conditions. In general, we see a rough increasing trend as question wordings become more concrete — respondents are more likely to express support when the policy instruments are named explicitly.

suppressMessages(suppressWarnings(source("Study 2 Script.R", local = FALSE)))
## # A tibble: 4 × 6
##   exp_treatment               mean     n    sd     se  prop
##   <fct>                      <dbl> <int> <dbl>  <dbl> <dbl>
## 1 Income Differences         0.579   380 0.494 0.0254  57.9
## 2 Governments Responsibility 0.628   393 0.484 0.0244  62.8
## 3 Market intervention        0.705   370 0.456 0.0237  70.5
## 4 Redistributive Measures    0.714   402 0.452 0.0226  71.4
print(Fig_3)
Figure 3 replication: average support by treatment condition

Figure 3 replication: average support by treatment condition

There could be a few explanations for this pattern. Two stand out:

  1. Self-interest: Concrete wording activates economic self-interest. When respondents can see what redistribution actually costs or benefits them personally, their material position shapes their response more directly.

  2. Ideological clarification: Concrete wording provides more information, allowing respondents to align with or against the policy on ideological grounds. Rather than reacting to an abstract principle, they are reacting to a specific instrument they can evaluate.

These explanations are observationally distinct: if self-interest is the primary driver, we would expect the income gradient to sharpen as wording becomes more concrete, regardless of ideology. If ideology is the driver, we would expect the treatment effect to vary more across ideological groups than across income groups.

Figure 4 Replication: Income Heterogeneity

The main model from Study 2 is a linear probability model (LPM) regressing the binary support indicator on treatment condition, income group, and their interaction. The reference category is the most abstract condition (Income Differences) among low-income respondents. Each treatment coefficient represents the change in probability of supporting redistribution relative to that baseline. The interaction terms capture whether the effect of more concrete wording differs between high- and low-income respondents — the paper’s central theoretical claim.

lm1_income <- lm(MainQ_support ~ exp_treatment_inc * highincome,
                 data = Survey2_experiment)
summary(lm1_income)
## 
## Call:
## lm(formula = MainQ_support ~ exp_treatment_inc * highincome, 
##     data = Survey2_experiment)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.7599 -0.5913  0.2593  0.3623  0.4601 
## 
## Coefficients:
##                                                         Estimate Std. Error
## (Intercept)                                              0.53992    0.02892
## exp_treatment_incGovernments Responsibility              0.09781    0.04081
## exp_treatment_incMarket intervention                     0.20082    0.04172
## exp_treatment_incRedistributive Measures                 0.21993    0.04030
## highincome1                                              0.14476    0.05308
## exp_treatment_incGovernments Responsibility:highincome1 -0.15955    0.07382
## exp_treatment_incMarket intervention:highincome1        -0.24143    0.07474
## exp_treatment_incRedistributive Measures:highincome1    -0.31331    0.07428
##                                                         t value Pr(>|t|)    
## (Intercept)                                              18.673  < 2e-16 ***
## exp_treatment_incGovernments Responsibility               2.396  0.01667 *  
## exp_treatment_incMarket intervention                      4.813 1.64e-06 ***
## exp_treatment_incRedistributive Measures                  5.457 5.65e-08 ***
## highincome1                                               2.727  0.00646 ** 
## exp_treatment_incGovernments Responsibility:highincome1  -2.161  0.03083 *  
## exp_treatment_incMarket intervention:highincome1         -3.231  0.00126 ** 
## exp_treatment_incRedistributive Measures:highincome1     -4.218 2.61e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4689 on 1508 degrees of freedom
##   (91 observations deleted due to missingness)
## Multiple R-squared:  0.0267, Adjusted R-squared:  0.02218 
## F-statistic: 5.909 on 7 and 1508 DF,  p-value: 8.556e-07

The replication is exact: treatment effect magnitudes, standard errors, and significance levels match the reported values in Table SI-6 of the original paper to three decimal places.

print(fig_4)
Figure 4 and SI-6 replication: predicted support by treatment and income

Figure 4 and SI-6 replication: predicted support by treatment and income

print(Fig_SI6)
Figure 4 and SI-6 replication: predicted support by treatment and income

Figure 4 and SI-6 replication: predicted support by treatment and income

What is most striking is that for high-income earners, support declines as the wording becomes more concrete. This supports the notion that as personal economic stakes become clearer, those with more capital become less supportive of redistribution — owning to the fact that they stand to lose money under progressive taxation. However, this interpretation warrants some caution. As the ideology-stratified plot shows, the income reversal pattern is present among both liberals and conservatives, but the baseline levels of support differ dramatically across ideological groups. Conservative high-income earners show a relatively flat and already-low support profile across all conditions, while liberal high-income earners remain highly supportive even under concrete framing. This suggests ideology may be doing at least as much work as income in shaping responses to concrete redistribution, complicating the authors’ self-interest interpretation.

Appendix SI-5: LPM with Controls

In Chapter 18.6 (page 352) of ROS, Gelman, Hill, and Vehtari outline strategies for increasing the precision of experimental estimates. Here, efficiency simply refers to the size of the standard errors around our point estimates. Strategies that increase efficiency do not change the actual value of the estimate; rather, they constrict the standard errors and therefore the confidence intervals, allowing us to draw stronger inferences from the sample.

Since treatment was randomly assigned, confounding should not be an issue and our point estimates should be unbiased. That said, there are still factors predictive of the response which, although they are not biasing the results (to do so, the treatment and response would both need to be predicted by the same variable, which randomization prevents), can reduce precision by contributing to residual variance. ROS recommends adding pre-treatment covariates that are predictive of the outcome to soak up this residual variance and tighten standard errors. Note that ideology and income serve a dual purpose here: they appear as efficiency-improving additive controls in the appendix specifications, and as theoretically motivated moderators in the interaction model above.

Their replication code did not run successfully, but nonetheless the authors show that their findings are robust to the addition of controls.


Extensions & Robustness

The authors’ main model is straightforward: Figure 3 is a raw difference of means across treatment conditions, and Figure 4 comes from a single LPM with a treatment x income interaction:

MainQ_support ~ exp_treatment_inc * highincome

All appendix tables (SI-5 through SI-10) are covariate robustness checks on this one model. Below I conduct four independent robustness exercises the authors do not.


1. Heteroskedasticity Check

LPM is guaranteed to be heteroskedastic by construction — the error variance is p(1-p), which depends on the fitted value. The real question is whether it is severe enough to distort inference. The residuals vs. fitted plot below diagnoses this visually.

plot(lm1_income, which = 1,
     main = "Residuals vs. Fitted — Main LPM",
     col = "#136266", pch = 16, cex = 0.7)
Residuals vs. Fitted: main LPM

Residuals vs. Fitted: main LPM

The two-band pattern is expected with a binary outcome and does not indicate a problem. There is no fan shape: residual spread is roughly constant across fitted values (0.54-0.76), meaning heteroskedasticity is not severe here.

As a formal correction, HC3 robust standard errors are reported below. If significance levels are unchanged, the LPM results stand.

lpm_robust <- coeftest(lm1_income, vcov = vcovHC(lm1_income, type = "HC3"))
lpm_robust
## 
## t test of coefficients:
## 
##                                                          Estimate Std. Error
## (Intercept)                                              0.539924   0.030850
## exp_treatment_incGovernments Responsibility              0.097812   0.042780
## exp_treatment_incMarket intervention                     0.200817   0.041816
## exp_treatment_incRedistributive Measures                 0.219933   0.040131
## highincome1                                              0.144761   0.054150
## exp_treatment_incGovernments Responsibility:highincome1 -0.159546   0.075947
## exp_treatment_incMarket intervention:highincome1        -0.241434   0.075533
## exp_treatment_incRedistributive Measures:highincome1    -0.313313   0.075693
##                                                         t value  Pr(>|t|)    
## (Intercept)                                             17.5015 < 2.2e-16 ***
## exp_treatment_incGovernments Responsibility              2.2864  0.022370 *  
## exp_treatment_incMarket intervention                     4.8024 1.724e-06 ***
## exp_treatment_incRedistributive Measures                 5.4804 4.967e-08 ***
## highincome1                                              2.6733  0.007591 ** 
## exp_treatment_incGovernments Responsibility:highincome1 -2.1008  0.035827 *  
## exp_treatment_incMarket intervention:highincome1        -3.1964  0.001420 ** 
## exp_treatment_incRedistributive Measures:highincome1    -4.1393 3.677e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Original vs. HC3 Robust Standard Errors
Term Original SE Robust SE (HC3) Significance Change?
Govts Responsibility 0.041 0.043 No
Market intervention 0.042 0.042 No
Redistributive Measures 0.040 0.040 No
High Income 0.053 0.054 No
Govts Resp x High Income 0.074 0.076 No
Market int x High Income 0.075 0.076 No
Redist x High Income 0.074 0.076 No

All seven treatment and interaction terms retain identical significance levels under HC3 robust standard errors. Standard errors increase by at most 0.002. Heteroskedasticity is not materially affecting inference in this model.


2. Balance Check

Are the groups properly balanced? Randomization should produce balance on pre-treatment covariates in expectation, but ROS Chapter 18 recommends verifying this empirically. If treatment groups differ substantially on observed characteristics, it raises questions about the randomization procedure. Here we check whether age, gender, income, education, and ideology are roughly equally distributed across the four conditions.

str(Survey2_experiment[, c("age", "female", "highincome", "education", "ideology","Race")])
## tibble [1,607 × 6] (S3: tbl_df/tbl/data.frame)
##  $ age       : Factor w/ 6 levels "18-24","25-34",..: 2 6 6 4 2 6 6 6 5 4 ...
##  $ female    : Factor w/ 2 levels "0","1": 1 1 2 2 1 1 2 2 2 1 ...
##  $ highincome: Factor w/ 2 levels "0","1": 2 1 2 1 2 2 1 1 1 1 ...
##  $ education : Factor w/ 6 levels "Less than high school",..: 5 2 6 2 6 6 5 5 3 6 ...
##  $ ideology  : num [1:1607] 10 10 9 7 4 3 6 9 5 6 ...
##  $ Race      : Factor w/ 5 levels "White","Asian",..: 1 1 1 1 2 1 1 1 1 1 ...
# Create binary white indicator
Survey2_experiment <- Survey2_experiment %>%
  mutate(white = as.factor(as.integer(Race == "White")))

# Balance table
Survey2_experiment %>%
  filter(!is.na(exp_treatment_inc)) %>%
  group_by(exp_treatment_inc) %>%
  summarise(
    `Age 18-34 (%)`   = mean(age %in% c("18-24", "25-34"), na.rm = TRUE),
    `Female (%)`      = mean(female == "1", na.rm = TRUE),
    `High Income (%)` = mean(highincome == "1", na.rm = TRUE),
    `College+ (%)`    = mean(education %in% c("Bachelor's degree",
                                               "Graduate degree"), na.rm = TRUE),
    `White (%)`       = mean(Race == "White", na.rm = TRUE),
    `Ideology (mean)` = mean(ideology, na.rm = TRUE)
  ) %>%
  knitr::kable(digits = 3, caption = "Covariate Balance Across Treatment Conditions")
Covariate Balance Across Treatment Conditions
exp_treatment_inc Age 18-34 (%) Female (%) High Income (%) College+ (%) White (%) Ideology (mean)
Income Differences 0.313 0.568 0.297 0.134 0.571 5.495
Governments Responsibility 0.331 0.527 0.315 0.122 0.656 5.691
Market intervention 0.362 0.514 0.327 0.097 0.573 5.881
Redistributive Measures 0.299 0.483 0.292 0.119 0.627 5.724
# F-tests
covs_factor <- list(
  "Female"      = "female",
  "High Income" = "highincome",
  "Age"         = "age",
  "Education"   = "education",
  "White"       = "white"
)

lapply(names(covs_factor), function(label) {
  f <- aov(as.formula(paste("as.numeric(", covs_factor[[label]],
                             ") ~ exp_treatment_inc")),
           data = Survey2_experiment)
  result <- broom::tidy(f) %>% filter(term == "exp_treatment_inc")
  tibble(
    Covariate     = label,
    `F-statistic` = round(result$statistic, 3),
    `p-value`     = round(result$p.value, 3)
  )
}) %>%
  bind_rows() %>%
  mutate(
    `Bonferroni p` = round(`p-value` * 5, 3),
    `Balanced?`    = ifelse(`Bonferroni p` > 0.05, "Yes", "No")
  ) %>%
  knitr::kable(caption = "F-tests: Does Treatment Predict Covariates? (Bonferroni Corrected)")
F-tests: Does Treatment Predict Covariates? (Bonferroni Corrected)
Covariate F-statistic p-value Bonferroni p Balanced?
Female 1.978 0.115 0.575 Yes
High Income 0.465 0.707 3.535 Yes
Age 0.332 0.802 4.010 Yes
Education 0.879 0.451 2.255 Yes
White 2.861 0.036 0.180 Yes

If p-values are large and F-statistics are small across all covariates, the randomization is well-balanced. Any significant imbalance would warrant including those covariates as controls in the main model. I add a Bonferroni correction to account for the multiple comparisons problem, lowering the chance we may induce a Type I error onto the F-test.


3. Causal Heterogeneity by Ideology

The authors check the income x treatment interaction but do not formally examine heterogeneity by ideology — which their own discussion identifies as a dominant predictor. As ROS Chapter 18 notes, treatment effect heterogeneity (variation in the causal effect across subgroups) is both theoretically important and directly estimable by interacting treatment with subgroup indicators.

This check speaks to the paper’s core interpretive ambiguity: if the income reversal pattern holds within ideological groups, it supports the self-interest story. If the pattern instead varies sharply by ideology independent of income, it suggests ideological updating rather than material self-interest.

# Three-way interaction: treatment x income x ideology
lm_threeway <- lm(MainQ_support ~ exp_treatment_inc * highincome * ideology,
                  data = Survey2_experiment)

summary(lm_threeway)
## 
## Call:
## lm(formula = MainQ_support ~ exp_treatment_inc * highincome * 
##     ideology, data = Survey2_experiment)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.9250 -0.5265  0.2216  0.3342  0.6338 
## 
## Coefficients:
##                                                                   Estimate
## (Intercept)                                                       0.766993
## exp_treatment_incGovernments Responsibility                       0.164094
## exp_treatment_incMarket intervention                              0.196155
## exp_treatment_incRedistributive Measures                          0.187274
## highincome1                                                       0.053382
## ideology                                                         -0.040080
## exp_treatment_incGovernments Responsibility:highincome1          -0.069736
## exp_treatment_incMarket intervention:highincome1                 -0.100867
## exp_treatment_incRedistributive Measures:highincome1             -0.176845
## exp_treatment_incGovernments Responsibility:ideology             -0.012983
## exp_treatment_incMarket intervention:ideology                     0.001940
## exp_treatment_incRedistributive Measures:ideology                 0.004898
## highincome1:ideology                                              0.013422
## exp_treatment_incGovernments Responsibility:highincome1:ideology -0.008923
## exp_treatment_incMarket intervention:highincome1:ideology        -0.020870
## exp_treatment_incRedistributive Measures:highincome1:ideology    -0.017643
##                                                                  Std. Error
## (Intercept)                                                        0.067086
## exp_treatment_incGovernments Responsibility                        0.095989
## exp_treatment_incMarket intervention                               0.098410
## exp_treatment_incRedistributive Measures                           0.092338
## highincome1                                                        0.108419
## ideology                                                           0.010746
## exp_treatment_incGovernments Responsibility:highincome1            0.158972
## exp_treatment_incMarket intervention:highincome1                   0.167022
## exp_treatment_incRedistributive Measures:highincome1               0.164325
## exp_treatment_incGovernments Responsibility:ideology               0.015618
## exp_treatment_incMarket intervention:ideology                      0.015577
## exp_treatment_incRedistributive Measures:ideology                  0.014892
## highincome1:ideology                                               0.017968
## exp_treatment_incGovernments Responsibility:highincome1:ideology   0.025451
## exp_treatment_incMarket intervention:highincome1:ideology          0.026625
## exp_treatment_incRedistributive Measures:highincome1:ideology      0.026148
##                                                                  t value
## (Intercept)                                                       11.433
## exp_treatment_incGovernments Responsibility                        1.710
## exp_treatment_incMarket intervention                               1.993
## exp_treatment_incRedistributive Measures                           2.028
## highincome1                                                        0.492
## ideology                                                          -3.730
## exp_treatment_incGovernments Responsibility:highincome1           -0.439
## exp_treatment_incMarket intervention:highincome1                  -0.604
## exp_treatment_incRedistributive Measures:highincome1              -1.076
## exp_treatment_incGovernments Responsibility:ideology              -0.831
## exp_treatment_incMarket intervention:ideology                      0.125
## exp_treatment_incRedistributive Measures:ideology                  0.329
## highincome1:ideology                                               0.747
## exp_treatment_incGovernments Responsibility:highincome1:ideology  -0.351
## exp_treatment_incMarket intervention:highincome1:ideology         -0.784
## exp_treatment_incRedistributive Measures:highincome1:ideology     -0.675
##                                                                  Pr(>|t|)    
## (Intercept)                                                       < 2e-16 ***
## exp_treatment_incGovernments Responsibility                      0.087565 .  
## exp_treatment_incMarket intervention                             0.046416 *  
## exp_treatment_incRedistributive Measures                         0.042722 *  
## highincome1                                                      0.622534    
## ideology                                                         0.000199 ***
## exp_treatment_incGovernments Responsibility:highincome1          0.660966    
## exp_treatment_incMarket intervention:highincome1                 0.545993    
## exp_treatment_incRedistributive Measures:highincome1             0.282014    
## exp_treatment_incGovernments Responsibility:ideology             0.405920    
## exp_treatment_incMarket intervention:ideology                    0.900930    
## exp_treatment_incRedistributive Measures:ideology                0.742292    
## highincome1:ideology                                             0.455173    
## exp_treatment_incGovernments Responsibility:highincome1:ideology 0.725948    
## exp_treatment_incMarket intervention:highincome1:ideology        0.433259    
## exp_treatment_incRedistributive Measures:highincome1:ideology    0.499963    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.457 on 1499 degrees of freedom
##   (92 observations deleted due to missingness)
## Multiple R-squared:  0.08079,    Adjusted R-squared:  0.07159 
## F-statistic: 8.783 on 15 and 1499 DF,  p-value: < 2.2e-16
# Predicted values at liberal (-2 from mean), moderate (mean), conservative (+2 from mean)
ideology_mean <- mean(Survey2_experiment$ideology, na.rm = TRUE)

predict_threeway <- ggpredict(lm_threeway,
  terms = c("exp_treatment_inc",
            "highincome [0, 1]",
            paste0("ideology [", ideology_mean - 2, ", ",
                                 ideology_mean,     ", ",
                                 ideology_mean + 2, "]")))

plot(predict_threeway) +
  labs(title    = "Treatment x Income x Ideology: Causal Heterogeneity",
       subtitle = "Predicted support at liberal, moderate, and conservative ideology values",
       x        = NULL,
       y        = "Predicted Support") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "bottom",
        axis.text.x = element_text(angle = 25, hjust = 1))

The income reversal shows up across all three ideological groups, not just liberals or conservatives. In the liberal panel, high-income respondents start above low-income respondents in the abstract condition but fall below by the Redistributive Measures condition, and the same downward pattern appears in the moderate and conservative panels at lower baseline levels. The shape of the reversal looks roughly the same across all three panels, consistent with the null three-way interaction terms in the regression showing that ideology does not significantly moderate the income effect. This cuts against the knowledge-gap explanation, since if the reversal were about political sophistication rather than money it would disappear among strong ideologues, which it does not.


4. Ordered Logit — Does Response Intensity Matter?

The authors collapse a 4-point ordinal scale into a binary outcome. This risks discarding meaningful variation in the intensity of support. An ordered logit uses the full scale and asks whether treatment shifts respondents toward stronger or weaker support, not just across the binary threshold.

Survey2_experiment$MainQ_ord <- factor(
  Survey2_experiment$exp_MainQ,
  levels = c(4, 3, 2, 1),
  labels = c(
    "Definitely should not be",
    "Probably should not be",
    "Probably should be",
    "Definitely should be"
  ),
  ordered = TRUE
)

ologit1_income <- polr(
  MainQ_ord ~ exp_treatment_inc * highincome,
  data = Survey2_experiment,
  Hess = TRUE
)

modelsummary(
  ologit1_income,
  coef_map = c(
    "exp_treatment_incGovernments Responsibility"             = "Governments Responsibility",
    "exp_treatment_incMarket intervention"                    = "Market intervention",
    "exp_treatment_incRedistributive Measures"                = "Redistributive Measures",
    "highincome1"                                             = "High Income",
    "exp_treatment_incGovernments Responsibility:highincome1" = "Govts Responsibility x High Income",
    "exp_treatment_incMarket intervention:highincome1"        = "Market intervention x High Income",
    "exp_treatment_incRedistributive Measures:highincome1"    = "Redistributive Measures x High Income"
  ),
  stars = c("+" = 0.1, "*" = 0.05, "**" = 0.01),
  gof_map = c("nobs", "logLik"),
  title = "Ordered Logit: Support for Redistribution (Full 4-Point Scale)",
  output = "markdown"
)
Ordered Logit: Support for Redistribution (Full 4-Point Scale)
Governments Responsibility 0.504**
(0.159)
Market intervention 0.667**
(0.159)
Redistributive Measures 0.947**
(0.158)
High Income 0.583**
(0.209)
Govts Responsibility x High Income -0.783**
(0.289)
Market intervention x High Income -0.831**
(0.289)
Redistributive Measures x High Income -1.206**
(0.295)
Num.Obs. 1516
  • p < 0.1, * p < 0.05, ** p < 0.01
predict_ologit_income <- ggpredict(ologit1_income,
                                   terms = c("exp_treatment_inc", "highincome"))

predict_ologit_income$group <- factor(predict_ologit_income$group,
                                      labels = c("Low Income", "High Income"))
predict_ologit_income <- tibble(predict_ologit_income)
predict_ologit_income$response.level <- factor(
  predict_ologit_income$response.level,
  levels = c("Definitely should not be", "Probably should not be",
             "Probably should be", "Definitely should be")
)

predict_lpm <- ggpredict(lm1_income, terms = c("exp_treatment_inc", "highincome"))
predict_lpm <- tibble(predict_lpm)
predict_lpm$group <- factor(predict_lpm$group, labels = c("Low Income", "High Income"))

predict_lpm_long <- predict_lpm %>%
  mutate(Support    = predicted,
         `No Support` = 1 - predicted) %>%
  pivot_longer(cols = c("Support", "No Support"),
               names_to = "response.level",
               values_to = "prob") %>%
  mutate(response.level = factor(response.level,
                                 levels = c("No Support", "Support")))

lpm_plot <- ggplot(predict_lpm_long,
                   aes(x = x, y = prob, fill = response.level)) +
  geom_col(position = "stack", width = 0.65) +
  facet_wrap(~group) +
  scale_fill_manual(values = c("#c0392b", "#136266"), name = "Response") +
  scale_y_continuous(labels = percent_format(accuracy = 1)) +
  labs(title = "Linear Probability Model",
       subtitle = "Binary outcome",
       x = "Treatment Condition", y = "Predicted Probability") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "bottom",
        axis.text.x = element_text(angle = 25, hjust = 1),
        strip.text = element_text(face = "bold", size = 13),
        plot.title = element_text(face = "bold"))

ologit_plot <- ggplot(predict_ologit_income,
                      aes(x = x, y = predicted, fill = response.level)) +
  geom_col(position = "stack", width = 0.65) +
  facet_wrap(~group) +
  scale_fill_manual(
    values = c("#c0392b", "#e67e22", "#27ae60", "#136266"),
    name = "Response"
  ) +
  scale_y_continuous(labels = percent_format(accuracy = 1)) +
  labs(title = "Ordered Logit",
       subtitle = "Full 4-point scale",
       x = "Treatment Condition", y = "Predicted Probability") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "bottom",
        axis.text.x = element_text(angle = 25, hjust = 1),
        strip.text = element_text(face = "bold", size = 13),
        plot.title = element_text(face = "bold"))

patchwork::wrap_plots(lpm_plot, ologit_plot, ncol = 1) +
  plot_annotation(
    title = "LPM vs. Ordered Logit: Predicted Response Distributions",
    subtitle = "Top: binary recoding used by authors. Bottom: full ordinal scale.",
    theme = theme(plot.title = element_text(face = "bold", size = 14),
                  plot.subtitle = element_text(size = 11))
  )
LPM vs. Ordered Logit: predicted response distributions

LPM vs. Ordered Logit: predicted response distributions

The LPM (top) shows only whether respondents cross the binary support threshold. The ordered logit (bottom) reveals that concreteness also shifts intensity: low-income respondents move into “Definitely should be” as framing becomes more concrete, while high-income respondents’ “Definitely should be” segment shrinks without a corresponding spike in strong opposition — they become ambivalent rather than strongly opposed. This nuance is invisible in the binary coding.


5. Permutation / Randomization Inference

This is the robustness check the authors do not conduct. Rather than relying on asymptotic normal approximations for p-values, randomization inference asks: if treatment had no effect, how often would we observe an effect as large as the one we found, just by chance reassignment of treatment labels?

The key idea from ROS Chapter 18 is that in a randomized experiment, treatment assignment is arbitrary. Because respondents were randomly assigned to question conditions, any reshuffling of those assignments would have been equally valid. This means that if the treatment had no effect, we could randomly reassign the labels and expect to see a similar result. The replicate() call does exactly this: it shuffles the treatment labels 1,000 times and recomputes the treatment effect each time, building up a picture of what effects look like under pure chance. The ri_pvalue then asks how often those shuffled effects are as large as the one we actually observed. If the answer is almost never, as it is here, that is strong evidence the observed effect is real and not just a lucky draw from the randomization.

set.seed(123)

ri_data <- Survey2_experiment %>%
  filter(!is.na(MainQ_support), !is.na(exp_treatment_inc)) %>%
  mutate(treat_redist = as.integer(exp_treatment_inc == "Redistributive Measures"))

obs_effect <- lm(MainQ_support ~ treat_redist, data = ri_data) %>%
  coef() %>%
  .["treat_redist"]

n_perms <- 1000
perm_effects <- replicate(n_perms, {
  ri_data$treat_perm <- sample(ri_data$treat_redist)
  coef(lm(MainQ_support ~ treat_perm, data = ri_data))["treat_perm"]
})

ri_pvalue <- mean(abs(perm_effects) >= abs(obs_effect))

cat("Observed treatment effect (Redistributive Measures):", round(obs_effect, 4), "\n")
## Observed treatment effect (Redistributive Measures): 0.077
cat("Randomization inference p-value:", round(ri_pvalue, 4), "\n")
## Randomization inference p-value: 0.003
perm_df <- tibble(effect = perm_effects)

ggplot(perm_df, aes(x = effect)) +
  geom_histogram(bins = 60, fill = "#1E7875", alpha = 0.75, color = "white") +
  geom_vline(xintercept = obs_effect,
             color = "#c0392b", linewidth = 1.2, linetype = "dashed") +
  geom_vline(xintercept = -obs_effect,
             color = "#c0392b", linewidth = 1.2, linetype = "dashed", alpha = 0.5) +
  annotate("text", x = obs_effect + 0.005, y = 60,
           label = paste0("Observed\neffect = ", round(obs_effect, 3)),
           color = "#c0392b", hjust = 0, size = 4, fontface = "bold") +
  annotate("text", x = 0, y = 75,
           label = paste0("RI p-value = ", round(ri_pvalue, 3)),
           color = "#1a1a2e", size = 4.5, fontface = "bold") +
  labs(
    title = "Randomization Inference: Redistributive Measures Treatment",
    subtitle = paste0(n_perms, " random permutations under sharp null of no effect"),
    x = "Permuted Treatment Effect",
    y = "Count"
  ) +
  theme_minimal(base_size = 13) +
  theme(plot.title = element_text(face = "bold"))
Permutation distribution under sharp null of no effect

Permutation distribution under sharp null of no effect

The permutation distribution confirms that the observed treatment effect would occur essentially never under the sharp null of no effect. Across 1,000 random reassignments of treatment labels, none produced an effect as large as the one observed, yielding a randomization inference p-value of 0.003. This corroborates the conventional OLS p-value and indicates the result is not an artifact of asymptotic approximations.


Summary of Robustness Findings

Across four independent robustness exercises, the core findings of Margalit & Raviv (2024) hold. First, heteroskedasticity-robust standard errors (HC3) change inference on no coefficient — the known structural problem with LPM is not materially affecting the results here. Second, a balance check on pre-treatment covariates confirms the randomization is well-executed: treatment conditions do not systematically differ on age, gender, income, education, or ideology. Third, splitting the sample by ideology reveals that the income reversal pattern is present within both liberal and conservative subgroups, but baseline support levels differ dramatically — suggesting ideology conditions the magnitude of the self-interest effect. Fourth, randomization inference under the sharp null confirms the Redistributive Measures treatment effect is not an artifact of distributional assumptions. Taken together, these checks support the authors’ main conclusion while adding nuance to their self-interest interpretation.

References

Gelman, A., Hill, J., & Vehtari, A. (2021). Regression and other stories. Cambridge University Press.

Margalit, Y., & Raviv, S. (2024). Does support for redistribution mean what we think it means? Political Science Research and Methods, 12(4), 870–878. https://doi.org/10.1017/psrm.2023.57