Description
Problem Summary
I'm trying to estimate a random parameters logit model (RPL) in R using the logitr package, replicating results from Ahn & Lusk’s paper on sugar-sweetened beverage policies ([https://doi.org/10.1111/ajae.12134] - The data is in the supplementary material)
Each respondent (total) answers 16 choice tasks (question), and in each task, they choose one option among 6 alternatives (Option). My dataset is structured in long format, where each row is one alternative in a choice task.
However, I keep encountering the following error when running logitr:
Error in checkRepeatedIDs("panelID", panelID, reps) :
The 'panelID' variable provided has repeated ID values.
Even after ensuring panelID appears exactly 6 times per choice task and defining obsID uniquely per alternative, the error persists.
What I’ve Tried So Far
1️⃣ Data Structure
Panel ID (panel_id) should track each respondent across choice tasks. I’ve defined it as:
data$panel_id <- paste0(data$total, "_", data$question)
Observation ID (obs_id) should uniquely identify each alternative in a choice task:
data$obs
5D43
_id <- paste0(data$total, "_", data$question, "_", data$Option)
Checked that each choice task has exactly 6 alternatives:
table(table(data$panel_id)) # Should return only 6s
any(duplicated(data$obs_id)) # Should return FALSE
2️⃣ Running the Model
library(logitr)
model <- logitr(
data = data,
outcome = "Choice",
obsID = "obs_id",
panelID = "panel_id",
pars = c("Price", "Soda", "Diet", "Water", "Spark", "FlSp",
"af_Pri", "af_Soda", "af_Diet", "af_Water", "af_Spark", "af_FlSp",
"pri_in", "sd_in", "di_in", "wa_in", "sp_in", "fl_in",
"pri_af_in", "sd_af_in", "di_af_in", "wa_af_in", "sp_af_in", "fl_af_in"),
randPars = c("Soda" = "n", "Diet" = "n", "Water" = "n", "Spark" = "n", "FlSp" = "n",
"af_Soda" = "n", "af_Diet" = "n", "af_Water" = "n", "af_Spark" = "n", "af_FlSp" = "n",
"sd_in" = "n", "di_in" = "n", "wa_in" = "n", "sp_in" = "n", "fl_in" = "n",
"sd_af_in" = "n", "di_af_in" = "n", "wa_af_in" = "n", "sp_af_in" = "n", "fl_af_in" = "n"),
drawType = "halton",
numDraws = 500
)
Ensured no missing values in panel_id:
sum(is.na(data$panel_id)) # Should return 0
Checked for unintended duplicates:
duplicated_ids <- data$panel_id[duplicated(data$panel_id)]
table(duplicated_ids) # Should show no unexpected repeats
Key Issue
Despite these checks, logitr still flags panelID as having repeated ID values.
I suspect that logitr expects panelID to be defined differently, but I’m not sure how to fix it while keeping the structure correct.
What I Need Help With:
- Is my definition of panelID correct for a panel RPL model?
- Are there additional logitr requirements for panel data that I’m missing?
- Is there a better way to structure obsID and panelID to avoid this issue?
- If anyone has successfully run an RPL model in logitr, could you share how you structured your data?
**Any insights or guidance would be greatly appreciated!
Thanks in advance for your help! 🙌**