Bug in regression_EM.py?

I'm a little rusty at this stuff but I believe there might be a bug in the implementation of the position bias EM algorithm here for p_r1 = reshaped_train_labels + (1 - reshaped_train_labels) * p_e0_r1_c0. In particular, I think p_r1 for unclicked documents, (the 1-reshaped_train_labels case), is incorrect.

Using notation from the paper, page 614, we want to compute $P(R=1|C=0, q, k, d)$ by marginalizing $P(R=1, E|C=0, q, k, d)$ over $E$:

$$ \begin{aligned} P(R=1|C=0, q, k, d) =& \underbrace{P(R=1, E=1|C=0, q, k, d)}_{=0} P(E=1|C=0, q, k, d) \\ & + P(R=1, E=0|C=0, q, k, d) P(E=0|C=0, q, k, d) \\ =& P(R=1, E=0|C=0, q, k, d) (1- P(E=1|C=0, q, k, d)) \end{aligned} $$

The implementation just has $P(R=1|C=0, q, k, d) = P(R=1, E=0|C=0, q, k, d)$.
That is, it's missing the $(1- P(E=1|C=0, q, k, d))$ term.

Am I missing something or is this a bug? Unfortunately, the paper just says "From this, we can compute the marginals P(E = 1|c, q, d, k) and P(R = 1|c, q, d, k)" without additional details and I can't find any other implementations of this paper to see how they're working. Like I said, I'm rusty at this so I may be missing something obvious.

Note, I think the same issue comes up here with $P(E=1|C=0, q,k, d)$, which is not explicitly named in the code like p_r1, but is computed as reshaped_train_labels + (1 - reshaped_train_labels) * p_e1_r0_c0 if I'm reading things correctly.

I think it should instead be

$$ P(E=1|C=0, q, k, d) = P(R=0, E=1|C=0, q, k, d) (1- P(R=1|C=0, q, k, d)) $$

Of course, these depend on each other, but with these two equations we can solve for $P(E=1|C=0, q, k, d)$ and $P(R=1|C=0, q, k, d)$ in terms of $P(R, E|C=0, q, k, d)$ which we can compute from gamma and self.propensity.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions