Implementation of importance sampling, direct, and hybrid methods for off-policy evaluation.
Estimators:
- Importance sampling (IS)
- Weighted importance sampling (WIS)
- Per-decision importance sampling (PDIS)
- Weighted per-decision importance sampling (WPDIS)
- Fitted Q evaluation (FQE)
- Doubly robust (DR)
- Weighted doubly robust (WDR)