Open
Description
See notebook. Summary:
- read dataset using
utils.read_data
- examine some values in
LRE Ages 3-5 - Full Incl #
column - plot the frequences of the unique values in the subsample
- instantiate Ptype and fit schema to subsample
- plot posterior distribution for column type and row type
- list the missing values for the column
- replace those values in the column by a new missing data encoding
- run PType again to verify new encoding correctly identified as missing
To do:
- read dataset directly rather than via
utils.read_data
- nothing gained by plots – remove
- use Ptype to browse unique values?
- subsume missing probabilities plot with
col.get_missing_values()
?