8000 Merge missing data encodings · Issue #86 · alan-turing-institute/ptype · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
8000
Merge missing data encodings #86
Open
@rolyp

Description

@rolyp

See notebook. Summary:

  • read dataset using utils.read_data
  • examine some values in LRE Ages 3-5 - Full Incl # column
  • plot the frequences of the unique values in the subsample
  • instantiate Ptype and fit schema to subsample
  • plot posterior distribution for column type and row type
  • list the missing values for the column
  • replace those values in the column by a new missing data encoding
  • run PType again to verify new encoding correctly identified as missing

To do:

  • read dataset directly rather than via utils.read_data
  • nothing gained by plots – remove
  • use Ptype to browse unique values?
  • subsume missing probabilities plot with col.get_missing_values()?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0