8000 Make method to convert a GIG table into a Pandas dataframe · Issue #34 · hyanwong/giglib · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Make method to convert a GIG table into a Pandas dataframe #34

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you ag 8000 ree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
duncanMR opened this issue Oct 17, 2023 · 4 comments · Fixed by #36
Closed

Make method to convert a GIG table into a Pandas dataframe #34

duncanMR opened this issue Oct 17, 2023 · 4 comments · Fixed by #36
Assignees

Comments

@duncanMR
Copy link
Collaborator

While we have a GIG data model, we still lack basic methods to do things like subsetting (e.g. intervals[intervals.parent == 2]) and sorting by a particular column. If we convert a GIG table to a Pandas DataFrame, doing such basic data manipulation is trivial, so I think having such a conversion method available will be useful for exploratory work. It will be hard to do subsetting and sorting more efficiently than Pandas without resorting to the tricks used in tskit, so I'm inclined to kick that down the road for now.

@duncanMR duncanMR self-assigned this Oct 17, 2023
@hyanwong
Copy link
Owner

Yes, we could reasonably use pandas for the moment, as long as we don't end up relying on pandas functions in the scripts we write.

@hyanwong
Copy link
Owner

A conversion to a pandas dataframe is quite reasonable anyway. FWIW in tsdate, to avoid a pandas dependency we just have an asdict method (or something like that), so you can do df = pd.DataFrame(obj.asdict()), but maybe that's too long a thing to type here, and we just want an obj.df() method?

@duncanMR
Copy link
Collaborator Author

A conversion to a pandas dataframe is quite reasonable anyway. FWIW in tsdate, to avoid a pandas dependency we just have an asdict method (or something like that), so you can do df = pd.DataFrame(obj.asdict()), but maybe that's too long a thing to type here, and we just want an obj.df() method?

I'm inclined to go for the latter for now, since I don't have any use for the dict object outside of turning it into a dataframe. I'm only going to use Pandas in exploratory work, so it won't be any trouble to drop the Pandas dependency later, I think?

@hyanwong
Copy link
Owner

Sure. Maybe call it obj._df() (with the underscore), to emphasise that it's not part of the public API (not that we actually have one, but nevertheless, a good habit to get into)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants
0