8000 Account for chromosome IDs when tracing intervals · Issue #103 · hyanwong/giglib · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Account for chromosome IDs when tracing intervals #103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasi 8000 onally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hyanwong opened this issue Mar 13, 2024 · 4 comments
Closed

Account for chromosome IDs when tracing intervals #103

hyanwong opened this issue Mar 13, 2024 · 4 comments

Comments

@hyanwong
Copy link
Owner
hyanwong commented Mar 13, 2024

A relatively large task is to update the sample resolving and MRCA finding algorithms to account for different chromosomes (see #11 for the approach). At the moment we assume that an interval from e.g. 0...100 below a node can be intersected with an interval above the node e.g. from 50..200. This is only true if both intervals are on the same chromosome. We therefore need to keep a list of chromosome intervals (Portion objects) on the intervals stack, rather than just having a single Portion object per node. We should probably store this as a dictionary (keyed by numerical chromosome ID) rather than a list, because we can't be guaranteed that the chromosomes for a given node will be numbered from 0..N.

However, for the time being, we can raise an error if we have any chromosome numbers other than the default.

Additionally, I think it would be neater to default to a chromosome of 0 (not -1). After all, even if we don't specify a chromosome, we are assuming there is one. I suppose -1 (or some other negative number) could be reserved for circular chromosomes.

@hyanwong
Copy link
Owner Author

Default now set to 0 (I guess most species number from 1 anyway, so 0 is fine for indicating it's not been explicitly set).

We should check that specifying chromosome=1 (without defining a chromosome 0) works: no reason it shouldn't.

However, the sample-resolving etc algorithms will currently break for different chromosomes.

@hyanwong
Copy link
Owner Author

It should be possible to set the chromosome numbers to arbitrary values for different individuals, and the recombination-breakpoint-finding routines should "just work". This would be a good unit-test.

Details for chromosome metadata should I think, be stored in the node metadata, when we implement it. There is no point having a chromosome table, as the identity of chromosomes can differ between individuals.

@hyanwong hyanwong changed the title Account for chromosome IDs, and set default to 0 Account for chromosome IDs when tracing intervals Mar 15, 2024
@hyanwong
Copy link
Owner Author

We will need to change the format of the MRCAdict to allow different intervals to exist on different chromosomes.

@hyanwong
Copy link
Owner Author

Done in #112

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant
0