8000 Complexity of examples · Issue #52 · ashander/ftprime · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Complexity of examples #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to o 8000 ur terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ashander opened this issue Nov 2, 2017 · 4 comments
Open

Complexity of examples #52

ashander opened this issue Nov 2, 2017 · 4 comments

Comments

@ashander
Copy link
Owner
ashander commented Nov 2, 2017

Ideally we'll have a few stand-alone example scripts (eg as contributed by @hyanwong in #48 ) but these will:

  1. not be too complex involving lots of library-ish code and new objects or functions to interpret
  2. not depend on ad hoc statistical code outside of msprime
@ashander
Copy link
Owner Author
ashander commented Nov 2, 2017

To answer question in #48 (comment) @hyanwong it'd be nice to have examples that are relatively simple to read. I realize that your example grew out of an ongoing project though.

The current state is great (thanks for finding lots of bugs/issues!) but adding a lot more code complexity to the PR /branch would make things hard to understand as an ftprime example.

@hyanwong
Copy link
hyanwong commented Nov 2, 2017

Yes, I agree about making them easy to read / simpler & not relying on external code. I feel a bit guilty about bloating the code. However, I do think it is worth having examples that can be called either as a python routine in their own right or as a main() routine using argparse. This needn't add much to the complexity if it is done right, I suspect. FWIW, I think I've done it in too hacky a manner in selective_sweep.py. Also, as an aside, I think it's also worth using import logging and logging.basicConfig(...) rather than your own code for log files (easier to turn this on and off)

@ashander
Copy link
Owner Author
ashander commented Nov 2, 2017

Cool. Those are good comments and I agree separating code from main() is nice style. It's all a balancing act.

@hyanwong
Copy link
hyanwong commented Nov 2, 2017

Code to allow SFS-like testing of example files is at hyanwong@40d01ae

Since it updates the SFS incrementally, you will need to pass it a function to calculate various statistics iteratively. I use a hacked together script which could probably do with some tidying, something like (untested)

from SFS import incrementalSFS

def H(n):
    """Returns an approximate value of n-th harmonic number.

       http://en.wikipedia.org/wiki/Harmonic_number
    """
    # Euler-Mascheroni constant
    gamma = 0.57721566490153286060651209008240243104215933593992
    return gamma + math.log(n) + 0.5/n - 1./(12*n**2) + 1./(120*n**4)

def Theta_W(sfs, n):
    """
    Calculate the (mutationless) Watterson estimator
    """
    return sum(sfs.values()) / H(n)

def Theta_pi(sfs, n):
    """
    Calculate the (mutationless) nucleotide diversity
    2 sum(i(n−i)*ξi) / n / (n-1)
    """
    return 2 * sum([i*(n-i)*sfs[i] for i in sfs.keys()]) / n / (n-1)

def Theta_H(sfs, n):
    """
    Calculate the theta estimate used in Fay & Wu's H
    2 * sum(i^2 ξi) / n / (n-1)
    """
    return 2 * sum([(i**2) * sfs[i] for i in sfs.keys()]) / n / (n-1)
    
def save_estimates(start, end, sfs, n):
    watt = Theta_W(sfs, n)
    ndiv = Theta_pi(sfs, n)
    tH = Theta_H(sfs, n)
    T_W.append((watt, end-start))
    T_pi.append((ndiv, end-start))
    T_H.append((tH, end-start))
    D.append(((end+start)/2, ndiv - watt))
    FayWuH.append(((end+start)/2, ndiv - tH))

ts = msprime.load("sweepfileXXX.hdf5")
#globals (yuck)
T_W, T_pi, T_H, D, FayWuH = [], [], [], [], []
incrementalSFS(ts, save_estimates)

#plot D here using something like 
#import matplotlib
#plt.plot(*zip(*D))
#plt.set_xlim(0,ts.get_sequence_length())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0