Description
Dear @kpmainali,
I have recently found your package and wanted to apply the alpha_mle to search for co-occurring core taxa in some microbiome WGS datasets (based on presence-absence data). However, I have some doubts that I hope you can clarify:
- Warning-related doubts. When I run the example, I see the following warnings:
-
"Warning in AlphInts(x, marg, lev = lev, scal = scal, pvalType = pvalType) : MLE = -Infty is capped, along with lower confidence limit", also a variation of this with "Infty" instead of "-Infty". I was wondering if they are important and should be taken into account for alpha_mle results, or can they be ignored and carry on? If I understand correctly, based on "Cap on MLE" this only indicates that the limits of the interval(-10,10) were reached, correct?
-
"Warning in ML.Alpha(x = X, marg = c(sum(mA), sum(mB), nrow(sub))) : If the mA or mB value is equal to 0 or N, then the corresponding co-occurrence distribution is degenerate at min(mA,mB). This means that the co-occurrence count X will always be min(mA,mB) regardless of alpha. In this case alpha is undefined, and no computations are done." That is related to errornotes "Degenerate co-occurrence distribution!" in the myout object. I understand that in this particular case, affinity measure can not be calculated. However, I wonder what would be the biological interpretation or justification if any? What does the Degenerate co-occurrence distribution indicate for Certhidea olivacea in the example for instance?
- Significance, p-values and affinity function:
-
p-value reported in myout$all is described as "p_value: the commonly reported P-value of the observed co-occurrences; computed by AlphInts()$pval". If I interpret it correctly, this p-value would indicate whether co-occurrence between pairs of entities(species) are significant. Then could these p-values be used to cross out non-significant co-occurrences for different indexes (alpha_mle, jaccard, sorensen, simpson)? If so, maybe it could be an addition to plotgg function, similar to the corrplot package.
-
Should p-values be corrected? Bonferroni Correction or Benjamini-Hochberg, etc?
- Range for alpha_mle and interpretation: Based on "Cap on MLE" I understand that the expected values for alpha_mle are between (-10,10). Positive alpha_mle values denote positive associations (co-occurrence) and negative values denote negative associations (mutual-exclusion). But in what ranges of alpha_mle values would you categorise weak and strong associations? For instance, in the example alpha_mle_sig most values are around |6|, but also some values around |3| are significant. These |6| values are the max values in the table, but is this a weak or strong association, since the theoretical max value could be |10|?
Thank you very much in advance, and congratulations on this interesting tool!
Sam