chr
: common homologous regions
This package contains functions for fast search of homologous regions shared by multiple closely related genomes. The interface of chr
consists of the Intersect
function and the Parameters
struct:
chr.Intersect(chr.Parameters)
The function returns a slice of pointers to fasta sequences []*fasta.Sequence
. The type fasta
is defined in the fasta
package.
type Parameters struct {
Reference []*fasta.Sequence
ShiftRefRight bool
TargetDir string
Threshold float64
ShustrPval float64
CleanSubject bool
CleanQuery bool
PrintSegSitePos bool
PrintN bool
PrintOneBased bool
}
Fields of this data structure contain parameters used to call Intersect(). The parameters include:
- reference slice of
fasta
sequences - a switch to shift the output coordinates to the right by a given number
- path to the directory of target genomes minus the reference
- a threshold, the minimum fraction of intersecting genomes
- a p-value of the shustring length (needed for
sus
.Quantile) - a switch to clean subject's sequence. To clean a sequence is to remove non-ATGC nucleotides.
- a switch to clean query sequences
- a switch to print positions of segregation sites in output's headers
- a switch to print N at the positions of mismatches
- a switch to print one-based coordinates.