RegionBased¶
FAN-C builds extensively on the genomic_regions package, which provides a unified
interface for most types of region-based genomic data. We highly recommend
reading the documentation
of that package before going into the details of FAN-C, as many of the concepts discussed
therein are central to the handling of data in FAN-C.
You can check whether a FAN-C object supports the RegionBased
interface with
import genomic_regions as gr
isinstance(o, gr.RegionBased) # True for objects supporting the regions interface
The current list of FAN-C objects supporting the RegionBased
interface is:
InsulationScore,
DirectionalityIndex,
Boundaries,
InsulationScores,
DirectionalityIndexes,
FoldChangeScores,
DifferenceScores,
DifferenceRegions,
FoldChangeRegions,
CoolerHic,
JuicerHic,
Hic,
ABCompartmentMatrix,
DifferenceMatrix,
FoldChangeMatrix,
PeakInfo,
and
RaoPeakInfo.
Any object built on that foundation supports, for example, region iterators:
for region in hic.regions:
print(region)
print(region.chromosome, region.start, region.end, region.strand)
print(region.is_forward)
print(region.center)
# ...
Range queries:
for region in hic.regions('chr1:3mb-12mb'):
print(region.chromosome) # chr1
# ...
and many more convenient features. The object type returned by all of those queries
is GenomicRegion, which has many convenient functions to
deal with region properties and operations.
len(region) # returns the size of the region in base pairs
region.center # returns the base (or fraction of base) at the center of the region
region.five_prime # returns the starting base at the 5' end of the region
region.three_prime # returns the starting base at the 3' end of the region
region.is_forward() # True if strand is '+' or '+1'
region.is_reverse() # True if strand is '-' or '-1'
region.attributes # return all attribute names in this region object
region.copy() # return a shallow copy of this region
region.to_string() # return a region identifier string describing the region
region = gr.as_region('chr12:12.5Mb-18Mb')
region.overlaps('chr12:11Mb-13Mb') # True
region.overlaps('chr12:11Mb-11.5Mb') # False
region.overlaps('chr1:11Mb-13Mb') # False
Refer to the genomic_regions documentation for all the details.
Similarly to the regions interface for handling collections of genomic regions,
FAN-C implements interfaces for working with pairs of genomic regions (edges)
and matrix operations (matrix). These work in exactly the same way for FAN-C,
Cooler, and Juicer files. Hence, all of these are directly compatible with FAN-C architectural
functions such as the insulation score or AB compartment analyses, …
These interfaces will be introduced in the following sections, starting with RegionPairsContainer.