Cooler module¶

class fanc.compatibility.cooler.CoolerHic(*args, **kwargs)¶

Bases: fanc.matrix.RegionMatrixContainer, cooler.api.Cooler

add_contact(contact, *args, **kwargs)¶

Alias for add_edge()

Parameters

contact – Edge
args – Positional arguments passed to _add_edge()
kwargs – Keyword arguments passed to _add_edge()

add_contacts(contacts, *args, **kwargs)¶: Alias for add_edges()

add_edge(edge, check_nodes_exist=True, *args, **kwargs)¶

Add an edge / contact between two regions to this object.

Parameters

edge – Edge, dict with at least the attributes source and sink, optionally weight, or a list of length 2 (source, sink) or 3 (source, sink, weight).
check_nodes_exist – Make sure that there are nodes that match source and sink indexes
args – Positional arguments passed to _add_edge()
kwargs – Keyword arguments passed to _add_edge()

add_edge_from_dict(edge, *args, **kwargs)¶

Direct method to add an edge from dict input.

Parameters: edge – dict with at least the keys “source” and “sink”. Additional keys will be loaded as edge attributes

add_edge_from_edge(edge, *args, **kwargs)¶

Direct method to add an edge from Edge input.

Parameters: edge – Edge

add_edge_from_list(edge, *args, **kwargs)¶

Direct method to add an edge from list or tuple input.

Parameters: edge – List or tuple. Should be of length 2 (source, sink) or 3 (source, sink, weight)

add_edge_simple(source, sink, weight=None, *args, **kwargs)¶

Direct method to add an edge from Edge input.

Parameters

source – Source region index
sink – Sink region index
weight – Weight of the edge

add_edges(edges, *args, **kwargs)¶

Bulk-add edges from a list.

List items can be any of the supported edge types, list, tuple, dict, or Edge. Repeatedly calls add_edge(), so may be inefficient for large amounts of data.

Parameters: edges – List (or iterator) of edges. See add_edge() for details

add_region(region, *args, **kwargs)¶

Add a genomic region to this object.

This method offers some flexibility in the types of objects that can be loaded. See parameters for details.

Parameters: region – Can be a GenomicRegion, a str in the form ‘<chromosome>:<start>-<end>[:<strand>], a dict with at least the fields ‘chromosome’, ‘start’, and ‘end’, optionally ‘ix’, or a list of length 3 (chromosome, start, end) or 4 (ix, chromosome, start, end).

static bin_intervals(intervals, bins, interval_range=None, smoothing_window=None, nan_replacement=None, zero_to_nan=False)¶

Bin a given set of intervals into a fixed number of bins.

Parameters

intervals – iterator of tuples (start, end, score)
bins – Number of bins to divide the region into
interval_range – Optional. Tuple (start, end) in base pairs of range of interval to be binned. Useful if intervals argument does not cover to exact genomic range to be binned.
smoothing_window – Size of window (in bins) to smooth scores over
nan_replacement – NaN values in the scores will be replaced with this value
zero_to_nan – If True, will convert bins with score 0 to NaN

Returns

iterator of tuples: (start, end, score)

static bin_intervals_equidistant(intervals, bin_size, interval_range=None, smoothing_window=None, nan_replacement=None, zero_to_nan=False)¶

Bin a given set of intervals into bins with a fixed size.

Parameters

intervals – iterator of tuples (start, end, score)
bin_size – Size of each bin in base pairs
interval_range – Optional. Tuple (start, end) in base pairs of range of interval to be binned. Useful if intervals argument does not cover to exact genomic range to be binned.
smoothing_window – Size of window (in bins) to smooth scores over
nan_replacement – NaN values in the scores will be replaced with this value
zero_to_nan – If True, will convert bins with score 0 to NaN

Returns

iterator of tuples: (start, end, score)

property bin_size¶

Return the length of the first region in the dataset.

Assumes all bins have equal size.

Returns: int

binned_regions(region=None, bins=None, bin_size=None, smoothing_window=None, nan_replacement=None, zero_to_nan=False, *args, **kwargs)¶

Same as region_intervals, but returns GenomicRegion objects instead of tuples.

Parameters

region – String or class:~GenomicRegion object denoting the region to be binned
bins – Number of bins to divide the region into
bin_size – Size of each bin (alternative to bins argument)
smoothing_window – Size of window (in bins) to smooth scores over
nan_replacement – NaN values in the scores will be replaced with this value
zero_to_nan – If True, will convert bins with score 0 to NaN
args – Arguments passed to _region_intervals
kwargs – Keyword arguments passed to _region_intervals

Returns

iterator of GenomicRegion objects

bins(**kwargs)¶

Bin table selector

Returns
Return type: Table selector

bins_to_distance(bins)¶

Convert fraction of bins to base pairs

Parameters: bins – float, fraction of bins
Returns: int, base pairs

property binsize¶: Resolution in base pairs if uniform else None

property chromnames¶: List of reference sequence names

property chromosome_bins¶

Returns a dictionary of chromosomes and the start and end index of the bins they cover.

Returned list is range-compatible, i.e. chromosome bins [0,5] cover chromosomes 1, 2, 3, and 4, not 5.

property chromosome_lengths¶: Returns a dictionary of chromosomes and their length in bp.

chromosomes()¶: Get a list of chromosome names.

chroms(**kwargs)¶

Chromosome table selector

Returns
Return type: Table selector

property chromsizes¶: Ordered mapping of reference sequences to their lengths in bp

close(remove_tmp=True)¶

Close this Juicer file and run exit operations.

If file was opened with tmpdir in read-only mode: close file and delete temporary copy.

Parameters: remove_tmp – If False, does not delete temporary copy of file.

distance_to_bins(distance)¶

Convert base pairs to fraction of bins.

Parameters: distance – distance in base pairs
Returns: float, distance as fraction of bin size

edge_data(attribute, *args, **kwargs)¶

Iterate over specific edge attribute.

Parameters

attribute – Name of the attribute, e.g. “weight”
args – Positional arguments passed to edges()
kwargs – Keyword arguments passed to edges()

Returns

iterator over edge attribute

edge_subset(key=None, *args, **kwargs)¶

Get a subset of edges.

This is an alias for edges().

Returns: generator (Edge)

property edges¶

Iterate over contacts / edges.

edges() is the central function of RegionPairsContainer. Here, we will use the Hic implementation for demonstration purposes, but the usage is exactly the same for all compatible objects implementing RegionPairsContainer, including JuicerHic and CoolerHic.

import fanc

# file from FAN-C examples
hic = fanc.load("output/hic/binned/fanc_example_1mb.hic")

We can easily find the number of edges in the sample Hic object:

len(hic.edges)  # 8695

When used in an iterator context, edges() iterates over all edges in the RegionPairsContainer:

for edge in hic.edges:
    # do something with edge
    print(edge)
    # 42--42; bias: 5.797788472650082e-05; sink_node: chr18:42000001-43000000; source_node: chr18:42000001-43000000; weight: 0.12291311562018173
    # 24--28; bias: 6.496381719803623e-05; sink_node: chr18:28000001-29000000; source_node: chr18:24000001-25000000; weight: 0.025205961072838057
    # 5--76; bias: 0.00010230955745211447; sink_node: chr18:76000001-77000000; source_node: chr18:5000001-6000000; weight: 0.00961709840049876
    # 66--68; bias: 8.248432587969082e-05; sink_node: chr18:68000001-69000000; source_node: chr18:66000001-67000000; weight: 0.03876763316345468
    # ...

Calling edges() as a method has the same effect:

# note the '()'
for edge in hic.edges():
    # do something with edge
    print(edge)
    # 42--42; bias: 5.797788472650082e-05; sink_node: chr18:42000001-43000000; source_node: chr18:42000001-43000000; weight: 0.12291311562018173
    # 24--28; bias: 6.496381719803623e-05; sink_node: chr18:28000001-29000000; source_node: chr18:24000001-25000000; weight: 0.025205961072838057
    # 5--76; bias: 0.00010230955745211447; sink_node: chr18:76000001-77000000; source_node: chr18:5000001-6000000; weight: 0.00961709840049876
    # 66--68; bias: 8.248432587969082e-05; sink_node: chr18:68000001-69000000; source_node: chr18:66000001-67000000; weight: 0.03876763316345468
    # ...

Rather than iterate over all edges in the object, we can select only a subset. If the key is a string or a GenomicRegion, all non-zero edges connecting the region described by the key to any other region are returned. If the key is a tuple of strings or GenomicRegion, only edges between the two regions are returned.

# select all edges between chromosome 19
# and any other region:
for edge in hic.edges("chr19"):
    print(edge)
    # 49--106; bias: 0.00026372303696871666; sink_node: chr19:27000001-28000000; source_node: chr18:49000001-50000000; weight: 0.003692122517562033
    # 6--82; bias: 0.00021923129703834945; sink_node: chr19:3000001-4000000; source_node: chr18:6000001-7000000; weight: 0.0008769251881533978
    # 47--107; bias: 0.00012820949175399097; sink_node: chr19:28000001-29000000; source_node: chr18:47000001-48000000; weight: 0.0015385139010478917
    # 38--112; bias: 0.0001493344481069762; sink_node: chr19:33000001-34000000; source_node: chr18:38000001-39000000; weight: 0.0005973377924279048
    # ...

# select all edges that are only on
# chromosome 19
for edge in hic.edges(('chr19', 'chr19')):
    print(edge)
    # 90--116; bias: 0.00021173151730025176; sink_node: chr19:37000001-38000000; source_node: chr19:11000001-12000000; weight: 0.009104455243910825
    # 135--135; bias: 0.00018003890596887822; sink_node: chr19:56000001-57000000; source_node: chr19:56000001-57000000; weight: 0.10028167062466517
    # 123--123; bias: 0.00011063368998965993; sink_node: chr19:44000001-45000000; source_node: chr19:44000001-45000000; weight: 0.1386240135570439
    # 92--93; bias: 0.00040851066434864896; sink_node: chr19:14000001-15000000; source_node: chr19:13000001-14000000; weight: 0.10090213409411629
    # ...

# select inter-chromosomal edges
# between chromosomes 18 and 19
for edge in hic.edges(('chr18', 'chr19')):
    print(edge)
    # 49--106; bias: 0.00026372303696871666; sink_node: chr19:27000001-28000000; source_node: chr18:49000001-50000000; weight: 0.003692122517562033
    # 6--82; bias: 0.00021923129703834945; sink_node: chr19:3000001-4000000; source_node: chr18:6000001-7000000; weight: 0.0008769251881533978
    # 47--107; bias: 0.00012820949175399097; sink_node: chr19:28000001-29000000; source_node: chr18:47000001-48000000; weight: 0.0015385139010478917
    # 38--112; bias: 0.0001493344481069762; sink_node: chr19:33000001-34000000; source_node: chr18:38000001-39000000; weight: 0.0005973377924279048
    # ...

By default, edges() will retrieve all edge attributes, which can be slow when iterating over a lot of edges. This is why all file-based FAN-C RegionPairsContainer objects support lazy loading, where attributes are only read on demand.

for edge in hic.edges('chr18', lazy=True):
    print(edge.source, edge.sink, edge.weight, edge)
    # 42 42 0.12291311562018173 <fanc.matrix.LazyEdge for row /edges/chrpair_0_0.row (Row), pointing to row #0>
    # 24 28 0.025205961072838057 <fanc.matrix.LazyEdge for row /edges/chrpair_0_0.row (Row), pointing to row #1>
    # 5 76 0.00961709840049876 <fanc.matrix.LazyEdge for row /edges/chrpair_0_0.row (Row), pointing to row #2>
    # 66 68 0.03876763316345468 <fanc.matrix.LazyEdge for row /edges/chrpair_0_0.row (Row), pointing to row #3>
    # ...

Warning

The lazy iterator reuses the LazyEdge object in every iteration, and overwrites the LazyEdge attributes. Therefore do not use lazy iterators if you need to store edge objects for later access. For example, the following code works as expected list(hic.edges()), with all Edge objects stored in the list, while this code list(hic.edges(lazy=True)) will result in a list of identical LazyEdge objects. Always ensure you do all edge processing in the loop when working with lazy iterators!

When working with normalised contact frequencies, such as obtained through matrix balancing in the example above, edges() automatically returns normalised edge weights. In addition, the bias attribute will (typically) have a value different from 1.

When you are interested in the raw contact frequency, use the norm=False parameter:

for edge in hic.edges('chr18', lazy=True, norm=False):
    print(edge.source, edge.sink, edge.weight)
    # 42 42 2120.0
    # 24 28 388.0
    # 5 76 94.0
    # 66 68 470.0
    # ...

You can also choose to omit all intra- or inter-chromosomal edges using intra_chromosomal=False or inter_chromosomal=False, respectively.

Returns: Iterator over Edge or equivalent.

edges_dict(*args, **kwargs)¶

Edges iterator with access by bracket notation.

This iterator always returns unnormalised edges.

Returns: dict or dict-like iterator

expected_values(selected_chromosome=None, norm=True, *args, **kwargs)¶

Calculate the expected values for genomic contacts at all distances.

This calculates the expected values between genomic regions separated by a specific distance. Expected values are calculated as the average weight of edges between region pairs with the same genomic separation, taking into account unmappable regions.

It will return a tuple with three values: a list of genome-wide intra-chromosomal expected values (list index corresponds to number of separating bins), a dict with chromosome names as keys and intra-chromosomal expected values specific to each chromosome, and a float for inter-chromosomal expected value.

Parameters

selected_chromosome – (optional) Chromosome name. If provided, will only return expected values for this chromosome.
norm – If False, will calculate the expected values on the unnormalised matrix.
args – Not used in this context
kwargs – Not used in this context

Returns

list of intra-chromosomal expected values, dict of intra-chromosomal expected values by chromosome, inter-chromosomal expected value

expected_values_and_marginals(selected_chromosome=None, norm=True, *args, **kwargs)¶

Calculate the expected values for genomic contacts at all distances and the whole matrix marginals.

This calculates the expected values between genomic regions separated by a specific distance. Expected values are calculated as the average weight of edges between region pairs with the same genomic separation, taking into account unmappable regions.

It will return a tuple with three values: a list of genome-wide intra-chromosomal expected values (list index corresponds to number of separating bins), a dict with chromosome names as keys and intra-chromosomal expected values specific to each chromosome, and a float for inter-chromosomal expected value.

Parameters

selected_chromosome – (optional) Chromosome name. If provided, will only return expected values for this chromosome.
norm – If False, will calculate the expected values on the unnormalised matrix.
args – Not used in this context
kwargs – Not used in this context

Returns

list of intra-chromosomal expected values, dict of intra-chromosomal expected values by chromosome, inter-chromosomal expected value

extent(region)¶

Bin IDs containing the left and right ends of a genomic region

Parameters: region (str or tuple) – Genomic range
Returns
Return type: 2-tuple of ints

Examples

>>> c.extent('chr3')  
(1311, 2131)

find_region(query_regions, _regions_dict=None, _region_ends=None, _chromosomes=None)¶

Find the region that is at the center of a region.

Parameters: query_regions – Region selector string, :class:~GenomicRegion, or list of the former
Returns: index (or list of indexes) of the region at the center of the query region

property info¶

File information and metadata

Returns
Return type: dict

intervals(*args, **kwargs)¶: Alias for region_intervals.

mappable(region=None)¶: Get the mappability vector of this matrix.

marginals(masked=True, *args, **kwargs)¶

Get the marginals vector of this Hic matrix.

Sums up all contacts for each bin of the Hi-C matrix. Unmappable regoins will be masked in the returned vector unless the masked parameter is set to False.

By default, corrected matrix entries are summed up. To get uncorrected matrix marginals use norm=False. Generally, all parameters accepted by edges() are supported.

Parameters

masked – Use a numpy masked array to mask entries corresponding to unmappable regions
kwargs – Keyword arguments passed to edges()

matrix(key=None, log=False, default_value=None, mask=True, log_base=2, *args, **kwargs)¶

Assemble a RegionMatrix from region pairs.

Parameters

key – Matrix selector. See edges() for all supported key types
log – If True, log-transform the matrix entries. Also see log_base
log_base – Base of the log transformation. Default: 2; only used when log=True
default_value – (optional) set the default value of matrix entries that have no associated edge/contact
mask – If False, do not mask unmappable regions
args – Positional arguments passed to regions_and_matrix_entries()
kwargs – Keyword arguments passed to regions_and_matrix_entries()

Returns

RegionMatrix

classmethod merge(pairs, *args, **kwargs)¶

Merge two or more RegionPairsContainer objects.

Parameters

pairs – list of RegionPairsContainer
args – Positional arguments passed to constructor of this class
kwargs – Keyword arguments passed to constructor of this class

offset(region)¶

Bin ID containing the left end of a genomic region

Parameters: region (str or tuple) – Genomic range
Returns
Return type: int

Examples

>>> c.offset('chr3')  
1311

open(mode='r', **kwargs)¶

Open the HDF5 group containing the Cooler with h5py

Functions as a context manager. Any open_kws passed during construction are ignored.

Parameters

mode (str, optional [default: 'r']) –

'r' (readonly)
'r+' or 'a' (read/write)

Notes

For other parameters, see h5py.File.

pixels(join=False, **kwargs)¶

Pixel table selector

Parameters: join (bool, optional) – Whether to expand bin ID columns into chrom, start, and end columns. Default is False.
Returns
Return type: Table selector

possible_contacts()¶

Calculate the possible number of contacts in the genome.

This calculates the number of potential region pairs in a genome for any possible separation distance, taking into account the existence of unmappable regions.

It will calculate one number for inter-chromosomal pairs, return a list with the number of possible pairs where the list index corresponds to the number of bins separating two regions, and a dictionary of lists for each chromosome.

Returns: possible intra-chromosomal pairs, possible intra-chromosomal pairs by chromosome, possible inter-chromosomal pairs

region_bins(*args, **kwargs)¶

Return slice of start and end indices spanned by a region.

Parameters: args – provide a GenomicRegion here to get the slice of start and end bins of onlythis region. To get the slice over all regions leave this blank.
Returns

region_intervals(region, bins=None, bin_size=None, smoothing_window=None, nan_replacement=None, zero_to_nan=False, score_field='score', *args, **kwargs)¶

Return equally-sized genomic intervals and associated scores.

Use either bins or bin_size argument to control binning.

Parameters

region – String or class:~GenomicRegion object denoting the region to be binned
bins – Number of bins to divide the region into
bin_size – Size of each bin (alternative to bins argument)
smoothing_window – Size of window (in bins) to smooth scores over
nan_replacement – NaN values in the scores will be replaced with this value
zero_to_nan – If True, will convert bins with score 0 to NaN
args – Arguments passed to _region_intervals
kwargs – Keyword arguments passed to _region_intervals

Returns

iterator of tuples: (start, end, score)

region_subset(region, *args, **kwargs)¶

Takes a class:~GenomicRegion and returns all regions that overlap with the supplied region.

Parameters: region – String or class:~GenomicRegion object for which covered bins will be returned.

property regions¶

Iterate over genomic regions in this object.

Will return a GenomicRegion object in every iteration. Can also be used to get the number of regions by calling len() on the object returned by this method.

Returns: RegionIter

regions_and_edges(key, *args, **kwargs)¶

Convenient access to regions and edges selected by key.

Parameters

key – Edge selector, see edges()
args – Positional arguments passed to edges()
kwargs – Keyword arguments passed to edges()

Returns

list of row regions, list of col regions, iterator over edges

regions_and_matrix_entries(key=None, score_field=None, *args, **kwargs)¶

Convenient access to non-zero matrix entries and associated regions.

Parameters

key – Edge key, see edges()
oe – If True, will divide observed values by their expected value at the given distance. False by default
oe_per_chromosome – If True (default), will do a per-chromosome O/E calculation rather than using the whole matrix to obtain expected values
score_field – (optional) any edge attribute that returns a number can be specified here for filling the matrix. Usually this is defined by the _default_score_field attribute of the matrix class.
args – Positional arguments passed to edges()
kwargs – Keyword arguments passed to edges()

Returns

list of row regions, list of col regions, iterator over (i, j, weight) tuples

property regions_dict¶

Return a dictionary with region index as keys and regions as values.

Returns: dict {region.ix: region, …}

static regions_identical(pairs)¶

Check if the regions in all objects in the list are identical.

Parameters: pairs – list of RegionBased objects
Returns: True if chromosome, start, and end are identical between all regions in the same list positions.

scaling_factor(matrix, weight_column=None)¶

Compute the scaling factor to another matrix.

Calculates the ratio between the number of contacts in this Hic object to the number of contacts in another Hic object.

Parameters

matrix – A Hic object
weight_column – Name of the column to calculate the scaling factor on

Returns

float

property storage_mode¶: Indicates whether ordinary sparse matrix encoding is used ("square") or whether a symmetric matrix is encoded by storing only the upper triangular elements ("symmetric-upper").

to_bed(file_name, subset=None, **kwargs)¶

Export regions as BED file

Parameters

file_name – Path of file to write regions to
subset – optional GenomicRegion or str to write only regions overlapping this region
kwargs – Passed to write_bed()

to_bigwig(file_name, subset=None, **kwargs)¶

Export regions as BigWig file.

Parameters

file_name – Path of file to write regions to
subset – optional GenomicRegion or str to write only regions overlapping this region
kwargs – Passed to write_bigwig()

to_gff(file_name, subset=None, **kwargs)¶

Export regions as GFF file

Parameters

file_name – Path of file to write regions to
subset – optional GenomicRegion or str to write only regions overlapping this region
kwargs – Passed to write_gff()

class fanc.compatibility.cooler.LazyCoolerRegion(series, ix=None)¶

Bases: genomic_regions.regions.GenomicRegion

as_bed_line(score_field='score', name_field='name')¶

Return a representation of this object as line in a BED file.

Parameters

score_field – name of the attribute to be used in the ‘score’ field of the BED line
name_field – name of the attribute to be used in the ‘name’ field of the BED line

Returns

str

as_gff_line(source_field='source', feature_field='feature', score_field='score', frame_field='frame', float_format='.2e')¶

Return a representation of this object as line in a GFF file.

Parameters

source_field – name of the attribute to be used in the ‘source’ field of the GFF line
feature_field – name of the attribute to be used in the ‘feature’ field of the GFF line
score_field – name of the attribute to be used in the ‘score’ field of the GFF line
frame_field – name of the attribute to be used in the ‘frame’ field of the GFF line
float_format – Formatting string for the float fields

Returns

str

property attributes¶

Return all visible attributes of this GenomicRegion.

Returns all attribute names that do not start with an underscore. :return: list of attribute names

property center¶

Return the center coordinate of the GenomicRegion.

Returns: float

contains(region)¶

Check if the specified region is completely contained in this region.

Parameters: region – GenomicRegion object or string

copy()¶

Return a (shallow) copy of this GenomicRegion

Returns: GenomicRegion

expand(absolute=None, relative=None, absolute_left=0, absolute_right=0, relative_left=0.0, relative_right=0.0, copy=True, from_center=False)¶

Expand this region by a relative or an absolute amount.

Parameters

absolute – Absolute amount in base pairs by which to expand the region represented by this GenomicRegion object on both sides. New region start will be <old start - absolute>, new region end will be <old end + absolute>
relative – Relative amount as fraction of region by which to expand the region represented by this GenomicRegion object on both sides. New region start will be <old start - relative*len(self)>, new region end will be <old end + relative*(len(self)>
absolute_left – Absolute amount in base pairs by which to expand the region represented by this GenomicRegion object on the left side
absolute_right – Absolute amount in base pairs by which to expand the region represented by this GenomicRegion object on the right side
relative_left – Relative amount in base pairs by which to expand the region represented by this GenomicRegion object on the left side
relative_right – Relative amount in base pairs by which to expand the region represented by this GenomicRegion object on the right side
copy – If True, return a copy of the original region, if False will modify the existing region in place
from_center – If True measures distance from center rather than start and end of the old region

Returns

GenomicRegion

property five_prime¶

Return the position of the 5’ end of this GenomicRegion on the reference.

Returns: int

fix_chromosome(copy=False)¶

Change chromosome representation from chr<NN> to <NN> or vice versa.

Parameters: copy – If True, make copy of region, otherwise will modify existing region in place.
Returns: GenomicRegion

classmethod from_string(region_string)¶

Convert a string into a GenomicRegion.

This is a very useful convenience function to quickly define a GenomicRegion object from a descriptor string. Numbers can be abbreviated as ‘12k’, ‘1.5M’, etc.

Parameters: region_string – A string of the form <chromosome>[:<start>-<end>[:<strand>]] (with square brackets indicating optional parts of the string). If any optional part of the string is omitted, intuitive defaults will be chosen.
Returns: GenomicRegion

is_forward()¶

Return True if this region is on the forward strand of the reference genome.

Returns: True if on ‘+’ strand, False otherwise.

is_reverse()¶

Return True if this region is on the reverse strand of the reference genome.

Returns: True if on ‘-‘ strand, False otherwise.

overlap(region)¶

Return the overlap in base pairs between this region and another region.

Parameters: region – GenomicRegion to find overlap for
Returns: overlap as int in base pairs

overlaps(region)¶

Check if this region overlaps with the specified region.

Parameters: region – GenomicRegion object or string

set_attribute(attribute, value)¶

Safely set an attribute on the GenomicRegion object.

This automatically decodes bytes objects into UTF-8 strings. If you do not care about this, you can also use region.<attribute> = <value> directly.

Parameters

attribute – Name of the attribute to be set
value – Value of the attribute to be set

property strand_string¶

Return the ‘strand’ attribute as string.

Returns: strand as str (‘+’, ‘-‘, or ‘.’)

property three_prime¶

Return the position of the 3’ end of this GenomicRegion on the reference.

Returns: int

to_string()¶

Convert this GenomicRegion to its string representation.

Returns: str

fanc.compatibility.cooler.to_cooler(hic, path, balance=True, multires=True, resolutions=None, n_zooms=10, threads=1, chunksize=100000, max_resolution=5000000, natural_order=True, chromosomes=None, **kwargs)¶

Export Hi-C data as Cooler file.

Only contacts that have not been filtered are exported. https://github.com/mirnylab/cooler/

Single resolution files: If input Hi-C matrix is uncorrected, the uncorrected matrix is stored. If it is corrected, the uncorrected matrix is stored along with bias vector. Cooler always calculates corrected matrix on-the-fly from the uncorrected matrix and the bias vector.

Multi-resolution files (default):

Parameters

hic – Hi-C file in any compatible (RegionMatrixContainer) format
path – Output path for cooler file
balance – Include bias vector in cooler output (single res) or perform iterative correction (multi res)
multires – Generate a multi-resolution cooler file
resolutions – Resolutions in bp (int) for multi-resolution cooler output
chunksize – Number of pixels processed at a time in cooler
kwargs – Additional arguments passed to cooler.iterative_correction