Compartments and TADs detection

Here, we present the analysis to detect the compartments in Mouse B and iPS cells. In this example, we will use the GC-content (guanine-cytosine content) to identify which bins belong to the A or B compartments. The percentage of bases that are either guanine or cytosine on a DNA strand correlates directly with gene density and is a good measure to identify open and close chromatine.

Note: Compartments are detected on the full genome matrix.

from pytadbit.parsers.hic_parser import load_hic_data_from_bam
from pytadbit.parsers.genome_parser import get_gc_content, parse_fasta
from pickle import load
reso = 200000
base_path = 'results/fragment/{0}_both/03_filtering/valid_reads12_{0}.bam'
bias_path = 'results/fragment/{0}_both/04_normalizing/biases_{0}_both_{1}kb.biases'
rich_in_A = get_gc_content(parse_fasta('genome/Mus_musculus-GRCm38.p6/Mus_musculus-GRCm38.p6.fa'),
                           by_chrom=True ,resolution=reso, n_cpus=8)
Loading cached genome

Compartments

Mouse B cells

cell   = 'mouse_B'
hic_data = load_hic_data_from_bam(base_path.format(cell),
                                  resolution=reso,
                                  biases=bias_path.format(cell, reso // 1000),
                                  ncpus=8)
(Matrix size 13641x13641)                                                    [2020-02-06 12:04:07]

- Parsing BAM (122 chunks)                                                   [2020-02-06 12:04:08]
   .......... .......... .......... .......... ..........     50/122
   .......... .......... .......... .......... ..........    100/122
   .......... .......... ..                                  122/122

- Getting matrices                                                           [2020-02-06 12:06:28]
   .......... .......... .......... .......... ..........     50/122
   .......... .......... .......... .......... ..........    100/122
   .......... .......... ..                                  122/122
! mkdir -p results/fragment/$cell\_both/05_segmenting
chrname = 'chr3'
corr = hic_data.find_compartments(show_compartment_labels=True,
        show=True, crms=[chrname], vmin='auto', vmax='auto', rich_in_A=rich_in_A,
        savedata='results/fragment/{0}_both/05_segmenting/compartments_{1}_{2}.tsv'.format(cell, chrname, reso),
        savedir='results/fragment/{0}_both/05_segmenting/eigenvectors_{1}_{2}'.format(cell, chrname, reso))
../_images/tutorial_8-Compartments_and_TADs_detection_10_0.png

Mouse iPS cells

cell   = 'mouse_PSC'
hic_data = load_hic_data_from_bam(base_path.format(cell),
                                  resolution=reso,
                                  biases=bias_path.format(cell, reso // 1000),
                                  ncpus=8)
(Matrix size 13641x13641)                                                    [2020-02-06 12:20:31]

- Parsing BAM (122 chunks)                                                   [2020-02-06 12:20:32]
   .......... .......... .......... .......... ..........     50/122
   .......... .......... .......... .......... ..........    100/122
   .......... .......... ..                                  122/122

- Getting matrices                                                           [2020-02-06 12:22:08]
   .......... .......... .......... .......... ..........     50/122
   .......... .......... .......... .......... ..........    100/122
   .......... .......... ..                                  122/122
! mkdir -p results/fragment/$cell\_both/05_segmenting
chrname = 'chr3'
corr = hic_data.find_compartments(show_compartment_labels=True,
        show=True, crms=[chrname], vmin='auto', vmax='auto', rich_in_A=rich_in_A,
        savedata='results/fragment/{0}_both/05_segmenting/compartments_{1}_{2}.tsv'.format(cell, chrname, reso),
        savedir='results/fragment/{0}_both/05_segmenting/eigenvectors_{1}_{2}'.format(cell, chrname, reso))
../_images/tutorial_8-Compartments_and_TADs_detection_15_0.png

Compare

The assignments of the compartments for the two cell types are stored in two different files:

! head -n 20 results/fragment/mouse_B_both/05_segmenting/compartments_chr3_200000.tsv
## CHR chr3 Eigenvector: 1
#   start   end     rich in A       type
chr3        16      44      0.40    B
chr3        45      52      0.50    A
chr3        53      72      0.40    B
chr3        73      76      0.53    A
chr3        77      96      0.39    B
chr3        97      98      0.82    A
chr3        99      100     0.82    B
chr3        101     101     nan     A
chr3        102     108     0.46    B
chr3        109     111     0.62    A
chr3        112     135     0.40    B
chr3        136     139     0.59    A
chr3        140     153     0.44    B
chr3        154     156     0.64    A
chr3        157     162     0.51    B
chr3        163     164     0.87    A
chr3        165     178     0.46    B
chr3        179     181     0.61    A
! head -n 20 results/fragment/mouse_PSC_both/05_segmenting/compartments_chr3_200000.tsv
## CHR chr3 Eigenvector: 1
#   start   end     rich in A       type
chr3        16      42      0.40    B
chr3        43      52      0.49    A
chr3        53      75      0.40    B
chr3        76      76      nan     A
chr3        77      96      0.39    B
chr3        97      97      nan     A
chr3        98      109     0.43    B
chr3        110     111     0.84    A
chr3        112     135     0.40    B
chr3        136     145     0.48    A
chr3        146     152     0.47    B
chr3        153     156     0.57    A
chr3        157     160     0.56    B
chr3        161     165     0.54    A
chr3        166     169     0.54    B
chr3        170     183     0.46    A
chr3        184     184     nan     B
chr3        188     189     0.84    A

In another folder, we also saved the coordinates of each computed eigenvector:

! ls -lh results/fragment/mouse_B_both/05_segmenting/eigenvectors_chr3_200000
total 52K
-rw-r--r-- 1 dcastillo dcastillo 49K Feb  6 12:08 chr3_EigVect1.tsv

Note: In this file the first line shows the eigenvector index with it’s corresponding eigenvalue (the first column should always be the one you selected, even if it is not the first eigenvector).

Then there are the coordinates of the eigenvectors. In the first column, the coordinates correspond to the assignment of the A and B compartments, positive values for A compartments and negative values for B compartments.

! head -n 20 results/fragment/mouse_B_both/05_segmenting/eigenvectors_chr3_200000/chr3_EigVect1.tsv
# EV_1 (199.3674)   EV_2 (38.9365)  EV_3 (22.7067)
nan nan     nan
nan nan     nan
nan nan     nan
nan nan     nan
nan nan     nan
nan nan     nan
nan nan     nan
nan nan     nan
nan nan     nan
nan nan     nan
nan nan     nan
nan nan     nan
nan nan     nan
nan nan     nan
nan nan     nan
-0.04705877433867864        0.0008416946845418325   -0.0418514300736033
-0.04570124930041132        0.004910131406689665    -0.03832275686221951
-0.04591426453743422        0.002039448345766754    -0.035967872719506695
-0.044857688854588844       0.003962351416820969    -0.04230275892744377

Load eigenvectors coordinates from files:

from builtins   import next

fh = open('results/fragment/mouse_B_both/05_segmenting/eigenvectors_chr3_200000/chr3_EigVect1.tsv')

header = next(fh)

ev1_B = []
for line in fh:
    evc1, evc2, evc3 = line.split()
    ev1_B.append(float(evc1))

fh = open('results/fragment/mouse_PSC_both/05_segmenting/eigenvectors_chr3_200000/chr3_EigVect1.tsv')

header = next(fh)

ev1_PSC = []
for line in fh:
    evc1, evc2, evc3 = line.split()
    ev1_PSC.append(float(evc1))

diff = []
for i in range(len(ev1_B)):
    diff.append(ev1_B[i] - ev1_PSC[i])

Spot changes in activity

Plot the difference between each eigenvector along chromosome

from matplotlib import pyplot as plt
plt.figure(figsize=(12, 2))
plt.text(10, 0.07, 'More active in B cell')
plt.fill_between(range(len(diff)), diff, 0, where=[i>0 for i in diff])
plt.text(10, -0.09, 'More active in PSC cell')
plt.fill_between(range(len(diff)), diff, 0, where=[i<0 for i in diff])
plt.xlim(0, len(diff))
plt.ylim(-0.1, 0.1)
plt.ylabel('difference of EV')
_ = plt.xlabel('Genomic coordinate (%s kb)' % (reso / 1000))
../_images/tutorial_8-Compartments_and_TADs_detection_29_0.png

Correlate eigenvectors

plt.figure(figsize=(6, 6))
for i in range(len(ev1_B)):
    if ev1_B[i] > 0 and ev1_PSC[i] > 0:
        plt.plot(ev1_B[i], ev1_PSC[i], 'ro', alpha=0.5)
    elif ev1_B[i] < 0 and ev1_PSC[i] < 0:
        plt.plot(ev1_B[i], ev1_PSC[i], 'bo', alpha=0.5)
    else:
        plt.plot(ev1_B[i], ev1_PSC[i], 'o', color='grey', alpha=0.5)
plt.axhline(0, color='k', alpha=0.2)
plt.axvline(0, color='k', alpha=0.2)
plt.xlabel('B cell EV1')
_ = plt.ylabel('PSC cell EV1')
../_images/tutorial_8-Compartments_and_TADs_detection_31_0.png

TADs

Now, we move to the TADs detection. In this notebook we will detect TAD borders at 100kbp resolution.

from pytadbit import Chromosome
from pytadbit.parsers.hic_parser import load_hic_data_from_bam
base_path = 'results/fragment/{0}_both/03_filtering/valid_reads12_{0}.bam'
bias_path = 'results/fragment/{0}_both/04_normalizing/biases_{0}_both_{1}kb.biases'

reso = 100000

cel1 = 'mouse_B'
cel2 = 'mouse_PSC'
hic_data1 = load_hic_data_from_bam(base_path.format(cel1),
                                   resolution=reso,
                                   region='chr3',
                                   biases=bias_path.format(cel1, reso // 1000),
                                   ncpus=8)
hic_data2 = load_hic_data_from_bam(base_path.format(cel2),
                                   resolution=reso,
                                   region='chr3',
                                   biases=bias_path.format(cel2, reso // 1000),
                                   ncpus=8)
(Matrix size 1601x1601)                                                      [2020-02-06 12:36:22]

- Parsing BAM (101 chunks)                                                   [2020-02-06 12:36:24]
   .......... .......... .......... .......... ..........     50/101
   .......... .......... .......... .......... ..........    100/101
   .                                                         101/101

- Getting matrices                                                           [2020-02-06 12:36:28]
   .......... .......... .......... .......... ..........     50/101
   .......... .......... .......... .......... ..........    100/101
   .                                                         101/101


(Matrix size 1601x1601)                                                      [2020-02-06 12:36:54]

- Parsing BAM (101 chunks)                                                   [2020-02-06 12:36:55]
   .......... .......... .......... .......... ..........     50/101
   .......... .......... .......... .......... ..........    100/101
   .                                                         101/101

- Getting matrices                                                           [2020-02-06 12:36:59]
   .......... .......... .......... .......... ..........     50/101
   .......... .......... .......... .......... ..........    100/101
   .                                                         101/101
chrname = 'chr3'
crm = Chromosome(chrname)
crm.add_experiment('mouse_B',
                   hic_data=[hic_data1.get_matrix(focus='chr3')],
                   norm_data=[hic_data1.get_matrix(focus='chr3',normalized=True)],
                   resolution=reso)
crm.add_experiment('mouse_PSC',
                   hic_data=[hic_data2.get_matrix(focus='chr3')],
                   norm_data=[hic_data2.get_matrix(focus='chr3',normalized=True)],
                   resolution=reso)
crm.visualize([('mouse_B', 'mouse_PSC')])
../_images/tutorial_8-Compartments_and_TADs_detection_39_0.png

TAD caller algorithms

TADbit

TADbit is the original TAD caller algorithm TADbit is a breakpoint detection algorithm that returns the optimal segmentation of the chromosome under BIC-penalized likelihood. The model assumes that counts have a Poisson distribution and that the expected value of the counts decreases like a power-law with the linear distance on the chromosome.

crm.find_tad(['mouse_B', 'mouse_PSC'], n_cpus=8)
crm.visualize([('mouse_B', 'mouse_PSC')], normalized=True, paint_tads=True)
../_images/tutorial_8-Compartments_and_TADs_detection_44_0.png
crm.visualize([('mouse_B', 'mouse_PSC')], normalized=True, paint_tads=True, focus=(300, 360))
../_images/tutorial_8-Compartments_and_TADs_detection_45_0.png
B = crm.experiments['mouse_B']
PSC = crm.experiments['mouse_PSC']
crm.experiments
[Experiment mouse_B (resolution: 100 kb, TADs: 96, Hi-C rows: 1601, normalized: visibility),
 Experiment mouse_PSC (resolution: 100 kb, TADs: 118, Hi-C rows: 1601, normalized: visibility)]

TopDom

TopDom identifies TAD borders based on the assumption that contact frequencies between regions upstream and downstream of a border are lower than those between two regions within a TAD. The algorithm only depends on a single parameter corresponding to the window size. The algorithm provides a measure (from 0 to 10) of confidence on the accuracy of the border detection (https://www.ncbi.nlm.nih.gov/pubmed/26704975).

crm.find_tad(['mouse_B', 'mouse_PSC'], n_cpus=8, use_topdom=True)
crm.visualize([('mouse_B', 'mouse_PSC')], normalized=True, paint_tads=True)
../_images/tutorial_8-Compartments_and_TADs_detection_51_0.png
crm.visualize([('mouse_B', 'mouse_PSC')], normalized=True, paint_tads=True, focus=(300, 360))
../_images/tutorial_8-Compartments_and_TADs_detection_52_0.png

Insulation score

Insulation score (Crane et al. 2015 https://doi.org/10.1038/nature14450) can be used to build an insulation profile of the genome and, with a simple transformation, to identify TAD borders.

from pytadbit.tadbit import insulation_score, insulation_to_borders

First we need to normalize the matrices by visibility and by decay:

hic_data1.normalize_hic()
hic_data1.normalize_expected()

hic_data2.normalize_hic()
hic_data2.normalize_expected()
iterative correction
  - copying matrix
  - computing biases
rescaling to factor 1
  - getting the sum of the matrix
    => 1553.816
  - rescaling biases
iterative correction
  - copying matrix
  - computing biases
rescaling to factor 1
  - getting the sum of the matrix
    => 1570.990
  - rescaling biases

The two important parameter to define are the window size, the distance from the diagonal and the delta. - the square size should be 500 kb as close as possible from the diagonal - the delta is to look for increases in insulation around a given bin. Should be around 100 kb (in out case we define it as 200 kb as we are working at 100 kb resolution, and working with only one bin is a bit to little)

wsize = (1, 4)
insc1, delta1 = insulation_score(hic_data1, [wsize], resolution=100000, normalize=True, delta=2)
insc2, delta2 = insulation_score(hic_data2, [wsize], resolution=100000, normalize=True, delta=2)
- computing insulation in band 1-4
- computing insulation in band 1-4
/home/dcastillo/miniconda2/envs/py3_tadbit/lib/python3.7/site-packages/numpy/core/fromnumeric.py:3335: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)

Once defined the insulation score and the values of delta can be used to search for borders.

borders1 = insulation_to_borders(insc1[wsize], delta1[wsize], min_strength=0.1)
borders2 = insulation_to_borders(insc2[wsize], delta2[wsize], min_strength=0.1)

Currently the representation is not available in TADbit as for the other methods, but we can easily plot it:

plt.figure(figsize=(10, 4))

plt.subplot(2, 1, 1)
plt.title('B cell')
l1 = plt.plot([insc1[(wsize)].get(i, float('nan')) for i in range(max(insc1[(wsize)]))], label='Insulation score')
l2 = plt.plot([delta1[(wsize)].get(i, float('nan')) for i in range(max(insc1[(wsize)]))],
         alpha=0.3, label='Delta value')
for b, c in borders1:
    l3 = plt.plot([b] * 2, [-2, -2.3], color='darkgreen', alpha=c, lw=4, label='Border strength')

plt.grid()
plt.axhline(0, color='k')
plt.ylim(-2.5, 2)
plt.xlim(300, 360)

plt.legend(l1 + l2 + l3, [l.get_label() for l in l1 + l2 + l3], frameon=False, bbox_to_anchor=(1.3, 0.6))

plt.subplot(2, 1, 2)
plt.title('PSC')
plt.plot([insc2[(wsize)].get(i, float('nan')) for i in range(max(insc2[(wsize)]))])
plt.plot([delta2[(wsize)].get(i, float('nan')) for i in range(max(insc2[(wsize)]))], alpha=0.3)
for b, c in borders2:
    plt.plot([b] * 2, [-2, -2.3], color='darkgreen', alpha=c, lw=4)
plt.grid()
plt.axhline(0, color='k')
plt.ylim(-2.5, 2)
_ = plt.xlim(300, 360)
plt.tight_layout()
../_images/tutorial_8-Compartments_and_TADs_detection_65_0.png

Comparison of TAD borders

The TAD borders can be aligned, using a simple reciprocal best hit strategy:

ali = crm.align_experiments(['mouse_B', 'mouse_PSC'], max_dist=reso)

In the plots below, each arc represents a TAD. Between two consecutive arcs the triangle mark the border. This triangle is colored depending on the confidence of the TAD border call.

ali.draw(ymax=3)
../_images/tutorial_8-Compartments_and_TADs_detection_70_0.png
ali.draw(focus=(1, 350), ymax=3)
../_images/tutorial_8-Compartments_and_TADs_detection_71_0.png

Statistical significance of the TAD borders alignments

In order to asses how well two experiments align, or how conserved are the TAD borders between two experiments, we can compare the overlap between our experiments with the overlap of simulated random distributions of TAD borders.

ali, stats = crm.align_experiments(['mouse_B', 'mouse_PSC'], randomize=True)
print(ali)
Alignment shown in 100 Kb (2 experiments) (scores: 0 1 2 3 4 5 6 7 8 9 10)
  mouse_B:|    31| ---- | ---- |    73|    85|   104|   138| ---- |   155|   158|   160|   178|   191|   206|   224| ---- |   270| ---- |   281|   290|   297|   305|   312|   317|   323|   328|   338| ---- |   356| ---- | ---- | ---- |   382|   389| ---- |   407|   410|   419|   454| ---- |   508|   510|   512| ---- |   522|   530|   535|   543| ---- |   552|   561| ---- |   574| ---- |   591|   600|   608|   612|   625|   635| ---- | ---- | ---- |   660| ---- |   673| ---- | ---- | ---- |   692|   700|   731|   751|   761|   789|   798|   801| ---- |   819|   830|   837|   849| ---- |   861| ---- |   870|   877|   880|   883|   889| ---- |   895|   900| ---- |   908| ---- | ---- | ---- | ---- | ---- |   949|   959|   963| ---- |   967|   976| ---- |   986| ---- |  1020|  1027| ---- |  1047|  1051| ---- | ---- | ---- |  1075| ---- |  1084|  1093|  1097| ---- |  1134|  1136|  1154| ---- | ---- |  1169|  1175| ---- |  1211| ---- |  1221|  1228|  1235| ---- |  1257|  1264|  1270|  1275|  1280|  1291|  1298|  1301|  1307|  1314|  1316| ---- |  1326|  1333| ---- | ---- |  1352|  1359|  1369|  1378|  1384|  1393|  1418| ---- |  1428| ---- |  1444|  1448|  1452|  1459| ---- |  1470|  1490| ---- | ---- |  1521|  1528|  1540|  1546| ---- |  1575|  1581| ---- |  1602
mouse_PSC:|    31|    63|    71|    73|    85|   104|   139|   145| ---- |   158|   160|   179|   192|   206|   223|   256|   270|   273|   281|   290|   297|   305|   311|   317|   323|   329|   338|   344|   356|   362|   370|   376|   382|   389|   405|   407|   410|   418|   454|   489|   508|   510|   512|   514|   522|   530|   535| ---- |   547|   552|   560|   572| ---- |   585|   591| ---- |   607|   612|   625|   635|   640|   648|   653|   660|   663|   674|   681|   687|   690| ---- |   700|   731|   752|   760|   789|   799|   801|   808|   820|   831|   838|   849|   856|   862|   868|   870| ---- |   880|   883|   888|   891| ---- | ---- |   905|   907|   914|   921|   931|   936|   943|   949|   959|   962|   965|   967|   976|   979| ---- |   988|  1019| ---- |  1030|  1047|  1051|  1057|  1060|  1069|  1076|  1082| ---- |  1093| ---- |  1099|  1134|  1136| ---- |  1156|  1163|  1169| ---- |  1177|  1211|  1218| ---- | ---- |  1235|  1243|  1258| ---- |  1271| ---- |  1280|  1292|  1298|  1301|  1307|  1314|  1316|  1319|  1327|  1333|  1339|  1346|  1352|  1358| ---- |  1377|  1385|  1392|  1418|  1425| ---- |  1430|  1444| ---- |  1453| ---- |  1462|  1470| ---- |  1496|  1516|  1521|  1528|  1539|  1546|  1566|  1575| ---- |  1583|  1602

This analysis returns an alignment score between 0 and 1 (0: no match, 1: all borders aligned), a p-value (usually equal zero), the proportion of borders conserved in the second experiment, and the proportion of borders conserved in the first experiment:

stats
The history saving thread hit an unexpected error (OperationalError('database is locked')).History will not be written to the database.
(0.4696132596685082, 0.0, 0.872, 0.8980891719745223)
print('Alignment score: %.3f, p-value: %.4f\n  proportion of borders of T0 found in T60: %.3f, of T60 in T0 %.3f' % stats)
Alignment score: 0.470, p-value: 0.0000
  proportion of borders of T0 found in T60: 0.872, of T60 in T0 0.898

Save Chromosome object (with TAD definition)

crm.save_chromosome('results/fragment/chr3.tdb')

Extra: choosing the best resolution to call TAD borders

TopDom and the methodology based on insulation score are relatively fast computationally and allow to call TAD borders at very high resolution. However being able to call TAD borders at high resolution does not mean that we should do it. As the resolution increases, so does the noise.

In order to assess which is the best resolution in order to call TAD borders, a good strategy is to test the consistency of several resolutions.

In the example below, we use the methodology based on the insulation score as it is the fastest (tadbit strategy is almost not usable bellow 20 kb). We compare the number of TAD borders that are shared (plus-minus one bin) between the two replicates (iPS and B cells).

from pytadbit.parsers.hic_parser import load_hic_data_from_bam
from pytadbit.tadbit import insulation_score, insulation_to_borders
base_path = 'results/fragment/{0}_both/03_filtering/valid_reads12_{0}.bam'
cel1 = 'mouse_B'
cel2 = 'mouse_PSC'
borders = {}
resos = [5000, 10000, 20000, 30000, 40000, 50000, 60000, 80000, 100000, 150000, 200000]
for c in ['chr%d' % cs for cs in range(1, 20)] + ['chrX']:
    borders[c] = {}
    for reso in resos:
        hic_data1 = load_hic_data_from_bam(base_path.format(cel1),
                                           resolution=reso,
                                           region=c,
                                           ncpus=8, verbose=False)
        hic_data2 = load_hic_data_from_bam(base_path.format(cel2),
                                           resolution=reso,
                                           region=c,
                                           ncpus=8, verbose=False)
        hic_data1.normalize_hic(silent=True)
        hic_data1.normalize_expected()
        hic_data2.normalize_hic(silent=True)
        hic_data2.normalize_expected()
        wsize = (1, max(2, 500000 // reso))
        delta = max(1, 100000 // reso)
        insc1, delta1 = insulation_score(hic_data1, [wsize], resolution=100000, normalize=True, delta=delta)
        insc2, delta2 = insulation_score(hic_data2, [wsize], resolution=100000, normalize=True, delta=delta)
        borders1 = insulation_to_borders(insc1[wsize], delta1[wsize], min_strength=0.1)
        borders2 = insulation_to_borders(insc2[wsize], delta2[wsize], min_strength=0.1)
        borders[c][reso] = borders1, borders2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2
- computing insulation in band 1-100
- computing insulation in band 1-100
- computing insulation in band 1-50
- computing insulation in band 1-50
- computing insulation in band 1-25
- computing insulation in band 1-25
- computing insulation in band 1-16
- computing insulation in band 1-16
- computing insulation in band 1-12
- computing insulation in band 1-12
- computing insulation in band 1-10
- computing insulation in band 1-10
- computing insulation in band 1-8
- computing insulation in band 1-8
- computing insulation in band 1-6
- computing insulation in band 1-6
- computing insulation in band 1-5
- computing insulation in band 1-5
- computing insulation in band 1-3
- computing insulation in band 1-3
- computing insulation in band 1-2
- computing insulation in band 1-2

We align TAD borders and keep the proportion of borders aligned between cell types

from pytadbit.alignment import align
scores = {}
props1 = {}
props2 = {}
for c in borders:
    scores[c] = []
    props1[c] = []
    props2[c] = []
    for reso in resos:
        ali = align([[t[0] for t in b] for b in borders[c][reso]], max_dist=1)
        props1[c].append(ali[0][2])
        props2[c].append(ali[0][3])
        scores[c].append(ali[0][1])

We plot the results, as boxplots to group the values of each chromosomes:

from functools import reduce

plt.figure(figsize=(12, 5))
bp = plt.boxplot([reduce(lambda x, y: x+ y,
                 [(props1[c][i], props2[c][i]) for c in props1])
                  for i in range(len(resos))], positions=resos, widths=2000)

plt.plot([sum(m.get_xdata()) / 2 for m in bp['medians']],
         [0m.get_ydata()[0] for m in bp['medians']], color='grey', alpha=0.5)

plt.xlim(0, 205000)
plt.xticks(resos, [str(reso / 1000) + 'kb'  for reso in resos], rotation=90)
plt.ylabel('Proportion of aligned borders\nPSC vs B cell (all chromosomes)')
plt.xlabel('Resolution at which TAD borders are called')
plt.grid()
plt.tight_layout()
plt.show()
../_images/tutorial_8-Compartments_and_TADs_detection_90_0.png

In the case of these experiments it seems that looking for TAD borders bellow 50 kb is dangerous, as very few are reproducible. Calling TAD borders at 100 kb resolution might be here a good idea.