Utilities¶
- pytadbit.get_dependencies_version(dico=False)[source]¶
Check versions of TADbit and all dependencies, as well and retrieves system info. May be used to ensure reproducibility.
- Returns
string with description of versions installed
- pytadbit.utils.tadmaths.calinski_harabasz(scores, clusters)[source]¶
Implementation of the CH score [CalinskiHarabasz1974], that has shown to be one the most accurate way to compare clustering methods [MilliganCooper1985] [Tibshirani2001].
The CH score is:
\[CH(k) = \frac{B(k) / (k-1)}{W(k)/(n - k)}\]Where \(B(k)\) and \(W(k)\) are the between and within cluster sums of squares, with \(k\) clusters, and \(n\) the total number of elements (models in this case).
- Parameters
scores – a dict with, as keys, a tuple with a pair of models; and, as value, the distance between these models.
clusters – a dict with, as key, the cluster number, and as value a list of models
nmodels – total number of models
- Returns
the CH score
- pytadbit.utils.extraviews.plot_3d_optimization_result(result, axes=('scale', 'maxdist', 'upfreq', 'lowfreq'))[source]¶
Displays a three dimensional scatter plot representing the result of the optimization.
- Parameters
result – 3D numpy array contating correlation values
axes ('scale','maxdist','upfreq','lowfreq') – tuple of axes to represent. The order will define which parameter will be placed on the w, z, y or x axe.
- pytadbit.utils.extraviews.plot_2d_optimization_result(result, axes=('scale', 'kbending', 'maxdist', 'lowfreq', 'upfreq'), dcutoff=None, show_best=0, skip=None, savefig=None, clim=None, cmap='inferno')[source]¶
A grid of heatmaps representing the result of the optimization. In the optimization up to 5 parameters can be optimized: ‘scale’, ‘kbending’, ‘maxdist’, ‘lowfreq’, and ‘upfreq’. The maps will be divided in different pages depending on the ‘scale’ and ‘kbending’ values. In each page there will be different maps depending the ‘maxdist’ values. Each map has ‘upfreq’ values along the x-axes, and ‘lowfreq’ values along the y-axes.
- Parameters
result – 3D numpy array contating the computed correlation values
axes ('scale','kbending','maxdist','lowfreq','upfreq') – tuple of axes to represent. The order is important here. It will define which parameter will be placed respectively on the v, w, z, y, or x axes.
show_best (0) – number of best correlation value to highlight in the heatmaps. The best correlation is highlithed by default
skip (None) – a dict can be passed here in order to fix a given parameter value, e.g.: {‘scale’: 0.001, ‘kbending’: 30, ‘maxdist’: 500} will represent all the correlation values at fixed ‘scale’, ‘kbending’, and ‘maxdist’ values, respectively equal to 0.001, 30, and 500.
dcutoff (None) – The distance cutoff (dcutoff) used to compute the contact matrix in the models.
savefig (None) – path to a file where to save the generated image. If None, the image will be displayed using matplotlib GUI. NOTE: the extension of the file name will automatically determine the desired format.
clim (None) – color scale. If None, the max and min values of the input are used.
cmap (inferno) – matplotlib colormap
- pytadbit.utils.extraviews.compare_models(sm1, sm2, cutoff=150, models1=None, cluster1=None, models2=None, cluster2=None)[source]¶
Plots the difference of contact maps of two group of structural models.
- Parameters
sm1 – a StructuralModel
sm2 – a StructuralModel
dcutoff (150) – distance threshold (nm) to determine if two particles are in contact
models (None) – if None (default) the contact map will be computed using all the models. A list of numbers corresponding to a given set of models can be passed
cluster (None) – compute the contact map only for the models in the cluster number ‘cluster’
- pytadbit.utils.extraviews.color_residues(x, **kwargs)[source]¶
Function to color residues from blue to red.
- Parameters
model – a given
pytadbit.imp.impmodel.IMPmodel
- Returns
a list of rgb tuples (red, green, blue), each between 0 and 1.
- pytadbit.utils.extraviews.tad_coloring(x, mstart=None, mend=None, tads=None, **kwargs)[source]¶
Colors TADs from blue to red (first to last TAD). TAD borders are displayed in scale of grey, from light to dark grey (again first to last border)
- Parameters
model – a given
pytadbit.imp.impmodel.IMPmodel
tads – a dictionary of TADs, Experiments.tads can be directly passed
- Returns
a list of rgb tuples (red, green, blue), each between 0 and 1.
- pytadbit.utils.extraviews.tad_border_coloring(x, mstart=None, mend=None, tads=None, **kwargs)[source]¶
Colors TAD borders from blue to red (bad to good score). TAD are displayed in scale of grey, from light to dark grey (first to last particle in the TAD)
- Parameters
model – a given
pytadbit.imp.impmodel.IMPmodel
tads – a dictionary of TADs, Experiments.tads can be directly passed
- Returns
a list of rgb tuples (red, green, blue), each between 0 and 1.
- pytadbit.utils.extraviews.plot_3d_model(x, y, z, label=False, axe=None, thin=False, savefig=None, show_axe=False, azimuth=- 90, elevation=0.0, color='index', smooth=0.001, particle_size=50, alpha_part=0.5, lw_main=3, **kwargs)[source]¶
Given a 3 lists of coordinates (x, y, z) plots a three-dimentional model using matplotlib
- Parameters
model – a
pytadbit.imp.impmodel.IMPmodel
label (False) – show labels
axe (None) – a matplotlib.axes.Axes object to define the plot appearance
thin (False) – draw a thin black line instead of representing particles and edges
smooth (0.001) – connction between particles is smoothed according to the input condition
savefig (None) – path to a file where to save the image generated; if None, the image will be shown using matplotlib GUI (the extension of the file name will determine the desired format).
azimuth (-90) – angle to rotate camera along the y axis
elevation (0) – angle to rotate camera along the x axis
color ('index') –
can be:
- a string as:
’index’ to color particles according to their position in the model (
pytadbit.utils.extraviews.color_residues()
)’tad’ to color particles according to the TAD they belong to (
pytadbit.utils.extraviews.tad_coloring()
)’border’ to color particles marking borders. Color according to their score (
pytadbit.utils.extraviews.tad_border_coloring()
) coloring function like.
a function, that takes as argument a model and any other parameter passed through the kwargs.
a list of (r, g, b) tuples (as long as the number of particles). Each r, g, b between 0 and 1.
- pytadbit.utils.extraviews.chimera_view(cmm_files, chimera_bin='chimera', chimera_cmd=None, savefig=None, center_of_mass=False, gyradius=False, align=True, grid=False, highlight='all', **kwargs)[source]¶
Open a list of .cmm files with Chimera (http://www.cgl.ucsf.edu/chimera) to view models.
- Parameters
cmm_files – list of .cmm files
chimera_bin ('chimera') – path to chimera binary
chimera_cmd (None) – list of command lines for chimera
savefig (None) – path to a file where to save generated image
center_of_mass (False) – if True, draws the center of mass
gyradius (False) – increment the radius of the center of mass by the value of the radius of girations given.
align (False) – align models
grid (False) – tile models
higlight ('all') – higlights a given model, or group of models. Can be either ‘all’, or a model number
- pytadbit.utils.hic_filtering.hic_filtering_for_modelling(matrx, silent=False, perc_zero=90, auto=True, min_count=None, draw_hist=False, savefig=None, diagonal=True)[source]¶
Call filtering function, to remove artifactual columns in a given Hi-C matrix. This function will detect columns with very low interaction counts; and columns with NaN values (in this case NaN will be replaced by zero in the original Hi-C data matrix). Filtered out columns will be stored in the dictionary Experiment._zeros.
- Parameters
matrx – Hi-C matrix of a given experiment
silent (False) – does not warn for removed columns
savefig (None) – path to a file where to save the image generated; if None, the image will be shown using matplotlib GUI (the extension of the file name will determine the desired format).
diagonal (True) – remove row/columns with zero in the diagonal
perc_zero (90) – maximum percentage of cells with no interactions allowed.
min_count (None) – minimum number of reads mapped to a bin (recommended value could be 2500). If set this option overrides the perc_zero filtering… This option is slightly slower.
auto (True) – if False, only filters based on the given percentage zeros
- Returns
the indexes of the columns not to be considered for the calculation of the z-score
- pytadbit.utils.file_handling.magic_open(filename, verbose=False, cpus=None)[source]¶
To read uncompressed zip gzip bzip2 or tar.xx files
- Parameters
filename – either a path to a file, or a file handler
- Returns
opened file ready to be iterated
- pytadbit.utils.file_handling.get_free_space_mb(folder, div=2)[source]¶
Return folder/drive free space (in bytes)
Based on stackoverflow answer: http://stackoverflow.com/questions/51658/cross-platform-space-remaining-on-volume-using-python
- Parameters
folder – folder/drive to measure
div (2) – divide the size by (div*1024) to get the size in Mb (3 is Gb…)
- pytadbit.utils.three_dim_stats.calc_eqv_rmsd(models, beg, end, zeros, dcutoff=200, one=False, what='score', normed=True)[source]¶
Calculates the RMSD, dRMSD, the number of equivalent positions and a score combining these three measures. The measure are done between a group of models in a one against all manner.
- Parameters
beg – start particle number of the region to compare
end – end particle number of the region to compare
zeros – list of True/False representing particles to skip
dcutoff (200) – distance in nanometer from which it is considered that two particles are separated.
fact (0.75) – Factor for equivalent positions
one (False) – if True assumes that only two models are passed, and returns the rmsd of their comparison
what ('score') – values to return. Can be one of ‘score’, ‘rmsd’, ‘drmsd’ or ‘eqv’
normed (True) – normalize result by maximum value (only applies to rmsd and drmsd)
- Returns
a score of each pairwise comparison according to:
\[score_i = eqvs_i \times \frac{dRMSD_i / max(dRMSD)} {RMSD_i / max(RMSD)}\]where \(eqvs_i\) is the number of equivalent position for the ith pairwise model comparison.
- pytadbit.utils.three_dim_stats.square_distance(part1, part2)[source]¶
Calculates the square distance between two particles.
- Parameters
part1 – coordinate (dict format with x, y, z keys)
part2 – coordinate (dict format with x, y, z keys)
- Returns
square distance between two points in space
- pytadbit.utils.three_dim_stats.distance(part1, part2)[source]¶
Calculates the distance between two particles.
- Parameters
part1 – coordinate in list format (x, y, z)
part2 – coordinate in list format (x, y, z)
- Returns
distance between two points in space
- pytadbit.utils.three_dim_stats.angle_between_3_points(point1, point2, point3)[source]¶
Calculates the angle between 3 particles
Given three particles A, B and C, the angle g (angle ACB, shown below):
A /| /i| c/ | / | / | B )g |b \ | \ | a\ | \h| \| C
is given by the theorem of Al-Kashi:
\[b^2 = a^2 + c^2 - 2ac\cos(g)\]- Parameters
point1 – list of 3 coordinate for x, y and z
point2 – list of 3 coordinate for x, y and z
point3 – list of 3 coordinate for x, y and z
- Returns
angle in radians
- pytadbit.utils.three_dim_stats.dihedral(a, b, c, d, e)[source]¶
Calculates dihedral angle between 4 points in 3D (array with x,y,z)
- pytadbit.utils.three_dim_stats.generate_circle_points(x, y, z, u, v, w, n)[source]¶
Returns list of 3d coordinates of points on a circle using the Rodrigues rotation formula.
see Murray, G. (2013). Rotation About an Arbitrary Axis in 3 Dimensions for details
- Parameters
x – x coordinate of a point somewhere on the circle
y – y coordinate of a point somewhere on the circle
z – z coordinate of a point somewhere on the circle
a – x coordinate of the center
b – y coordinate of the center
c – z coordinate of the center
u – 1st element of a vector in the same plane as the circle
v – 2nd element of a vector in the same plane as the circle
w – 3rd element of a vector in the same plane as the circle
n – number of points in the circle
- TODO: try simplification for a=b=c=0 (and do the translation in the main
function)