StructuralModels class

pytadbit.modelling.structuralmodels.load_structuralmodels(input_)[source]

Loads pytadbit.modelling.structuralmodels.StructuralModels from a file (generated with pytadbit.modelling.structuralmodels.StructuralModels.save_models).

Parameters

input – path to the pickled StructuralModels object or dictionary containing all the information.

Returns

a pytadbit.modelling.imp_model.StructuralModels.

class pytadbit.modelling.structuralmodels.StructuralModels(nloci, models, bad_models, resolution, original_data=None, zscores=None, clusters=None, config=None, experiment=None, zeros=None, restraints=None, description=None)[source]

This class contains three-dimensional models generated from a single Hi-C data. They can be reached either by their index (integer representing their rank according to objective function value), or by their IMP random intial number (as string).

Parameters
  • nloci – number of particles in the selected region

  • models – a dictionary containing the generated pytadbit.modelling.impmodel.IMPmodel to be used as ‘best models’

  • bad_models – a dictionary of pytadbit.modelling.impmodel.IMPmodel, these model will not be used, just stored in case the set of ‘best models’ needs to be extended later-on ( pytadbit.modelling.structuralmodels.StructuralModels.define_best_models() ).

  • resolution – number of nucleotides per Hi-C bin. This will be the number of nucleotides in each model’s particle.

  • original_data (None) – a list of list (equivalent to a square matrix) of the normalized Hi-C data, used to build this list of models.

  • clusters (None) – a dictionary of type pytadbit.modelling.structuralmodels.ClusterOfModels

  • config (None) – a dictionary containing the parameter to be used for the generation of three dimensional models.

accessibility(radius, models=None, cluster=None, nump=100, superradius=200, savefig=None, savedata=None, axe=None, plot=True, error=True, steps=(1,))[source]

Calculates a mesh surface around the model (distance equal to input radius) and checks if each point of this mesh could be replaced by an object (i.e. a protein) of a given radius

Outer part of the model can be excluded from the estimation of accessible surface, as the occupancy outside the model is unkown (see superradius option).

Parameters
  • radius – radius of the object we want to fit in the model.

  • nump (100) – number of points to draw around a given particle. This number also sets the number of points drawn around edges, as each point occupies a given surface (see maths below). Note that this number is considerably lowered by the occupancy of edges, depending of the angle formed by the edges surrounding a given particle, only 10% to 50% of the ``nump`` will be drawn in fact.

  • savefig (None) – path where to save chimera image

  • steps ((1, )) – how many particles to group for the estimation. By default 1 curve is drawn

  • superradius (200) – radius of an object used to exclude outer surface of the model. Superradius must be higher than radius.

This function will first define a mesh around the chromatin, representing all possible position of the center of the object we want to fit. This mesh will be at a distance of radius from the chromatin strand. All dots in the mesh represents an equal area (a), the whole surface of the chromatin strand being: \(A=n \times a\) (n being the total number of dots in the mesh).

The mesh consists of spheres around particles of the model, and cylinders around edges joining particles (no overlap is allowed between sphere and cylinders or cylinder and cylinder when they are consecutive).

If we want that all dots of the mesh representing the surface of the chromatin, corresponds to an equal area (\(a\))

\[a = \frac{4\pi r^2}{s} = \frac{2\pi r N_{(d)}}{c}\]

with:

  • \(r\) radius of the object to fit (as the input parameter radius)

  • \(s\) number of points in sphere

  • \(c\) number of points in circle (as the input parameter nump)

  • \(N_{(d)}\) number of circles in an edge of length \(d\)

According to this, when the distance between two particles is equal to \(2r\) (\(N=2r\)), we would have \(s=c\).

As :

\[2\pi r = \sqrt{4\pi r^2} \times \sqrt{\pi}\]

It is fair to state the number of dots represented along a circle as:

\[c = \sqrt{s} \times \sqrt{\pi}\]

Thus the number of circles in an edge of length \(d\) must be:

\[N_{(d)}=\frac{s}{\sqrt{s}\sqrt{\pi}}\times\frac{d}{2r}\]
Returns

a list of 1- the number of dots in the mesh that could be occupied by an object of the given radius 2- the total number of dots in the mesh 3- the estimated area of the mesh (in square micrometers) 4- the area of the mesh of a virtually straight strand of chromatin defined as \(contour\times 2\pi r + 4\pi r^2\) (also in micrometers) 5- a list of number of (accessibles, inaccessible) for each particle (percentage burried can be infered afterwards by accessible/(accessible+inaccessible) )

if ploting, returns values to be plotted (list of lists, one per window), and the same for upper and lower error lines

align_models(models=None, cluster=None, in_place=False, reference_model=None, by_cluster=False, **kwargs)[source]

Three-dimensional aligner for structural models.

Parameters
  • models (None) – if None (default) the average model will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the average model only for the models in the cluster number ‘cluster’

  • in_place (False) – if True, the result of the alignment will REPLACE the coordinates of each model. Default is too yield new coordinates of each model

  • reference_model (None) – align given model to reference model

  • by_cluster (False) – if True align each cluster individually in case we align all the models (models=None)

angle_between_3_particles(parta, partb, partc, models=None, cluster=None, radian=False, all_angles=False)[source]

Calculates the angle between 3 particles.

Given three particles A, B and C, the angle g (angle ACB, shown below):

      A
     /|
    /i|
  c/  |
  /   |
 /    |
B )g  |b
 \    |
  \   |
  a\  |
    \h|
     \|
      C

is given by the theorem of Al-Kashi:

\[b^2 = a^2 + c^2 - 2ac\cos(g)\]
Parameters
  • parta – A particle number

  • partb – A particle number

  • partc – A particle number

  • models (None) – if None (default) the angle will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the angle only for the models in the cluster number ‘cluster’

  • radian (False) – if True, return value in radians (in degrees otherwise)

Returns

an angle, either in degrees or radians. If all_angles is true returns a list of the angle g, h, i (see picture above)

average_model(models=None, cluster=None, verbose=False)[source]

Builds and returns an average model representing a given group of models

Parameters
  • models (None) – if None (default) the average model will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the average model only for the models in the cluster number ‘cluster’

  • verbose (False) – prints the distance of each model to average model (in stderr)

Returns

the average model of a given group of models (a new and ARTIFICIAL model)

centroid_model(models=None, cluster=None, verbose=False)[source]

Estimates and returns the centroid model of a given group of models.

Parameters
  • models (None) – if None (default) the centroid model will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the centroid model only for the models in the cluster number ‘cluster’

  • verbose (False) – prints the distance of each model to average model (in stderr)

Returns

the centroid model of a given group of models (the most model representative)

cluster_analysis_dendrogram(n_best_clusters=None, color=False, axe=None, savefig=None, **kwargs)[source]

Representation of the clustering results. The length of the leaves if proportional to the final objective function value of each model. The branch widths are proportional to the number of models in a given cluster (or group of clusters, if it is an internal branch).

Parameters
  • n_best_clusters (None) – number of clusters to represent (by default all clusters will be shown)

  • color (False) – color the dendrogram based on the significance of the clustering (basically it depends of the internal branch lengths)

  • savefig (None) – path to a file where to save the image generated; if None, the image will be shown using matplotlib GUI (the extension of the file name will determine the desired format).

  • width_factor (10.0) – multiplicator for the width of the line representing the number of models in a given cluster.

  • fontsize (8) – size of the smallest font represented in the plot

  • figsize ((8,8)) – a tuple of width and height, to set the size of the plot.

cluster_models(fact=0.75, dcutoff=None, method='mcl', mcl_bin='mcl', tmp_file=None, verbose=True, n_cpus=1, mclargs=None, external=False, what='score', region=None)[source]

This function performs a clustering analysis of the generated models based on structural comparison. The result will be stored in StructuralModels.clusters

Clustering is done according to a score of pairwise comparison calculated as:

\[score_i = eqvs_i \times \frac{dRMSD_i / max(dRMSD)} {RMSD_i / max(RMSD)}\]

where \(eqvs_i\) is the number of equivalent position for the ith pairwise model comparison.

param 0.75 fact

factor to define the percentage of equivalent positions to be considered in the clustering

param None dcutoff

distance threshold (nm) to determine if two particles are in contact, default is 1.5 times resolution times scale

param ‘mcl’ method

clustering method to use, which can be either ‘mcl’ or ‘ward’. MCL method is recommended. WARD method uses a scipy implementation of this hierarchical clustering, and selects the best number of clusters using the pytadbit.utils.tadmaths.calinski_harabasz() function.

param ‘mcl’ mcl_bin

path to the mcl executable file, in case of the ‘mcl is not in the PATH’ warning message

param None tmp_file

path to a temporary file created during the clustering computation. Default will be created in /tmp/ folder

param True verbose

same as print StructuralModels.clusters

param 1 n_cpus

number of cpus to use in MCL clustering

param mclargs

list with any other command line argument to be passed to mcl (i.e,: mclargs=[‘-pi’, ‘10’, ‘-I’, ‘2.0’])

param False external

if True returns the cluster found instead of storing it as StructuralModels.clusters

param ‘score’ what

Statistic used for clustering. Can be one of ‘score’, ‘rmsd’, ‘drmsd’ or ‘eqv’.

param None region

coordinates of the region to base the clustering. Can be either one chromosome name, or the coordinate in the form: “chr3:110000000-120000000”

contact_map(models=None, cluster=None, cutoff=None, axe=None, savefig=None, savedata=None)[source]

Plots a contact map representing the frequency of interaction (defined by a distance cutoff) between two particles.

Parameters
  • models (None) – if None (default) the contact map will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the contact map only for the models in the cluster number ‘cluster’

  • cutoff (None) – distance cutoff (nm) to define whether two particles are in contact or not, default is 2 times resolution, times scale.

  • axe (None) – a matplotlib.axes.Axes object to define the plot appearance

  • savefig (None) – path to a file where to save the image generated; if None, the image will be shown using matplotlib GUI (the extension of the file name will determine the desired format).

  • savedata (None) – path to a file where to save the contact map data generated, in three columns format (particle1, particle2, percentage of models where these two particles are in contact)

correlate_with_real_data(models=None, cluster=None, cutoff=None, off_diag=1, plot=False, axe=None, savefig=None, corr='spearman', midplot='hexbin', log_corr=True, contact_matrix=None, show_bad_columns=True)[source]

Plots the result of a correlation between a given group of models and original Hi-C data.

Parameters
  • models (None) – if None (default) the correlation will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the correlation only for the models in the cluster number ‘cluster’

  • cutoff (None) – distance cutoff (nm) to define whether two particles are in contact or not, default is 2 times resolution, times scale.

  • savefig (None) – path to a file where to save the image generated; if None, the image will be shown using matplotlib GUI (the extension of the file name will determine the desired format).

  • plot (False) – to display the plot

  • log_corr (True) – log plot for correlation

  • axe (None) – a matplotlib.axes.Axes object to define the plot appearance

  • contact_matrix (None) – input a contact matrix instead of computing it from the models

  • show_bad_columns (True) – Wether to hide or not bad columns in the contact map

Returns

correlation coefficient rho, between the two matrices. A rho value greater than 0.7 indicates a very good correlation

deconvolve(fact=0.75, dcutoff=None, method='mcl', mcl_bin='mcl', tmp_file=None, verbose=True, n_cpus=1, mclargs=None, what='dRMSD', n_best_clusters=10, savefig=None, represent_models=False, figsize=(11, 11), clusters=None, **kwargs)[source]

This function performs a deconvolution analysis of a given froup of models. It first clusters models based on structural comparison (dRMSD), and then, performs a differential contact map between each possible pair of cluster.

Note

Clusters defined here are different from the one defined when using pytadbit.modelling.structuralmodels.StructuralModels.cluster_models(). They are also not stored into StructuralModels.clusters

Parameters
  • fact (0.75) – factor to define the percentage of equivalent positions to be considered in the clustering

  • dcutoff (None) – distance threshold (nm) to determine if two particles are in contact, default is 1.5 times resolution times scale

  • method ('mcl') – clustering method to use, which can be either ‘mcl’ or ‘ward’. MCL method is recommended. WARD method uses a scipy implementation of this hierarchical clustering, and selects the best number of clusters using the pytadbit.utils.tadmaths.calinski_harabasz() function.

  • mcl_bin ('mcl') – path to the mcl executable file, in case of the ‘mcl is not in the PATH’ warning message

  • tmp_file (None) – path to a temporary file created during the clustering computation. Default will be created in /tmp/ folder

  • verbose (True) – same as print StructuralModels.clusters

  • n_cpus (1) – number of cpus to use in MCL clustering

  • mclargs – list with any other command line argument to be passed to mcl (i.e,: mclargs=[‘-pi’, ‘10’, ‘-I’, ‘2.0’])

  • n_best_clusters (10) – number of clusters to represent

  • clusters (None) – provide clusters as a dictionary with keys=cluster number, or name, and values list of model numbers.

  • represent_models (False) – To generate an interactive visualization of a representative model for each cluster. Representative model depends on the value passed to this option, it can be either ‘centroid’ or ‘best’ (this last standing for the model with lowest IMP objective function value).

  • savefig (None) – path to a file where to save the image generated; if None, the image will be shown using matplotlib GUI (the extension of the file name will determine the desired format).

  • figsize ((11,11)) – dimension of the plot

define_best_models(nbest)[source]

Defines the number of top models (based on the objective function) to keep. If keep_all is set to True in pytadbit.modelling.imp_model.generate_3d_models() or in pytadbit.experiment.Experiment.model_region(), then the full set of models (n_models parameter) will be used, otherwise only the n_keep models will be available.

Parameters

nbest – number of top models to keep (usually 20% of the generated models).

density_plot(models=None, cluster=None, steps=(1, 2, 3, 4, 5), interval=1, use_mass_center=False, error=False, axe=None, savefig=None, savedata=None, plot=True)[source]

Plots the number of nucleotides per nm of chromatin vs the modeled region bins.

Parameters
  • models (None) – if None (default) the density plot will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the density plot only for the models in the cluster number ‘cluster’

  • steps ((1, 2, 3, 4, 5)) – how many particles to group for the estimation. By default 5 curves are drawn

  • error (False) – represent the error of the estimates

  • axe (None) – a matplotlib.axes.Axes object to define the plot appearance

  • interval (1) – distance are measure with this given interval between two bins.

  • use_mass_center (False) – if interval is higher than one, calculates the distance between the center of mass of the particles n to n+interval and the center of mass of the particles n+interval and n+2interval

  • savefig (None) – path to a file where to save the image generated; if None, the image will be shown using matplotlib GUI (the extension of the file name will determine the desired format).

  • savedata (None) – path to a file where to save the density data generated (1 column per step + 1 for particle number).

  • plot (True) – e.g. if False, only saves data. No plotting done

Returns

values to be plotted (list of lists, one per window), and the same for upper and lower error lines

dihedral_angle(pa, pb, pc, pd, pe, models)[source]
Calculates the dihedral angle between 2 planes formed by 5 particles

(one common to both planes).

Parameters
  • models (None) – if None (default) the angle will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the angle only for the models in the cluster number ‘cluster’

fetch_model_by_rand_init(rand_init, all_models=False)[source]

Models are stored according to their objective function value (first best), but in order to reproduce a model, we need its initial random number. This method helps to fetch the model corresponding to a given initial random number stored under StructuralModels.models[N][‘rand_init’].

Parameters
  • rand_init – the wanted rand_init number.

  • all_models (False) – whether or not to use ‘bad’ models

Returns

index of 3d model

get_contact_matrix(models=None, cluster=None, cutoff=None, distance=False, show_bad_columns=True)[source]

Returns a matrix with the number of interactions observed below a given cutoff distance.

Parameters
  • models (None) – if None (default) the contact matrix will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the contact matrix only for the models in the cluster number ‘cluster’

  • cutoff (None) – distance cutoff (nm) to define whether two particles are in contact or not, default is 2 times resolution, times scale. Cutoff can also be a list of values, in wich case the returned object will be a dictionnary of matrices (keys being square cutoffs)

  • distance (False) – returns the distance matrix of all_angles against all_angles particles instead of a contact_map matrix using the cutoff

  • show_bad_columns (True) – show bad columns in contact map

Returns

matrix frequency of interaction

get_persistence_length(begin=0, end=None, axe=None, savefig=None, savedata=None, plot=True)[source]

Calculates the persistence length (Lp) of given section of the model. Persistence length is calculated according to [Bystricky2004] :

\[<R^2> = 2 \times Lp^2 \times (\frac{Lc}{Lp} - 1 + e^{\frac{-Lc}{Lp}})\]

with the contour length as \(Lc = \frac{d}{c}\) where \(d\) is the genomic dstance in bp and \(c\) the linear mass density of the chromatin (in bp/nm).

Parameters
  • begin (0) – starting particle of the region to consider

  • end (None) – ending particle of the region to consider

Returns

(float) persistence length

infer_unrestrained_particle_coords(xcoords, ycoords, zcoords)[source]
if a given particle (and direct neighbors) have no restraints. Infer

the coordinates by linear interpolation using closest particles with restraints.

Parameters
  • xcoords – list of x coordinates

  • ycoords – list of y coordinates

  • zcoords – list of z coordinates

Transforms

three input lists corresponding to the x, y and z coordinates of the FULL model

interactions(models=None, cluster=None, cutoff=None, steps=(1, 2, 3, 4, 5), axe=None, error=False, savefig=None, savedata=None, average=True, plot=True)[source]

Plots, for each particle, the number of interactions (particles closer than the given cut-off). The value given is the average for all models.

Parameters
  • models (None) – if None (default) the contact map will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the contact map only for the models in the cluster number ‘cluster’

  • cutoff (None) – distance cutoff (nm) to define whether two particles are in contact or not, default is 2 times resolution, times scale.

  • steps ((1, 2, 3, 4, 5)) – how many particles to group for the estimation. By default 5 curves are drawn

  • error (False) – represent the error of the estimates

  • axe (None) – a matplotlib.axes.Axes object to define the plot appearance

  • savefig (None) – path to a file where to save the image generated; if None, the image will be shown using matplotlib GUI (the extension of the file name will determine the desired format).

  • savedata (None) – path to a file where to save the contact map data generated, in three columns format (particle1, particle2, percentage of models where these two particles are in contact)

  • average (True) – calculate average interactions along models, otherwise, the median.

  • plot (True) – e.g. only saves data. No plotting done

Returns

values to be plotted (list of lists, one per window), and the same for upper and lower error lines

median_3d_dist(part1, part2, models=None, cluster=None, plot=True, median=True, axe=None, savefig=None)[source]

Computes the median distance between two particles over a set of models

Parameters
  • part1 – number corresponding to the first particle

  • part2 – number corresponding to the second particle

  • models (None) – if None (default) the distance will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the distance only for the models in the cluster number ‘cluster’

  • plot (True) – if True, display a histogram and a box-plot of the distribution of the calculated distances. If False, return either the full list of the calculated distances or their median value

  • median (True) – return either the full list of the calculated distances (False) or their median value (True), when ‘plot’ is set to False

  • axe (None) – a matplotlib.axes.Axes object to define the plot appearance

  • savefig (None) – path to a file where to save the image generated; if None, the image will be shown using matplotlib GUI (the extension of the file name will determine the desired format).

Returns

if ‘plot’ is False, return either the full list of the calculated distances or their median value distances, either the list of distances.

model_consistency(cutoffs=None, models=None, cluster=None, axe=None, savefig=None, savedata=None, plot=True)[source]

Plots the particle consistency, over a given set of models, vs the modeled region bins. The consistency is a measure of the variability (or stability) of the modeled region (the higher the consistency value, the higher stability).

Parameters
  • cutoffs (None) – list of distance cutoffs (nm) used to compute the consistency. Two particle are considered consistent if their distance is less than the given cutoff, default is a tuple of 0.5, 1, 1.5 and 2 times resolution, times scale.

  • models (None) – if None (default) the consistency will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the consistency only for the models in the cluster number ‘cluster’

  • tmp_path ('/tmp/tmp_cons') – location of the input files for TM-score program

  • tmsc ('') – path to the TMscore_consistency script (assumed to be installed by default)

  • axe (None) – a matplotlib.axes.Axes object to define the plot appearance

  • savefig (None) – path to a file where to save the image generated; if None, the image will be shown using matplotlib GUI (the extension of the file name will determine the desired format).

  • savedata (None) – path to a file where to save the consistency data generated (1 column per cutoff + 1 for particle number).

  • plot (True) – e.g. only saves data. No plotting done

objective_function_model(model, log=False, smooth=True, axe=None, savefig=None)[source]

This function plots the objective function value per each Monte-Carlo step

Parameters
  • model – the number of the model to plot

  • log (False) – log plot

  • smooth (True) – curve smoothing

  • savefig (None) – path to a file where to save the image generated; if None, the image will be shown using matplotlib GUI (the extension of the file name will determine the desired format).

particle_coordinates(part, models=None, cluster=None)[source]

Returns the mean coordinate of a given particle in a group of models.

Parameters
  • part – the index number of a particle

  • models (None) – if None (default) the angle will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the angle only for the models in the cluster number ‘cluster’

save_models(outfile, minimal=())[source]

Saves all the models in pickle format (python object written to disk).

Parameters
  • path_f – path where to save the pickle file

  • minimal (()) – list of items to exclude from save. Options: - ‘restraints’: used for modeling common to all models - ‘zscores’: used generate restraints common to all models - ‘original_data’: used generate Z-scores common to all models - ‘log_objfun’: generated during modeling model specific

view_centroid(**kwargs)[source]

shortcut for view_models(tool=’plot’, show=’highlighted’, highlight=’centroid’)

Parameters

kwargs – any parameters to be passed to view_models (i.e.: view_centroid(azimuth=30, elevation=10, show_axe=True, label=True))

view_models(models=None, cluster=None, tool='chimera', show='all', highlight='centroid', savefig=None, cmd=None, color='index', align=True, smooth=False, particle_size=50, lw_main=3, alpha_part=0.5, **kwargs)[source]

Visualize a selected model in the three dimensions (either with Chimera or through matplotlib).

Parameters
  • models (None) – if None (default) the visualization will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the visualization only for the models in the cluster number ‘cluster’

  • tool ('chimera') – path to the external tool used to visualize the model. Can also be ‘plot’, to use matplotlib.

  • savefig (None) – path to a file where to save the image OR movie generated (depending on the extension; accepted formats are png, mov and webm). if set to None, the image or movie will be shown using the default GUI.

  • alpha (False) – only for matplotlib (‘plot’ option), transparency of particles.

  • smooth (False) – only for matplotlib (‘plot’ option), spline smoothing (by default smoothing is 0.001).

  • particle_size (50) – only for matplotlib (‘plot’ option), redefine size of particles. If None, resolution times scale is used.

  • color ('index') –

    can be:

  • highlight ('centroid') – higlights a given model, or group of models. Can be either ‘all’, ‘centroid’ or ‘best’ (‘best’ being the model with the lowest IMP objective function value

  • show ('all') – models to be displayed. Can be either ‘all’, ‘grid’ or ‘highlighted’.

  • cmd (None) –

    list of commands to be passed to the viewer. The chimera list is:

    focus
    set bg_color white
    windowsize 800 600
    bonddisplay never #0
    represent wire
    shape tube #0 radius 5 bandLength 100 segmentSubdivisions 1 followBonds on
    clip yon -500
    ~label
    set subdivision 1
    set depth_cue
    set dc_color black
    set dc_start 0.5
    set dc_end 1
    scale 0.8
    

    Followed by the movie command to record movies:

    movie record supersample 1
    turn y 3 120
    wait 120
    movie stop
    movie encode output SAVEFIG
    

    Or the copy command for images:

    copy file SAVEFIG png
    

    Passing as the following list as ‘cmd’ parameter:

    cmd = ['focus', 'set bg_color white', 'windowsize 800 600', 'bonddisplay never #0', 'shape tube #0 radius 10 bandLength 200 segmentSubdivisions 100 followBonds on', 'clip yon -500', '~label', 'set subdivision 1', 'set depth_cue', 'set dc_color black', 'set dc_start 0.5', 'set dc_end 1', 'scale 0.8']
    

    will return the default image (other commands can be passed to modified the final image/movie).

  • align (True) – show aligned models

  • radius (15) – radius for the chimera particles

  • kwargs – see pytadbit.utils.extraviews.plot_3d_model() or pytadbit.utils.extraviews.chimera_view() for other arguments to pass to this function. See also coloring function

walking_angle(models=None, cluster=None, steps=(1, 3), signed=True, savefig=None, savedata=None, axe=None, plot=True, error=False)[source]

Plots the angle between successive loci in a given model or set of models. In order to limit the noise of the measure angle is calculated between 3 loci, between each are two other loci. E.g. in the scheme bellow, angle are calculated between loci A, D and G.

Parameters
  • models (None) – if None (default) all models will be used for computation. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the angle only for the models in the cluster number ‘cluster’

  • steps ((1, 3)) – how many particles to group for the estimation. By default 2 curves are drawn

  • signed (True) – whether to compute the sign of the angle according to a normal plane, or not.

  • axe (None) – a matplotlib.axes.Axes object to define the plot appearance

  • error (False) – represent the error of the estimates

  • savefig (None) – path to a file where to save the image generated; if None, the image will be shown using matplotlib GUI (the extension of the file name will determine the desired format).

  • plot (True) – e.g. if False, only saves data. No plotting done

  • savedata (None) – path to a file where to save the angle data generated (1 column per step + 1 for particle number).

Returns

values to be plotted (list of lists, one per window), and the same for upper and lower error lines

                      C..........D
                   ...            ...
                ...                 ...
             ...                       ...
  A..........B                            .E
 ..                                        .
.                                          .
                                           .                 .
                                           .                .
                                           F...............G
walking_dihedral(models=None, cluster=None, steps=(1, 3), span=(- 2, 1, 0, 1, 3), error=False, plot=True, savefig=None, axe=None, savedata=None)[source]

Plots the dihedral angle between successive planes. A plane is formed by 3 successive loci.

Parameters
  • models (None) – if None (default) the dihedral angle will be computed using all the models. A list of numbers corresponding to a given set of models can be passed

  • cluster (None) – compute the dihedral angle only for the models in the cluster number ‘cluster’

  • steps ((1, 3)) – how many particles to group for the estimation. By default 2 curves are drawn

  • signed (True) – whether to compute the sign of the angle according to a normal plane, or not.

  • axe (None) – a matplotlib.axes.Axes object to define the plot appearance

  • span ((-3, -1, 0, 1, 2)) – by default computes angle between a plane comprising the current residue (0), the residue after (1) and the residue 3 positions before (-3), and the plane with the current residue, the residue directly after (1) and the residue 3 positions after (3). e.g. In the scheme bellow it’s plane ACD vs plane CDF.

  • savefig (None) – path to a file where to save the image generated; if None, the image will be shown using matplotlib GUI (the extension of the file name will determine the desired format).

  • error (False) – represent the error of the estimates

  • plot (True) – e.g. if False, only saves data. No plotting done

  • savedata (None) – path to a file where to save the angle data generated (1 column per step + 1 for particle number).

Returns

values to be plotted (list of lists, one per window), and the same for upper and lower error lines

                      C..........D
                   ...            ...
                ...                 ...
             ...                       ...
  A..........B                            .E
 ..                                        .
.                                          .
                                           .                 .
                                           .                .
                                           F...............G
write_cmm(directory, model_num=None, models=None, cluster=None, color='index', rndname=True, infer_unrestrained=False, **kwargs)[source]

Save a model in the cmm format, read by Chimera (http://www.cgl.ucsf.edu/chimera).

Note: If none of model_num, models or cluster parameter are set, ALL the models will be written.

Parameters
  • directory – location where the file will be written (note: the name of the file will be model_1.cmm if model number is 1)

  • model_num (None) – the number of the model to save

  • models (None) – a list of numbers corresponding to a given set of models to save

  • cluster (None) – save the models in the cluster number ‘cluster’

  • rndname (True) – If True, file names will be formatted as: model.RND.cmm, where RND is the random number feed used by IMP to generate the corresponding model. If False, the format will be: model_NUM_RND.cmm where NUM is the rank of the model in terms of objective function value

  • color ('index') –

    can be:

  • kwargs – any extra argument will be passed to the coloring function

write_json(filename, color='index', models=None, cluster=None, title=None, infer_unrestrained=False, **kwargs)[source]

Save a model in the json format, read by TADkit.

Note: If none of model_num, models or cluster parameter are set, ALL the models will be written.

Parameters
  • directory – location where the file will be written (note: the name of the file will be model_1.cmm if model number is 1)

  • model_num (None) – the number of the model to save

  • rndname (True) – If True, file names will be formatted as: model.RND.cmm, where RND is the random number feed used by IMP to generate the corresponding model. If False, the format will be: model_NUM_RND.cmm where NUM is the rank of the model in terms of objective function value

  • filename (None) – overide the default file name writing

  • kwargs – any extra argument will be passed to the coloring function

  • infer_unrestrained (False) – infer unrestrained particle coordinates by linear interpolation using closest particles with restraints

write_xyz(directory, model_num=None, models=None, cluster=None, get_path=False, rndname=True)[source]

Writes a xyz file containing the 3D coordinates of each particle in the model.

Note

If none of model_num, models or cluster parameter are set, ALL the models will be written.

Parameters
  • directory – location where the file will be written (note: the file name will be model.1.xyz, if the model number is 1)

  • model_num (None) – the number of the model to save

  • models (None) – a list of numbers corresponding to a given set of models to be written

  • cluster (None) – save the models in the cluster number ‘cluster’

  • rndname (True) – If True, file names will be formatted as: model.RND.xyz, where RND is the random number feed used by IMP to generate the corresponding model. If False, the format will be: model_NUM_RND.xyz where NUM is the rank of the model in terms of objective function value

  • get_path (False) – whether to return, or not, the full path where the file has been written

write_xyz_babel(directory, model_num=None, models=None, cluster=None, get_path=False, rndname=True)[source]

Writes a xyz file containing the 3D coordinates of each particle in the model using a file format compatible with babel (http://openbabel.org/wiki/XYZ_%28format%29). .. note:

If none of model_num, models or cluster parameter are set,
ALL the models will be written.
Parameters
  • directory – location where the file will be written (note: the file name will be model.1.xyz, if the model number is 1)

  • model_num (None) – the number of the model to save

  • models (None) – a list of numbers corresponding to a given set of models to be written

  • cluster (None) – save the models in the cluster number ‘cluster’

  • rndname (True) – If True, file names will be formatted as: model.RND.xyz, where RND is the random number feed used by IMP to generate the corresponding model. If False, the format will be: model_NUM_RND.xyz where NUM is the rank of the model in terms of objective function value

  • get_path (False) – whether to return, or not, the full path where the file has been written

zscore_plot(axe=None, savefig=None, do_normaltest=False)[source]

Generate 3 plots. Two heatmaps of the Z-scores used for modeling, one of which is binary showing in red Z-scores higher than upper cut-off; and in blue Z-scores lower than lower cut-off. Last plot is an histogram of the distribution of Z-scores, showing selected regions. Histogram also shows the fit to normal distribution.

Parameters
  • axe (None) – a matplotlib.axes.Axes object to define the plot appearance

  • savefig (None) – path to a file where to save the image generated; if None, the image will be shown using matplotlib GUI (the extension of the file name will determine the desired format).

  • do_normaltest (False) – to display the result of a test of normality (D’Agostino and Pearson’s test, that combines skew and kurtosis).