We are searching data for your request:
Upon completion, a link will appear to access the found materials.
I am new to the study of chromatin, and I am trying to understand what people mean when they write about chromatin states, chromatin interactions, and chromatin structures.
What are the difference among these? Is there someone available to explain them to me clearly?
Are loops interactions or structures?
Chromatin state refers to the marks (i.e. methylated DNA, histone modifications, euchromatin v. heterochromatin) found at specific loci and is often referred to as open (readily bound by DNA binding proteins, like transcription factors) or closed (inaccessible to most factors).
Chromatin interactions can include the binding of transcription factors and other DNA binding proteins, but is usually used to describe local and long distance functional contacts between two or more regions of chromatin. These interactions are generated actively through protein-dependent chromatin organization and passively via polymer physics and the nuclear environment.
Chromatin structures can refer to the actual molecular shape and composition of the DNA or to the molecular modifications found at specific regions of chromatin (e.g. heterochromatin, DNA methylation). It can also refer to the ordered compaction of mitotic/meiotic chromosomes.
Chromatin loops are structures that may act in promoting regulatory interactions. Conversely, chromatin interactions may induce looping.
Cavalli and Misteli, Functional Implications of Genome Topology, Nature Structural and Molecular Biology 2013
Chromatin is a highly organized complex of DNA and proteins and is a principal component of the cell nucleus. Histone proteins help organize DNA into structural units called nucleosomes, which are then assembled into a compact structure (chromatin) and eventually into very large, high-order structures (chromosomes). The localized accessibility of chromatin is largely regulated by posttranslational modifications of both histone proteins and DNA, which have dramatic effects on the regulation of chromatin structure, binding of chromatin-modifying complexes, and transcriptional regulators. This article examines previously established and recently discovered roles of chromatin as a structural component of the nucleus and as a regulator of activation and repression of gene transcriptional activity.
Biochemical investigation of different states of chromatin and gene activity in cells
Sensitivity of chromatin to nucleases
A seminal observation in the correlation of gene activity with more accessible chromatin was the demonstration that transcriptionally active genes are found in chromatin that is more sensitive to DNases. Weintraub and Groudine showed in 1976 that the overall sensitivity of a gene to DNase I is increased about 3 to 10 fold over that of DNA in bulk chromatin, but only in tissues expressing the gene (Figure 4.6.8). Subsequent studies have shown this correlation for many genes in many tissues, but it is not seen in every case. Some genes are in accessible chromatin whether they are expressed or not. The reasons for these differences are being studied.
Figure 4.6.8.DNase I digestion of nuclei reduces the concentration of actively transcribed DNA. Adapted from Stalder et al. (1980) Cell 20:451-460.
The basic experimental approach was to measure the sensitivity of particular sequences to nuclease digestion in nuclei from expressing and nonexpressing tissues (Figure 4.6.8). For example, nuclei from chicken erythroid cells (avian red blood cells retain their nuclei, in contrast to mammals) and liver cells were digested separately with DNase I. Sufficient nuclease was added so that sensitive regions would be cut but the bulk of the DNA in chromatin was only lightly digested. Chromosomal proteins were then removed (proteinase K followed by phenol extraction) leaving purified DNA. The partially digested nuclear DNA was denatured and annealed to labeled gene-specific hybridization probes, and the appearance of the labeled probe in duplex with the nuclear DNA was monitored as a function of Cot (concentration of DNA ´ time - recall this from Part One of the course). DNA from partially digested liver nuclei annealed with the globin gene probe at a much lower Cot than did DNA from partially digested erythroid nuclei. This shows that the amount of globin gene DNA in erythroid nuclei is substantially reduced by the DNase I treatment, i.e. the globin gene is sensitive to DNase I in a cell that is expressing it.
An important negative control is the annealing to a labeled ovalbumin gene probe, a gene that is not expressed in either liver or red cells (only oviduct). In this case, the DNA from partially digested nuclei from both tissues annealed with the same kinetics to the ovalbumin probe. Thus there is no gross over-digestion of the erythroid nuclei, and it is clear the globin gene is much less sensitive to nucleases in nonexpressing tissues.
Mapping the extent of the region around the gene that is accessible
The basic strategy is similar to that used above, but the nuclear DNA is monitored as a function of [DNase I], hybridization probes from outside the gene are used, and a blot-hybridization assay is employed (Figure 4.6.9). After obtaining the DNA from nuclei digested to increasing extents with DNase I, the DNA is digested to completion with restriction endonucleases, separated by size on an agarose gel, blotted to a membrane like nylon and hybridized with a radioactive probe from within the gene or from regions flanking the gene. Probes from within and immediately flanking the gene show a progressive loss of signal as the [DNase I] is increased in the initial digestion, hence the name "fade-out" experiments for these assays. Further away from the gene, once one is outside the open domain, the signal from the restriction fragments does not decline any faster than the negative control. The boundaries of the open domain lie outside the fragments that show sensitivity but inside the fragments that show insensitivity.
Figure 4.6.9. DNase I digestion of nuclei preferentially cuts restriction endonuclease fragments containing actively transcribed DNA. Adapted from Stalder et al. (1980) Cell 20:451-460, Figure 2,
In the case of the human b-like globin gene cluster (see below), the region for insensitivity begins over 60 kb 5' to the b-globin gene and over 100 kb 3' to it. In other cases, e.g. chicken lysozyme gene, the entire domain is about 20 kb in size and has a single gene.
The structural basisfor the increased sensitivity to digestion by DNase Iin cells is not firmly established. It is ofteninterpreted as being the result of unfolding in higher order structure. One possibility is that DNA that is sensitive over a broad region is in the 10 nm fiber (a linear string of nucleosomes), whereas insensitive regions may be in a 30 nm fiber, which is thought to be a solenoid of nucleosomes. However, some genes in the 30 nm fiber may be active, and inactivation may correspond to a higher order compaction, or assembly of a silencing structure.
The extended regions of general DNase sensitivity are thought to define a functional domain in chromatin. It may correspond to a large loop of chromatin (e.g. 100 kb or more) (Figure 4.6.10).
Figure 4.6.10. Regions of general DNase sensitivity may correspond to "lampbrush&rdquo chromosome-like loops or domains. Adapted from Stalder et al., 1980, Cell 20:451
DNase hypersensitive sites
Specific, short regions (usually about 100 to 200 bp) are about 100 times more sensitive that bulk DNA in nuclei. Because DNase I cuts frequentlyin this short region, it generates a double-stranded breakat this hypersensitive site(abbreviated HS). This produces a new band on a genomic blot-hybridization assay (Figure 4.6.11).
The technique employed, called "indirect end labeling"is a modification of the "fade-out" experiment described in Figure 4.6.9 above, and it is used to detect HSs. As in the previous assays, nuclei are digested with increasing amounts of DNase I, DNA is purified and cleaved with a restriction endonuclease and the region of interest analyzed by genomic blot-hybridization (Southern blot). By using a radioactive probe from one end of the restriction fragment that is being detected on the genomic blot-hybridization assay (instead of the larger probes used in the previous assays), one can resolve the new fragments generated by cleavage by DNase I at a HS. The size of the new fragment tells you the position of the HS. For example, a new 5 kb fragment would mean that a HS is located 5 kb away from the restriction endonuclease cleavage site that is closest to the probe used in the assay.
Figure 4.6.11. Indirect end-labeling assay maps DNase hypersensitive sites. This example uses Indirect end-labeling to see DNase HSs in gamma globin genes. Adapted from Groudine et al. (1983) PNAS 80:7551-7555.
This approach can reveal multiple hypersensitive sites (Figure 4.6.12) as well as single site.
Figure 4.6.12. Example of results from an indirect end labeling assay.This experiment maps three DNase HSs in the human beta-globin locus control region (see Section E of this chapter0. Data from H. Petrykowska.
General properties of DNase HSs in chromatin
(1)HSs are free of nucleosomes, or the nucleosomes are highly disrupted. E.g. the SV40 control region is a HS, and visualization in the EM shows that SV40 minichromosomes do not have nucleosomes in this region.
(2)DNA sequences that are in HSs in chromatin are frequently involved in gene regulation. Examples are promoters, enhancers, silencers and LCRs. Matrix and scaffold attachment regions (MARs and SARs) are also hypersensitive to DNase I.
(3) Investigation of the HSs shows that they have multiple sites for binding transcription factors(as expected for promoters, enhancers, silencers, etc.) or other regulatory or structural proteins (e.g. MARs binding topoisomerase II).
(4) The basic idea is that the DNA can be occupied by specific binding factors (when the gene is being transcribed) or it can be wrapped into nucleosomes. In most (but not all) cases these are mutually exclusive options. The DNA is not hypersensitive to DNase I cleavage when it is in nucleosomes. The coverage of the DNA by the transcription factors is not complete and still allows cleavage by DNase I between the bound factors.
(5) The DNase HSs are landmarks for gene regulatory sequences
Chromatin structure changes during various processes from a DNA sequence view
The DNA sequence dependence of chromatin 3D structure formation.
Enhanced chromatin domain segregation from birth to senescence and establishment of cell identity through cross-domain contacts.
A unified model for chromatin structure changes during development, differentiation, senescence, and tumorigenesis.
Chromatin mainly consists of protein and DNA, and the sequence information of DNA contributes to controlling the spatial structure of chromatin. Genome-wide contact patterns of chromosome at high precision uncover fine structural properties, conductive to exploring underlying mechanisms on structure establishment and function realization for chromatin. In this short review, we describe changes of chromatin structure during various biological processes from a DNA sequence view, with an increase of the overall domain segregation from birth to senescence and establishment of cell identity related cross-domain contacts. Segregation patterns vary with cell stage and genomic distance. Meanwhile, possible effects of cell cycle, temperature, nuclear lamina and nucleolus on chromatin structure are discussed. At last, important roles of transcription factors and other proteins in proper chromatin organization are also discussed.
Predictive modeling of chromatin organization
We introduce a predictive model to study cell-type specific 3D chromatin folding. This model takes a sequence of chromatin states derived from genome-wide histone modification profiles and a list of CTCF binding sites as input. We selected these genomic features due to their known roles in organizing the chromatin at various length scales (Fig 1A). At the core of this model is an energy function—a force field—that is sequence specific and ranks the stability of different chromatin conformations. Starting from the input for a given chromatin segment, we use molecular dynamics simulations to explore chromatin conformations dictated by the energy function and to predict an ensemble of high-resolution structures. These structures can be compared directly with super-resolution imaging experiments or converted into contact probability maps for validation against genome-wide chromosome conformation capture (Hi-C) experiments.
(A) Illustration of genome organization at various length scales that includes the formation of CTCF mediated chromatin loops, TADs, and compartments. (B) A schematic representation of the computational model that highlights the assignment of chromatin states and CTCF binding sites. Chromatin states for each bead—a 5kb long genomic segment—are derived from the combinatorial patterns of histone marks. They are shown in part (C) as a heat map with darker colors indicating higher probabilities of observing various marks.
As shown in Fig 1B, a continuous genomic segment is represented as beads on a string in this model. Each bead accounts for five-kilo bases in sequence length and is assigned with a chromatin state derived from the underlying combinatorial patterns of 12 key histone marks. Chromatin states are known to be highly correlated with Hi-C compartment types [39,54,66] and, therefore, will help model large-scale chromosome compartmentalization. In the meantime, chromatin states can go beyond traditional A/B compartments or subcompartments to provide polymer models with the specificity needed for studying interactions between regulatory elements. We define a total of 15 chromatin states, identified using a hidden Markov model , to distinguish promoters, enhancers, heterochromatin, quiescent chromatin, etc (see Methods). Detailed histone modification patterns for these chromatin states are shown in Fig 1C. We note that 15 is large enough to capture the diversity of epigenetic modifications while still being small enough to ensure a sufficient population of each state for a robust inference of interaction parameters between them (Figure A1 in S1 Supporting Information). We further studied a hidden Markov model with 20 states, and found that further increasing the number of states does not lead to a discovery of additional epigenetic classes with significant populations (Figure A2 in S1 Supporting Information). A polymer bead is further labeled as a CTCF site to mark chromatin loop boundaries if both CTCF and cohesin molecules are found to be present in the corresponding genomic region. We define the orientation of these CTCF sites by analyzing the underlying CTCF motif and the relative position of CTCF molecules with respect to cohesin. Details for the definition of CTCF binding sites are provided in Methods.
The potential energy for a given chromatin configuration r is a sum of three components, and UChrom(r) = U(r) + UCS(r) + UCTCF(r). U(r) is a generic polymer potential that is included to ensure the continuity of the chromatin, and to enforce excluded volume effect among genomic loci. UCS(r) is a key innovation of the chromatin model, and is crucial to capture the formation of TADs and compartments. It quantifies the chromatin state specific interaction energies between pairs of loci. As detailed in Section: Physical principles of chromatin organization and Methods, we used a general form for UCS(r) to capture its dependence on genomic separation. UCTCF(r) is inspired by the loop extrusion model [29–31], and facilitates the formation of loop domains enclosed by pairs of CTCF binding sites in convergent orientation (Fig 1A). Both UCS(r) and UCTCF(r) contain adjustable parameters that can be derived from Hi-C data following the optimization procedure developed by one of the authors [64,65]. Segments of chromosomes 1, 10, 19 and 21 from GM12878 cells were used for parameterization to ensure a sufficient coverage of all chromatin states (see Figure A1 in S1 Supporting Information). Detailed expressions for the potential energy, and the parameterization procedure are provided in Methods and in the S1 Supporting Information.
Using the parameterized energy function, we simulated the ensemble of chromatin structures and determined the corresponding contact probability map for a 20 Mb region of chromosome 1 from GM12878 cells. As shown in Fig 2A, the simulated contact map is in good agreement with the one measured by Hi-C experiments from Ref.  and reproduces the overall block-wise checkerboard pattern that corresponds to the compartmentalization of chromatin domains. A zoomed-in view along the diagonal of the contact map provided in Fig 2B and 2C further suggests that chromatin TADs and loops are also well reproduced. Similar comparisons for other chromosomes used in parameterizing the model are provided in Figure B in S1 Supporting Information. We note that the length 20 Mb was chosen for computational efficiency, but the model can be easily generalized to longer chromatin segments (see Figure C in S1 Supporting Information).
(A) Results from simulation and the Hi-C experiment performed in Ref.  are shown in the upper and lower triangle respectively on a log scale. Also shown on the left and top panels are the sequence of chromatin states and the genomic positions of CTCF binding sites. (B) A zoomed-in view of the contact maps along the diagonal region to highlight the formation of TADs. TAD boundaries detected using the software TADbit are plotted on the top of the contact map, with the simulation shown in cyan and experiment in grey. (C) Zoomed-in view of several representative regions along the diagonal to highlight the formation of chromatin loops. (D) A representative chromatin structure predicted by the computational model is drawn in a tube representation and colored by chromatin states. (E) The average contact probability as a function of the genomic separation is shown below on a log-log scale for the simulated (blue) and experimental (red) contact maps respectively.
To go beyond the visual inspection and quantify the correlation between simulated (GM-Sim) and experimental (GM-Exp) contact maps, we calculated the Pearson correlation coefficient (PCC) between the two for chromosome 1 and found that it exceeds 0.96. Importantly, this number is higher than the PCC (0.94) between GM-Sim and Hi-C data from IMR90 cells (IMR-Exp). Breaking down the PCC at different genomic separations also supports that GM-Sim is more correlated with GM-Exp at all ranges than with IMR-Exp (Figure D in S1 Supporting Information). In addition, we also determined the stratum-adjusted correlation coefficient (SCC) that takes into account the distance-dependence effect of contact maps by stratifying them according to the genomic distance , and obtained 0.7 for GM-Sim/GM-Exp, and 0.66 for GM-Sim/IMR-Exp. Therefore, SCC analysis also validates our model’s ability in reproducing Hi-C contact maps and in capturing the distinction between cell types. We note that the magnitude of SCC can be sensitive to the smoothing parameter used in its calculation and should be interpreted with caution (Figure E in S1 Supporting Information).
We further examined the agreement between simulated and experimental contact maps using multiple feature-specific metrics. First, we define the contact enhancement for a pair of genomic loci as the ratio of their contact probabilities over the mean contacts averaged over a locally selected background region (see Figure F1 in S1 Supporting Information). The contact enhancement for chromatin loops from chromosome 1 is always larger than one, indicating a strong enhancement of spatial colocalization between loop anchors. Furthermore, over 74% of the loop pairs exhibit a contact enhancement that is larger than the 90th percentile of the distribution for random genomic pairs. These random pairs are selected regardless of CTCF occupancy but with comparable sequence separations as those found in chromatin loops. Therefore, if we use the 90th percentile of the random distribution as a threshold (1.16) and predict every convergent CTCF pairs as loops, the prediction will have a false negative rate of 26%, and a false positive rate less than 10%. The false positive value is an upper bound since most of the random pairs are not flanked with convergent CTCF. The sensitivity of chromatin loop predictions on the threshold is shown in Figure F2 in S1 Supporting Information. It is worth pointing out that the contact enhancement for chromatin loops calculated using Hi-C data is in general larger than simulated values and separates better from that for random pairs (Figure F3 in S1 Supporting Information). The overlap between the two distributions in our simulation is due to that random pairs include a significant fraction of convergent CTCF pairs whose contacts are enhanced as a result of the potential UCTCF(r). Many of these pairs, however, are not recognized as loops in Hi-C, and more advanced algorithms than simple binding site orientations are probably needed to identify loop forming CTCF pairs .
To go beyond CTCF mediated contacts and evaluate our model’s ability in reproducing strong interactions between genomic loci, we selected statistically significant contact pairs from simulated and experimental contact maps for chromosome 1 using the software Fit-Hi-C  (Figure G in S1 Supporting Information). As a quantitative metric, we define the matching score as the percent of experimental pairs that can be found in the list extracted from simulation. The reverse matching score can be similarly defined as the percent of simulated pairs found in the experimental list. The matching score for the top 1000 chromatin contacts is determined to be 46% and 52% for the reverse matching. To examine specific interactions between regulatory elements, we performed a similar analysis by selecting the top 100 enhancer (state: EnhW1)-promoter (state: PromD1) pairs with highest contact probabilities based on simulated and experimental contact maps. We find that over 70% of experimental pairs are captured in our simulation for chromosome 1. These results suggest that our model based on chromatin states and CTCF mediate interactions is able to reproduce a large fraction of significant contacts detected in Hi-C experiments. Further improving the model’s ability in predicting functionally important pairs would potentially require considering the effect of other proteins, such as YY1 that are known to mediate chromatin interactions , and will be an interesting future direction.
We next determined the correlation coefficients between the top five eigenvectors for simulated and experimental contact matrices. As shown in Figure H in S1 Supporting Information, the contact maps reconstructed using only these eigenvectors recapitulate the formation of TADs and compartments observed in the original maps. The high correlation between simulated and experimental eigenvectors (with PCC at approximately 0.8) supports that the corresponding features are well captured by the computational model, and confirms the qualitative observations from Fig 2 and Figure B in S1 Supporting Information.
To more closely examine the quality of simulated TADs, we calculated the insulation profile by sliding a uniform 500kb × 500kb square along the diagonal of the contact matrix and averaging over all contacts within the square. The minima of this profile can be used to identify TAD boundaries as inter-TAD contacts are sparser compared to intra-TAD contacts, resulting in a drop in the insulation score profile as the sliding window crosses TAD boundaries . The PCC between experimental and simulated insulation profiles for chromosome 1 is 0.7. We find that the matching score for TAD boundaries is 80% and 100% for the reverse matching. As another independent validation, we determined TAD boundaries using the software TADbit , and found that the simulated results again match well with experimental ones (see Figure I in S1 Supporting Information).
To demonstrate the transferability of the computational model across chromosomes and cell types, we performed additional simulations for chromosomes from GM12878, K562, and Hela cells, whose Hi-C data were not included during the parameterization procedure. As shown in Fig 3 and Figure J in S1 Supporting Information, these de novo predictions are in good agreement with experimental results as measured by PCC (Fig 3B) and SCC (Fig 3C) between experimental and simulated contact maps, matching score between TAD boundaries detected from the insulation profile (Fig 3D) and from TADbit (Figure K1A in S1 Supporting Information), PCC between experimental and simulated insulation profiles (Figure K1D in S1 Supporting Information), matching score between significant contacts detected using Fit-Hi-C (Fig 3E), matching score between interacting enhancer-promoter pairs (Figure K2C in S1 Supporting Information), correlation coefficients of the top five eigenvectors (Fig 3F and Figure H in S1 Supporting Information), and false negative rate of loop predictions (Fig 3F). Furthermore, the model succeeds in revealing the cell-type specificity of Hi-C contact maps, and the simulated contact maps are always more correlated with the corresponding experimental data from the same cell type than with those from IMR90 cells (light colors in Fig 3B and 3C). The matching scores between experimental and simulation results are also significantly higher than those calculated between experimental and control data (light colors in Fig 3D and 3E), which were obtained by randomly shuffling the size of loops/enhancer-promoter pairs/TADs along the chromosome while keeping their total number unchanged. The success of these de novo predictions supports that the chromatin-state-based model introduced here provides a consistent description of the 3D genome organization across cell types.
(A) Comparison between simulated (Top right) and experimental (Bottom left) contact maps for chromosome 2 from GM12878 (Left), K562 (Middle), and Hela cells (Right). (B-E) Quality of computational predictions for all chromosomes from the three cell types measured by Pearson (PCC) and stratum-adjusted correlation coefficients (SCC) between simulated and experimental contact maps (B,C), matching score for TAD boundaries detected from insulation profiles (D), and matching score for the top 1000 significant contacts (E). Each data point represents one chromosome. Data shown as light colors in (B,C) correspond to PCC/SCC between simulated and IMR90 experimental contact maps, while those in (D,E) correspond to matching scores between experimental and control data. The boxes represent the 25% and 75% quantities of the matching score distribution, and the thick line inside each box corresponds to the median value. Whiskers indicate the last values that fall within 1.5 times the interquartile range. (F) Average correlation coefficients between the top five eigenvectors for the logarithm of contact matrices for all the three cell types. Error bars correspond to standard deviations of the results for all chromosomes. (G) False negative rates for predicting chromatin loops identified in Hi-C data with convergent CTCT binding sites in different cell types.
Structural characterization of chromatin organization
We next analyze the simulated 3D structural ensembles to gain additional insights on chromatin organization. Consistent with previous experimental and theoretical studies [37,72,73], our model reproduces the clustering of active chromatin state and their preferred location at the exterior of chromosomes (Figure L in S1 Supporting Information).
Super-resolution imaging experiments probe chromatin organization in 3D space to quantify spatial distances between genomic segments. These 3D measurements can be compared directly with simulated chromatin structures, and thus provide a crucial validation of the computational model parameterized from Hi-C experiments with independent datasets. To understand the overall compactness of various chromatin types, we selected a set of active, repressive and inactive chromatins and determined their radiuses of gyration from the ensemble of simulated structures. These different chromatin types are identified using two key histone marks H3K4me2 and H3K27me3 (Fig 4A). The complete list of chromatin domains with their genomic locations is provided in the Extended Data Sheet. As shown in Fig 4B, the radius of gyration increases at larger genomic separation following a power law behavior in all cases with exponents of 0.34, 0.31 and 0.23 for the three chromatin types respectively. These scaling exponents are in quantitative agreement with imaging measurements performed for Drosophila chromosomes  and support the notion that active chromatins adopt less condensed conformations to promote gene activity. Consistent with the imaging study performed on chromosome 21 from IMR90 cells [13,20], we also observe a strong correlation between Hi-C contact probabilities and spatial distances for pairs of genomic loci (Fig 4C).
(A) Characteristic histone modification profiles for repressive, active and inactive chromatin. (B) The sizes of repressive (blue), active (orange) and inactive (green) chromatin domains, as measured by their radiuses of gyration, are plotted as a function of the genomic separation on a log scale. The straight lines correspond to numerical fits of the data with a power-law expression R = RoL α , with the values of α shown in the legend. Representative structures of 500kb in length for the three chromatin types are shown in the inset. Error bars correspond to standard deviations of structures from the entire simulated ensemble. (C) Scatter plot of the contact probabilities between pairs of genomic loci versus their spatial distances shown on a log-log scale. The black line is the best fit to the data using the expression P = Por β , with β = −4.18.
One of the most striking features revealed by high-resolution Hi-C experiments is the formation of chromatin loops anchored at pairs of convergent CTCF sites [7,10,74,75]. Microscopy studies that directly visualizes 3D distances using fluorescence in situ hybridization (FISH) methods further find that these loops are dynamic, and despite their high contact frequencies, loop anchors are not in close contact in every cell [16,41,76]. Consistent with their dynamic nature, chromatin loops in our simulation adopt flexible conformations as well. As shown in Fig 5A, for the loop formed between chr1:39.56–39.73 Mb, we observe a large variance in the probability distribution of its end-to-end distances. Additional results for other loop pairs are provided in Figure M in S1 Supporting Information. Two example configurations of the loop domain with distance at 0.08 and 0.24 μm are shown in the inset. A systematic characterization of all the loops identified in Ref.  for the simulated chromatin segment shows that the conformational flexibility is indeed general, though there is a trend in decreasing variance for loops with larger contact probabilities (Fig 5B). We also emphasize that though higher contact probabilities, in general, corresponds to smaller end-to-end distances, their relationship is not strictly monotonic. The opposite correlation can be seen in numerous cases in Fig 5B. Such seemingly paradoxical observations have indeed been found in previous experimental studies that compare 3C with FISH experiment [16,77], and can naturally arise as a result of dynamical looping or loop extrusion .
(A) Probability distribution of the end-to-end distance for the chromatin loop formed between chr1:39.56 Mb and chr1:39.73 Mb from GM12878 cells (blue) and for a random genomic pair (yellow). Two example configurations that correspond to open and closed chromatin loop structures are shown in the inset. (B) End-to-end distances of chromatin loops versus their corresponding contact probabilities. The shaded areas represent the variances in distances estimated from the simulated structural ensemble.
Compared to chromatin loops, TADs are longer and are stabilized by a complex set of interactions . The analysis of their structural ensemble is less straightforward, and the end-to-end distance may not be sufficient for a faithful description of their conformational fluctuation . It is desirable to analyze TAD structures using reaction coordinates that not only help to distinguish different clusters of chromatin conformations, but can also provide insight into the mechanism of TAD folding and formation. Borrowing ideas from protein folding studies, we approximate these reaction coordinates using collective variables with slowest relaxation timescales as determined following the diffusion map analysis [81,82]. Progression along these variables approximates well the most likely transition between two sets of structures and can, therefore, shed light on the pathway for conformational rearrangements. Diffusion map analysis has been successfully applied to a variety of systems to provide mechanistic insights on the conformational dynamics involved in protein folding, ligand diffusion, etc. [83,84].
We applied the diffusion map technique to the predicted structural ensemble of the genomic region chr1:34–38 Mb from GM12878 cells that consists of three visible TADs. As shown in Fig 6, several basins are observed in the probability distribution of chromatin conformations projected onto the first two reaction coordinates, suggesting the presence of multiple stable TAD structures, rather than a unique one. Conformational heterogeneity in TADs has indeed been observed in a recent super-resolution imaging study that characterizes single cell chromatin structures . To gain physical intuition on the reaction coordinates and insight on the transition between TAD structures, we calculated the corresponding contact maps at various values of these coordinates. As shown in the top panel, reaction coordinate one captures the formation of contacts between TAD1 and TAD3 while the structures for all three TADs remain relatively intact. On the other hand, progression along reaction coordinate two (left panel) leads to significant overlaps between TAD1 and TAD2. Interaction between TAD2 and TAD3 can also be observed along a third coordinate as shown in Figure N in S1 Supporting Information. Example structures for the three TADs in various regions are also provided on the right panel. These results are consistent with the notion that TADs are stable structural units for genome organization , but also suggest the presence of significant cross-talk among neighboring TADs .
(Center) Free energy profile of TAD conformations projected onto two coordinates that describe the slowest collective motions. The (Left) and (Top) panels illustrate the change in contact maps along the two coordinates. (Right) Representative structures for the chromatin segment at various positions indicates in the central and bottom panel. The three contact maps for reaction coordinate 1 were calculated using chromatin structures that fall into the regions [−2.5, −0.5), [−0.5, 0.5) and [0.5,1.5). The three regions used to determine the contact maps for reaction coordinate 2 are [−2.5, −1.0), [−1.0, 1.5), and, [1.5, 3.5).
Physical principles of chromatin organization
Though the exact molecular mechanism and driving force for chromatin folding remain elusive, it is becoming increasingly clear that different molecular players are involved in organizing the chromatin at various length scales [49,60,86,87]. For example, transcription factors and architectural proteins are critical in stabilizing the formation of chromatin loops and TADs [4,33,79]. On the other hand, nuclear compartments, such as the nucleolus and the nuclear envelope, contribute to chromatin compartmentalization and mediate contacts among chromatin domains separated by tens of Mb in sequence [50,88]. We expect that these different molecular mechanisms will give rise to distinct interaction energies at various genomic length scales. For example, for the same pair of chromatin states, as the genomic separation between them is varied, the interaction energy that stabilizes their contact should vary. In the following, we examine the dependence of inferred contact energies on genomic separation to reveal the principles of genome organization.
Fig 7A presents the derived contact energies among chromatin states UCS(r) at various genomic separations (500kb, 1.5 Mb, 4 Mb and 10 Mb from left to right), with blue and red for attractive and repulsive interactions respectively. A notable feature for all four length scales is the clear partition of chromatin states into at least two groups that correspond to well-known active and repressive chromatins respectively. For example, attractive interactions are observed among the top half chromatin states that include promoters (PromD1, PromU), enhancers (TxEnh5, Enhw1) and gene body (Tx), and for the bottom half that includes inactive chromatin (Quies), polycomb repressed domain (ReprPC) and heterochromatin (Het). The unfavorable interactions among active and repressive chromatins will drive their phase separation shown in Fig 2D and Figure L in S1 Supporting Information. Partitioning of chromatin states into active and inactive groups is also evident from the dendrogram shown in Fig 7B, and the eigenvectors for the largest in magnitude eigenvalue of the interaction matrices shown in Fig 7C.
(A) Heat maps for the interaction matrices at various genomic separations, with blue and red corresponding to attractive and repulsive interactions respectively. We subtracted out the mean of the interaction energies in order to shift different plots to the same scale. (B) Dendrogram calculated using the interaction energy matrix at 1.5 Mb to highlight the hierarchical clustering of chromatin states. The coloring scheme is the same as in part (A). (C) The eigenvectors corresponding to the largest eigenvalues of the four interaction matrices, with grey and red indicating positive and negative values respectively. (D) Pearson correlation coefficients between interaction matrices at different scales. (E) The complexity measure for different interaction matrices as a function of the index for top eigenvalues. See text for the definition of the complexity measure.
Despite their overall similarities, the interaction energies at various genomic separations differ from each other. To quantify their differences, we determined the pairwise Pearson correlation coefficients between the interaction matrices. As shown in Fig 7C, the interactions that are responsible for TAD formation (
1 Mb) indeed differ significantly from those that lead to chromatin compartmentalization (
10 Mb), as evidenced by the low correlation among them. Strikingly, the correlation coefficient between interaction matrices at 4 Mb and 10 Mb exceeds 0.9, indicating the convergence of chromatin interactions at large genomic separation.
We further compared the complexity of the interaction matrices by calculating the ratio of the first n eigenvalues over the sum of all eigenvalues. Fig 7D plots this complexity measure as a function of n, and absolute values of the eigenvalues were used to calculate the measure. For all three matrices with genomic separation larger than 1 Mb, we find the top first six eigenvectors can explain a large fraction of their complexity (over 80%). This observation is consistent with the success of our previous effort in modeling chromatin organization with six compartment types . However, more eigenvectors are needed, especially for short range in sequence interactions, to capture the full matrix complexity. These results together highlight the presence of distinct mechanisms that fold the chromatin at various genomic separations, and argues the importance of using sequence length dependent contact energies.
Combinations of post-translational modifications of histone proteins (PTMs or ‘marks’) act as a synergistic signaling platform regulating chromatin function. Currently a large host of histone modifications is known, such as lysine acetylation (Kac), mono-, di- and trimethylation (Kme1/2/3), symmetric and asymmetric arginine methylation, arginine deimination, serine and threonine phosphorylation and glycosylation, ADP ribosylation, lysine ubiquitylation (Kub) as well as SUMOylation which can directly alter the physico-chemical properties of chromatin and function as recognition sites for chromatin effector proteins , . On the other hand, a different set of chemical modifications appears on the DNA template itself, not only foremost 5-methyl-cytosine in CpG dinucleotides , as well as cytosine hydroxymethylation and higher oxidized forms . The colocalization and spatial arrangement of combinations of histone modifications, together with specific non-histone chromatin proteins (effectors or regulators) form a chromatin state and are coupled to biological function. Fundamental chromatin states include heterochromatin, which is highly compacted, characterized by H3 K9 methylation, histone deacetylation and by the presence of heterochromatin protein 1 (HP1), and where gene expression is silenced. Similarly, polycomb repressed regions are characterized by H3 K27 methylation and the presence of polycomb repressive complexes 1 and 2 (PRC1, 2) . Conversely, active chromatin regions, such as transcriptionally active genes, promoters and enhancers exhibit various degrees of histone acetylation, methylation at H3 K4 and K36 and the presence of a large number of different effector proteins, including RNA polymerase II and general transcription factors . DNA methylation patterns have important functions in long term gene regulation and epigenetic inheritance , whereas our understanding of the role of DNA hydroxymethylation is still limited , .
Recognizing the exceedingly high complexity of possibly co-existing histone PTMs, the hypothesis of a ‘histone code’ was put forward to establish a causal link between PTM patterns and genome function . Histone modifications can alter DNA accessibility and chromatin structure. Furthermore, they can serve as binding sites for chromatin effector proteins such as histone modifying enzymes. Many effectors contain one or several protein domains that specifically interact with histone modifications (‘reader’ domains). Examples include chromo-, tudor-, WD40 and malignant brain tumor MBT-domains (which bind Kme1/2/3), bromodomains (BD, which recognize Kac), 14-3-3 domains (bind phosphorylated serines) and plant homeodomain PHD fingers (recognize unmodified or methylated lysines) . Individual interactions between chromatin proteins and histones are fairly weak, exhibiting dissociation constants (Kd) in the micromolar range . Therefore, a single PTM — reader domain interaction may not be sufficient to transduce a particular effector-mediated consequence. Indeed, a significant number of chromatin effectors contain multiple reader domains, or exist in higher order complexes containing multiple chromatin interaction motifs. The combinatorial action of several (low-affinity) reader domains (including other protein-protein, or protein-DNA interaction domains) then allows simultaneous recognition of multiple histone PTMs coexisting on a nucleosome in a multivalent fashion, with greatly increased affinity , . This presents an interesting mechanistic hypothesis how a chromatin state, characterized by combinations of PTMs, is specifically read-out by effectors and translated into a biological output. Since the development of this model , several examples of such interactions have been observed and studied , , , .
In addition to thermodynamic considerations, the average dwell time at a particular chromatin locus is a critical parameter for effector action. Large scale chromatin compartments, such as heterochromatin, are highly stable while the individual factors are in rapid exchange with soluble proteins . This allows a fast response to stimuli, e.g. for chromatin remodeling in DNA damage repair . Such malleability can be understood to be a result of many weak effector — chromatin interactions, associated with fast dissociation rate constants. In this context, multivalent interactions are an attractive way to establish selectivity while retaining kinetically dynamic interactions, as they mainly result in an increase of the local concentration of factors at its target chromatin region thereby accelerating the binding kinetics . In turn, local competition for binding sites can still take place and, through processes such as facilitated dissociation , , rapid chromatin factor exchange can take place. A crucial factor for a deeper understanding of such interactions is thus the local concentration of both factors and PTMs in the nucleus, and the resulting interaction thermodynamics and kinetics. While the degree of modification is mostly unknown at given chromatin loci, recent findings demonstrated that nucleosomes are often not homogenously modified, but that many histone PTMs exist in an asymmetric fashion: One copy of a particular histone might carry one modification whereas the other copy is unmodified or modified differently . This might have important consequences for the downstream readout by multivalent effectors.
Methylated DNA, on the other hand, recruits its own set of associated proteins, for example methyl CpG binding protein 2 (MeCP2) and the SET- and RING-associated (SRA) domain in the ubiquitin-like, containing PHD and RING finger domains 1 (UHRF1) complex. Contrariwise, other DNA binding proteins are expelled from their target sequences by methylation resulting in a cell specific biological output . In summary, patterns of coexisting chromatin modifications and associated effectors define distinct chromatin states, which correlate with the expression levels of underlying genes, splicing activity, and replication and repair processes , . Chromatin states can persist over cell generations and are involved in the regulation of cell differentiation and lineage commitment. Therefore, the combination of the chemical modifications in DNA and histones, the associated effector proteins and the chromatin structural states can be considered to contribute to epigenetic inheritance .
Currently the knowledge on the location and amounts as well as the combinatorial complexity and dynamics of histone modifications in cell populations is rapidly expanding through genome wide investigations employing ChIP-Seq methodologies  and mass spectrometry (MS) based investigations . Moreover, low cell number and time-dependent ChIP methodologies are emerging, enabling ChIP analysis on small sample sizes, minimizing information loss through ensemble averaging and allowing kinetic investigations , , , . However the molecular mechanism of PTMs function is poorly understood, information on the single cell dynamics of modifications is often lacking and the biological downstream effects of the epigenetic PTM landscape remain elusive.
Get full journal access for 1 year
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Availability of data and materials
C++ source code for chromatin polymer folding as well as comprehensive tutorials demonstrating how to configure polymer folding simulations are publicly available via git repository . Similarly, C++ source code for our Bayesian Hi-C deconvolution Gibbs sampler and loci modeling scripts are also available via git [73, 74]. The source code versions used specifically in this work are available via Zenodo open-access repository . All source code is licensed under the GNU General Public License v3.0.
PSC can Oligomerize Nucleosomal Arrays
To analyze the interaction between PSC and chromatin by electron microscopy (EM) and STEM, we incubated PSC with 4 nucleosome (4N) arrays at a ratio of 0.4𢍡. This concentration refers to active concentration for DNA binding PSC preparations are typically 20% active assuming the protein binds DNA as a monomer. The resulting complexes were cross-linked with glutaraldehyde and fractionated by sucrose gradient sedimentation ( Fig. 1a ). Arrays that were not incubated with PSC sediment mainly in fractions 2 and 3. Arrays incubated with PSC were spread between fractions 3 and 7 (which includes the pellet). Analysis of these fractions on native agarose gels indicates that a series of protein-DNA complexes with progressively reduced mobility are present in the fractions. The complexes formed are discrete, suggesting they could contain different numbers of nucleosomal arrays. It is unlikely that the differences in mobility could be generated solely by binding of more PSC molecules to each array because of the low ratios of PSC to nucleosomes used, and because the amount of mass added for each PSC (169 kDa) is small relative to the mass of each array (0.9 MDa). We inspected the material in these fractions by electron microscopy. EM shows that nucleosomal arrays alone adopt an extended conformation, although 4 nucleosome circles and some more compacted forms were also observed ( Fig. 1b ). PSC-bound arrays form quite uniform single particles. Occasional multilobed structures were observed (bottom right image in Fig. 1b ), but each lobe is of a size likely to represent an individual array. We measured the maximal diameter of arrays from three fractions ( Fig. 1c ) (as in ). PSC-chromatin complexes have significantly smaller diameters than arrays alone ( Fig. 1d ). Diameters increase progressively towards the bottom of the gradient, consistent with the particles having distinct compositions. Together, these results suggest the larger, single particle structures observed in lower fractions are oligomerized structures that include more than one nucleosomal array. The average diameters are much smaller than the additive diameter of two or more compacted arrays, suggesting the oligomerized structures are highly compact.
(a) Representative glycerol gradient purification of nucleosomal templates with and without PSC for EM and STEM. Boxes indicate fractions selected for analysis. (b) Representative EM images from indicated fractions. Photographs were taken in dark field and are inverted to enhance contrast. (c) Distribution of maximum diameters of particles determined from micrographs like those in (b). Note that less than 10% of the 4N alone arrays have diameters larger than 95 nm and are not shown on the graph. (d) Summary of diameter measurements. Note that all fractions of 4N arrays incubated with PSC were significantly smaller than 4N arrays alone, and fractions 4 and 5 are different from 3. Table shows p-values for student’s t-test (unpaired, assuming equal variance in samples).
To ask if PSC can oligomerize nucleosomal arrays using a different method, we employed a well-established centrifugation assay . PSC was incubated with 12-nucleosome arrays at increasing concentrations reactions were pelleted in a microfuge, and the fraction of the array in the pellet versus the supernatant determined. Arrays alone do not pellet under our reaction conditions, as expected. PSC also does not pellet under these conditions  we centrifuged PSC at full speed prior to adding it to reactions to get rid of any large aggregates that might be present. As PSC is titrated into reactions, the fraction of arrays that pellets increases ( Fig. 2 ). About 50% of the template pellets at a ratio of 0.25 PSCs per nucleosome, and the reaction saturates at two PSCs per nucleosome. Together, the results from different assays and with different sized nucleosomal arrays are consistent with PSC being able to oligomerize nucleosomal arrays.
(a) Nucleosomal arrays were mixed with PSC at the indicated ratios and reactions pelleted by centrifugation in a microfuge. Proteinase K digested samples were separated by agarose gel electrophoresis, stained with SYBR gold and quantified. Two different examples of the assay are shown considerable variability was observed in this assay although the trend was constant. (b) Summary of PSC-induced oligomerization. Error bars are standard deviation in this and all other figures. n =𠂥.
Stoichiometry of PSC and Nucleosomes in PSC-Chromatin Interaction
We previously determined that full inhibition of chromatin remodeling by PSC occurs at ratios of about one PSC per nucleosome, based on measurements of active concentrations . However, as discussed above, we do not know what the active DNA binding form is, so that it is possible that multiple PSCs bind to single DNAs in binding assays. We therefore sought to directly measure the stoichiometry of PSC to nucleosomes in compacted chromatin that is refractory to chromatin remodeling. We used STEM to determine the ratio of PSC to nucleosomes. STEM can accurately measure particle masses using the linear relationship between electron scattering by the sample and its molecular mass . We first analyzed glutaraldehyde cross-linked PSC alone by EM and by STEM. By negative staining followed by EM, particles of different sizes were observed ( Fig. 3a, b ). Some had complex structures but many are simple oblongs of the approximate size expected for a globular protein of 169 kDa (8 nm). We note that more than half of the sequence in PSC is predicted to be intrinsically disordered , , so that it is interesting that the protein has a compact rather than extended conformation.
(a) Representative EM images of negatively stained PSC. (b) Distribution of diameters of negatively stained PSC (n =). (c) Mass distributions of STEM analysis of PSC alone (n =). (d) Summary of measured masses of PSC. (e) Mass distributions of STEM analyses. Measured masses for 4N and 4N + PSC are 0.97ଐ.01 MDa (n =) and 1.67ଐ.03 MDa (n =) respectively.
PSC was analyzed by STEM and we observed six average measured masses in multiples of 0.17 MDa (PSC monomer predicted at 0.17 MDa) ( Fig. 3c, d ). These data are consistent with the protein existing in monomeric and several multimeric forms. More than 70% of the observed particles have masses consistent with multimers, consistent with previous demonstrations that PSC can self-associate , and with the EM data. Most of these multimers are dimers or trimers. This, along with previous data indicating that PSC forms multimers at low concentrations , suggests that the active DNA binding form of PSC may be a multimer.
We analyzed 4N arrays alone and with PSC (using gradient fractions similar to the boxed ones in Fig. 1a ). Only fractions containing the most rapidly migrating PSC-bound species, expected to contain primarily complexes of single arrays, were analyzed ( Fig. 3e ). The average measured mass of 4N chromatin templates alone was 0.97ଐ.01 MDa, which is consistent with its predicted mass of 0.92 MDa. A single peak distribution of masses was observed for PSC +4N arrays, and the average mass of PSC-bound chromatin templates was 1.67ଐ.03 MDa. Thus, compacted 4N templates contain an average of 4.1 PSC molecules (expected mass =𠂠.92+4൰.17 =𠂡.6MDa). The right shoulder on the graph of PSC +4N arrays in Fig. 3e indicates that the distribution of the 4N + PSC sample tends toward higher masses. These structures could reflect more than one PSC binding to each nucleosome or binding of some nucleosomes by a PSC multimer. Nevertheless, our STEM results suggest that a minimum ratio of 1 PSC to 1 nucleosome produces a compacted species. This stoichiometry is agreement with previous estimates of the ratio of PSC to nucleosomes required to completely inhibit remodeling of a nucleosomal array . This stoichiometry supports a model for nucleosome bridging in which each nucleosome is bound by PSC and these nucleosome-bound PSC interact with each other ( Fig. 3e ). The finding that additional PSCs can bind to arrays may explain array oligomerization since these unoccupied PSCs may function as sticky ends to capture additional nucleosomes in trans.
The Acidic Patch of Histone H2A is Not Necessary for PSC-Chromatin Interactions
We next investigated the mechanism PSC uses to interact with nucleosomes. Interactions among nucleosomes, particularly between the basic tails of histone H4 on one nucleosome and a cluster of acidic residues (the acidic patch) of histone H2A on another nucleosome, are believed to play important roles in chromatin folding and array oligomerization –. Although previous electron microscopy data showed that PSC can compact chromatin with trypsinized, tail-less histones , it is possible that other activities such as nucleosome bridging have different requirements than chromatin compaction. Furthermore, several nucleosome binding proteins (HMGN2, RCC1, LANA, Sir3 ) interact with the acidic patch of histone H2A. It is possible that PSC can interact with the H2A acidic patch to compact chromatin. PSC is a basic protein (pI =𠂩.2), suggesting that it could replace histone tails in nucleosome-nucleosome interactions. This mechanism would be consistent with the stoichiometry of PSC to nucleosomes, and unaffected by removal of the histone tails. To examine if nucleosome bridging and inhibition of chromatin remodeling by PSC requires the acidic patch of histone H2A, we prepared recombinant histone H2A with three key amino acids in the acidic patch (DEE) mutated to uncharged, polar residues (STT) ( Fig. 4a ) . Xenopus leavis STT mutant histone H2A (H2A-STT) was prepared using standard protocols, and assembled into histone octamers with H2B, H3, and H4 .
(a) Amino acid sequences for the wild-type H2A acidic patch (WT) and uncharged mutant (STT). Acidic residues are highlighted, and mutated residues are underlined. (b) MgCl2 dependent oligomerization of wild type and H2A-STT containing chromatin. Chromatin was incubated with the indicated concentrations of MgCl2 and centrifuged in a microfuge. Supernatants were electrophoresed on agarose gels and stained with SYBR gold the % of the template remaining in the supernatant was determined by comparison with the 0 mM MgCl2 control. (c) Summary of chromatin oligomerization assays. (d) Restriction enzyme accessibility (REA) assays on chromatin templates with 12 nucleosomes. The chromatin template contains a unique restriction site (HhaI) that is normally occluded by nucleosomes but is exposed upon Swi/Snf-mediated chromatin remodeling. The first two lanes are negative and positive controls (with or without Swi/Snf, no PSC) demonstrating that the HhaI site becomes more accessible in the presence of Swi/Snf. (e) Summary of REA assay on chromatin templates assembled with rec-WT and H2A-STT histones. Percent inhibition is calculated as .
Wild-type recombinant histone octamers or those assembled with H2A-STT were assembled into nucleosomal arrays on DNA containing two sets of five 5S nucleosome positioning sequences flanking a unique region (G5E4, ). To verify that the STT mutation disrupts chromatin folding in our hands as reported, we used the precipitation assay described above, but induced array oligomerization with MgCl2 instead of PSC. Previous work demonstrated that nucleosomal arrays undergo reversible oligomerization in the presence of MgCl2 concentrations above 1 mM , which depends on the H2A-H4 interaction , . We titrated MgCl2 into nucleosomal arrays assembled with wild type recombinant histones (rec-WT), or H2A-STT octamers, and pelleted the arrays in an eppendorf centrifuge. Supernatants were analyzed on agarose gels, and the amount of DNA was quantified after staining with SYBR gold each sample was compared with the control, 0 mM MgCl2 supernatant ( Fig. 4b ). We find that at least 80% of arrays assembled with rec-WT octamers are pelleted in 1 mM MgCl2, while only about 40% of arrays assembled with H2A-STT octamers are pelleted even at 4 or 8 mM MgCl2 ( Fig. 4c ). Thus, the chromatin assembled with H2A-STT octamers has an oligomerization defect, as reported previously .
We first tested whether H2A-STT interferes with the ability of PSC to inhibit nucleosome remodeling. Chromatin remodeling was monitored by restriction enzyme access (REA) to nucleosomal DNA. DNA assembled into nucleosomes is generally inaccessible  but can be exposed by chromatin remodeling  the extent of restriction enzyme digestion is a quantitative measure of chromatin remodeling. PSC inhibits chromatin remodeling by the ATP-dependent remodeling factor hSwi/Snf  so that restriction enzyme digestion induced by hSwi/Snf is reduced. PSC does not substantially inhibit restriction enzyme accessibility in the absence of hSwi/Snf , indicating that inhibition of chromatin remodeling is specific. PSC activity on chromatin templates assembled with recombinant histones has not previously been reported, so we first compared inhibition of chromatin remodeling on templates assembled with histone octamers purified from HeLa cells or rec-WT octamers. We find that PSC inhibits remodeling of both nucleosomal arrays equally well ( Fig. 4d, e ), indicating that any histone modifications present on HeLa-purified octamers are not required for PSC-mediated inhibition of chromatin remodeling. We tested whether PSC can inhibit remodeling of arrays assembled with H2A-STT octamers and find that it inhibits their remodeling as efficiently as that of arrays assembled with wild-type octamers ( Fig. 4d, e ). Thus, neither the acidic patch of H2A nor covalent modifications present on cellular histones are required for inhibition of nucleosome remodeling by PSC.
We tested whether nucleosome bridging, which involves the clustering of nucleosomes, requires the acidic patch of histone H2A ( Fig. 5 ). Nucleosome bridging is assessed on chromatinized plasmids which are incubated with buffer (control) or PSC ( Fig. 5a ). After PSC has bound to the chromatin, the linker DNA connecting the nucleosomes is removed by digestion with micrococcal nuclease (MNase). MNase digested reactions (or control mock-digested reactions) are sedimented through sucrose gradients which can separate bridged from free mononucleosomes, or PSC bound from unbound intact plasmids in control reactions. The nucleosome bridging assay was performed on plasmid chromatin templates with rec-WT or H2A-STT histones. Analysis of gradient fractions from reactions that were mock digested indicates that incubation of either template with PSC causes chromatin to sediment rapidly near the bottom of the gradient. Mononucleosomes generated by MNase digestion of both PSC bound templates also sediment near the bottom of the gradient. Thus, PSC is able to bridge nucleosomes from both H2A-STT and rec-WT templates with similar efficiencies ( Fig. 5b𠄽 ). We conclude that neither the acidic patch nor histone modifications are required for PSC to bridge nucleosomes.
(a) Schematic diagram of nucleosome bridging assay. (b) Representative control reactions for nucleosome bridging (mock MNase digested) showing that PSC binds both rec-WT and H2A-STT chromatin. Arrows point to the main plasmid forms (nicked and supercoiled) the array of minor isoforms observed here is atypical but the isoforms behave similarly. (c) Representative MNase digested nucleosome bridging reactions demonstrating that PSC bridges both rec-WT and H2A-STT nucleosomes. (d) Summary of nucleosome bridging assays on chromatin templates assembled with recombinant wild-type (rec-WT) and H2A acidic patch mutant histones (H2A-STT). Values from fractions 5𠄷 (bottom fractions) of sucrose gradients were summed and plotted.
PSC can Bridge Bare DNA Segments
PSC binds tightly to both nucleosomes and bare DNA, suggesting its interaction with chromatin is mediated at least in part through linker DNA binding. Since our data indicate standard chromatin folding mechanisms are not required for bridging by PSC ( Fig. 5b𠄽 ), we wondered if nucleosomes were required for bridging. If bridging reflects PSC-DNA binding and PSC-PSC interactions, it may not depend on nucleosomes. To ask whether PSC can bring segments of bare DNA together, PSC was incubated with either unbiotinylated DNA or a mixture of biotinylated and unbiotinylated DNAs of different sizes ( Fig. 6a ). Bound DNAs were isolated by sucrose gradient sedimentation ( Fig. 6b ). Gradient fractions were incubated with streptavidin coated beads to capture the biotinylated DNA ( Fig. 6c, d ). When both the biotinylated an unbiotinylated DNAs were included with PSC, both were efficiently captured by the streptavidin beads. In the absence of the biotinylated DNA, little of the unbiotinylated DNA is captured by the streptavidin coated beads even though it is still bound to PSC. Thus, PSC brings the biotinylated and unbiotinylated DNA fragments together, indicating that PSC can bridge segments of DNA as well as nucleosomes, and consistent with DNA binding playing an important role in how PSC interacts with chromatin.
(a) Schematic diagram of DNA bridging assay. (b) Representative analysis of bridging of naked DNA by PSC. Top panels show sucrose gradient fractions that were pooled for streptavidin pull-down. Bottom panels show streptavidin pull-down results. The per cent bound refers to how much of the unbiotinylated fragment is present in the pellet as a fraction of the total (pellet + supernatant). Asterisks indicate position of biotinylated fragment note that in pellet fractions, biotinylated fragment is incompletely recovered by Proteinase K treatment of streptavidin coated beads, and migrates slowly likely due to bound streptavidin. (c) Summary of streptavidin pull-down experiments. Graphs show average per cent of the unbiotinylated fragment associated with streptavidin beads.
PSC Clusters Nucleosomes but also Binds DNA on Sparsely Assembled Plasmids
The observation that PSC can bridge naked DNA could indicate that it interacts with chromatin solely through DNA binding. To understand how PSC interacts with nucleosomes and DNA when both are present, we prepared 3 kb plasmid templates with a small number (1 or 2) of assembled nucleosomes and asked how PSC interacts with them by EM (Fig. S1a). Templates contain two 601 nucleosome positioning sequences, which are preferentially occupied at low ratios of histones to DNA (similar to ). PSC was bound at low ratios to plasmids. Templates were fixed with glutaraldehyde, rotary shadowed with platinum and visualized by EM. Titrations of PSC binding to chromatin by electrophoretic mobility shift assay (EMSA) indicate that sparsely assembled arrays readily form slowly migrating multi-template aggregates (Fig. S1b). EM was carried out with ratios of PSC that cause only a slight shift in plasmid mobility by EMSA (Fig. S1b).
PSC formed a diverse array of structures with sparsely assembled templates ( Fig. 7b, c ). We classified these structures into two groups, Class 1 and Class 2, based on visual inspection of how much free DNA is visible and how large the particles on the template are. Class 1 structures likely represent templates with fewer PSC molecules bound, while Class 2 includes fully compacted structures. The simplest Class 1 structures contain one particle that is larger than the size of a nucleosome and could represent one or more PSC bound nucleosomes, and a large loop of DNA (such as molecule 14 in Fig. 7b ). In some cases, small particles are observed to cluster together, which might represent simple bridging events (such as molecule 3 in Fig. 7b ). Importantly, counting of the number of individual particles per template for Class 1 molecules indicates that many of these templates contain 3 particles. Templates alone contain 0, 1 or 2 particles (nucleosomes). Thus, at least some of the particles observed must represent PSC bound to DNA without a nucleosome. In contrast, most Class 2 molecules contain one or two particles, and the majority of these particles are larger than nucleosomes, and larger than the particles on Class 1 molecules ( Fig. 7e , Table 1 ). Thus, the large particles observed in Class 2 molecules likely contain PSC bound to both nucleosomes and DNA in a compacted complex. Notably, the most compacted structures (such as molecules 4, 5, 7 in Fig. 7b ) have very little free DNA. Together, these data indicate that PSC likely binds preferentially to nucleosomes, as observed in Class 1 molecules. PSC can also compact large regions of DNA even on templates with only one or two nucleosomes, as observed in Class 2 molecules. Because the particles on Class 2 templates are larger than those on Class 1 molecules ( Fig. 7e , Table 1 ), the classes may represent a progression driven by binding of increasing numbers of PSC molecules.
|−PSC||+PSC class 1||+ PSC class 2||−PSC||+ PSC class 1||+ PSC class 2|
|Diameter (nm, +/−SD)||285+/||233+/||164+/||282+/||251+/||164+/|
|particle diameter (nm, +/− SD||20+/𢄤||45+/||58+/||19+/𢄤||36+/||64+/|
|# particles (+/− SD)||1.2+/𢄠.7||1.8+/𢄠.9||1.6+/𢄠.9||1.3+/𢄠.8||1.8+/𢄠.9||1.5+/𢄠.7|
Table indicates mean +/− standard deviation for three measurements from two experiments. Each template uses a plasmid containing two 601 nucleosome positioning sequences that is assembled at low ratios of histones to DNA (0.2 histone:DNA by mass). The first set template has bp between the 601 nucleosomes and the second has 385 bp. Diameter is the maximum diameter of each template (the diameter of the smallest circle that would completely encompass the template) . The maximal diameter of each particle on each template was measured, giving rise to the “particle diameter” measurement. For –PSC samples, particles should be nucleosomes, while in +PSC samples, they could be nucleosomes, PSC bound to naked DNA, or PSC bound nucleosomes. Note that the largest diameter of the disk-shaped nucleosome is 11 nm samples were rotary shadowed to a thickness of 3.75 nm thus, the diameter measured is consistent with expectation (11+2൳.75 =.5 nm predicted size). The number of particles indicates each separate particle on a template, irrespective of size.
(a) Representative EM images of plasmids with two 601 nucleosome positioning sequences assembled at low ratio of histones to DNA. Note that fully assembled plasmids would contain 17 nucleosomes and 601 sequences are separated by 385 base pairs. Plasmids were assembled in the presence of E.coli Topoisomerse I so that plasmids are relaxed. Arrows point to nucleosomes. Note that template 2 was the only observed example of more than 2 nucleosomes on the plasmid (3), out of the 103 molecules that were analyzed ( Table 1 ). (b) Sparsely assembled plasmids with PSC. Class 1 molecules (see text) are more extended, and likely have fewer copies of PSC bound than Class 2 molecules (c), which are highly compacted. Arrows point to particles (likely to be PSC bound nucleosomes) that have come together and may represent the bridged configuration. Note that in some cases, more than one template may be clustered (such as molecule 3). Asterisks indicate particles that may be unbound nucleosomes (based on their size), although they could also be bound PSC. (c) Sparsely assembled plasmids with PSC with Class 2 configurations. Note that the molecules represent a series between the most extended Class 1 molecules and the most highly compacted Class 2 molecules. (d) Summary of the number of particles per template. The finding that Class 1 molecules frequently have more than 2 particles indicates that PSC must bind to naked DNA (as well as nucleosomes) on some templates. See Table 1 for summary of measurements from this and a similar experiment.
PSC can Bind Free Histones
The EM data suggest PSC has a preference for binding nucleosomes over bare DNA, and thus that PSC recognizes feature(s) of nucleosomes other than DNA. To test whether PSC might interact with the histone proteins in the nucleosome as well as the DNA, we carried out PSC binding assays with histone octamers or histone subcomplexes. We find that histone octamers bind PSC using pull-down assays ( Fig. 8a ). We aimed to test whether PSC binds both H2A/H2B dimers and H3/H4 tetramers however, these assays were hindered by high non-specific binding of the histone subcomplexes to beads. By immobilizing Flag-PSC on beads, adding dimers or tetramers, washing, and then eluting Flag-PSC and bound histones (protocol from ), we were able to observe a low but reproducible level of specific binding to both H2A/H2B and H3/H4 ( Fig. 8b ). PSC binding was also observed with glutaraldehyde-mediated cross-linking (not shown). We conclude that PSC can bind to free histones although additional methods will be needed to quantitatively assess this interaction. The ability of PSC to interact with histones may contribute to its nucleosome binding activity.
(a) Representative assay of PSC binding to histone octamers. PSC was mixed with histone octamers that contain one biotinylated and one fluorescent copy of either H3 (H3 labeled octamers) or H2B (H2B labeled octamers) (see Methods for detailed description). Mixtures were incubated with streptavidin coated beads and the amount of captured PSC determined by Western blotting. Fluorescence (Cy5) was used to monitor octamer capture. Similar results were observed in two additional assays. (b) Representative assay of PSC binding to H2A/H2B dimers or H3/H4 tetramers. Dimers and tetramers were fluorophore labeled on the indicated (asterisk) subunit. High levels of background binding to beads was observed for both dimers and tetramers, as shown, but in each of three assays, more H2A/H2B and H3/H4 were eluted from Flag beads that have immobilized PSC than control beads with no immobilized protein. PSC in the elution is detected by Western blotting.
Chromatin Fiber vs Chromosome
Chromosomes are thread-like structures in which DNA molecules are packaged. They are the repositories of the genetic information of an organism. The number of chromosomes and their shapes are differing among the living organisms. A human cell contains 46 chromosomes, which are in 23 homologous pairs. Prokaryotes possess less number of chromosomes which are not enclosed by a nuclear membrane. Chromosome has four chromatids and a centromere region. Chromosomes are not visible in normal cells. They become visible during the cell division under the microscope. Chromosomal DNA exists as chromatin fibers. Chromatin fibers are the complexes of DNA and histone proteins. The basic unit of the chromatin is nucleosome and nucleosomes are composed of a segment of DNA wrapped around eight histone proteins. Nucleosomes coil into loops and form tightly compacted chromatin fibers. Chromatin fibers coil tightly and form chromatids and chromatids form chromosomes. This is how DNA is packaged inside a small space of nucleus within a cell. This is the difference between chromatin and chromosome.
1.Nature News, Nature Publishing Group. Available here
2.The Editors of Encyclopædia Britannica. “Chromosome.” Encyclopædia Britannica, Encyclopædia Britannica, inc., 15 Jan. 2018. Available here