There are numerous packages available to perform different steps in the analysis of Hi-C data. It does not perform the normalisation and statistical tests needed to interpret Hi-C experiments, rather it is intended as the starting point of processing Hi-C datasets and should be used in conjunction with other Hi-C pipelines.
HiCUP was designed for mapping Hi-C data and removing artefacts. HiCUP produces a detailed quality control (QC) report in an interactive HTML format, enabling the user easily to assess the quality of a library and how the experimental protocol may be improved in the future. To meet these demands we developed HiCUP, an easy-to-use Hi-C bioinformatics pipeline which has few dependencies and is coded in Perl. Following this, artefacts inherent to the Hi-C protocol should be removed. Reaching valid conclusions regarding genomic interactions requires Hi-C data to be mapped unconventionally (described below) as compared with most paired-end experiments. Selecting for di-tags in this way makes it possible to gain a more complete and higher resolution contact profile for loci of interest. This Capture Hi-C protocol (CHi-C) is advantageous since Hi-C libraries are extremely complex and even with current high-throughput technologies often only a small proportion of a Hi-C library is sequenced. The modified restriction site sequence is not found in the original genomic sequence.Ī recent variation of the protocol involves enriching Hi-C libraries for di-tags in which one or both reads align to pre-selected regions of a genomeĤ. The red and blue rectangles represent cross-linked restriction fragments while the yellow marker shows the position of biotin incorporation.ī) Generation of the Hi-C ligation junction sequence by successive digestion (with HindIII in this example), fill in and blunt-ended ligation steps. Since these two fragments were positioned close to each other during fixation, by analysing the composition of a population of di-tags generated by a Hi-C experiment it is possible to infer genomic three-dimensional organisation.Ī) Diagram summarising the Hi-C experimental protocol. The resulting molecule, termed a di-tag, should comprise two different DNA fragments separated by a modified restriction site. Following sonication the sheared ligated DNA fragments are enriched by streptavidin pull-down of the biotin residues, and then are ligated between sequencing adapters.
Fragments in close spatial proximity are ligated together generating a novel "modified restriction site" sequence (seeįigure 1b). Overhanging single-stranded DNA at the ends of restriction fragments are then filled in with the concomitant incorporation of biotin. Hi-C is a ligation-based proximity assay utilising the power of massively parallel sequencing to identify three-dimensional genomic interactionsįigure 1a) involves fixing chromatin to preserve genomic organisation, followed by restriction enzyme digestion of the DNA.