ChIP-Seq

Table Of Contents

  1. Measuring Enrichment Prior to Library Prep
  2. Control
  3. Read Redundancy

1.  Measuring Enrichment Prior to Library Prep

Test your antibody and sample prep before making a library. We recommend that the enrichment ratio \( {{C^+/I^+}\over{C^-/I^-}} \ge 10 \). Where \( C^+ \) is ChIP at positive control, \( I^+ \) is input at positive control, \( C^- \) is ChIP at negative control, and \( I^- \) is input at negative control.

To do this, you need at least two primer pairs, a positive control (+) and a negative control (-) which you measure on the ChIP and input samples using Q RT-PCR. See the figure to the right.

2.  Control

We strongly recommend that you sequence an \textcolor{red}{input} library for each condition. Note that in the figure, the input track has a strong peak at the same place as the ChIP peak. The strength of inputs peaks or bias is dependent on the state of the cells as well as chromatin preparation conditions.

3.  Read Redundancy

Like most Illumina library preps, ChIP-Seq libraries usually require a number of rounds of PCR. PCR creates copies of DNA fragments which are statistically dependant on each other. During data analysis it is important to reduce dependant fragments to a single fragment when possible. This can reduce the read count dramatically for some libraries. However duplicate fragments can also be produced in ways that are statistically independant and so can be left in during the analysis, if the analysis is able to identify which is which.

Usually DNA is fragmented by sonication for ChIP-Seq libraries which produces largely random breakage. So in regions with low coverage, duplicated reads are more likely to come from PCR copies. As the coverage gets deeper, the chance that a duplicated read is from an independent fragment increases, though only if the library has sufficient complexity. In enriched areas, i.e., peaks, it is quite possible to see duplicate reads that are independant.

Since removing duplicated reads can dramatically reduce the effective depth of sequencing it is important to assess the level of duplication and take steps to reduce it if necessary and if possible. However, the amount of material available, the size of the target's footprint in the genome, and the effectiveness of the anitbody may limit what is possible. The table below covers most of the common scenarios.

Redundancy in Peaks Redunancy in Background Summary Suggestions
no peaks high IP is not working, and little background material is available IP conditions may be too stringent or antibody does not work at all.
no peaks low IP is not working, but plenty of background material is available IP conditions may be not stringent enough or antibody does not work at all.
low high not possible
low low plenty of material available prior to PCR. You can work with data as is, or consider adjusting IP conditions to be more stringent if enrichment could be stronger.
high high Little material prior to PCR If effective read counts are workable, can be left as is. Otherwise consider using more cells or pooling multiple IP before library prep
high low Enrichment should be very strong so redundancy in peaks is due to crowding or pigeon-hole principle. Data should be usable as is, but if you are measuring differential binding, you may want to take advantage of redundany reads. Also suggests that you may be able to use less chromatin in the IP.