Tide

Upload Data:

Guide sequence: Submit 20nt guide sequence upstream of PAM ('5-'3)


Parameters

All parameters have default settings but can be adjusted by checking the 'advance settings' box.

Alignment window (bp) The sequence segment used to align the control, reference and test sample left boundary
right boundary automatically set at breaksite - 10bp
Decomposition window (bp) The sequence segment used for decomposition.
Default is 20bp before breaksite to 80bp after breaksite

Distance upstream of break site (bp) Maximum size of base pairs before break site allowed in decomposition window (bp)

Indel size range Insertions Maximum size of insertions is fixed at 5. Deletions Maximum size of deletions modeled in decomposition.


Created by Bas van Steensel lab
Hosted by NKI
TIDER version 0.0.0
TIDER supports Firefox6, Chrome4, Safari6, IE10 or higher

TIDER: Easy quantification of template-directed CRISPR/Cas9 editing


  • What it does: Estimates the frequency of designed (templated) small mutations in a pool of cells transfected with [Cas9 + sgRNA + template oligonucleotide]. It also determines the frequency of non-templated indels.
  • When to use: Quantification of oligonucleotide-templated point mutations and small indels. For non-templated CRISPR/Cas9, use the original TIDE web tool.
  • Drawbacks: Requires some additional wet-lab work to generate a template reference sequence (see Protocol tab).
  • Reference (please cite!): Brinkman et al, ####.

Instructions

1. Upload Data:

  • Enter a 20nt ('5-'3) DNA character string representing the used sgRNA guide sequence immediately upstream of the PAM sequence (PAM not included). Numbers and other invalid (non-IUPAC) DNA characters will be automatically removed. TIDE assumes that a dsDNA break is induced between nucleotides 17 and 18 in this sequence.
  • Upload the chromatogram sequence files of respectively the control sample (e.g. transfected without Cas9 or without the sgRNA) and the test sample (e.g. cells treated with both Cas9 and the sgRNA).
    We advise to sequence a stretch of DNA ~700bp enclosing the designed editing site. The projected break site should be located preferably ~200bp downstream from the sequencing start site. This region upstream of the break site is used to align the sequencing data of the test sample with that of the control sample.
  • Upload the chromatogram file of the reference. This sample should have the same nucleotide sequence as introduced by the HDR template (i.e. the desired editing outcome). The reference DNA be generated with a 2-step PCR protocol (see Protocol tab), or synthesized de novo.

Currently, ABIF (.ab1) and SCF (.scf) files are supported. SCF is an open standard and several tools exist to convert other formats to SCF files.

sequence trace

2. Enter Parameters for Analysis:

The following parameters have default settings but can be adjusted if necessary by checking the 'advance settings' box.

Alignment window:

The window used to align control and test sequences to determine any offset between the two reads. Default settings are recommended, except when long repetitive sequences are present.

left boundary: Default is 100, because the beginning of a Sanger sequence trace is often of poor quality.
right boundary: This is automatically set to break site minus 10bp

Decomposition window:

The sequence segment used for decomposition. Default is a 100bp window that starts 20bp upstream of the break site. TIDER performance may improve with smaller window sizes if the designed mutations are subtle (e.g. one or two signle-nucleotide substitutaions). The window may also be adjusted if part of the sequence read is of low quality or contains repetitive sequences. If possible, settings are automatically corrected in case of invalid values.

left boundary: 20bp downstream of the break site.
right boundary: left boundary + 100bp

Distance upstream of break site

Maximum size of base pairs upstream of the break site to be considered in the decomposition window. Default is 20.

Indel size range

Maximum size of non-templated deletions to be modeled. Default is 10.
The maximum size of non-templated insertions is fixed to 5.

3. Results:

Once the data are uploaded and parameters are set, submit the data by clicking on the "update view" button. After 10-30 seconds the plots will appear in the "Decomposition" tab. If the settings are incorrect or too stringent, warnings or error messages will be displayed.

Quality measures: Results depend on the quality of the sequence reads. As a rule of thumb, we recommend to aim for an average aberrant sequence signal strength before the breaksite < 10% (both control and test sample), and R2>0.9 for the decomposition result.

License and privacy

This webtool and the associated R code are open source software under GNU General Public License version 3. Your uploaded data are only used for the duration of the analysis session and are not stored or used for any other purpose.

Code

R code of the TIDER algorithm will be made available once this work is published in a scientific journal (manuscript submitted).

Contact

This web tool was developed by Eva Brinkman, Christ Leemans and Bas van Steensel. William Peters has assisted with setting up the online web tool. For more information and to report bugs, please contact Bas van Steensel

Acknowledgements

R

R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. www.R-project.org . R version 3.1.1.

Biostrings

H. Pages, P. Aboyoun, R. Gentleman and S. DebRoy. Biostrings: String objects representing biological sequences, and matching algorithms. R package version 2.32.1.

sangerseqR

J.T. Hill and B. Demarest (2014). sangerseqR: Tools for Sanger Sequencing Data in R. R package version 1.3.1. http://www.bioconductor.org/packages/devel/bioc/html/sangerseqR.html

nnls

K. M. Mullen and I. H. M. van Stokkum. The Lawson-Hanson algorithm for non-negative least squares (NNLS). R package version 1.4.

msa

E. Bonatesta, C. Horejs-Kainrath, and U. Bodenhofer. Multiple Sequence Alignment. R package version 1.6.0.

plyr

H. Wickham. Tools for Splitting, Applying and Combining Data. R package version 1.8.4.

shiny

RStudio and Inc. (2013). shiny: Web Application Framework for R. R package version 1.0.0. http://shiny.rstudio.com



A plot will be shown here when the valid sequencing files and guide string have been uploaded.

Remarks








Alignment between control, reference and test sample before the break site


                

Alignment between guide, control and reference around the break site


                

Indel Spectrum

Quality control - Aberrant sequence signal

Quality control - Designed sequence signal

Checklist quality control

Check the following criteria to determine the quality of your data (use the quality plots)

Alignments

  • Is there an agreement between control, reference and test sample before the break site?
  • Does the reference show the expected (point) mutations around the break site?

Aberrant sequence signal (quality plot1)

  • Is there a considerable divergent signal between control and test sample after the breaksite?
  • Sequences of good quality show:
    • in the control sample (black) a low and equally distributed aberrant sequence signal
    • in the test sample (green) a low signal before the breaksite and a higher signal downstream of the breaksite
  • Is the breaksite at expected location?
    The aberrant sequence signal should increase around the expected cut site (blue dotted line)
  • Does the decomposition window covers a representative sequence?
    For optimal decomposition, adjust window boundaries when the sequence trace is locally of poor quality

Designed sequence signal - reference (quality plot2)

  • Is divergent signal between control and reference at the site of changed nucleotides?
  • Sequences of good quality show:
    • in the reference (red) a higher signal at the nucleotides that are introduced with donor donor template
  • Does the decomposition window covers a representative sequence?
    For optimal decomposition, adjust window boundaries when the sequence trace is locally of poor quality.
    Create a clear difference between control and reference; keep therefore the following rules in mind:
    • In case of few differences between wild type and donor donor template (e.g. 1-10 bp substitutions): small window (~100bp) enclosing the designed changed nucleotides
    • In case of big differences between wild type and donor donor template (e.g. >10 bp substitution): maximal window

Designed sequence signal - test sample (quality plot3)

  • Is divergent signal between control and test sample at the site of changed nucleotides in reference sample?
  • Sequences of good quality show:
    • in the control sample (black) a low and equally distributed aberrant sequence signal
    • in the test sample (green) a higher signal at the nucleotides that are introduced with donor template
If the plot does not meet these criteria, visit the troubleshooting page to see how you can improve your results.

Quantification Indel Frequencies



Protocol

Overview

For TIDER, 3 PCR amplicons (all from the same primer set) are needed:

  1. Control (DNA from wild-type cell. e.g. control cells transfected without Cas9 or sgRNA)
  2. Reference (DNA carrying the designed mutations as in the donor oligo template)
  3. Test sample (DNA from a pool of cells treated with Cas9, sgRNA and donor template)


sequence trace

Generate control (1) & test sample (3) DNA

  • Amplify a fragment enclosing the designed editing site by standard PCR. We advise to take a stretch of DNA ~700bp. The projected break site should be located preferably ~200bp downstream from the start site.

  • Amount Sample
    21-x µL H2O
    2 µL primer a (10 µM stock)
    2 µL primer b (10 µM stock)
    x µL genomic DNA (~50ng)
    25 µL 2x pre-mix of buffer, Taq polymerase and dNTPs (e.g. BioLine MyTaq, BIO-25044)

    PCR program:
    Step Temperature Time, min:sec Number of cycles
    Initial denaturation 95 °C 1:00 1
    Denaturation 95 °C 0:15
    Annealing 58 °C 0:15 25
    Extension 72 °C 0:10
    4 °C hold


Generate reference (2) DNA

  • Design two overlapping primers (primer c & d) that include the desired mutation(s) (see figure)
  • Run two PCR reactions that produce two halves of the reference

  • PCR mix1
    Amount Sample
    21-x µL H2O
    2 µL primer a (10 µM stock)
    2 µL primer c (10 µM stock)
    x µL genomic DNA (~50ng)
    25 µL 2x pre-mix of buffer, Taq polymerase and dNTPs (e.g. BioLine MyTaq, BIO-25044)

    PCR mix2
    Amount Sample
    21-x µL H2O
    2 µL primer d (10 µM stock)
    2 µL primer b (10 µM stock)
    x µL genomic DNA (~50ng)
    25 µL 2x pre-mix of buffer, Taq polymerase and dNTPs (e.g. BioLine MyTaq, BIO-25044)

    PCR program:
    Step Temperature Time, min:sec Number of cycles
    Initial denaturation 95 °C 1:00 1
    Denaturation 95 °C 0:15
    Annealing 58 °C 0:15 25
    Extension 72 °C 0:10
    4 °C hold

  • Check samples on 1% agarose gel
  • Purify PCR products
  • Anneal the two PCR products

  • Annealing reaction
    Amount Sample
    48 µL annealing buffer (=10 mM Tris, 50mM NaCl, 1mM EDTA)
    1 µL PCR mix1
    1 µL PCR mix2

  • 1 minute 95 °C, cool down to 20 °C (0.1 degrees/sec)
  • Extend the annealed products and amplify the joined product

  • Amount Sample
    18 µL H2O
    2 µL primer a (10 uM stock)
    2 µL primer b (10 uM stock)
    3 µL annealed oligo mix
    25 µL 2x pre-mix of buffer, Taq polymerase and dNTPs (e.g. BioLine MyTaq, BIO-25044)

    PCR program:
    Step Temperature Time, min:sec Number of cycles
    Extension 72 °C 0:15 1
    Denaturation 95 °C 0:15
    Annealing 58 °C 0:15 25
    Extension 72 °C 0:10
    4 °C hold

  • Check on 1% agarose gel
  • Purify PCR product

Sanger sequencing

We strongly recommend that all three PCR products (control, reference and experimental sample(s)) are sequenced in parallel. Either primer a or b may be used. Sequence trace files must be saved in .ab1 or .scf format.

FAQ

Troubleshooting

FAQ

What is the minimally required sequence length?

The requirements of sequence length are flexible. The region upstream of the break site is used to align the sequencing traces. The region from -20 to +80 relative to the break site (the decomposition window) is used for the actual calulations, but can be shortened or extended a bit useing the Advanced settings. We advise to sequence a stretch of DNA ~700bp enclosing the designed editing site. The projected break site should be located preferably ~200bp downstream from the sequencing start site. The designed mutations should be within 20 bp of the break site. Note that often with shorter sequences than 700 bp, the break site is too close to the start of the sequence read in the default setting (see figure). The alignment window can be changed under Advanced settings. short sequence

What does R2 mean?

R2 is a measure for the reliability of the estimated values. For example, if the R2 value is 0.95, it means that 95% of the variance can be explained by the model; the remainder 5% consists of random noise, very large indels, non-templated point mutations, and possibly more complex mutations. If R2 is < 0.9, it usually means that the quality of the sequence reads is low, which compromises the accuracy of the TIDER estimates.

How is the overall efficiency calculated?

The overall efficiency refers to the estimated total fraction of DNA with mutations (templated mutations plus non-tempalted indels) around the break site. It is calculated as R2 - % wildtype.

What do the indel and HDR values indicate when a cell pool is sequenced?

The different bars represent the different insertions, deletions and templated mutations in the population. For example, if the estimated HDR fraction is 20%, then 20% of the DNA molecules in the cell pool are predicted to carry the designed mutation. You can not tell for an individual cell what the specific mutation of each allel is. To determine allel specific information you have to isolate a cell clone and perform TIDER analysis.

What do the indel and HDR values indicate when cell clone is sequenced?

The different bars represent the different insertion, deletion or homologous directed repaired mutations in the allels in a cell clone. With a diploid cell you should get a percentage of ~50% per mutation.

Can we get the precise sequence of the indels?

TIDER can determine how large the non-templated indels are, but not accurately which nucleotides are inserted or deleted. To know the precise sequence of the non-templated mutations you can use next generation sequencing or Sanger sequencing of individual cloned DNA molecules.

Can TIDER discriminate between a templated indel from non-templated indels?

TIDER is able to discrimate natural occuring deletions and insertions from templated indel. In general TIDER is able to discrimate natural occuring deletions and insertions from templated "designed" indel. Only in the presence of a small designed deletion (-1, -2) near the expected break site the designed mutation may be underestimated somewhat. Also, in case the designed mutation consists of an insertion larger than +1, TIDER does not consider natural insertions of the same size, because we found the decomposition to become less robust, and because we and others have rarely observed natural insertions larger than +1.

Can/should I sequence both strands?

We recommend that results are verified by sequencing of the opposite strand. Note, when designed mutations are present >20 bp away from the break site this may confound TIDER estimates when such distal mutations are combined with mutations close to the break site. It has been reported that the incorporation of donor template is less efficient when the designed point mutations are further away from the break site. By comparing different settings for the decomposition window and by visual inspection of the TIDER plots it is possible to infer such biases

Can TIDER be used for other nucleases (e.g. TALEN, ZFN, other RNA CRISPR nucleases with different PAM)?

TIDER is currently only designed for regular Cas9. But it can be tricked to analyze data from another nuclease, provided it creates a blunt cut with a precisly predicalbe location. TIDER assumes that the dsDNA break is induced between nucleotides 17 and 18 in the sgRNA sequence. If you do not know the exact cutting position, then TIDER results are not reliable. In the future we hope to include functionality for multiple nuclease types.

Troubleshooting

TIDER web page turns grey

If the page turns grey, there may be one of two problems. (1) You are using an incompatible web browser. For a list of compatible browsers, please check at the bottom left of the TIDER page. (2) The firewall of your institute does not allow WebSocket connections. This is essential for TIDER. Before you contact us, please first try to access TIDER from a different location outside your institute, and talk to your systems administrator. If you continue to experience problems please contact us and let us know the nature of the problem and the exact date and time when you tried to access TIDER. Your feedback is helpful for us to improve this webtool.

TIDER web tool does not respond when uploading .ab1 files

Unfortunately, some .ab1 sequence files are not adhering to the official format specifications, and may therefore not be compatible with the TIDER webtool. Various software programs that process the raw sequence data can cause this problem. We recommend that you export the data as a .scf file and then uploaded in TIDER. The .ab1 format can also be converted to .scf using 4peaks (Mac) or FinchTV (Windows & Mac).

Wrongly annotated nucleotides

Sometimes the quality of the peaks in chromatogram looks fine, but the file has some wrongly unannotated or wrongly inserted annotated nucleotides. These will interfere with the mutation spectra (see figure wrongly unannotated nucleotide). TIDER gives a warning when the spacing between the nucleotides in the chromatogram of the sequence trace are not consistent, which is often an indication for wrongly unannotated or wrongly inserted annotated nucleotides. Then the sequence file cannot be used for a reliable TIDER analysis. If possible, try to set the right boundary of the decomposition window lower. In case this warning stays, carefully investigate your chromatogram.
wrongly unannotated nucleotide(s)

Low R2 value

A low R2 can be caused by sub-optimal or by poor sequence quality

Settings
By default, the size of the decomposition window is 100bp and the indel size range is set to 10. The settings can be adjusted in advanced settings. Possible issues and solutions:

  • Large indels are present in the sample.i> By default the decomposition is calculated with a maximum indel size of 10. When larger indels are present, they are not modeled, which will result in a low R2. Try to increase the indel size range to test if this improves the fit (see figure indel size range)
  • Poor local quality of the sequence trace. Often the end of the sequence is of low quality. This can be observed in the quality plot that shows a high aberrant sequence signal at the end of the sequence trace (see figure Poor quality sequence end). Adjust the boundaries of the decomposition window in such a way that it does not overlap with the region of low quality.
  • Repetitive regions in the sequence trace. These regions can be observed in the quality plot as a sudden stretch without aberrant nucleotides (see figure Repetitive region). This region might interfere with the decomposition of the sequence trace. Adjust the boundaries of the decomposition window in order to exclude the repetitive sequence.

Poor sequence quality
Poor sequence quality can be observed in the chromatogram (see figure Poor sequence quality).This results in a lower R2. Check the purity of the PCR products and the quality of your sequencing reagents. Repeat the sequencing, possibly with a different primer.

Figure: Indel size range
maxshift
Figure: Poor quality sequence end
poor local quality
Figure Repetitive region
repetitive region
Figure Poor sequence quality poor sequence quality

HDR events are detected in absence of a donor template in the editing experiment

When the peak heights of the control and reference chromatogram are very different, it might happen that background signals are estimated as HDR events. Make sure that all three sequencing traces are generated in parallel.

No sgRNA match because there is a mismatch in the control sequence

Sometimes a mismatch occurs in the control sequence at the location of the sgRNA. This will stop the TIDER analysis. In this case, change the chromatogram file into identical IUPAC nucleotides as the expected control sequence.

Error: boundaries of decomposition window are not acceptable

This error message can occur when the settings are not optimal or when the breaksite is too close to the sequence start or end. Try if possible to set decomposition window boundaries further apart or use smaller indel size limits; or use lower the alignment window. If that does not help, you might have to re-sequence to perform the TIDER analysis. We advise to sequence a stretch of DNA ~700bp enclosing the designed editing site. The projected break site should be located preferably ~200bp downstream from the sequencing start site.

Poor alignment

When the beginning of the sequence is of poor quality, the alignment function can make a mistake. This can be observed in the quality plot that has high aberrant sequence signal over the whole length of the sequence trace (see figure). The aberrant sequence signal should only increase around the expected cut site (blue dotted line). In case of poor alignment, try to shift the start of the alignment window (Advanced settings). poor alignment