Browse code

Added documentation for #320.

Markus Riester authored on 23/09/2023 22:20:04
Showing 1 changed files

... ...
@@ -207,7 +207,10 @@ Important recommendations:
207 207
   to presence in germline databases. _Mutect 1.1.7_ automatically calls SNPs, 
208 208
   but _Mutect 2_ does not. Make sure to run _Mutect 2_ with
209 209
   `--genotype-germline-sites true --genotype-pon-sites true`. You will not get
210
-  usuable output without those flags.
210
+  usuable output without those flags. Since _Mutect 2_ from _GATK 4.2.0+_,
211
+  average base quality scores can be very low and variants will be too
212
+  aggressively removed by _PureCN_.  You will need to set `--min-base-quality
213
+  20` in _PureCN.R_ to keep them.
211 214
 
212 215
 - Run the variant caller with a 50-75 base pair interval padding to increase
213 216
   the number of heterozygous SNPs (for example `--interval_padding` and
... ...
@@ -524,7 +527,9 @@ Important recommendations:
524 527
       with 50-100bp interval padding or no interval file at all. Also check
525 528
       that the interval file was generated using the baits coordinates, not the
526 529
       targets (the baits BED file should have a more even size distribution,
527
-      e.g. 120bp and multiples of it).
530
+      e.g. 120bp and multiples of it). If many variants are removed by the
531
+      default 25 base quality feature, you might be using _Mutect 2_ and need
532
+      to re-run _PureCN.R_ with `--min-base-quality 20`.
528 533
     
529 534
     - "Initial testing for significant sample cross-contamination" in the log
530 535
       file should not have many false positives, i.e. should be "unlikely" for