Nanopore - practical
- de.NBI nanopore training course
- a few example workflows
- in general always check the
--help
for each program
0. Base calling
- using Albacore
- Comparison of Basecaller here
# part of my script so variables are assigned to the options
read_fast5_basecaller.py -i $currDir/$i -f $flowcell -t $CPU -q 100000 -o fastq -k $kittype -r -s $currDir/FASTQ/
1. QC of reads
- Using the assembly-stats to get a summary of the reads:
assembly-stats ERR1147227.fastq
2. Adapter removal & demultiplexing
- with
porechop
- porechop needs seperate outputs depending if barcodes were present or not
porechop -i <input>.fastq -b <output_foulder>
- You can redo
assembly-stats
to validate
3. Assembly
- with
minimap2
andminiasm
- use
canu
for de novo assemblies minimap2
creates a files thatminiasm
needs for assemblyminimap2
creates a first draft map. However its not a good minion only assembler, we use it here to combine it with illumina data
minimap2 -x ava-ont ERR1147227_trimmed.fastq ERR1147227_trimmed.fastq | gzip -1 > ERR1147227.paf.gz
miniasm -f ERR1147227_trimmed.fastq ERR1147227.paf.gz > ERR1147227.gfa
canu
example
canu -d run_e.coli -p e.coli genomeSize=6m -nanopore-raw 7718_E.coli_sum.fastq
4. Polishing
- We do: index build, mapping reads to assembly, piping to samtools for bam conversion, sorting, indexing
-p4
for 4 threads (CPU) asbowtie2
takes time
bowtie2-build ERR1147227.fasta ERR1147227
bowtie2 -p4 -x ERR1147227 -1 ecoli_hiseq_R1.fastq.gz -2 ecoli_hiseq_R2.fastq.gz | samtools view -bS -o ERR1147227.bam
samtools sort ERR1147227.bam -o ERR1147227.sorted.bam
samtools index ERR1147227.sorted.bam
- Now we can do the actual polishing with
pilon
pilon --genome ERR1147227.fasta --frags ERR1147227.sorted.bam --output ERR1147227_improved