Figure: CRISPRs
- make plots/table to represent the CRISPR spacer identification
P.S. while the files are ready, I still have to create snakefiles
. Will do later today and let you know!
Designs
- Show closed items
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Susheel Busi changed milestone to %Manuscript - initial version
changed milestone to %Manuscript - initial version
- Susheel Busi added Figure label
added Figure label
- Susheel Busi mentioned in issue #24 (closed)
mentioned in issue #24 (closed)
- Author Maintainer
- output files after this commit-e98663ed, will be under
results/analysis/crispr/{folder}
- folder == ["casc", "minced"]
- output files after this commit-e98663ed, will be under
- Maintainer
Notes
Files to parse
-
casc
:<tool>.results.txt
-
minced
:<tool>.txt
Information
- contig ID
- array start/end
Plot
- intersection plot:
UpSetR
, contig IDs only - overlap plot: array overlap per contig
1 -
- Valentina Galata mentioned in commit 629dc3fc
mentioned in commit 629dc3fc
- Valentina Galata mentioned in commit d75c0381
mentioned in commit d75c0381
- Maintainer
First attempt: fig_crispr.pdf
I am not happy with these plots...
Problems:
- An intersection plot over the contig IDs does not tell much
- Also, saving multiple
UpSetR
plots is problematic for some reason
- Also, saving multiple
- Could not find a good way to plot CRISPR array coordinates
- Too many contigs to plot them individually
- Plotting contigs together requires a large coordinate range s.t. the arrays cannot be seen properly
Any suggestions are welcome. :)
- An intersection plot over the contig IDs does not tell much
- Author Maintainer
- maybe do a simple facet? the overlap is a great idea but not critical.
Collapse replies - Maintainer
What do you mean by "a simple facet"? To plot what?
- Author Maintainer
- facet the tools.. keep them separate..
- within each tool, compare the different assembly methods ?!
Collapse replies - Maintainer
I mean what exactly should I compare?
Currently, I only look at the overlap between
minced
andcrisp
based on contig IDs. I cannot compare between assemblies as the contig IDs are different.
- Author Maintainer
My bad.. I should have explained better.
- we simply report the number of spacers recovered, i.e. something like this: less minced_CRISPR_output.txt
flye 44 megahit 40 metaspades_hybrid 169 metaspades 192
- but we normalise that to total number of contigs/total length of sequences?!
- could be a barplot, or a mere table even.
- And we do this separately for CASC and minCED
- Since the overlap is a pain anyways to represent graphically
Edited by Susheel Busi - Valentina Galata changed title from figure-request-5: CRISPRs to Figure: CRISPRs
changed title from figure-request-5: CRISPRs to Figure: CRISPRs
- Valentina Galata mentioned in commit 1990f776
mentioned in commit 1990f776
- Valentina Galata mentioned in commit 25b450c3
mentioned in commit 25b450c3
- Maintainer
Plot update: fig_crispr.pdf
Notes about how the data is collected:
-
CasC
: reading in{asm_tool}.results.txt
-
MinCED
: parsing{asm_tool}.txt
- for each sequence and each found CRISPR array: extract the spacer sequences
- number of spacers: number of unique spacer sequences
- for each sequence and each found CRISPR array: extract the spacer sequences
Created summary contains for each CRISPR tool and assembly the found CRISPR arrays w/ sequence (contig) ID, array start/end and the number of spacers.
1 -
- Valentina Galata closed
closed
- Maintainer
Filter
CasC
table bybonafide
. - Valentina Galata reopened
reopened
- Valentina Galata mentioned in commit a12b69cc
mentioned in commit a12b69cc
- Maintainer
- Valentina Galata closed
closed