Obtaining GO IDs
My analysis prior to this used DESeq2 to obtain a series of tables of genes with significantly-different expressions for each of my four comparisons.
I wrote a script in R that took in each DESeq2 output file, matched it against this annotated transcriptome, and produced a newline-separated file of UniProt accessions. I then took each newline-separated file and put it into this Bash script from Sam, which retrieved the Gene Ontology terms for each Accession ID.
This is progressing fairly well! Next steps are as follows:
- Get the GOslim terms using the GSEAbase package
- Perform gene enrichment analysis with TopGO or GO-MWU
- When time allows, go back and annotate the transcriptome myself to understand it more fully
Producing Venn Diagrams
In order to determine the overlap between DEGs for my different comparisons, I produced a few Venn diagrams. Since a quick analysis found practically no overlap between the time comparison (Day 0 vs Day 17) and any of my temperature comparisons, the Venn diagram solely includes our three temperature comparisons (Day 0/2: Elevated vs Ambient, Ambient vs. Low, Elevated vs. Low).
Some stats for our DEGs:
- 2166 unique transcript IDs
- 2919 total transcript IDs
- 74.2% unique transcript IDs
Some stats for the accession IDs of our DEGs:
- 633 unique accession IDs
- 1061 total accession IDs
- 59.6% unique accession IDs