In the parameters.tsv
file you are asked to provide PEMA with a name for your analysis that will be used to build a directory where all the PEMA output will be found.
In all cases PEMA returns 7 subdirectories no matter. If a phyloseq
analysis has been asked then PEMA builds an extra directory for that.
Here is a short description of the output files PEMA returns.
In folders 1.quality_control
, 2.trimmomatic_output
, 3.correct_by_BayesHammer
, 4.merged_by_SPAdes
, 5.dereplicate_by_obiuniq
and 6.linearized_files
the output of each of tool used for the pre-processing steps are placed.
In the first file the sequence quality control results are located.
In the second one, your trimmed sequences are located, in the third one the corrected ones.
In the fourth subdirectory, you will see that now you have only one file for each of your samples as your sequences have been merged.
In the fifth one you will find the dereplicated sequences and finally, in the sixth only the sequences that remained after the quality control and the pre-processing steps are now present.
These last .fasta
files are used to form a single .fasta
file, called final_all_samples.fasta
that will be used from this point onwards for the clustering and taxonomy assignment steps.
All these files can be considered as intermediate files and are always the same no matter what is your marker gene.
7.gene_dependent
subdirectoryIn this subdirectory, all output from clustering and taxonomy assignment steps is placed.
According to the user’s parameters, the content of this subdirectory differs.
A number of files can be found there with respect to the OTU clustering or ASVs inference.
The most important file though is the final_table.tsv
file where you can find the taxonomy assigmnent of your OTUs/ASVs.
phyloseq_output
subdirectoryIn case that a phyloseq
analysis has been asked by the user, an extra subdirectory will be created with this name including all the relative files.
Finally, a copy of the parameters.tsv
file you used for your analysis, will be placed in the main output directory.