|
|
## Input files
|
|
|
|
|
|
Each sample should have one input file:
|
|
|
|
|
|
- `*.fna`: FASTA file containing nucleotide sequences of the contigs
|
|
|
- no whitespaces in FASTA headers
|
|
|
- for prediction of mobile genetic elements and input for prodigal
|
|
|
|
|
|
The following files are generated by PathoFact itself:
|
|
|
|
|
|
- `*.faa`: FASTA file conatining translated gene sequences, i.e. amino acid sequences
|
|
|
- no whitespaces in FASTA headers
|
|
|
- for prediction of toxins, virulence factors and antimicrobial resistance genes
|
|
|
- `*.contig`: TAB-delimited file containing a mapping from contig ID (1st column) to gene ID (2nd column)
|
|
|
- no header, one gene ID per line
|
|
|
- contig and gene IDs should be the same as in the FASTA files
|
|
|
|
|
|
The input file for each sample should be located in the same directory.
|
|
|
For each sample, the corresponding input files should have the same basename, e.g. `SAMPLE_A.fna` for sample `SAMPLE_A`.
|
|
|
|
|
|
**NOTE**: For preprocessing and assembly of metagenomic reads we would suggest using IMP (https://imp.pages.uni.lu/web/)
|
|
|
|