ALPAR¶
ALPAR, short for Automated Learning Pipeline for Antimicrobial Resistance, serves as a tool crafted to generate antimicrobial resistance predicting machine learning models. Licensed under the MIT license, it is open source and conveniently accessible on GitHub. Installation is made simple through conda.
Install¶
ALPAR is installable from conda, currently only support Linux:.
Note
It is recommended to use mamba for the installation because conda might not be able to resolve the dependencies of ALPAR successfully.
Command
Flowchart¶
Quick Start¶
Example files¶
Example files can be downloaded from: Example files
Automatic pipeline¶
From genomic files, creates binary mutation and phenotype tables, applies thresholds, creates phylogenetic tree, conducts GWAS analysis, calculates PRPS score and trains machine learning models with conducting feature importance analysis and splitting data aginst information leakage with DataSAIL against all the given antibiotics.
Input,
-i: Path of folder that have structure: input_folder -> antibiotic -> [Resistant, Susceptible]
input_folder
├── antibiotic1
│ ├── Resistant
│ │ ├── fasta1.fna
│ │ ├── fasta2.fna
│ │ └── ...
│ └── Susceptible
│ ├── fasta3.fna
│ ├── fasta4.fna
│ └── ...
├── antibiotic2
│ ├── Resistant
│ │ ├── fasta2.fna
│ │ ├── fasta5.fna
│ │ └── ...
│ └── Susceptible
│ ├── fasta2.fna
│ ├── fasta3.fna
│ └── ...
Output,
-o: Output folder path, where the output will be stored. If path exist,--overwriteoption can be used to overwrite existing output.Reference,
--reference: Reference file path, accepted file formats are:.gbk .gbffCustom database (Optional) [Recommended],
--custom_database: Fasta file path for protein database creation, can be downloaded from UniProt accepted file formats are:.fasta
Basic usage:¶
alpar automatix -i example/example_files/ -o example/example_output/ --reference example/reference.gbff
For more information about the automatic pipeline parameters:¶
alpar automatix -h
Citation¶
For citation details, see Citing ALPAR.