# Background

The ampir (short for antimicrobial peptide prediction in r ) package was designed to be a fast and user-friendly method to predict antimicrobial peptides (AMPs) from any given size protein dataset. ampir uses a supervised statistical machine learning approach to predict AMPs. It incorporates a support vector machine classification model that has been trained on publicly available antimicrobial peptide data.

## Usage

Standard input to ampir is a data.frame with sequence names in the first column and protein sequences in the second column.

library(ampir)

Read in a FASTA formatted file as a data.frame with read_faa()

my_protein_df <- read_faa(system.file("extdata/bat_protein.fasta", package = "ampir"))
seq_name seq_aa

Calculate the probability that each protein is an antimicrobial peptide with predict_amps()

Note that amino acid sequences that are shorter than five amino acids long and/or contain anything other than the standard 20 amino acids are not evaluated and will contain an NA as their prob_AMP value.

my_prediction <- predict_amps(my_protein_df)
seq_name seq_aa prob_AMP
my_predicted_amps <- my_protein_df[my_prediction[,3] >= 0.8,]
Write the data.frame with sequence names in the first column and protein sequences in the second column to a FASTA formatted file with df_to_faa()
df_to_faa(my_predicted_amps, tempfile("my_predicted_amps", tempdir()))