This tool performs the prediction using a Naive Bayes classifier model generated in the
Waikato Environment for Knowledge Analysis (WEKA, version 3.6.2). The model was trained with a positive
dataset of 100 experimentally known effectors from 28 species and negative dataset of 100 non-effector
protein sequences from 10 species. The training was restricted to only the 100 amino acids (aa) region of the N-terminal
of these sequences and based on three physico-chemical properties, namely hydrophobicity, polarity and beta-turns
to discriminate highly diverse effectors from non-effectors. Testing of the Naive Bayes model with experimentally
validated effectors (positive dataset of 68 sequences from 19 species) and non-effector protein sequences
(negative dataset of 68 sequences from 7 species) that were not part of the training data returned an excellent Aroc of
93%, demonstrating the utility of the model for prediction of effectors. The model has been integrated in
T3SEdb as prediction tool to allow users to scan thier sequences of interest against the model for presence of functional signals indicative of a T3SE.
The model will be updated periodically as more experimentally verified effector sequence data are deposited in T3SEdb and may include more features to discriminate
effectors from non-effectors as the database records get enriched with more annotations.