Project description


Full map screenshot

We start our analysis by constructing a "gold-standard" protein interaction dataset from several related bacteria organisms: C. jejuni, E. coli, and H. pylori. The constructed a "gold-standard" positive interaction dataset contains 3629 interactions. Random interactions were added to serve as a "gold-standard" negative dataset. We selected a set of eight genomic features that have predictive power on protein interaction. We then used machine learning methods to perform training and testing, and adopted the learned classifier to predict whether any pair of PAO1 proteins can interact.


A total of 54,107 interactions were predicted that cover 4,181 PAO1 proteins with a probability higher than 0.5. More strigent cutoff would result in higher-confident interactions. We assembled a high-confidence network including predicted high-confidence interactions, interactions in the "gold-standard" positive, and a few verified interactions. The whole high-confidence network contains 3,343 PAO1 proteins and 19,416 interactions, serving as a good starting point for further network analysis.


Based on the whole interactome and the high-confidence network, we predicted drug targets by performing network analysis and identify important proteins in the network. With a set of filters, a list of 28 proteins were found as putative drug targets for PAO1 infection. We performed network clustering to identify different community structures in the network, and identified a few of them enriched with essential proteins. We further performed case studies on important PAO1 proteins MucA, MucB, Rhlr, and their interacting partners, and revealed their potential to serve as drug targets. Finally, we overlaid human-Pseudomonas (host-pathogen) interactions onto the human interactome and the predicted PAO1 interactome, and analyzed functions of related interactions.