In addition, two novel gene functions were predicted and experimentally confirmed to affect the efficiency of non-homologous end-joining, providing further support for the usefulness of the identified PPIs in biological investigations. Proteins are key biomolecules that often realize their functions by interacting with one another. Protein—protein interactions PPIs mediate various aspects in the structural and functional organization of a cell including multi-faceted responses to internal and external stimuli. Protein interaction networks have also been shown to possess topological and dynamic properties that may be essential for certain biological events 1 , 2.
- Functional mapping of yeast genomes by saturated transposition.
- Secrets of Drawing Comics.
- Yeast - Wiley Online Library!
Thus, elucidating the complete network of PPIs is expected to garner a greater understanding of the biology of the cell. The sequencing of the budding yeast Saccharomyces cerevisiae over a decade ago 3 , along with its simple genetics which had made this yeast a model eukaryotic organism, led to its emergence as the organism of choice for large-scale functional genomics experiments including expression profiling 4 and identification of PPI networks interactomes.
Additionally, these techniques may not be applied to all proteins without discrimination.
In TAP tag, the double tag fusion to the target protein may interfere with the formation of some complexes or cause a mutant phenotype 6 , 7. In Y2H, not all proteins can be safely over-expressed and not all proteins can find their way into the nucleus, which is required for the successful detection via Y2H 8. Such limitations resulted in small overlaps between the PPI data collected using different approaches and even little reproducibility using the same method in different experiments 5 , 9.
This lack of overlap suggests the presence of more undiscovered PPIs. Consequently, there is a growing need for the development of new and improved experimental and computational approaches to better uncover the yeast interactome. Very recently, we 10 , 11 as well as others 12 reported that PPI's could be successfully detected from short polypeptide sequences within proteins. Our approach that we termed Protein-protein Interaction Prediction Engine, PIPE, was based on re-occurring short polypeptide sequenzces observed in a database of known interacting protein pairs.
Pooled clone collections by multiplexed CRISPR-Cas12a-assisted gene tagging in yeast
The goal of this investigation is to complement previous genome-wide experimental analyses of PPIs, leading to a more complete PPI map for yeast. The PIPE method 10 estimates the likelihood of an interaction between a pair of target proteins by measuring the reoccurrence of short polypeptide sequences referred to henceforth as windows from protein pairs that are known to interact. To determine whether two given query proteins A and B interact, the proteins are scanned for similarity to a library of known interacting proteins pairs X, Y.
For each known interacting pair X, Y , we compare protein A against X and protein B against Y by using sliding windows of a fixed size. These matches are counted and added up in a 2D matrix which reports for each pair of windows in A and B the number matches found among known interacting proteins. This 2D matrix is then plotted into a 3D landscape.
Fitness Landscape of the Fission Yeast Genome | Molecular Biology and Evolution | Oxford Academic
Examples of landscape diagrams produced by PIPE. B A positive interaction indicated by a peak of The PIPE2 method presented in this article provides a significant improvement in computational speed and specificity over the original PIPE method 10 , making possible the first global investigation of protein—protein interactions in yeast. Details about the improvements of the computational method are found in the Supplementary Data 1. In brief, PIPE2 incorporates two window comparison optimizations and one structural change over the initial algorithm. The first window comparison optimization converts the character-based amino acid representation to a binary representation digital alphabet speeding up lookups in the similarity matrix.
The second window comparison optimization takes advantage of the fact that we are using sliding windows for our comparisons. Only updating the characters that are removed or added during one move of a window yields another significant improvement. We solved this problem by pre-computing all these window comparisons in advance and storing them on local disk. This one-time pre-computation allows PIPE2 to lookup the answer of a comparison instead of computing it.
Yeast Gene Analysis, Volume 36, Second Edition (Methods in Microbiology)
Table 1 shows each change along with the average single-processor runtime per PPI prediction and the overall 16 fold performance improvement over the original PIPE implementation. These runtime numbers were obtained after running the program on the same set of randomly chosen protein pairs. If one includes the one-time pre-computation then the performance improvement provided by our new approach is 14 fold. We improved our threshold function and tuned our parameters by using a true positive set and a true negative set of pairs each.
In order to better evaluate the specificity of PIPE2, a larger set of true negatives was needed. Therefore, we constructed a negative set of randomly chosen pairs as explained in 13 that are not reported in either our database or in BioGRID In contrast to our previous approach of applying a moving average filter, we apply a median filter, which effectively eliminates thin line regions assumed false positives due to regions of low complexities and maintains hill regions assumed true positives.
Those n 2 values are then sorted and the cell c is replaced by the median value Figure S1A. The results of this experiment are illustrated in Figure S1B. As indicated in Supplementary Data , the combination of the median filter and application of a cutoff value on the average is important to achieve reasonable sensitivity rates.
The portal works on the most common operating systems and web browsers and has been tested on Windows Vista SP1 Internet Explorer 7. The yeast gene deletion strains are described in ref. Plasmid repair analysis was performed as before using a modified p plasmid Each experiment was repeated at least four times. We ran all 19 possible pairs of S. This resulted in 29 pairs detected as positive interactions listed in Supplementary Data.
Of these, a slight majority 15 Interestingly, since then, of our 14 novel protein interactions 2. We then investigated the total number of interactions, average and maximum degree of nodes interactions for each protein and the number of unique proteins participating in interactions according to PIPE2, Gavin et al. It is also important to note the significantly increased number of unique proteins found in the PIPE2 dataset compared to Gavin et al. Similarly, when compared against Y2H studies, the PIPE2 dataset contains almost twice the number of unique proteins compared to Ito et al.
It has been previously reported that the overlap between various interaction maps obtained using different methods is very small 22 , 24 , A comparison study carried out by Aloy and Russell in showed a low level of overlap among two-hybrid, affinity purification, mass spectrometry, and bioinformatics methods Figure 2 shows the overlap between PIPE2 data and those of other genome-wide experimental studies.
PIPE2 identifies PIPE2 covers Gavin et al. For example Gavin et al. The overlaps represent the number of interactions which are common between different databases. There seems to be a significant overlap between PIPE data and those of others. This overlap is even more notable for the data gathered using Y2H, which is similar to PIPE2, and designed to study an interaction between two target proteins.
Comparing PIPE2 data to those obtained by other large-scale computational experiments. InSite and Betel et al. InSite bases its predictions on a set of affinity parameters between pairs of motifs or domains for the query proteins. The published InSite database contains 78 protein interactions between proteins. However, the lack of a clear specificity for InSite makes the interpretation of this database very difficult. As discussed earlier, large-scale PPI scans without a very high specificity can have a very large number of false positives.
The Betel et al. Their database contains 18 interactions between proteins. This may not be a surprising observation as the method behind InSite that uses affinities between different motifs, has more resemblance to that of PIPE2 that uses re-occurrence of short polypeptide sequences. For the most part Betel et al. This may explain the small overlap between Betel et al.
It should be noted that InSite also has very little overlap with Betel et al. Figure 3 shows the percentage of identified PPIs that were co-localized in the nucleus Figure 3 also shows a comparison of these numbers with Gavin et al. Co-localization percentage of predicted interactors for PIPE2 and high throughput experiments. The overall pattern for co-localized protein pairs is very similar for all datasets.
Figure 4 compares the absolute numbers of co-localization across PIPE2 predictions in comparison with large-scale experimental approaches. The total number of co-localized pairs for each dataset is as follows: for PIPE2, for Gavin et al. Supplementary Data. Furthermore, Figure 4 shows for each location the number of novel co-localized PIPE2 interactions in comparison with the number of previously known co-localized interactions union of other datasets , which is now reported by PIPE2.
A large number of novel co-localized interactions are found for the nucleus and cytoplasm. PIPE2 generates more co-localized interactions than the experimental methods in seven out of eight categories. Funding for open access charge: Wellcome Trust, UK. David M.
Related Yeast Gene Analysis, Volume 36, Second Edition
Copyright 2019 - All Right Reserved