Case studies

  1. Filtering procedure. Construction of high confidence integrated network
  2. Retrieving of Cell Cycle Network.
  3. Retrieving of Complex Interaction Network.
  4. Intermediate Results and Query execution Statistics.
  5. Sample Queries and Results.
  6. High-confidence MAPK network.

Filtering procedure. Construction of high confidence integrated network.

To obtain high confidence integrated network we took next steps of filtering.

 

  • Protein-protein interactions from MIPs were filtered to remove high-throughput (HTP) interactions contributed by yeast two-hybrid (y2h) and co-immunoprecipitation (co-IP) studies to construct MIPS HC (1207 nodes, 1785 edges).
  • To get high confidence interactions (HTP HC all) from the high throughput protein-protein interactions, we took the union of two y2h data sets (Uetz et al. (2000) and Ito et al. (2001)) and its intersection with union of two co-IP data sets (Gavin et al. (2002) and Ho et al. (2002)), using matrix interpretation for co-IP data.
  • High confidence DNA-protein network (MIT HC, 2420 nodes, 4365 interactions) was constructed from Lee et al. (2002) data filtered for a p-value threshold of 0.001.
  • Genetic interactions from MIPs and Tong et al. (2001, 2004) were added to the high confidence DNA- protein data and all the interactions form this data set that were supported by at least one high throughput protein- protein interaction were used to construct genetic HC (289 nodes, 490 interactions).
  • A high confidence, integrated interaction network (All HC) was derived by taking the union of MIPS HC, HTP HC all and genetic HC (1469 nodes, 2997 interactions, connected component of 1037 nodes).Figure 1. Venn diagram summarizing data filtering procedures.
  • HTP_HC_ALL: high confidence physical interaction network supported by high throughput (HTP) experiment. The interaction must be supported by both Y2H and CO-IP/complex data. Different from HTP_HC, the network is not filtered with co-localization data. Graph file: htp_hc_all.sif
  • all_hc union of htp_hc_all, mips_hc and genetic_hc. Graph file: all_hc.sif
  • genetic_HC: take HTP protein-protein interaction (Y2H, CO-IP or complex), pick those interactions that are supported by either MIPs genetic or HTP genetic data (Tong or MIT (cutoff < 0.001)). Graph file: genetic_hc.sif
  • MIPs_HC: high confidence physical interaction network from MIPs. The interaction must be supported by biochemical data. If the interaction is supported only by high throughput experiment, it is not included. Graph file: mips_hc.sif
  • ItoUetz_GENETIC_INTERSECT: Intersection of (Ito Union Uetz) with (2 genetic Tong’s data Union genetic_mips). Graph File: itouetz_genetic_intersect.sif
  • FYI: Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Vidal et al. 2004 Nature Vol430:88-93. supplement table 1. Graph File: FYI.sif
  • HMI_network.sif: union of MIPs_HC, htp_hc_all and itouetz_genetic_intersect. Graph File: HMI_network.sif
  • FYI_HMI: Intersection of FYI network and HMI_network. Interactions only form FYI and HMI are labeled as FYI and HMI respectively and the ones shared by both, represented as a single edge labeled as ‘both’. Result:FYI_HMI.sif
    Figure 2a. Intersection of FYI network and HMI_network incorporating MIPS complexes and computational predictions (Han et al, 2004) colored due to GeneOntology annotation.
    Figure 2b. Functionally related gene groups in FYI_HMI interaction network.
    Figure 2c. Functionally related gene groups in FYI_HMI interaction network.

    Retrieving of Cell Cycle Network.

    This query demonstrates the use of PathSys as pre-screening tool for defining literature searches by quickly summarizing and reviewing the molecular interactions as well as transcriptional regulatory information for genes involved in a particular cellular process. Go biological process annotation was applied to a large network derived by union of all_hc and MIT_HC (MHM_HC) which was then filtered for genes involved in ‘cell cycle’. The resulting network shows well-known functional modules involved in the cell cycle such as DNA replication, DNA packaging, degradation of cyclins, chromosome segregation, DNA repair, etc. In addition it reveals cell cycle related transcription factors such as MCM1, ABF1, regulating their target genes. Even though the DNA protein interactions are derived from high-throughput datasets filtered to reduce false positives, caution should be used in interpreting the results, The related reference source for each interaction can then be obtained from the edge attribute information.

  • MHM_HC: Union of MIT_HC, HTP_HC_ALL and MIPS_HC. (MHM stands for MIT_HTP_MIPS). Graph file:mhm_hc.sif.

    Retrieving of Complex Interaction Network.

    Complex _ID BC value Complex annotation
    440.30.10 0.25024954 mRNA splicing
    480.1 0.232740746 SPB components
    270.20.10 0.162348718 ctf19 protein complex
    510.40.20 0.158207356 SRB mediator complex
    440.30.10.20 0.131013265 prp 9/11/21complex
    270.20.40 0.12936989 ndc80 protein complex
    260.50.10 0.0766123 tSNARES
    140.20.20 0.066129342 actin associated proteins
    480.2 0.061237633 SPB associated proteins
    445.1 0.060252701 SCF_CDC4 complex
    230.20.20 0.054251563 SAGA
    510.190.10.20.10 0.054251563 SAGA
    510.40.10 0.050920464 RNA polII
    140.30.20 0.049608979 tubulin associated proteins
    410.3 0.049004938 pre-replication

    TABLE 1. Fifteen highest BC (betwenness centrality) complexes with BC values and their functional annotations

     

    Figure 3. In this network with 164 nodes and 482 interactions, each node represents a protein complex identified by a complex_ID label from MIPS and edges are inter-complex protein-protein interactions from high-confidence HMI network.

     

    Figure 4. Interaction details on highest BC complex node

    Intermediate results and Query execution Statistics.

        In the integrated graph, there are 25597 nodes and 1956530 edges.
    • Number of gene pairs which are co-localized and physically interacted, verified by 2-hybrid: 1596
    • Number of complex/gene pairs which are co-localized and physically interacted, verified by co-ip/complex data: 790
    • Number of gene pairs which are co-localized and physically interacted, verified by 2-hybrid and co-ip/complex: 350
    • Number of gene pairs in MIT gene-regulator data (P_VALUE >=0): 708510
    • Number of gene pairs in MIT gene-regulator data (P_VALUE >= 0.8): 166434
    • Number of proteins at each cellular location:
      ER			295
      ER to Golgi		6
      Golgi			40
      NA			2062 -- excluded from all queries
      actin			32
      ambiguous		233
      bud			69
      bud neck		93
      cell periphery		158
      cytoplasm		1745
      early Golgi		52
      endosome		49
      late Golgi		46
      lipid particle		23
      microtubule		17
      mitochondrion		517
      nuclear periphery	52
      nucleolus		143
      nucleus			1336
      peroxisome		21
      punctate composite	139
      spindle pole		62
      vacuolar membrane	60
      vacuole			163
      

    All intermediate networks are described below:

    • MIPs_HC: high confidence physical interaction network from MIPs. The interaction must be supported by biochemical data. If the interaction is supported only by high throughput experiment, it is not included. Graph file: mips_hc.sif
      Fraction of edges returned by query: 0.09%
      Execution Time: 12 sec.
    • HTP_HC: high confidence physical interaction network supported by high throughput (HTP) experiment. The interaction must be supported by both Y2H and CO-IP/complex data. The pair of proteins involved in the interaction must also be co-localized. The list of HTP data sets is shown here. The protein localization data set is from UCSF. Graph file: htp_hc.sif.
      Fraction of edges returned by query: 0.02%
      Execution Time: 37 sec.

      BioNetSQL Query:

      WITH htp_pi AS ( 
      SELECT graph(e) 
      FROM yeastGraphDB G(N, E) 
      WHERE e:E and e.label = 'physical' and e.reference in ('htp_ref1', 'htp_ref2', ...) ) 
      SELECT graph(e2) 
      FROM htp_pi G2(N2, E2) 
      WHERE e2:E2 and n2a:N2 and n2b:N2 and n2=e2.source and n3=e2.target and n2.location=n3.location; 
      
    • HTP_HC_ALL: high confidence physical interaction network supported by high throughput (HTP) experiment. The interaction must be supported by both Y2H and CO-IP/complex data. Different from HTP_HC, the network is not filtered with co-localization data. Graph file: htp_hc_all.sifFraction of edges returned by query: 0.04%
      Execution Time: 32 sec.
    • HTP_HC_NL: high confidence physical interaction network supported by high throughput (HTP) experiment. The interaction must be supported by both Y2H and CO-IP/complex data. For each interaction, one or both members’ location is unknown in UCSF localization data. We would like to know how many interactions in HTP_HC_ALL fail to pass co-localization filter due to incomplete data. Graph file: htp_hc_nl.sif
      Fraction of edges returned by query: 0.01%
      Execution Time: 21 sec.
    • query1d_HC1: union of MIPs_HC and HTP_HC. high confidence physical interaction network. Graph file: query1d_hc1.sif
      Fraction of edges returned by query: 0.11%
      Time: 9 sec.
    • query1d_hc_intersect: intersection of htp_hc and mips_hc. Graph file: query1d_hc_intersect.sif
      Fraction of edges returned by query: 0.005%
      Execution Time: 9 sec.
    • genetic_HC: take HTP protein-protein interaction (Y2H, CO-IP or complex), pick those interactions that are supported by either MIPs genetic or HTP genetic data (Tong or MIT (cutoff < 0.001)). Graph file:genetic_hc.sif
      Fraction of edges returned by query: 0.03%
      Execution Time: 18 sec.
    • query1d_all: union of genetic_hc, mips_hc and htp_hc. Graph file: query1d_all.sif
      Fraction of edges returned by query: 0.14%
      Execution Time: 5 sec.
    • ppi_hc: union of htp_hc_all and mips_hc. Graph file: ppi_hc.sif
      Fraction of edges returned by query: 0.13%
      Execution Time: 3 sec.
    • all_hc union of htp_hc_all, mips_hc and genetic_hc. Graph file: all_hc.sif
      Fraction of edges returned by query: 0.15%
      Execution Time: 7 sec.
    • DP_genetic: Get all interactions in MIT (p<0.001) or CSH (appropriate cut-off) which are supported by genetic interactions either from MIPs or Tong’s. Graph file: dp_genetic.sif.
      Fraction of edges returned by query: 0.0005%
      Execution Time: 18 sec.
    • all_hcdp: union of all_hc and DP_genetic. Graph file: all_hcdp.sif.
      Fraction of edges returned by query: 0.15%
      Execution Time: 4 sec.
    • htp_ppi_2: get all the protein-protein interactions from HTPs that are supported by either both Y2H datasets or both CO-IP datasets (using matrix model, complex data from MIPS) Graph file:htp_ppi_2.sif.
      Fraction of edges returned by query: 0.24%
      Execution Time: 23 sec.
    • query1_final union of all_hcdp and htp_ppi_2. Graph file: query1_final.sif.
      Fraction of edges returned by query: 0.39%
      Execution Time: 4 sec.
    • MIT_HC: MIT interactions at a cut of P<0.001. Each row in the result file is an interaction. The format is: DNA dna_protein:MIT Factor. Graph file: mit_hc.sif.
      Fraction of edges returned by query: 0.22%
      Execution Time: 6 sec.
    • MHM_HC: Union of MIT_HC, HTP_HC_ALL and MIPS_HC. (MHM stands for MIT_HTP_MIPS). Graph file: mhm_hc.sif.
      Fraction of edges returned by query: 0.35%
      Execution Time: 5 sec.
    • MP2M_HC: Union of MIT_HC, HTP_PPI_2 and MIPS_HC. (MP2M means MIT-HTP_PPI_2-MIPS). Graph file: mp2m_hc.sif.
      Fraction of edges returned by query: 0.56%
      Execution Time: 5 sec.
    • UETZ_Y2H: Uetz’s yeast two hybrid data set (PubMed ID = 10688190). Data file: uetz_y2h.sif
      Fraction of edges returned by query: 0.05%
      Execution Time: 5 sec.
    • ITO_Y2H: Ito’s yeast two hybrid data set (PubMed ID = 11283351). Data file: ito_y2h.sif
      Fraction of edges returned by query: 0.23%
      Execution Time: 5 sec.
    • UETZ_ITO_INTERSECT: The intersection (common edges) of Uetz’s and Ito’s data. Graph file:uetz_ito_intersect.sif
      Fraction of edges returned by query: 0.008%
      Execution Time: 5 sec.
    • UETZ_ITO_UNION: The union of Uetz’s and Ito’s data. Graph file: uetz_ito_union.sif
      Fraction of edges returned by query: 0.28%
      Execution Time: 5 sec.
    • Gavin’s complex data: PubMed ID = 11805826; gavin_complex.txt: Original complex clusters;gavin_matrix.sif: Matrix model
      Fraction of edges returned by query: 1.6%
      Execution Time: 5 sec.
    • Ho’s complex data: PubMed ID = 11805837; ho_complex.txt: Original complex clusters; ho_matrix.sif: Matrix model
      Fraction of edges returned by query: 1.6%
      Execution Time: 5 sec.
    • GAVIN_HO_INTERSECT: The common edges of Gavin’s and Ho’s Co-IP matrix data. Graph File:gavin_ho_intersect.sif
      Fraction of edges returned by query: 0.11%
      Execution Time: 5 sec.
    • ItoUetz_GENETIC_INTERSECT: Intersection of (Ito Union Uetz) with (2 genetic Tong’s data Union genetic_mips). Graph File: itouetz_genetic_intersect.sif
      Fraction of edges returned by query: 0.002%
      Execution Time: 5 sec.
    • FYI: Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Vidal et al. 2004 Nature Vol430:88-93. supplement table 1. Graph File: FYI.sif
      Fraction of edges returned by query: 0.13%
      Execution Time: 5 sec.
    • genetic_lethal: union Tong’s genetic data and MIPs genetic interactions, then keep those interactions that are labeled with “synthetic lethal”. Graph File: genetic_lethal.sif
      Fraction of edges returned by query: 0.13%
      Execution Time: 18 sec.
    • FYI_genetic_intersect: First, union Tong’s genetic data and MIPs genetic interactions, then keep those interactions that are labeled with “synthetic lethal”, and intersect them with FYI data set. Graph File:FYI_genetic_intersect.sif
      Fraction of edges returned by query: 0.004%
      Execution Time: 5 sec.
    • HMI_network.sif: union of MIPs_HC, htp_hc_all and itouetz_genetic_intersect. Graph File:HMI_network.sif
      Fraction of edges returned by query: 0.03%
      Execution Time: 4 sec.
    • HMI_complex_network: 1. Get MIPS complexes, which are considered as a gold standard and are high-confidence (represented as single nodes); 2. For each of the component proteins, expand the network by adding interaction from our high-confidence HMI network only. Only add the interactions that are not involving the proteins in the same complex (inter-complex interactions only and not intra-complex). Results:A. this network excludes inter-complex edges that also appear as intra-complex edges in different complexes, B. the network includes such edges, MIPS_complex only, network A except mips_complex edges, network B except mips_complex edges.
      Fraction of edges returned by query: 0.12%
      Execution Time: 15 sec.
    • Essential ORFs: In SGD, select those ORFs that have “inviable” phenotype in the systematic deletion study. Result: essential_orf.txt
      Fraction of edges returned by query: 0.24%
      Execution Time: 4 sec.
    • MIPs_biochem_genetic_lethal: Synthetic lethal genetic interactions in MIPs supported by biochemical data. Graph File: MIPs_biochem_genetic_lethal.sif
      Fraction of edges returned by query: 0.03%
      Execution Time: 16 sec.
    • HTP_genetic_lethal: Synthetic lethal genetic interactions in MIPs supported by HTP data. Graph File:htp_genetic_lethal.sif
      Fraction of edges returned by query: 0.12%
      Execution Time: 5 sec.
    • MIT_CSH_HC: Find CSH interactions that have mapped edges where gene and factor both are known (not null). Union them with MIT_HC. Graph File: MIT_CSH_HC.sif
      Fraction of edges returned by query: 0.24%
      Execution Time: 21 sec.
    • MIT_CSH_HC_INTERSECT: Find CSH interactions that have mapped edges where gene and factor both are known (not null). Intersect them with MIT_HC. Graph File: mit_csh_hc_intersect.sif
      Fraction of edges returned by query: 0.002%
      Execution Time: 5 sec.
    • ORF_NO_PI: Take All ORFs from SGD. Find those ORFs that have no protein-protein interaction in MIPs, Gavin, Ho, Ito, Uetz’s data sets, and no dna-protein interactions in MIT_HC (cutoff<0.001) and TRANSFAC. Result: orf_no_pi.txt
      Fraction of edges returned by query: 0.02%
      Execution Time: 5 sec.
    • NO_PI_GENETIC: For each ORF in ORFs_NO_PI, find the genes genetically interacted with the ORF from TONG + MIPs_genetic. Result: no_pi_genetic.sif
      Fraction of edges returned by query: 0.01%
      Execution Time: 5 sec.
    • NO_PI_PREBIND: Find the interactions in PreBIND involving the ORFs in ORFs_NO_PI. Result:no_pi_prebind.sif
      Fraction of edges returned by query: 0.005%
      Execution Time: 5 sec.
    • DEGREE_FYI_SL: neighbors of degree outliers in FYI_union_SL network (SL: genetically lethal interactions). Result:degree_fyi_sl.sif
      Fraction of edges returned by query: 0.005%
      Execution Time: 5 sec.
    • CC_FYI_SL: neighbors of clustering coefficient outliers in FYI_union_SL network. Result: cc_fyi_sl.sif
      Fraction of edges returned by query: 0.003%
      Execution Time: 5 sec.
    • BC_FYI_SL: neighbors of betweenness centrality outliers in FYI_union_SL network. Result: bc_fyi_sl.sif
      Fraction of edges returned by query: 0.003%
      Execution Time: 5 sec.
    • DEGREE_FYI_DP: neighbors of degree outliers in FYI_union_DP network (DP=MIT_HC (MIT with cutoff < 0.001). Result: degree_fyi_dp.sif
      Fraction of edges returned by query: 0.04%
      Execution Time: 5 sec.
    • CC_FYI_DP: neighbors of clustering coefficient outliers in FYI_union_DP network. Result: cc_fyi_dp.sif
      Fraction of edges returned by query: 0.001%
      Execution Time: 5 sec.
    • DEGREE_SL_DP: neighbors of degree outliers in SL_union_DP network. Result: degree_sl_dp.sif
      Fraction of edges returned by query: 0.003%
      Execution Time: 5 sec.
    • OUTLIERS_NN_FDS: neighbors of the outliers in FYI_union_DP_union_SL network. Result:outliers_nn_fds.sif
      Fraction of edges returned by query: 0.05%
      Execution Time: 5 sec.
    • OUTLIERS_FDS_NETWORK: interactions between the outliers in FYI_union_DP_union_SL network. Result: outliers_fds_network.sif
      Fraction of edges returned by query: 0.001%
      Execution Time: 5 sec.
    • FYI_HMI: Intersection of FYI network and HMI_network. Interactions only form FYI and HMI are labeled as FYI and HMI respectively and the ones shared by both, represented as a single edge labeled as ‘both’. Result: FYI_HMI.sif
      Fraction of edges returned by query: 0.16%
      Execution Time: 23 sec.
    • MIN_NETWORK: Given a list of meiosis-related genes, find the shortest paths between each pair (n*(n-1)/2 pairs). Union all shortest paths into one graph. Note: use cutoff<0.001 for MIT data. Result:min_network.sif If we use cutoff < 0.005, the result min_network_005.sif is here.
      Fraction of edges returned by query: 0.005%
      Execution Time: very long (~10 minutes)
    • MIN_NETWORK_ALL: In min_network, extract all transcription factors (those nodes which are descendents of “transcription regulator” (GO:0030528). Find each transcription factor’s direct neighbors in CSH, MIT (cutoff<0.001) and TRANSFAC. Union the neighbors with min_network. Result:min_network_all.sif Similarly, find neighborhood of transcription factors from min_network_005, and union them together to form min_network_all_005.sif
      Fraction of edges returned by query: 0.08%
      Execution Time: 7 sec.
    • GO Distance Matrix for yeast genes in FYI yeast_gene_list.txt contains all genes in FYI network. For each pair of genes in the list, find their GO distance in Biological_Process, Cellular_Component, Molecular_Function subgraphs. If node C is the least common ancestor (LCA) of gene A and B, then the GO distance between A and B is Distance (A, C) + Distance(B, C). For example, if A and B has a common parent, then GO distance between A and B is 1+1 = 2. If A and B do not have common ancestor, their distance is -1. Results: GO distance matrix in Biological Process, Molecular Function,Cellular Component.
      Fraction of edges returned by query: not available
      Execution Time: 42 sec.
    • GO Distance Matrix for fly genes in BIND-FLY interaction network. Similar to how we compute the yeast GO distance matrix, fly_gene_name_list.txt contains all genes in BIND-FLY interaction network. Results: GO distance matrix in Biological Process, Molecular Function, and Cellular Component.
      Fraction of edges returned by query: not available
      Execution Time: 37 sec.
    • Interactions among peroxisome-related genes: 1. peroxisome-related genes; 2. interactions among the genes (screenshot in Cytoscape); 3. union of shortest paths between all pairs of peroxisome-related genes (screenshot in BiologicalNetworks);
      Fraction of edges returned by query: 0.07%
      Execution Time: very, very long (about 1 day)

      BioNetSQL Query:

      WITH peroxisome_genes AS ( 
      SELECT n1 
      FROM yeastGraphDB G(N1, E1) 
      WHERE n1:N1 and n1:description like '%peroxisom%' ) 
      SELECT union_of_shortest_paths(G2, peroxisome_genes) 
      FROM yeastGraphDB G2, peroxisome_genes;
      
      

      Sample Queries and Results

      • Query 1d:

        Find physically interacted proteins. The interaction is verified by yeast two-hybrid. The protein pairs are either co-immunoprecipitated or co-existing in some complex. They are also co-localized. (Results)

      • Query 1e:

        Find proteins that are co-localized, but not appear in any complex, 2-hydrid or co-ip data. (Results)

      • Query 2c:

        Find gene pairs satisfying the following conditions:
        (1) co-localized and physically interacted, verified by 2-hybrid and co-ip/complex data (See Query 1D Results).
        (2) genetically interacted or one gene is regulated by the protein of the other gene (DNA-protein interaction)
        (Results)NOTE: If we use P_VALUE 0.8 as cutoff when we choose gene-regulator data from MIT database, the results are here

      • Query 2c-a:

        Do the same query as Query 2c except that the condition1 is changed to: co-localized and physically interacted, verified by 2-hybrid data. (Results)

      • Query 2c-b:

        Do the same query as Query 2c except that the condition1 is changed to: co-localized and physically interacted, verified by co-ip/complex data. (Results)

      • Query 2d:

        * result 2ab: gene pairs that are genetically interacted or one gene is regulated by the protein of the other gene (DNA-protein interaction, P_VALUE >= 0.8)
        * result 2d: result 1e (see query 1e) intersect result 2ab.
        (Results) NOTE: The graph may be too big to display, a text version is here Download query2d_new.sif to display results in Cytoscape. The format is:

              FACTOR pd DNA

      All query results are exported into an MS Excel file (query_results.xls). GO function and process annotation for each node is included.

      • Find neighbors of Query 1d results:

        For each subgraph that has more than 4 nodes, find the proteins/genes interacted with these nodes (MIT source is excluded).

      • Query 3:

        For genes in cluster 4 of Esposito’s sporulation microarray data, show those genes interact with each other. The interaction type could be protein-protein, DNA-protein or genetic interaction. For DNA-protein interaction, use P_VALUE > 0.8 as cutoff. (Result file: query3.sif)NOTE: For DNA-protein edges, the first ORF is DNA and the second is the factor. For example:

        YDR374C dna_protein YOL067C

        YDR374C is the DNA bound by factor YOL067C.

      • Query 3a:

        For genes in cluster 4 of Esposito’s sporulation microarray data, show those genes interact with each other, AND co-localized. The interaction type could be protein-protein, DNA-protein or genetic interaction. For DNA-protein interaction, use P_VALUE > 0.8 as cutoff. (Result file:query3a.sif)

      • Query 3b:

        For genes in cluster 4 of Esposito’s sporulation microarray data, show those genes interact with each other, but NOT co-localized. The interaction type could be protein-protein, DNA-protein or genetic interaction. For DNA-protein interaction, use P_VALUE > 0.8 as cutoff. (Result file:query3b.sif)

      • Query 3c:

        For genes in cluster 1,2,3 and 4 of Esposito’s sporulation microarray data, show those genes interact with each other. The interaction type could be protein-protein, DNA-protein or genetic interaction. For DNA-protein interaction, use P_VALUE > 0.8 as cutoff. (Result file: query3c.sif)

      • Query 3d:

        For genes in cluster 1,2,3 and 4 of Esposito’s sporulation microarray data, show those genes interact with each other, AND co-localized. The interaction type could be protein-protein, DNA-protein or genetic interaction. For DNA-protein interaction, use P_VALUE > 0.8 as cutoff. (Result file: query 3d.sif)

      • Query 3e:

        For genes in cluster 1,2,3 and 4 of Esposito’s sporulation microarray data, show those genes interact with each other, but NOT co-localized. The interaction type could be protein-protein, DNA-protein or genetic interaction. For DNA-protein interaction, use P_VALUE > 0.8 as cutoff. (Result file: query 3e.sif)

      • Query 4a:

        Find all neighbors of SPO11. The interaction type could be protein-protein, DNA-protein or genetic interaction. For all these neighbors, find protein-protein and genetic interactions. (P_VALUE > 0.8. Restul file: query 4a.sif)

      • Query 4b:

        Find 2-nearest neighbors of SPO11. The interaction type is either protein-protein or genetic interaction. query 4b.sif)

      • Query 5a:

        Select those genes that show anything but “normal” in Enyenihi and Saunder’s sporulation defects data, find any networks within them, with any possible interactions.
        * query5a_1.sif: Use P_VALUE > 0.8 as cutoff for MIT’s DNA-protein interactions.
        * query5a_2.sif: Use P_VALUE > 0.95 as cutoff for MIT’s DNA-protein interactions.

      • Query 5b:

        In Esposito’s cluster 1,2,3 and 4, there are total 202 genes. In Enyenihi’s data, there are 479 genes displaying anything but ‘normal’ sporulation defects. Among these two groups of genes,
        * 23 genes appear in both groups.
        * 179 genes appear in only Esposito’s cluster 1-4.
        * 456 genes appear in only Enyenihi’s subset.

        Union these two groups of genes, and find any networks within them, with any possible interactions.
        * query5b_4.sif: Use P_VALUE > 0.8 as cutoff for MIT’s DNA-protein interactions.
        * query5b_5.sif: Use P_VALUE > 0.95 as cutoff for MIT’s DNA-protein interactions
        NOTE : The network uses ONLY those genes in the group. If there is a path through some gene that is not in the group, the path is not picked up.

      • Query 6:

        From Enyenihi’s paper take uncharacterized ORFs that showed effect on IME1 induction ( listed in Table1.doc, select for IME1 induction less than ++++), find neighborhood of these genes. The goal is to see if they start relating to any nutritional/environmental signal transduction pathways. Hopefully, we can find between these genes and IME1.
        NOTE: P_VALUE > 0.95 is used as cutoff for MIT’s DNA-protein interactions.
        * take those genes with effect less than ++++, find their nearest neighbors. ( query6a.sif)
        * take those genes with effect less than +++, find their nearest neighbors. ( query6b.sif)
        * take those genes with effect less than ++, in other words, 0 or +, find their nearest neighbors. (query6c.sif)

      • Query 7:

        7-1: protein-protein interaction graph (excluding BIND and PREBIND) of those proteins whose genes are expressed in sporulation/meiosis during the first 6-8 hours. (Results 7-1)
        7-2: The whole genome transcription factor-DNA interaction graph from MIT data, using P_VALUE cutoff 0.999 (Results 7-2)
        7-3a: The union of query 7-1 and 7-2 (Results 7-3a)
        7-3b: The intersection of query 7-1 and 7-2: empty results
        7-4: The genetic interaction graph, not including Tong’s data. (Results 7-4)
        7-5a: The union of query 7-1 and 7-4 (Results 7-5a)
        7-5b: The intersection of query 7-1 and 7-4 (Results 7-5b)
        7-6a: The union of query 7-2 and 7-4 (Results 7-6a)
        7-6b: The intersection of query 7-2 and 7-4 : empty results
        7-7a: The union of query 7-1, 7-2 and 7-4 (Results 7-7a)
        7-7b: The intersection of query 7-1, 7-2 and 7-4 : empty results

      • Query IME:

        List genes bound by IME4, order by P_VALUE. (Results)

      Some data sets are here

      High-confidence MAPK network

      The sub-network derived from MIPS (MAPK MIPS) shows 37 genes and 74 interactions and the sub-network from ALL HC (MAPK allhc) shows 39 genes and 106 interactions.

      Figure 5. The result of the query on the high-confidence MAPK subgraph. Each edge between two nodes comes from a different source. Nodes are colored according to GO annotations for bio- logical process at level 5; purple: protein metabolism and modi- fication, dark green: polar budding, light green: cell cycle regula- tion, yellow: signal transduction, light yellow: cell surface recep- tor linked signal transduction, aqua: perception of external stimulus, teal: cellular organization and biogenesis, dark pink: DNA replica- tion and chromosome cycle, light pink: cytoplasm organization and biogenesis, magenta: growth pattern, orange: M phase, light orange: cell surface organization and biogenesis, dark blue: nucleic acid metabolism, light blue: sporulation, grey: vesicle mediated trans- port, lavender: Not annotated. Edges are colored according to inter- action type; red: MIPS physical, blue: yeast two-hybrid, purple: co- immunoprecipitation, green: MIPS genetic interaction, pink: DNA- protein interaction (MIT).

 

Biological Articles: