BiologicalNetworks


1. Introduction

BiologicalNetworks is the research environment for inferring cellular molecular mechanisms and elucidation of factors making interrelated impact on different levels of organism including genes, biomolecules, cells, and cell systems.

BiologicalNetworks integrates over 100 public databases for thousands of eukaryotic, prokaryotic and viral genomes and provides a “one stop shop” experience for users seeking information needed to decipher gene regulatory networks, sequence and experimental data, functional annotation, orthology relations, transcriptional regulatory region analysis.

The sophisticated querying capabilities of BiologicalNetworks allow users to formulate queries with practically any combination of properties (name/synonym, function, sequence, expression, etc.) and conditions on any combination of bio-entities (gene/protein, promoter, COG, pathway, etc.) and/or bio-relations (interactions, co-expression, co-citations, etc.). This can be combined with the build-pathways infrastructure for molecular interactions, relationships and modules discovery from high-throughput experiments.

Integrated Genome Viewer allows the user to search and analyze gene regulatory regions, transcription factor binding sites and their conservation in multiple species in conjunction with molecular pathways/networks, experimental data and functional annotations.

BiologicalNetworks uses a specialized graph visualization engine to represent biological pathways, gene regulation networks and protein-protein interaction maps for intuitive exploration and prediction. The software can handle a variety of tasks, including graphic drawing and layout optimization, data filtering and pathway expansion, and classification and prioritization of proteins, etc

BiologicalNetworks uses a proprietary file format (BNX or BiologicalNetworks model XML format) that stores information pertaining to the model and the corresponding simulation environment in BiologicalNetworks. BiologicalNetworks supports import and export of models from the System Biology Mark-up Language (SBML), SIF and GML file formats.

The following is a summary of the functionality of BiologicalNetworks:

  • Distribution
    BiologicalNetworks is a freely distributed Java web-based application.

  • Multi-platform
    BiologicalNetworks is a multi-platform software that has been developed in Java. It has been tested on Windows and Macintosh platforms.


  • Biologist-friendly User Interface (UI)
    The user interface has been designed to be as friendly as possible for the biologist. The UI provides adequate tools for cell model building while abstracting certain underlying mechanism from the users.


  • Database Integration
    The BiologicalNetworks platform is integrated with IntegromeDB integrated database system compiled from over 100 databases and ontological data sources.


  • Database Querying
    The BiologicalNetworks platform provides extended querying functionalities to specify and retrieve biologically meaningful networks.


  • Network Analysis
    BiologicalNetworks is able to provide users the network statistics for the models, which are being loaded. In addition, finding conserved pathways between pathways is another main function in analysis.


  • Genomic Sequences Analysis
    BiologicalNetworks provides an integrated environment to work with genomic sequences and regulatory regions in context of biological pathways and gene regulatory networks. It provides extended functionalities for searching regulatory regions and other sequences and comparative genomics analysis.


  • 3D protein structure Analysis
    BiologicalNetworks provides an integrated environment to work with 3D protein structure data in context of biological pathways and gene regulatory networks.


  • SQL-like querying language
    The system is equipped with novel query engine with built in SQL-like querying language, allowing paths, trees, graphs operations.


This tutorial will guide the users through BiologicalNetworks by introducing various features and functions in greater details.

Chapter 2 describes how to download and install BiologicalNetworks. The existing BiologicalNetworks users may skip to Chapter 3 directly.

Chapter 4 highlights all details with regards to the GUI and introduces users to various functionality of BiologicalNetworks.

In Chapter 5 biologically relevant examples are presented, which provides step-by-step guide to construction of models using BiologicalNetworks.

Lastly Chapter 6 that explains Data Management of BiologicalNetworks.

2. Getting Started

2.1 System Requirements

In order to run BiologicalNetworks successfully, your computer must meet the following minimum system requirements.

Windows
2GHz Intel Pentium CPU
Microsoft Windows 2000, XP, Vista
at least 1GB (2 GB better) RAM
10 GB Free Disk Space

Macintosh
1GHz Intel Proc
Mac OS X 10.1 and above
at least 1 GB (2 GB better) RAM
10 GB Free Disk Space

2.2 Download Additional Data Files.

  1. For making Microarray, 3D protein structure, Functional Data analysis you can download example data files.
    Download the ExampleData.ZIP file containing Stanford (tab delimited), Affymetrix, TIGR, GenePix microarray data, GeneOnology data files and PDB 3D structures of proteins from the BiologicalNetworks website (http://BiologicalNetworks.net/ExampleData.zip). Before you download a file, notice that its byte size is provided on the download page. Once the download has completed, check that you have downloaded the full, uncorrupted data file.
  2. Zip file also contains GO annotation files, to make GeneOntology annotation analysis.
  3. Unzip the file anywhere into your hard drive. Now you can load these data files into BiologicalNetworks analysis environment.

2.3 Troubleshooting

If you have problems running BiologicalNetworks try to disable you firewall program and launch BiologicalNetworks again. If it helps you need to change firewall settings to allow Sun Java to connect to Internet.

For example to enable Sun Java access to Internet in Norton Internet Security you may follow these steps:

  1. Open NIS.
  2. Click on Status & Setting menu item on the left.
  3. Click on Personal Firewall.
  4. Press Configure button.
  5. Select Programs tab.
  6. Find java item in list.
  7. Select Permit All option in Internet Access column.
  8. Click OK.

For other firewall program please refer to it's manual for more details.

If it doesn't help try to restart your computer.

3. Quick Tour of BiologicalNetworks

BiologicalNetowrks is a application developed for navigation and analysis of molecular networks. The software is also a powerful drawing tool that allows researchers to create their own pathways. It helps them to summarize experimental results and produce publication quality pathways.

3.0 Standard Features

The software can handle a variety of tasks, including graphics and layout optimization, data filtering and pathway expansion, and classification and prioritization of proteins. BiologicalNetworks supports many features available in public pathway browsers and graphical toolkits.

With BiologicalNetworks you can:

  • Create new objects – pathways, nodes, and controls.
  • Edit and delete existing objects.
  • Import and export objects.
  • Create BiologicalNetworks projects to share with other users.
  • Add new fields to objects.
  • Drag-and-drop and move around individual nodes, selected subgraphs, and colored groups.
  • Design new pathways by copying and pasting or manual editing.

3.1 Graphical User Interface

The layout of BiologicalNetworks is shown in Figure 3.1. The interface has three major areas: the Project Pane on the left, the Search Result/Microarray

Pane on the bottom, and the central Pathway Navigation Pane.

3.1 General Layout of Interface

Figure 3-1 illustrates the BiologicalNetworks Graphical User Interface (GUI).
  • The Pathway Navigation Pane can contain several pathway views, one window per pathway. The Pathway map is completely configurable: it can show the whole pathway or any set of nodes. For the selected group of proteins, more detailed information can be viewed in the List Pane.


  • The Project Pane organizes data in a Workspace. Proteins, small molecules, and cell processes can be stored in folders. Folders are hierarchical and can include both subfolders and single objects. A separate set of folders is reserved to represent a functional protein ontology and/or classification hierarchies.


  • The Search Result Pane is organized as several tables with a set of columns, where each column represents one characteristic of a node or control found for your search. Some useful characteristics include protein symbol, gene name, links to external databases (GenBank, LocusLink, Golden Path, HUGO), experimental conditions, etc. These characteristics can be used for filtering or sorting objects in the table. Selection in the table can be highlighted on the map and vice versa; behind the Project Pane there are three other panels: the Palette Panel, Curated Pathways and Database Tables Pane.


  • The Palette Pane displays whole list of node and control types and their graphical presentation.


Figure 3-1: BiologicalNetworks Graphic User Interface

Figure 3-1: BiologicalNetworks Graphic User Interface


3.2 Creating Pathway/Model

To create a pathway/model in BiologicalNetworks, there are several means to do it. User can load ready-made pathways/models from pathway databases or SBML files (will touch them in next chapters) or manually create a new pathway/model from scratch. Hereby, we’ll describe the manually creating of a pathway/model.

Creating a project - Use the New Project button New Project button to open a new project workspace. Click on the button to create a new project. A new modelling canvas will be automatically generated.

Creating a species - Use the Species buttons Species buttons to create new species in the Palette Pane. Click on the species button once and then click on the modelling canvas where you want to place the new species.

Creating a reaction node - Use the Reaction button Reaction button to create a new reaction in the modelling canvas. Click on the button once and click in the modelling canvas where you want the new reaction created.

Creating a reaction - Use the Linking Tool ReactionFlow/Binding button to links the species to the reaction node. Click on the tool once and click on the species you want to link with the reaction node, hold the left mouse button and drag the arrow to the reaction node.

Creating a compartment - Use the Compartment button Compartment button to create a new compartment in the modelling canvas. Click on the workspace and drag the mouse to the desired region which covers all the reactions taking place in the same compartment.

You can easily explore all the species in the model tree viewer by its category and edit any of them.( Figure 3-2).

Figure 3-2: BiologicalNetworks Project Browser
Figure 3-2: BiologicalNetworks Project Browser

3.3 Defining Pathways/Models


3.3.1 Defining a Node

Single clicking on the node in the modelling canvas will highlight the Node Properties Panel fields(Figure 3-3), which are editable. This panel allows you to see and modify the bioentity properties: Name, Aliases, different IDs,etc. In addtion, you can include some biological information e.g. full GO annotations, organism and pathway information; kinetic information: Quantity, Concentration, Unit, Molecular Weight, etc. Apart from that, differnet links lead users to online databases to get more detailed information. Also Full Desciription window which is at the bottom of the panel gives to user full description of the bioentity.

3.3.2 Defining a Procces/Reaction

Double click on the Process node in the modelling canvas and a new window, the Process Properties panel(Figure 3-4) will appear. Modify the process properties (process stoichiometry,specify if the process is reversible and other properties) in this panel and. You may also click on the “kinetics” tab to access and modify the reaction/process kinetics.

Figure 3-3: BiologicalNetworks Project Browser Figure 3-3: Node Properties Editor Figure 3-3: Node Properties Editor

Figure 3-4: Process/Reaction properties Editor

Figure 3-4: Process/Reaction properties Editor

Figure 3-4: Process/Reaction properties Editor

Figure 3-4: Process/Reaction properties Editor windows

3.4 Exploring Curated Pathways

Click Curated Pathways tap in the left panel of application window to bring up Curated Pathways tab pane. The Curated Pathways tab pane has database tree panel. ‘Databases’ is the root node in the Curated Pathways tree panel and has only child of database server, ‘KEGG’.

Click the KEGG node to show available organisms in KEGG database. After retrieving the data, 213 organism nodes are appeared.

Fig.1.7.1 Curated Pathways Pane.
Fig.1.7.1 Curated Pathways Pane.

(Number of nodes may vary from this value.) Double click one of the organism nodes to show the accompanying pathways. Double click one of the pathways shows list of reaction nodes.

Fig.1.7.2 Curated Pathways: Organisms, Pathways, Reactions, Metabolites.
Fig.1.7.2 Curated Pathways: Organisms, Pathways, Reactions, Metabolites.

To load the pathways into the Pathway Panel simply drag a node and drop to the Pathway Panel window. The loaded pathways containing reactions, compounds and their interactions are drawn in model workspace window. Positions of these are randomly placed. Use layout tools to get nice positioning.

Fig.1.7.3 Drag and drop curated pathways into the Main Pathway Window.
Fig.1.7.3 Drag and drop curated pathways into the Main Pathway Window.

4. BiologicalNetworks Graphical User Interface

4.1 BiologicalNetworks Panes

The main BiologicalNetworks window is divided into three panes and a Menu ToolBar. The panes are:

  • Pathway Pane.
  • Project Pane.
  • List Pane.
Panes as a part of the software interface help you to manage your data. This section describes the panes and presents guidelines for their appearance and general use. Figure 4-1 gives screen shot of the entire application.

Figure 4-1: Screenshot of BiologicalNetworks
Figure 4-1: Screenshot of BiologicalNetworks

4.2 BiologicalNetworks Panes

4.2.1 Project Pane

The Project Pane contains:
  • Project properties pane.
  • Choose organism drop-down menu.
  • Project Folders .
Project properties pane allows you to define your personal project settings in order to be recognized by database sequrity system allowing to add, edit, remove data from database.

Choose organism drop-down menu allows you to choose the database to be studied.

Figure 4-2-1: Choose organism drop-down menu
Figure 4-2-1: Choose organism drop-down menu


4.2.2 Project Folders

The Project folders in the Folder Pane help you to organize your personal data and can include both subfolders and single objects. With the Project Folders you can create your personal working area. You can add your own subfolder in the Projects Folders.

The Project Pane has next folder types:

  • The Index folder shows the database content organized by object and process types.
  • The BiologicalNetworks Project folders are designed to help you to organize your personal data and can include both subfolders and single objects.
  • The Microarray folder contains Microarray heatmap of the opened microarray analysis project.
  • The Groups/Clusters folder contains functional groups, microarray clusters, protein lists, and other groups of objects.
  • The Analysis folder contains different types of microarray analysis (clustering, sorting, filtering, GeneOntology analysis) subfolders.

4.2.3 Pathways Pane

The Pathway Navigation Pane or Graph Display is capable of displaying several pathway views, one window per pathway. The pathway map is completely configurable. The map shows individual pathways or any selection of nodes. For a selected group of proteins, more detailed information can be viewed in the List Pane. For example, Figures 4-2-1, 4-2-2 show how you can find a description of selected proteins.

Figure 4-2-2: Metabolic Pathways Pane
Figure 4-2-2: Metabolic Pathways Pane



4.2.4 List Pane

The List Pane functionality allows you to:
  • Show query results and group contents.
  • Copy and paste contents of the table into the pathway.
  • Copy and paste contents of the table into MS Excel.
  • Select nodes from the table on the pathway diagram.
  • Retrieve the selection from the pathway diagram.
  • Find pathways and groups that contain nodes of interest.
  • Link Protein names stored in the List Pane tables to HUGO, LocusLink, and GenBank. Hyperlinks are shown in the blue color.


When the List Pane shows contents of groups or search results, the List Pane functionality allows you to examine useful characteristics for nodes and controls. Some characteristics to be displayed include: symbol, gene name, functional group, links to external databases (GenBank, LocusLink, Golden Path, HUGO2, experimental conditions, etc. These characteristics can serve as the basis for filtering or sorting objects in the table. Table selections can be highlighted on a pathway map and vice versa.

To show nodes of interest in the List Pane:

  • Create a query. For example, find nodes are containing string LAT in their names, using menu Edit > Find Nodes by String or Edit>Find Nodes by Attributes;
  • Run the query;
  • The results appear in the List Pane.


To copy a node of interest from the List Pane to the pathway diagram:

  • Select the node of interest from the List Pane;
  • Add the selected node to the pathway diagram using copy and paste from the List Pane, or simply draganddrop the selection into the Pathway Pane.


To select a node of interest on the graph:

  • In the List Pane, put mouse over the record with node of interest;
  • Open the rightclick menu and choose Select on Graph;
  • If the node of interest is included in the current pathway diagram, it will be selected on the graph.


To retrieve selection from the graph:

  • Open the List Pane;
  • Open the pathway of interest. Select the node of interest on the graph;
  • Open the context menu on the List Pane and choose Get Selection from Graph;
  • If the node of interest is included in the current list of nodes, it will be selected on the List Pane;


The List Pane functionality allows you to find groups and pathways that the node of interest is a member of. Find groups and pathways containing the selected object(s) by using the following steps:

  • Select the node of interest from the list of nodes;
  • Open the List Pane context menu and choose Find Pathways;
  • The List of Groups and Pathways dialog box appears;
  • Select a pathway or a group and press the Open button;
  • The pathway of interest appears in the Graph Window; or the group of interest opens. See Figure 4-2-4.


Figure 4-2-5: List Pane
Figure 4-2-4: List Pane

4.2.5 Palette Pane

The Palette Pane contains a full list of nodes and controls types and their graphical representation. Property values can be displayed by shape and color of the node in the Pathway Pane. Every kind of node or control in the software has a unique name and graphic representation. By default, BiologicalNetowrks has a builtin styles for all kinds of nodes and controls stored in the List Pane. See Figure 4-2-5.

Figure 4-2-5: Palette Pane
Figure 4-2-5: Palette Pane

4.3 Menu, Toolbar and Model Kit

Menu, Toolbar and model kit may be used for quick access to various components of BiologicalNetworks. The menu options are listed in the Table 4-1.

Main Menu

Option

Description

File

New

Create a new model workspace

Load

Open an existing model saved in the .bnm format

Import a SBML model file. The imported model may automatically be laid out on the model workspace using two layout algorithms – Force directed and hierarchical.

Save

Save the model in .bnm format. This will save the current model along with the diagrammatic layout

Save as another .bnm or other format file

Print

Print the model diagram

Export as

Export a model to a SBML or other file formats.

Exit

Exit BiologicalNetworks

Edit

Squigle

Mark up your network.

Create View

Create view for network data.

Destroy View

Destroy graphical representation of the network.

Destroy Network

Destroy network data (not yet visualized).

Data

Display attribute browser

Display the attribute browser, which lets you view attributes assigned to both nodes and edges.

Select

Select

Operations for selecting nodes and edges, and using the current selection to create a new network and an associated view Create a new model workspace

Layout

Alyout Manager

Arranges the network drawn using layout algorithms (Force-Directed, Hierarchical-Embedding , etc.)

Visualization

Visualization Menu

The Visualization menu provides options for changing the mapping from biological data to a visual representation: colors of nodes, thickness of edges, etc. These features are explored in-depth in the 9. Visual Styles section. This menu also provides an Overview (Bird's Eye view) of your entire network, which is helpful for navigating very large networks.

Analysis

Find cycles

Find independent cycles in the whole network.

Statistics

Show network statistic information.

Network BLAST

Show conversed pathways among networks.

Tools

Tools Menu

Tools Menu provide access to Directed Acyclic Graph Engine and Querying Language GUI.

Table 4-1: Menu Options in BiologicalNetworks

The toolbar contains the shortcuts for some of commonly used menu option like open, save, print, help etc which are activated by clicking on the corresponding icons.

4.3.1 Node Editor

In any model the Node may be one of the Bioentities:Gene, Protein, Cell Object, Enzyme, mRNA, Pathway, Complex, Small Molecule etc. For easy identification each category has a different icon associated with it as shown in Figure 4-2-5 . The Node Editor is a pane which is located at the right of the main window. Figure 4-4 shows a snapshot of the Node editor. Following information can be entered through the Npde editor:

Figure 4-4: Node properties editor
Figure 4-4: Node Editor

Label

Name displayed on the model workspace

Full Name

Full Name

Concentration

Concentration specified in Moles/litre

Volume

Volume of the compartment to which the bioentity belongs. Volume is entered through the project properties tab ( 4.1).

Total Quantity

Number of moles of the bioentity

No. of Molecules

Number of molecules of the bioentiy

Value Fixed

Check this check box for keeping the concentration fixed during simulation

Notes

Annotation of the bioentiy

Only one of concentration and number of Molecules need to be entered. The rest is calculated using formula:

Total Quantity = Concentration x Volume Equation 4-1
No. of Molecules = Total Quantity x Avogadro’s Number Equation 4-2

4.3.2 Process Properties

An in-silico model of a biological system is abstracted as a network of bioprocesses or chemical reactions. Section 3.3 describes steps to be followed for constructing a model on the model workspace.

Process Properties is an interface (Figure 4-5) for viewing and changing the properties of a bioprocess or a reaction such as stoichiometry, rate laws etc.

The Process editor consists of two tabs:

  • Process : Through this tab the process/reaction name, reversibility and stoichiometry and other properties can be entered. Stoichiometry can be entered through the drop down list of positive integers prefixed to each bioentiy involved in the process/reaction. Also a quick view of the properties of bioentity, which form the reaction, can be obtained by pressing the View Bioentity Details button as shown in (Figure 4-5: Process Properies Editor).
  • Kinetics : The kinetic law associated with the process/reaction and the associated rate constants are entered through this tab. In this version only Mass action rate law is supported.

Figure 4-5-1: Process/Reaction Editor - Reaction Tab
Figure 4-5-1: Process/Reaction Editor - Process/Reaction Tab

Figure 4-5-2: Process/Reaction Editor - Process/Reaction Tab
Figure 4-5-2: Process/Reaction Editor - Process/Reaction Tab


4.3.3 Compartment Editor

Chemical reactions may take place in different compartments. Compartment Editor is an interface ( Figure 4-6 ) for changing the properties of a compartment.

Figure 4-6:Compartment Editor
Figure 4-6: Compartment Editor

4.4 Overview of Context Menus

Rightclick menus or context menus are extensive hidden menus that exist throughout BiologicalNetworks. They allow you to access commands for selected objects. In other words, rightclick menus contain the list of commands that can be currently used.

5. BiologicalNetworks Modelling Environment

5.1 Building A Model

In this section, the steps for building a model would be described. Every model requires a project space. To define a new model the first step is to open a new project by using the icon in the toolbar or the menu option (File->New). This creates a model workspace on which the model network is built. Following sections describe how to define Nodes, Processes and edit their properties. Free text details or references for the model can be entered in the description window (4.2). Other specific information like User Name, Date, Species Name, etc can be entered in the project properties tab mentioned in 4.2.

5.1.1 Adding Biological Components

After creating the model workspace, the next step is to define different species in the network. Steps for adding new specie in the model are as follows:

  1. Click on the icon of specie category, i.e. gene, enzyme etc, in the Palette Panel. This will change the mouse pointer inside the model workspace to the icon of the selected category.
  2. Place the new specie on the model workspace.
  3. Change the properties of the specie by invoking the Figure 4-4 Species editor. By default a name would be assigned to the specie and its concentration would be set to “1” if it is a reactant and to “0” if it is a product. On moving the mouse pointer to a specie icon a tool tip is displayed with its name, volume and concentration.

5.1.2 Adding Prcesses/Reactions

The next step after creating the bioentities in the model is to define the processes/reactions and the associated kinetics. In BiologicalNetworks a reaction is graphically represented by a bioprocesses/reaction icons (Figure 4-3 ) and linking arrows. The steps for defining a bioprocess/reaction are as follows:

  1. Select the bioprocess/reaction icon from the Palette Panel
  2. Place the bioprocess/reaction link icon on the workspace by clicking on the workspace
  3. Choose the bioprocess/reaction arrow from the Palette Panel and link the bioprocess/reaction link with the bioentity in the bioprocess/reaction by clicking and dragging the mouse. If bioentity is a source/reactant, the arrow should originate from it with the arrowhead at the bioprocess/reaction link. To indicate destination/product the arrowhead should point to the bioentity and tail should be on the bioprocess/reaction link.
  4. Go to the Figure 3-4: Bioprocess/Reaction Editor window by double clicking on the bioprocess/reaction icon and see/change the properties, enter the stoichiometry, rate law and other information as explained earlier. On moving the mouse pointer to a bioprocess/reaction icon a tool tip is displayed with its name, bioprocess/reaction equation, rate law and rate law parameters etc.

5.1.3 Adding Compartments

In BiologicalNetworks a compartment is graphically represented by a compartment icon (Figure 4-3)
The steps for defining a compartment are as follows:

  1. Select the compartment icon from the Palette Panel
  2. Click on the workspace and drag the mouse to the desired region, which covers all the reactions taking place in the same compartment.
  3. Change the properties of the compartment by invoking the Figure 4-6: Compartment Editor. By default a name would be assigned to the compartment and its volume would be set to “1.0E-14”. On moving the mouse pointer to a compartment a tool tip is displayed with its name, and volume.

5.1.4 Editing Model

The model network definition is complete with the creation of all the species and reactions on the workspace, and entering their properties through the Species and Reaction editor. To better organize and aesthetically enhance the components on the workspace the following can be done:

  1. Move the components on the workspace. This can be done by first switching to the selection mode and then choosing the component on the workspace. The component can be moved around and resized. During all these process, the network topology will be preserved.
  2. Change the colour of the components. A component may be given any color from the colour palette in the Palette Panel.
  3. Add text to the model workspace. To enter on any section of the model free text may be added by selecting the annotation icon from the model click and placing a text box on the workspace. Also font and colour of the text may be changed by using the colour and fonticons in the Palette Panel.
  4. Cut, Copy, Paste and Delete component. These standard operations can be done on any component on the model workspace – specie, reaction arrow, reaction link, text box – through the Edit menu or standard key strokes.
  5. Undo and Redo operations. Up to 10 operations can be undone and redone using the icons in the toolbar.
  6. Zoom in /out/fit and Zoom in selected region. This feature becomes useful when the model size becomes big. The zoom in, zoom out, zoom to fit and zoom in selected region icons in the toolbar implement this functionality.
  7. Resize component. The species and reaction icons can be resized by selecting them and dragging one of the control points.
  8. Layout component. BiologicalNetworks incorporates two intelligent layout algorithms for automatically creating an aesthetically pleasing network layout of the model on the drawing workspace. These are:
    • Force Directed algorithm: This algorithm is more suitable for layout of networks with many loops in it.
    • Hierarchical algorithm: This algorithm gives better results when used for laying out networks with less loops e.g. linear cascades.

5.2 Managing Models and SBML Support

Building models of biological processes, is an intensive and tedious exercise. As a result re-usability and ease of exchange of models is an asset for any software tool aimed at studying biological networks and processes.

In BiologicalNetworks this requirement has been addressed by making it compliant with the SBML standard (www.sbml.org). SBML is an XML based modelling language for describing biochemical network models. It is an ongoing international collaboration effort and is fast becoming a standard for biochemical model specification and exchange. The current release versions of SBML (Level 1 version 1, Level 2 version 1, Level 2 version 2) are supported by BiologicalNetworks.

Following sections describe how to save, import and export models in BiologicalNetworks.

5.2.1 Saving Models

Models constructed in BiologicalNetworks can be saved by using the save option in the menu or through the save icon in the toolbar. The models are saved in the proprietary .bnm format, which is essentially a binary format of the model Java class.
The .bnm file captures all the network information entered through species editor, reaction editor, simulation setup window and the layout of the network on the model workspace.

5.2.2 Importing Models

BiologicalNetworks can import models created in the .bnm (BiologicalNetworks) format or a model specified in the SBML format.

BiologicalNetworks models can be imported by simply choosing the open option from the File menu or by clicking on the open icon in the toolbar and then browsing for the .bnm file. The model would be created along with the network layout.

SBML models can be imported by using the Import SBML option from the File menu and then browsing for the .bnm file. SBML does not specify information for the layout of the network on the model workspace. BiologicalNetworks uses an intelligent layout algorithm for automatically creation of network layout from the SBML information.

5.2.3 Exporting Models

BiologicalNetworks models can be exported to SBML file. User can use the Export SBML option from the File menu and then specify the file name and the version of SBML to exported the model. However, certain information for example the simulation setup and layout information will not be saved in the SBML files.

5.3 Network Satistical Analysis

BiologicalNetworks provides various tools to extract topological information from the model. These tools can be accessed under Analysis>Network menu.

  1. Network statistics: Shows network statistics.



  2. Fig.5.3.1 Network statistics information.
    Fig.5.3.1 Network statistics information.



  3. Find cycles: Finds independent cycles in the network. Cycles in the network are useful to determine feedback loops. Tarjan algorithm is used to find the cycles and computation takes linear time.

  4. Network Blast: Finds conserved pathways among networks.


  5. Fig.5.3.2 Pathway BLAST Panel.
    Fig.5.3.2 Pathway BLAST Panel.



6. Data Management

The software database provides storage and organization functions for proteins, small molecules, cell processes, events of regulation, chemical reactions, and other objects used in studies of molecular networks and pathway analysis.

6.1 Pathway Representation

The pathway concept is central in the design of the system. The program stores pathway as a diagram with groups, annotations, and user settings.

The pathway stores not only the list of all proteins, small molecules, and cellular processes, but also, the links and relationships between them. Graphically, each pathway is shown in a separate window and serves the purpose of organizing the data and saving results of searches in underlying database. Pathways can be saved to the database, and the list of existing pathways is displayed in the Folder Pane.

Pathways can be exported as separate XML files, exchanged between researchers for collaboration purposes and reimported back into the database.

To create the new pathway:

  • Choose File>New >Project from the Main Menu;
  • The Pathway named New Pathway appears in the Pathways folder;
  • Rename the new pathway by typing the new name in the folder box.


To remove the pathway from the Workspace:

  • Select the pathway of interest;
  • Choose Delete from the context menu;
  • Press Yes.

6.2 Workspace

The Workspace is a collection of the pathways, as well as custom annotations and other private data. Custom annotations may include folders and text fields functionally classifying a gene (for example, as a receptor) or linking it to a specific disease. Users may add both new nodes and links, which will be saved to the Workspace.

The Workspace is implemented in the form of a local file. The default Workspace is PathSys's Yeast database; it is preloaded with the list of proteins, complexes and cell processes. The Integrated Database workspace also contains approximately 100,000 functional links extracted from PubMed and full text articles.

6.3 Database Search

This section introduce you to the database search procedures. You can perform a database search to locate any type of objects stored in the database. In general, the software supports two types of searches:

  • Context search
  • Search by attributes


6.3.1 Context Search

The context index provides the retrieval of sets of objects that have attributes containing the specified query phrase (string) partially or completely. The text fields are broken into words according to a conventional manner. The quick context search is optimized for speed of retrieval.
Fig.6.3.1 Search database by string values.
Fig.6.3.1 Search database by string values.


The Context Search icon is located at the top of the Project Pane. See Figure 6.3.1
To call a context search:

  • Click Context Search Icon;
  • The Find dialog box appears;
  • Choose a type of node for selection; it may be CellObject, CellProcess,Protein and others;
  • Enter the text. Click OK;
  • The search result appears in the Search Result List Pane;

Fig.6.3.2 Search database by string values.
Fig.6.3.2 Search database by string values.


The search results are displayed in a separate Search Result List Pane and can be added to a pathway (or a group) by Copy and paste or a drag and drop operation.

6.3.2 Search by Attributes

The search by attributes allows you to search database objects using many types of data as search conditions. These include, for example, node type, effect (positive, negative, unknown), mechanism (transcription, phosphorylation), tissue type, description/userdefined attributes text, and so forth.

Fig.6.3.3 Search database by property values.
Fig.6.3.3 Search database by property values.


The Property Search icon is located at the top of the Project Pane. See Figure 6.3.3
To call a property search:

  • Click Property Search Icon;
  • The Find dialog box appears;
  • Choose where you want to make a search: on the Table Query Interface or on the Advanced Tree Interface;
  • Choose where you want to make a search: in the entire Workspace or in the active pathway only;
  • Choose a type of node for selection; it may be CellObject, CellProcess, Protein and others;
  • Select an attribute type in the Attribute field or in the Attribute Tree;
  • Select an operation type in the Operation field, it may be equals, not equal, includes, and starts with.
  • Specify value in the Value field. Some attributes contain dictionary values. For such attributes the software will display the list of the dictionary terms. For all other attributes, you must to type in the value. For example, if you choose CellObject in the Attribute field, you can see in the Value field the names the allowed names of cell objects such as chloroplast, liposome, DNA, and others.
  • Select the type to combine search parameters in the Logic field.
  • The query creation is complete;
  • To add the query in the Query list press the Add button. On the Tree interface conditions are added automatically to your query list ;
  • Click OK to run the query;

Fig.6.3.4 Search database by property values through Table Query Interface.
Fig.6.3.4 Search database by property values.


Fig.6.3.5 Search by Attribute and Node type window.
Fig.6.3.5 Search by Attribute and Node types through Advanced Tree Interface.


The search results are displayed in a separate Search Result List Pane and can be added to a pathway or a group by Copy and paste or a drag and drop operation.

6.3.3 Quick Search

You can perform a quick search for any object type stored in the Workspace. In the Quick Search window, you can set the search text and press the Start Quick Search button to start the Quick Search algorithm. All objects found will be placed in the separate list in the Search Result List Pane. You can save the search results as a individual group in the Groups folder, or as a search in the Searches folder (by default).


The Quick Search box is located at the top of the Project Pane. See Figure 6.3.6

Fig. 6.3.6 Quick search field.
Fig. 6.3.6 Quick search field.

Type your favorite gene name or keywords to search for all available information about it presented in our database.

Your search results will be listed in the Search Result List Pane. See Figure 6.3.7

Fig.6.3.7 Search Result List Pane: quick search results.
Fig.6.3.7 Search Result List Pane: quick search results.



6.3.4 Querying database to explore gene relationships

You can drag and drop rows from the Search Result List Pane to your active pathway, or Build a Pathway from selected group of bioentities by right clicking on the chosen group of bioentities. See Figure 6.3.8
Fig.6.3.8 Search Result List Pane context menu.
Fig.6.3.8 Search Result List Pane context menu.

Choosing Build Pathway opens PathwayWizard Panel. See Figure 6.3.9

Check starting entities for your pathway, See Figure 6.3.10

Choose an algorithm for pathway building, See Figure 6.3.10

Fig.6.3.9 Build Pathway Wizard Window.
Fig.6.3.9 Build Pathway Wizard Window.

Pathway Wizard

PathwayWizard allows you to create pathways, using molecular interaction data from the database. It starts by suggesting several ways to determine starting entities, and then opens the Build Pathway dialog box. See Figure 6.3.10

Fig.6.3.10 Build Pathway Wizard Window.
Fig.6.3.10 Build Pathway Wizard Window.

Check the Filter option and set up the filter, if needed, See Figure 6.3.11.see (3)
Refer to the User Manual to learn more about the BuildPathway Wizard

Fig.6.3.11 Build Pathway Filters Wizard.
Fig.6.3.11 Build Pathway Filters Wizard.

Apply advanced attributes search to specify any logical combination of searched bioentities and bioprocesses properties. See Figure 6.3.12

Fig.6.3.12 Search by Attribute and Node type window.
Fig.6.3.12 Search by Attribute and Node type window.

Press the Start button to start data mining, and then click OK when done (4). The new pathway diagram appears in the Pathway Pane.

7. Working with Microarray Data

7.1 Loading Microarray Data

Before using Microarray data environment of BiologicalNetworks make sure you have Microarray files to open.

Example data files could be downloaded here: http://brak.sdsc.edu/pub/BiologicalNetworks/MicroarrayData.zip

Expression data are easily imported from Microarray submenu of BiologicalNetwroks main window. See Figure 7.1.1

Fig.7.1.1 Load Microarray Data.
Fig.7.1.1 Load Microarray Data.

Choosing one of the available file types will open one Import Expression Data Wizard

See Figure 7.1.2


The Import Expression Wizards allow you to import the expression data in BiologicalNetworks.
The data files can be in TXT or MS Excel (for Stanford tab delimited and Affymetrix, .mev and .ann (for TIGR expression data), .gpr (for GenePix expression data). To import your data, call the Microarray>Load Microarray Data menu and choose format. Then specify the location of the source data file and follow the steps provided with the Wizard. The imported data will be opened in the Import Expression Wizard.



Importing Stanford (tab delimited) Data.

To import data in Stanford (tab delimited) TXT or MS Excel formats:
  • Call the Microarray>Load Microarray Data and choose the Stanford Format option.

  • The Import Expression Wizard appears.
  • In the Expression Wizard window, specify the content of the first string of the data file and the columns of the data file that contain the Gene IDs by clicking the upper leftmost expression value.
  • Then press "Load" to load the data.

  • Importing Affymetrix Microarray Data

    To import microarray data in Affymetrix format:
  • Call the Microarray>Load Microarray Data and choose the Affimetrix Format option.
  • The Import Expression Wizard appears.
  • In the Choose Affymetrix Expression Files dialog box, select the file and press OK.


  • Importing Microarray Data in GPR Format

    To import microarray data in GPR format:
  • Call the Microarray>Load Microarray Data and choose the Genepix Format (GPR) option.
  • The Import Expression Wizard appears.
  • Follow the steps provided by the wizard.


  • Fig.7.1.2 Load Stanford(tab delimited) data type file Wizard.
    Fig.7.1.2 Load Stanford(tab delimited) data type file Wizard.



    7.2 Color schemes and Visual styles settings

    When an expression experiment is opened as a heat map, a colored box represents the expression level of each gene (protein). There are two default color schemes in the Expression Experiment Viewer that correspond to data formats supported by the software (Signal and Ratio). While importing expression data, you should choose the color scheme in the Apply Custom Color Scheme submenu of Microarray menu. Later, in the opened experiment, you can change the color schemes in the Expression Viewer Toolbar.

    Ratio data: the color intensity is proportional to the log ratio of the current sample to the base sample and is represented as double gradient color map. There are acceptable negative values in this format. On the heat map, the green color represents the negative log ratios and the red color represents the positive log ratios. Greens of increasing intensity correspond to increasingly negative log ratios. Reds of increasing intensity correspond to increasingly positive log ratios. In the Color Settings dialog window, you can set up the color range for min and max values of ratio, the cut off values, and the color range for missing data.

    Signal data: the color intensity is proportional to the signal value and is represented as single gradient color map. There are no negative values in this data representation. By default, the software uses green for low expression values, red for high expression values, and yellow for missing values.


    Visual Styles for Gene Expression

    Visual styles are used to make your gene expression map more intuitive and clear.

    -First, you can try to change the scale of the expression map by zooming in and out, and setting up element size.
    -Secondly, can also use the Brightness option to adjust the color intensity of the heat map for better viewing.
    -You can also change the color range for the particular gene expression map. Call the Microarray>Color Settings dialog or use Microarrat toolbar icon to adjust the Expression Viewer option. In this dialog box, you can also enter the cut off values, set up the color range for min and max values of ratio, and the color range for missing data.
    -In the heat map, a colored box represents the level of expression for each gene. The software supports two default color schemes (Ratio and Signal) for expression data. To change the color schemes for an opened experiment, open the Micrroarray>Apply Custom Color Schema (or Microarray toolbar) dialog box, and select the radio button corresponding to the color scheme you want.


    7.3 Expression Experiment Viewer

    Loaded Microarray data appears on the Expression Experiment Viewer Panel on the bottom of BiologicalNetworks’ Main window.

    The Expression Experiment Viewer is designed to display a graphical representation of gene expression and proteomics experiment data, usually generated by microarray experiments. It provides the algorithms and workspace for examining the data from expression experiments or proteomics experiments and also for superimposing this data onto an opened pathways and gene regulatory networks.

    See Figure 7.3.1

    Fig.7.3.1 Microarray data file loaded.
    Fig.7.3.1 Microarray data file loaded.


    Functionalities available from Microarray submenu and Microarray Experiment Manager Menu bar, allows the user to:

    - Create new pathways as well as new groups from an expression experiment.

    - Select a number of genes and create a group or a pathway from them.

    - Expression data can be visually displayed on an existing pathway diagram by showing different shades of green/red depending on the fold change of expression.

     There are numerous clustering, filtering, normalization, search methods available in BiologicalNetworks.

    7.4 Expression viewer toolbar

    The Expression Viewer toolbar contains wide range of functionalities:

  • Import expression experiment option opens a gene expression file. The file must have a TXT, XLS, .mev and .ann, .gpr extension.
  • Save expression experiment.
  • Zoom in, Zoom out and Choose element size in the microarray matrix view.
  • Color entities by expression. Select this option to color pathway of interest entities by their expression values. Refer to Section 8.
  • Extract pathways, predict network from expression menu allows you to run the Correlation (for ex. Pearson) algorithms to build a network from your raw data. Refer to Section 8.
  • Build pathways from selection option creates a pathway from the selected genes. Refer to Section 8.
  • Filter and sort expression data.
  • Search over expression data. The Search feature permits the user to search the data for genes or samples for a search term given by search criteria.
  • Create group from selection option creates a group from the selected genes. Refer to Section 8.
  • Find pathways option can be applied to the opened Expression Experiment and returns a list of pathways that share at least one protein from the selection. Refer to Section 8.
  • Find groups option does the same as the Find Pathways option, but it will return the list of groups, which share at least one protein of the selection. Refer to Section 8.
  • Group selected genes together option allows you to select genes of interest from the expression map or table and put them together. Refer to Section 8.
  • Visual styles and color settings for gene expression map.


  • Fig.7.4.1 Expression viewer toolbar.
    Fig.7.4.1 Expression viewer toolbar

    7.5 Filtering, Normalization and Data Transformation

    Adjust Data

    Different types of adjustments may be applied on top of one another in any sequence, and the same type of adjustment may be applied repeatedly to the matrix. Adjustments may not necessarily affect the main display or the values displayed when elements are clicked on the matrix displays, but will influence the calculation of the expression matrix, the foundation of all analyses. Adjustments will also be reflected when the entire matrix or individual clusters are saved as text files, although the original data files are not overwritten. Furthermore, with the exception of three options: “Set Lower Cutoffs”, “Set Percentage Cutoffs” and “Adjust Intensities of Zero”, all the changes made to an expression matrix are irreversible for the current session.

    Because of the above features, a good way to use these options might be to apply any required adjustments to the data set, save the entire adjusted matrix as a tab delimited formatted text file (using the “Save Microarray Matrix” option under the “Microarray” Menu), and then load this new file in a new session, during which no further data adjustments will be made. This will ensure consistency throughout the session.

    Adjustment options are described below:

    Data Transformations
  • Log2 Transformation

  • This is fairly self-evident, just taking the log2 transform of every element in the matrix. Note that this adjustment should not usually be necessary. The program will automatically compute the log2 ratio of the two intensities and use them in the expression matrix. TDMS files also often contain pre-calculated log2 ratios.
  • Log10 to Log2

  • This assumes that the current data are log 10 transformed, and transforms them to log base 2, i.e., it assumes that the input data is in the form log10x, and it outputs log2x.

    Data Filters (Data Quality and Variance Based Filters)
  • Lower Cutoffs

  • Select Use Lower Cutoffs to exclude from analysis any genes for which the expression values (in either the corresponding Cy3 or Cy5 columns) are lower than specified values. Select Set Lower Cutoffs to set these values. To enable this option, check the “Use Lower Cutoffs” checkbox just below the “Set Lower Cutoffs” menu option, and uncheck it to disable this option. All subsequent analyses will include only those genes for which all Cy3 and Cy5 values are above the specified thresholds. This option is disabled by default.
  • Percentage Cutoffs

  • Select Use Percentage Cutoffs to ignore the genes for which there are not enough valid (non-zero) expression values across all samples. This will not delete any data, but will only exclude the genes from analysis. This option is sometimes useful in speeding up module calculation since many zeros will often slow them down.
    To determine which genes will be excluded, select Set Percentage Cutoffs and enter a percentage value. To enable this option, check the “Use Percentage Cutoffs” checkbox just below the “Set Percentage Cutoffs” menu option, and uncheck it to disable this option. Genes with less than the specified percentage of non-zero values will be ignored. A value of 0.0% indicates that all genes will be used in the analysis. To require that every one of the gene’s expression values must be valid to be included, set the value to 100. This option is disabled by default.
  • Variance Filter

  • The variance filter allows the removal of genes with low variation of expression over the loaded samples. This filter is basically used to remove ‘flat genes’ that don’t vary much in expression over the conditions of the experiment. The variance filter has three possible criteria for specifying which genes to keep. The Enable Variance Filter check box turns the filter on and off. Be sure to observe the History Node log to see the number of genes retained after using the filter. Note that the variance filter is performed after other filters such as Percent Cutoff Filter is imposed. This convention insures that the genes that are check for variance also contain some minimum level of ‘good’ (not missing) data.

    The Percentage of Highest SD Genes option ranks the genes based on standard deviation and then the genes that are kept are some percentage of this ranked list. For and example, if we have 1000 genes and the percentage was set to 20%, then the result would be a final list of the 200 most variable genes.

    The Number of Desired High SD Genes also ranks the genes based on SD and then the number of genes specified are selected from this SD ordered list such that the highest SD genes are selected. The SD Cutoff Value uses an actual SD value such that all genes having an SD greater than this value are selected.

    7.6 Sorting and searching over expression data

    The Sort feature permits the user to sort the data:

  • By expression value
  • By chromosomal order
  • By Gene ID



  • Search Utility

    -The Search feature permits the user to search the data for genes or samples for a search term given search criteria.
    -The Search initialization dialog allows the option of finding genes or samples. The search criteria include a search term, a selection to make the search case sensitive, and a selection to permit the search term to be an exact match or simply a contiguous portion of a larger annotation term.
    -Search results are returned in a new window . Upper section is represented as a table of genes or samples identified as matching the search criteria and a lower section providing shortcut links to cluster viewers that contain the identified samples or genes.
    -Navigation shortcuts provide a means to open cluster viewers that contain the elements found in the search.
    -Elements in the table can be deselected using the checkboxes. Clicking on the Update Shortcuts button will produce a new search result window with just the previously selected entries and the associated viewer shortcuts. This allows one to prune unwanted elements out of the search result.
    -The Store Cluster button will store the selected items as a cluster and assign a user selected color.

    7.7 Clustering of Experimental Data

    Each of clustering algorithms available in BiologicalNetworks can be launched from the Microarray> Cluster Analysis> Clustering Algorithms menu of the Main Window. All clustering algorithms can be performed to cluster genes or samples. Clustering analysis results appear in in the Analysis subtree of the Project Properties navigation tree. The tabs within this subtree contain the results of the method's calculations. Each algorithm run will present a dialog or form to use to input parameters specific to the algorithm being performed.

    Full description of algorithms is available in BiologicalNetworks see in Section 9.

    7.8 Clustering analysis viewers

    Viewers are the graphical displays used to present the results of the microarray analysis. The viewers will appear as a subtree under the method’s Analysis Tree within the Project Properties navigation tree.


    Fig.7.8.1Clustering analysis expression viewer.
    Fig.7.8.1Clustering analysis expression viewer.


    Expression Images

    This viewer is used in the main window of the Expression Viewer as well as in clustering analysis Viewers. Every colored rectangle represent a gene. Each column represents all the genes from a single experiment, and each row represents the expression of a gene across all experiments. The default color scheme used to represent expression level is red/green (red for overexpression, green for underexpression) and can be adjusted in the Color Scheme dialog in the Microarray> Apply Custom Color Schema or/and Color Settings menu. See Section 7.2

    Double clicking on any of the rectangles in this view will open a window containing a graph of this gene’s expression level across all samples.


    Expression Graphs

    The Expression Graphs Viewer displays graphs of the expression levels of each gene across the experimental conditions.
    The mean expression levels of genes in the cluster are shown as a centroid graph overlaid on top of the individual expression graphs.


    Fig.7.8.2 Clustering analysis expression graphs.
    Fig.7.8.2 Clustering analysis expression graphs.

    Gene cluster Table Views

    Table View of clustering results show annotations for gene in the cluster.
    -You can drag columns horizontally across the table to change their relative ordering.
    -You can sort the rows in ascending or descending order of the entries in the column by successive clicking on the header of that column.
    -You can sort the “Stored Color” column, bringing together elements that have been stored with the same cluster color.
    -You can sort the table in the original order of elements by CTRL-clicking on any column header.

    There is a Context Menu appearing by Right-clicking on the table view. The options available from the Context Menu are:

  • Store a subset of rows in the table in a cluster, to Groups/Clusters manager
  • Store entire table as a cluster, to Groups/Clusters manager
  • Make a search over the table.
  • Save currently viewed cluster to a file.
  • Delete all rows in the table or a subset of them
  • Delete a cluster stored from this viewer


  • Fig.7.8.3 Clustering analysis table viewer.
    Fig.7.8.3 Clustering analysis table viewer.



    7.9 GeneOntology terms overrepresentation analysis

    BiologicalNetworks provides an implementation of the GeneOntology Fisher's overrepresentation test, method which gives the researcher an initial biological interpretation of gene clusters based on the indices provided in the input data set and information linking those indices to biological “themes”. These themes are generally GO terms, KEGG pathways, or any other descriptive term related to biological role or biochemical pathway information. The result of the analysis is a group of biological themes which are represented in the cluster. A statistic reports the probability that the prevalence of a particular theme within the cluster is due to chance alone given the prevalence of that theme in the population of genes under study (all “genes” loaded into BiologicalNetworks ).


    Fisher Exact Probability

    The Fisher's Exact Probability reports the probability that a biological theme is over-represented in the cluster of interest relative to the representation of that theme in the total gene population.
    For example, suppose that one has a gene list of 50 genes from a population of 10,000 genes. Now suppose that 10 of the 50 genes were related to pathway "A" but only 13 genes in the total population were associated with pathway "A". This scenario would yield a low probability that the observed number of hits (occurrences of pathway "A") within the small sample could be due to chance alone. This statistic is based on the hypergeometric distribution and has benefits over chi-square in that it is appropriate for finite populations.

    Fig.7.9.1 Run GeneOntology annotation analysis.
    Fig.7.9.1 Run GeneOntology annotation analysis.


    Annotation parameters Panel

    Population and Cluster Selection Option
    This option specifies a gene population or a gene cluster list. The default selection is to use a population file which is simply all of the genes loaded into BiologicalNetworks.
    The Annotation parameters Panel also displays gene clusters currently stored in BiologicalNetworks cluster repository. If no clusters have been saved then a blank browser page will be displayed on this panel and the Cluster Analysis mode option will be disabled. Selecting a row (or a group of rows using 'Alt' button) in the cluster table will display the cluster in the expression graph area of the browser. Cluster analysis will be executed on the selected clusters.


    Annotation key

    This area contains a drop down list which contains a list of available annotation types which can be used identify genes. Generally it's best to use an index or accession 'uniquely' identifying the spotted material.

    Annotation Conversion File

    This optional file provides the mapping from your annotation key (above) to the index used to map to biological themes (GO terms, KEGG pathways, etc.). If your annotation key type is the one used in the linking file (below) then this conversion (mapping) is not needed. These files if needed are typically stored in the Convert directory.


    Gene Annotation / Gene Ontology Linking Files

    This section allows one to specify one or more annotation files. These files contain gene indices paired with biological themes such as GO terms. These files typically reside in the Class directory.

    Fig.7.9.2 Annotation analysis panel.
    Fig.7.9.2 Annotation analysis panel.


    Results of GeneOntology Analysis

    The primary result is reported in a table in which entries are ordered based on the reported statistic. The table can be sorted on any column. A right click in the table will launch a menu allowing several operations:

    Store Selection as Cluster: Stores the genes associated with a biological theme as a cluster that will be stored in the cluster manager.

    Open Viewer: Opens one of three possible viewers containing the genes within the biological theme. These viewers are also accessible from a node in the result tree which follows the table node in result navigation tree.

    Save Table: Stores the result table to a tab delimited file.



    Fig.7.9.3 Annotation analysis results.
    Fig.7.9.3 Annotation analysis results.

    8. Explore gene relationships with expression data

    Theory that expression of interacting entities is correlated due to evolutional or physical reasons makes it possible to predict networks of interactions from expression values. This is a good opportunity to start with, especially if you have no initial hypothesis concerning your gene expression data.

    8.1 Color network by expression values.

    To overlay Expression Experiment results onto an existing pathway diagram:

  • Open an expression experiment;
  • Open a pathway of interest;
  • Press the Coloring by Expression Values Toolbar button to color the active pathway by expression values;
  • From drop down menu choose the sample time point you would like to visualize on the active Pathway Pane



  • Fig.8.1 Color network by expression values
    Fig.8.1 Color network by expression values.



    8.2 Extract pathways from expression data.

    Correlation algorithms group genes according to similarities in patterns of expression variation over all the samples. A correlation network is a group of genes whose expression profiles are highly predictive of one another. Each pair of genes related by a correlation coefficient larger than a minimum threshold and smaller than a maximum threshold (assigned in the initialization dialog box) is connected by a edge. Groups of genes connected to one another are referred to as networks.

    Parameters:

    Sample Selection
    The sample selection option indicates whether to cluster genes or samples.

    Use Permutation Test
    This check box is used to indicate that the minimum threshold R2 value should be selected based on a distribution constructed from element to element R2 values derived following permutation of the expression vectors.

    Min Threshold
    This value ranging from 0 to 1.0 indicates the smallest R2 possible between two elements to permit a link between the elements in a subnet. This is minimal correlation which you want to include in your network.

    Max Threshold
    This value ranging from 0 to 1.0 indicates the greatest R2 possible between two elements to permit a link between the elements in a subnet. This is maximal correlation which you want to include in your network.

    Use Filter
    This option allows the user to filter out elements with little dynamic change thus removing flat or uninteresting elements. A measure of entropy is used to rank the elements. The percentage value entered (1 to 100) indicates what percentage of the elements to retain for the construction of the network. A value of 25 will retain the 25% of elements having the greatest entropy.

    Distance Metric: Pearson squared



    Fig.8.2 Extract pathways from expression data.
    Fig.8.2 Extract pathways from expression data.



    The algorithm calculates the correlation coefficient between genes by comparing the expression pattern of each gene to that of every other gene. The ability of each gene to predict the expression of each other gene is measured as a correlation coefficient. Genes are represented as nodes in a network and edges are drawn between them if their correlation coefficient falls between the minimum and maximum thresholds specified in the initialization dialog. The experiment subtree created in the Project Properties Panel contains information regarding the networks predicted. Under the Network tab is a graph of all of the subnets generated. A subnet is a group of genes in which each gene is connected to at least one other gene. The Correlation Subnets tab contains network diagrams for each of the individual subnets, and the Expression Images folder contains expression views for the genes in each of them.

    8.3 Build Pathways for selected expression values

      To create a new pathway from an expression experiment:

    • Open the Expression Experiment and select the genes by clicking on them while holding the SHIFT key;
    • Press the Build Pathway from Selected Genes button on the Microarray Toolbar;
    • The Build Pathway dialog box appears;
    • In the Build Pathway dialog box, choose the method for creating a pathway;See Chapter 6.3.4 to learn more on how to create a pathway from the list of genes.
    • For example, it can be the Find All Entities Connected to Selected
      Entities or Expand Pathway algorithm.
    • Set up the Filter options.
    • Press the Start button to start building a pathway.
    • The New Pathway appears in the Project Properties Pane and in the Graph View.

      To create a new group from an expression experiment, follow these steps:

    • Open the expression experiment and select genes;
      Press the Create New Group Toolbar button;
    • Enter the group name in the dialog box;
    • In the dialog box, press Create Group button, and then press Close;
    • A new group appears in the Groups/Clusters subtree of the Project Properties tree.



    Fig.8.3 Build Pathways for selected expression values
    Fig.8.3 Build Pathways for selected expression values.

    9. References

    [1] Shannon P. et al.: Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Research 2003, 13, 2498-2504.

    [2] Mendes P. et al.: Pathdb: a second generation metabolic database. In Hofmeyr JH, Rohwer J, Snoep J. (eds.), Ani- mating the cellular map, pp. 207-212. Stellenbosch University Press.

    [3] Bader G. et al.: BIND-The Biomolecular Interaction Network Database. Nucleic Acid Res. 2001, 29, 242-245.

    [4] Bhalla US: The chemical organization of signaling interactions. Bioinformatics 2002, 18, 855-863.

    [5] Cary MP. et al.: Pathway information for system biology. FEBS Lett. 2005, 579(8), 1815-1820.

    [6] Chen L, Gupta A, Kurul ME: Efficient algorithms for pattern matching on directed acyclic graphs. In Proc. 21st Int. Conf. on Data Engineering (ICDE), Tokyo.

    [7] Hu Z, et al.: VisANT: data-integrating visual framework for biological networks and modules. Nucleic Acid Res. 2005, 33,352-357.

    [8] Krishnamurthy L. et al.: Pathways database system: an integrated system for biological pathways. Bioinformatics 2003, 19, 930-937.

    [9] Yeger-Lotem E. et al.: Network motifs in integrated cellular networks. Proc. Natl. Acad. Sci. 2004, 101 (16), 534-539.

    [10] Ogata H. et al.: KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acid Res. 1999, 27, 29-34. http://www.genome.ad.jp/kegg/.

    [11] Nikitin A. et al.: Pathway studio - the analysis and navigation of molecular networks. Bioinformatics Applications Note 2003, 19, 1-3.