Complex system: A group of heterogeneous entities whose interaction leads to collective behaviours and emergent properties that the individual parts do not display.
Emergent properties: Properties that a complex system displays, but that none of its constituents have. They are usually the result of simple rules of interaction between system components.
Fallacy of division: Failure to realise that a property is emergent, can lead to the fallacy of division. The taste of saltiness, for example, is a property of salt, but neither sodium nor chlorine are salty.
Flocking is a very common example of emergent property. It represents the emergence of self-organisation in systems composed of many autonomous birds or insects.
Being an emergent property, flocking behaviour arises from the interaction of individual agents adhering to a set of simple rules. As a result, flocking can be easily simulated in the computer. In 1986, Craig Reynolds developed an artificial life program, where boids (bird-oid objects) observe the following rules:
Drag the mouse to add boids (a maximum of 50 are added), click to pause or run:
Check out Conway's Game of Life for another example of complexity emerging from simple rules.
To facilitate the analysis of complex systems, they can be represented as networks/graphs, where system components are abstracted by nodes/vertices and their interactions by edges/links/ties connecting them. Measurement of node and edge features allows for the topological characterisation of the network.
In order to figure out the origin of the topological characteristics of complex systems, we can model the formation of their network representations by imposing connectivity rules on nodes. The following simulation allows you to explore two network formation scenarios:
The network representations of many complex systems, like the Internet or social networks, have the following three important topological properties:
Most functions within the cell emerge thanks to a complex network of protein interactions. Failure of the control mechanisms behind these delicate relationships can lead to complex human disorders.
Being a complex network, the human protein interactome presents the scale free, small world and strong clustering properties. These topological features have a strong impact on the function and dynamics of the network. For example, hubs have been shown to be highly conserved proteins with essential functions, strong clustering is indicative of the presence of groups of proteins and complexes involved in similar processes and the small world property has an important effect on the way that signals (hormones, ligands, energy, infections, etc.) are spread throughout the system.
Although the emergence of strong clustering in complex networks is still a very active subject of research, the most probable reason for the presence of a scale free degree distribution in the human protein interactome is the duplication of genes and their subsequent functional divergence due to mutations.
Of the many methodologies that can measure protein-protein interactions, two are currently in wide use for large-scale mapping: the yeast-two-hybrid (Y2H) system and affinity- or immunopurification followed by some form of mass spectrometry (AP/MS).
When research groups screen for protein interactions for a certain project, they usually accompany the associated publications with a list of such interactions or they deposit them in specialised repositories where expert curators analyse the data and make it available in standardised formats.
There are also resources that integrate data from the above repositories and facilitate the construction and analysis of high-quality protein networks. These databases report interactions together with a confidence score that reflects how reliable they are. Some representative examples are STRING, GeneMania and HIPPIE.
Consider the following hypothetical scenario:
The University Medical Centre at JGU-Mainz is carrying out a study on children with Progeria, an abnormal congenital condition characterised by premature aging. Progeria's most evident manifestations are premature greying, hair and hearing loss, cataracts, arthritis, wrinkles and loose skin. The latter being the result of muscle and skin cell senescence.
Based on different screenings involving sick and healthy children, researchers at UniMedizin have identified 44 genes associated with Progeria and have asked you to analyse them in order to better understand the molecular basis of this disease and pinpoint potential drug targets.
Armed with the Network Biology tools that you've just learnt, you are set to help them out.
1. Use HIPPIE's Network Query to visualise high-confidence interactions within the set of Progeria genes from UniMedizin. Verify that there is indeed a significant association between these genes and Progeria.
2. Identify the top-3 hubs in the Progeria subnetwork. Study their function in UniProt. Are these proteins functionally related? What could their role be in Progeria?
3. Identify the node with the highest clustering coefficient in the Progeria subnetwork. Study its function and the function of its direct partners in UniProt. Can you deduce the function of this protein complex? (hint: you don't really have to calculate clusterings here, instead look for closed-triangle motifs in the network)
4. The UniMedizin researchers tell you that they're particularly interested in the Zyxin protein (ZYX), because its function is not well-known. Can you infer it by analysing the function of its partners in the Progeria subnetwork?
5. What's the size of the largest connected component (LCC) of the Progeria subnetwork? Taking random sets of 44 proteins from the human protein interactome results in LCCs with an average size of 3. What does this difference tell you about the inter-connectedness of the Progeria subnetwork?
6. The average shortest path length of the human protein interactome is approximately 5. The average shortest path length of the Progeria subnetwork is slightly larger than 3. What can you deduce about the importance of this difference?
7. Previous research has pointed at protein LMNA as one of the most important factors behind Progeria. This protein is crucial in nuclear assembly, nuclear membrane formation and telomere dynamics. Study the function of the direct partners of LMNA in UniProt. Do these interactions support the importance of LMNA in this disease? Why?
8. The role of LMNA in telomere dynamics intrigues you. You know that at every cellular division telomeres get shorter, up to the point where they can't shrink anymore and cells enter a state called cellular senescence. You suspect that this might be connected to the molecular basis of Progeria. Are there any differences in telomere length between Progeria patients and age-matched children? (hint: do a Google search for progeria and telomere length).
9. You now have three important pieces of information for your final report for UniMedizin: the function of topologically important nodes in the Progeria subnetwork, your analyses about the inter-connectedness of the subnetwork and the telomere length involvement in aging. Can you put these three pieces together and speculate about the biological processes that are affected by mutations in the 44 genes identified by your colleagues at UniMedizin?
10. What kind of molecular functions could be the target of a new therapy against Progeria, based on your network analysis?
A wide range of network-based approaches have been and are being developed to address problems with relevance to biology and human health. Some of the scenarios where network analysis is playing an important role are:
1. Give a definition for complex system and for emergent properties.
2. Give an example of the fallacy of division.
3. What's the degree of the white node in the following network? What's the length of its shortest path to the black node? What's the clustering coefficient of the striped node? What's the size of the network's largest connected component?
4. Consider the following two networks and their degree distributions. Which one is more likely to represent a complex system and why?
5. List the topological properties that are common to most network representations of complex systems, like the human protein interactome.
6. What are two of the ingredients that might be responsible for the scale free degree distribution of protein interaction networks?
7. Mention two high-throughput experimental techniques to measure protein-protein interactions.
8. Mention two protein-protein interaction databases from which you can construct high-quality protein networks.
9. Mention three problems with relevance to biology and human health, where network analysis is playing an important role.