Your name:
Name of your pair:
In this exercise we will eventually create a random ER graph and plot its degree distribution. However, we will start by introducing some basic python concepts and the iPython notebook environment to get you started.
Your solution:
from random import random
print(random())
A \(\rightarrow\) B
A \(\rightarrow\) C
B \(\rightarrow\) E
C \(\rightarrow\) B
C \(\rightarrow\) D
D \(\rightarrow\) A
graph = {'A' : ['B', 'C'],
'B' : ['E'],
'C' : ['B', 'D'],
'D' : ['A'],
'E' : []}
print(graph)
Create a graph with 1000 vertices labeled {1,...,1000} where each directed pair of vertices is connected by an edge with probability 0.05. Use the random function from above and the graph data structure described above.
Your solution:
from random import randint
import matplotlib.pyplot as plt
# Make plots appear inline
%matplotlib inline
# Generate the random numbers
numbers = [randint(0,9) for i in range(0,100)]
# Plot the histogram
plt.hist(numbers, bins=10)
Plot the degree histogram of the random graph you created in the previous step.
Your solution:
In this exercise we will read the E. coli regulatory network and plot its degree distribution.
http://regulondb.ccg.unam.mx/menu/download/datasets/files/network_tf_gene.txt
The header of the file contains general information on several commented lines starting with #. This is followed by the description of the network where each lines describes one edge of the network: the first column contains the transcription factor (TF) that regulates the gene in the second column thus describing an edge pointing from the TF to the gene. Note that the TF names start with upper case letters and the gene names with lower case letters. For example accB is the gene coding for the TF AccB. Thus when constructing the regulatory network treat accB and AccB as the same vertex. For this exercise you can ignore the rest of the columns.
Write code that reads the E. coli regulatory network into a graph data structure similar to that used in the first exercise. You can use the gene names as vertex labels. You might find the python string method split helpful in parsing the file.
Your solution:
Your solution:
import urllib.request # library handling URLs
ecoli_network = {}
# E. coli network file
networkfile="http://regulondb.ccg.unam.mx/menu/download/datasets/files/network_tf_gene.txt"
# Open the file. This will work exactly like a regular file.
data = urllib.request.urlopen(networkfile)
for line in data:
line = line.decode('utf-8') # decode the byte array into a string
# Add code to load the data here
Does the degree distribution of the E. coli regulatory network follow a power law? Why? / Why not? Try to fit a power law distribution to the data by plotting a curve \(y=a*x^{-k}\) with suitable values of \(a\) and \(k\).
from random import randint
import matplotlib.pyplot as plt
# Make plots appear inline
%matplotlib inline
# Generate random data
x = [2**i for i in range(0,10)]
y = [2**randint(0,9) for i in range(0,10)]
# Get the current axis
axis = plt.gca()
# Set log scale on x axis
axis.set_xscale('log')
# Set log scale on y axis
axis.set_yscale('log')
# Plot the data ('o' makes it a scatter plot)
axis.plot(x,y,'o')
Your solution:
Write your answer to the questions here.
In this exercise we will investigate the clustering coefficient of the E. coli regulatory network and compare it to the clustering coefficient of ER graphs with the same number of vertices and with a similar probability for an edge as in the E. coli regulatory network.
Your solution:
def clustering_coefficient(graph):
# Compute the clustering coefficient
# Return the clustering coefficient
return 0.0
print(clustering_coefficient(ecoli_network))
Your solution:
Your solution:
def ergraph(n, p):
# Create an ER graph with n vertices and probability p for an edge to exist
graph = {}
return graph
# Create an ensemble of ER graphs and compute their clustering coefficients
# Plot the histogram of the clustering coefficients
Write your answer to the questions here.
ER graphs have a very different degree distribution as compared to the E. coli regulatory network. Therefore the result of the previous exercise might just be a consequence of having a particular degree distribution. In this exercise we will investigate if the clustering coefficient of the E. coli regulatory network is exceptional for graphs with the same degree distribution.
Your solution:
def switchgraph(graph):
# Create a copy of the original graph
switchedgraph = {}
# Perform m random switches
# Return the graph
return switchedgraph
# Create a random graph and plot its degree distribution
Your solution:
Write your answer to the questions here.