Heat Diffusion Workflow

An variant of the Network Perturbation Amplitude algorithm

pybel_tools.analysis.cmpa.RESULT_LABELS = ['avg', 'stddev', 'normality', 'median', 'neighbors', 'subgraph_size']

The columns in the score tuples

class pybel_tools.analysis.cmpa.Runner(graph, target_node, key, tag=None, default_score=None)[source]

The NpaRunner class houses the data related to a single run of the CMPA analysis

Initializes the CMPA runner class

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • target_node (tuple) – The BEL node that is the focus of this analysis
  • key (str) – The key for the nodes’ data dictionaries that points to their original experimental measurements
  • tag (str) – The key for the nodes’ data dictionaries where the CMPA scores will be put. Defaults to ‘score’
  • default_score (float) – The initial CMPA score for all nodes. This number can go up or down.
iter_leaves()[source]

Returns an iterable over all nodes that are leaves. A node is a leaf if either:

  • it doesn’t have any predecessors, OR
  • all of its predecessors have CMPA score in their data dictionaries
Returns:An iterable over all leaf nodes
Return type:iter
has_leaves()[source]

Returns if the current graph has any leaves.

Implementation is not that smart currently, and does a full sweep.

Returns:Does the current graph have any leaves?
Return type:bool
in_out_ratio(node)[source]

Calculates the ratio of in-degree / out-degree of a node

Parameters:node (tuple) – A BEL node
Returns:The in-degree / out-degree ratio for the given node
Return type:float
unscored_nodes_iter()[source]

Iterates over all nodes without a CMPA score

get_random_edge()[source]

This function should be run when there are no leaves, but there are still unscored nodes. It will introduce a probabilistic element to the algorithm, where some edges are disregarded randomly to eventually get a score for the network. This means that the CMPA score can be averaged over many runs for a given graph, and a better data structure will have to be later developed that doesn’t destroy the graph (instead, annotates which edges have been disregarded, later)

  1. get all unscored
  2. rank by in-degree
  3. weighted probability over all in-edges where lower in-degree means higher probability
  4. pick randomly which edge
Returns:A random in-edge to the lowest in/out degree ratio node. This is a 3-tuple of (node, node, key)
Return type:tuple
remove_random_edge()[source]

Removes a random in-edge from the node with the lowest in/out degree ratio

remove_random_edge_until_has_leaves()[source]

Removes random edges until there is at least one leaf node

score_leaves()[source]

Calculates the CMPA score for all leaves

Returns:The set of leaf nodes that were scored
Return type:set
run()[source]

Calculates CMPA scores for all leaves until there are none, removes edges until there are, and repeats until all nodes have been scored

run_with_graph_transformation()[source]

Calculates CMPA scores for all leaves until there are none, removes edges until there are, and repeats until all nodes have been scored. Also, yields the current graph at every step so you can make a cool animation of how the graph changes throughout the course of the algorithm

Returns:An iterable of BEL graphs
Return type:iter
done_chomping()[source]

Determines if the algorithm is complete by checking if the target node of this analysis has been scored yet. Because the algorithm removes edges when it gets stuck until it is un-stuck, it is always guaranteed to finish.

Returns:Is the algorithm done running?
Return type:bool
get_final_score()[source]

Returns the final score for the target node

Returns:The final score for the target node
Return type:float
calculate_score(node)[source]

Calculates the score of the given node

Parameters:node (tuple) – A node in the BEL graph
Returns:The new score of the node
Return type:float
get_remaining_graph()[source]

Allows for introspection on the algorithm at a given point by returning the subgraph induced by all unscored nodes

Returns:The remaining unscored BEL graph
Return type:pybel.BELGraph
pybel_tools.analysis.cmpa.multirun(graph, node, key, tag=None, default_score=None, runs=None)[source]

Runs CMPA multiple times and yields the NpaRunner object after each run has been completed

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • node (tuple) – The BEL node that is the focus of this analysis
  • key (str) – The key for the nodes’ data dictionaries that points to their original experimental measurements
  • tag (str) – The key for the nodes’ data dictionaries where the CMPA scores will be put. Defaults to ‘score’
  • default_score (float) – The initial CMPA score for all nodes. This number can go up or down.
  • runs (int) – The number of times to run the CMPA algorithm. Defaults to 1000.
Returns:

An iterable over the runners after each iteration

Return type:

iter[NpaRunner]

pybel_tools.analysis.cmpa.workflow(graph, node, key, tag=None, default_score=None, runs=None)[source]

Generates candidate mechanism and runs CMPA.

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • node (tuple) – The BEL node that is the focus of this analysis
  • key (str) – The key in the node data dictionary representing the experimental data
  • tag (str) – The key for the nodes’ data dictionaries where the CMPA scores will be put. Defaults to ‘score’
  • default_score (float) – The initial CMPA score for all nodes. This number can go up or down.
  • runs (int) – The number of times to run the CMPA algorithm. Defaults to 1000.
Returns:

A list of runners

Return type:

list[Runner]

pybel_tools.analysis.cmpa.workflow_average(graph, node, key, tag=None, default_score=None, runs=None)[source]

Gets the average CMPA score over multiple runs.

This function is very simple, and can be copied to do more interesting statistics over the NpaRunner instances. To iterate over the runners themselves, see workflow()

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • node (tuple) – The BEL node that is the focus of this analysis
  • key (str) – The key for the nodes’ data dictionaries that points to their original experimental measurements
  • tag (str) – The key for the nodes’ data dictionaries where the CMPA scores will be put. Defaults to ‘score’
  • default_score (float) – The initial CMPA score for all nodes. This number can go up or down.
  • runs (int) – The number of times to run the CMPA algorithm. Defaults to 1000.
Returns:

The average score for the target node

Return type:

float

pybel_tools.analysis.cmpa.workflow_all(graph, key, tag=None, default_score=None, runs=None)[source]

Runs CMPA and get runners for every possible candidate mechanism

  1. Get all biological processes
  2. Get candidate mechanism induced two level back from each biological process
  3. CMPA on each candidate mechanism for multiple runs
  4. Return all runner results
Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • key (str) – The key in the node data dictionary representing the experimental data
  • tag (str) – The key for the nodes’ data dictionaries where the CMPA scores will be put. Defaults to ‘score’
  • default_score (float) – The initial CMPA score for all nodes. This number can go up or down.
  • runs (int) – The number of times to run the CMPA algorithm. Defaults to 1000.
Returns:

A dictionary of {node: list of runners}

Return type:

dict

pybel_tools.analysis.cmpa.workflow_all_average(graph, key, tag=None, default_score=None, runs=None)[source]

Runs CMPA to get average score for every possible candidate mechanism

  1. Get all biological processes
  2. Get candidate mechanism induced two level back from each biological process
  3. CMPA on each candidate mechanism for multiple runs
  4. Report average CMPA scores for each candidate mechanism
Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • key (str) – The key in the node data dictionary representing the experimental data
  • tag (str) – The key for the nodes’ data dictionaries where the CMPA scores will be put. Defaults to ‘score’
  • default_score (float) – The initial CMPA score for all nodes. This number can go up or down.
  • runs (int) – The number of times to run the CMPA algorithm. Defaults to 1000.
Returns:

A dictionary of {node: upstream causal subgraph}

Return type:

dict