Selection

This module contains functions to help select data from networks

pybel_tools.selection.group_nodes_by_annotation(graph, annotation='Subgraph')[source]

Groups the nodes occurring in edges by the given annotation

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • annotation (str) – An annotation to use to group edges
Returns:

dict of sets of BELGraph nodes

Return type:

dict

pybel_tools.selection.average_node_annotation(graph, key, annotation='Subgraph', aggregator=None)[source]

Groups graph into subgraphs and assigns each subgraph a score based on the average of all nodes values for the given node key

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • key (str) – The key in the node data dictionary representing the experimental data
  • annotation (str) – A BEL annotation to use to group nodes
  • aggregator (lambda) – A function from list of values -> aggregate value. Defaults to taking the average of a list of floats.
pybel_tools.selection.group_nodes_by_annotation_filtered(graph, node_filters=None, annotation='Subgraph')[source]

Groups the nodes occurring in edges by the given annotation, with a node filter applied

Parameters:
Returns:

A dictionary of {annotation value: set of nodes}

Return type:

dict[str,set[tuple]]

pybel_tools.selection.get_subgraph_by_induction(graph, nodes)[source]

Induces a graph over the given nodes

Parameters:
Returns:

A subgraph induced over the given nodes

Return type:

pybel.BELGraph

pybel_tools.selection.get_subgraph_by_edge_filter(graph, edge_filters)[source]

Induces a subgraph on all edges that pass the given filters

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • edge_filters (list or tuple or lambda) – A predicate or list of predicates (graph, node, node, key, data) -> bool
Returns:

A BEL subgraph induced over the edges passing the given filters

Return type:

pybel.BELGraph

pybel_tools.selection.get_subgraph_by_node_filter(graph, node_filters)[source]

Induces a graph on the nodes that pass all filters

Parameters:
Returns:

A subgraph induced over the nodes passing the given filters

Return type:

pybel.BELGraph

pybel_tools.selection.get_subgraph_by_neighborhood(graph, nodes)[source]

Gets a BEL graph around the neighborhoods of the given nodes

Parameters:
Returns:

A BEL graph induced around the neighborhoods of the given nodes

Return type:

pybel.BELGraph

pybel_tools.selection.get_subgraph_by_second_neighbors(graph, nodes, filter_pathologies=False)[source]

Gets a BEL graph around the neighborhoods of the given nodes, and expands to the neighborhood of those nodes

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • nodes (iter[tuple]) – An iterable of BEL nodes
  • filter_pathologies (bool) – Should expansion take place around pathologies?
Returns:

A BEL graph induced around the neighborhoods of the given nodes

Return type:

pybel.BELGraph

pybel_tools.selection.get_subgraph_by_all_shortest_paths(graph, nodes, cutoff=None, weight=None)[source]

Induces a subgraph over the nodes in the pairwise shortest paths between all of the nodes in the given list

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • nodes (set[tuple]) – A set of nodes over which to calculate shortest paths
  • cutoff (int) – Depth to stop the shortest path search. Only paths of length <= cutoff are returned.
  • weight (str) – Edge data key corresponding to the edge weight. If None, performs unweighted search
Returns:

A BEL graph induced over the nodes appearing in the shortest paths between the given nodes

Return type:

pybel.BELGraph

pybel_tools.selection.get_subgraph_by_annotation_value(graph, annotation, value)[source]

Builds a new subgraph induced over all edges whose annotations match the given key and value

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • annotation (str) – The annotation to group by
  • value (str) – The value for the annotation
Returns:

A subgraph of the original BEL graph

Return type:

pybel.BELGraph

pybel_tools.selection.get_subgraph_by_annotations(graph, annotations, or_=None)[source]

Returns the subgraph given an annotations filter.

Parameters:
  • graph – pybel.BELGraph graph: A BEL graph
  • annotations (dict) – Annotation filters (match all with pybel.utils.subdict_matches())
  • or (boolean) – if True any annotation should be present, if False all annotations should be present in the edge
Returns:

A subgraph of the original BEL graph

Return type:

pybel.BELGraph

pybel_tools.selection.get_subgraph_by_data(graph, annotations)[source]

Returns the subgraph filtering for Citation, Evidence or Annotation in the edges.

Parameters:
Returns:

A subgraph of the original BEL graph

Return type:

pybel.BELGraph

pybel_tools.selection.get_subgraph_by_pubmed(graph, pmids)[source]

Induces a subgraph over the edges retrieved from the given PubMed identifier(s)

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • or list[str] pmids (str) – A PubMed identifier or list of PubMed identifiers
Return type:

pybel.BELGraph

pybel_tools.selection.get_subgraph_by_authors(graph, authors)[source]

Induces a subgraph over the edges retrieved publications by the given author(s)

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • or list[str] authors (str) – An author or list of authors
Return type:

pybel.BELGraph

Gets a subgraph induced over all nodes matching the query string

Parameters:
  • graph (pybel.BELGraph) – A BEL Graph
  • or iter[str] query (str) – A query string or iterable of query strings for node names
Returns:

A subgraph induced over the original BEL graph

Return type:

pybel.BELGraph

Thinly wraps search_node_names() and get_subgraph_by_induction().

pybel_tools.selection.get_causal_subgraph(graph)[source]

Builds a new subgraph induced over all edges that are causal

Parameters:graph (pybel.BELGraph) – A BEL graph
Returns:A subgraph of the original BEL graph
Return type:pybel.BELGraph
pybel_tools.selection.get_subgraph(graph, seed_method=None, seed_data=None, expand_nodes=None, remove_nodes=None)[source]

Runs pipeline query on graph with multiple subgraph filters and expanders.

Order of Operations:

  1. Seeding by given function name and data
  2. Add nodes
  3. Remove nodes
Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • seed_method (str) – The name of the get_subgraph_by_* function to use
  • seed_data – The argument to pass to the get_subgraph function
  • expand_nodes (list[tuple]) – Add the neighborhoods around all of these nodes
  • remove_nodes (list[tuple]) – Remove these nodes and all of their in/out edges
Returns:

A BEL Graph

Return type:

pybel.BELGraph

pybel_tools.selection.get_subgraphs_by_annotation(graph, annotation)[source]

Stratifies the given graph into subgraphs based on the values for edges’ annotations

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • annotation (str) – The annotation to group by
Returns:

A dictionary of {str value: BELGraph subgraph}

Return type:

dict[str, pybel.BELGraph]

pybel_tools.selection.get_multi_causal_upstream(graph, nbunch)[source]

Gets the union of all the 2-level deep causal upstream subgraphs from the nbunch

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • or list[tuple] nbunch (tuple) – A BEL node or list of BEL nodes
Returns:

A subgraph of the original BEL graph

Return type:

pybel.BELGraph

pybel_tools.selection.get_multi_causal_downstream(graph, nbunch)[source]

Gets the union of all of the 2-level deep causal downstream subgraphs from the nbunch

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • or list[tuple] nbunch (tuple) – A BEL node or list of BEL nodes
Returns:

A subgraph of the original BEL graph

Return type:

pybel.BELGraph

pybel_tools.selection.get_random_subgraph(graph, number_edges=250, number_seed_nodes=5)[source]

Randomly picks a node from the graph, and performs a weighted random walk to sample the given number of edges around it

Parameters:
  • graph (pybel.BELGraph) –
  • number_edges (int) – Maximum number of edges
  • number_seed_nodes (int) – Number of nodes to start with (which likely results in different components in large graphs)
Return type:

pybel.BELGraph

pybel_tools.selection.get_upstream_leaves(graph)[source]

Gets all leaves of the graph (with no incoming edges and only one outgoing edge)

See also

upstream_leaf_predicate()

Parameters:graph (pybel.BELGraph) – A BEL graph
Returns:An iterator over nodes that are upstream leaves
Return type:iter
pybel_tools.selection.get_unweighted_upstream_leaves(graph, key)[source]

Gets all leaves of the graph with no incoming edges, one outgoing edge, and without the given key in its data dictionary

See also

data_does_not_contain_key_builder()

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • key (str) – The key in the node data dictionary representing the experimental data
Returns:

An iterable over leaves (nodes with an in-degree of 0) that don’t have the given annotation

Return type:

iter

pybel_tools.selection.get_gene_leaves(graph)[source]

Find all genes who have only one connection, that’s a transcription to its RNA

Parameters:graph (pybel.BELGraph) – A BEL graph
pybel_tools.selection.get_rna_leaves(graph)[source]

Find all RNAs who have only one connection, that’s a translation to its protein

Parameters:graph (pybel.BELGraph) – A BEL graph
pybel_tools.selection.get_leaves_by_type(graph, function=None, prune_threshold=1)[source]
Returns an iterable over all nodes in graph (in-place) with only a connection to one node. Useful for gene and
RNA. Allows for optional filter by function type.
Parameters:
Returns:

An iterable over nodes with only a connection to one node

Return type:

iter

pybel_tools.selection.get_nodes_in_all_shortest_paths(graph, nodes, weight=None)[source]

Gets all shortest paths from all nodes to all other nodes in the given list and returns the set of all nodes contained in those paths using networkx.all_shortest_paths().

Parameters:
  • graph (pybel.BELGraph) – A BEL graph
  • nodes (iter[tuple]) – The list of nodes to use to use to find all shortest paths
  • cutoff (int) – Depth to stop the search. Only paths of length <= cutoff are returned.
  • weight (str) – Edge data key corresponding to the edge weight. If none, uses unweighted search.
Returns:

A set of nodes appearing in the shortest paths between nodes in the BEL graph

Return type:

set

Note

This can be trivially parallelized using networkx.single_source_shortest_path()

pybel_tools.selection.get_shortest_directed_path_between_subgraphs(graph, a, b)[source]

Calculate the shortest path that occurs between two disconnected subgraphs A and B going through nodes in the source graph

Parameters:
Returns:

A list of the shortest paths between the two subgraphs

Return type:

list

pybel_tools.selection.get_shortest_undirected_path_between_subgraphs(graph, a, b)[source]

Get the shortest path between two disconnected subgraphs A and B, disregarding directionality of edges in graph

Parameters:
Returns:

A list of the shortest paths between the two subgraphs

Return type:

list

pybel_tools.selection.search_node_names(graph, query)[source]

Searches for nodes containing a given string

Parameters:
Returns:

An iterator over nodes whose names match the search query

Return type:

iter

pybel_tools.selection.search_node_cnames(graph, query)[source]

Searches for nodes whose canonical names contain a given string(s)

Parameters:
Returns:

An iterator over nodes whose canonical names match the search query

Return type:

iter

pybel_tools.selection.convert_path_to_metapath(graph, nodes)[source]

Converts a list of nodes to their corresponding functions

Parameters:nodes (list[tuple]) – A list of BEL node tuples
Return type:list[str]
pybel_tools.selection.get_walks_exhaustive[source]

Gets all walks under a given length starting at a given node

Parameters:
  • graph (networkx.Graph) – A graph
  • node – Starting node
  • length (int) – The length of walks to get
Returns:

A list of paths

Return type:

list[tuple]

pybel_tools.selection.match_simple_metapath(graph, node, simple_metapath)[source]

Matches a simple metapath starting at the given node

Parameters:
Returns:

An iterable over paths from the node matching the metapath

Return type:

iter[tuple]