Utilities

This module contains functions useful throughout PyBEL Tools

pybel_tools.utils.pairwise(iterable)[source]

Iterate over pairs in list s -> (s0,s1), (s1,s2), (s2, s3), ...

pybel_tools.utils.graph_edge_data_iter(graph)[source]

Iterates over the edge data dictionaries

Parameters:graph (pybel.BELGraph) – A BEL graph
Returns:An iterator over the edge dictionaries in the graph
Return type:iter
pybel_tools.utils.count_defaultdict(dict_of_lists)[source]

Takes a dictionary and applies a counter to each list

Parameters:dict_of_lists (dict or collections.defaultdict) – A dictionary of lists
Returns:A dictionary of {key: Counter(values)}
Return type:dict
pybel_tools.utils.get_value_sets(dict_of_iterables)[source]

Takes a dictionary of lists/iterables/counters and gets the sets of the values

Parameters:dict_of_iterables (dict or collections.defaultdict) – A dictionary of lists
Returns:A dictionary of {key: set of values}
Return type:dict
pybel_tools.utils.count_dict_values(dict_of_counters)[source]

Counts the number of elements in each value (can be list, Counter, etc)

Parameters:dict_of_counters (dict or collections.defaultdict) – A dictionary of things whose lengths can be measured (lists, Counters, dicts)
Returns:A Counter with the same keys as the input but the count of the length of the values list/tuple/set/Counter
Return type:collections.Counter
pybel_tools.utils.check_has_annotation(data, key)[source]

Checks that ANNOTATION is included in the data dictionary and that the key is also present

Parameters:
  • data (dict) – The data dictionary from a BELGraph’s edge
  • key – An annotation key
  • key – str
Returns:

If the annotation key is present in the current data dictionary

Return type:

bool

For example, it might be useful to print all edges that are annotated with ‘Subgraph’:

>>> from pybel import BELGraph
>>> graph = BELGraph()
>>> ...
>>> for u, v, data in graph.edges_iter(data=True):
>>>     if not check_has_annotation(data, 'Subgraph')
>>>         continue
>>>     print(u, v, data)
pybel_tools.utils.set_percentage(x, y)[source]

What percentage of x is contained within y?

Parameters:
  • x (set) – A set
  • y (set) – Another set
Returns:

The percentage of x contained within y

Return type:

float

pybel_tools.utils.tanimoto_set_similarity(x, y)[source]

Calculates the tanimoto set similarity

Parameters:
  • x (set) – A set
  • y (set) – Another set
Returns:

The similarity between

Return type:

float

pybel_tools.utils.min_tanimoto_set_similarity(x, y)[source]

Calculates the tanimoto set similarity using the minimum size

Parameters:
  • x (set) – A set
  • y (set) – Another set
Returns:

The similarity between

Return type:

float

pybel_tools.utils.calculate_single_tanimoto_set_distances(target, dict_of_sets)[source]

Returns a dictionary of distances keyed by the keys in the given dict. Distances are calculated based on pairwise tanimoto similarity of the sets contained

Parameters:
  • target (set) – A set
  • dict_of_sets (dict) – A dict of {x: set of y}
Returns:

A similarity dicationary based on the set overlap (tanimoto) score between the target set and the sets in dos

Return type:

dict

pybel_tools.utils.calculate_tanimoto_set_distances(dict_of_sets)[source]

Returns a distance matrix keyed by the keys in the given dict. Distances are calculated based on pairwise tanimoto similarity of the sets contained

Parameters:dict_of_sets (dict) – A dict of {x: set of y}
Returns:A similarity matrix based on the set overlap (tanimoto) score between each x as a dict of dicts
Return type:dict
pybel_tools.utils.calculate_global_tanimoto_set_distances(dict_of_sets)[source]

Calculates an alternative distance matrix based on the following equation:

\[distance(A, B)=1- \|A \cup B\| / \| \cup_{s \in S} s\|\]
Parameters:dict_of_sets (dict) – A dict of {x: set of y}
Returns:A similarity matrix based on the alternative tanimoto distance as a dict of dicts
Return type:dict
pybel_tools.utils.all_edges_iter(graph, u, v)[source]

Lists all edges between the given nodes

Parameters:
Returns:

A list of (node, node, key)

Return type:

list[tuple]

pybel_tools.utils.barh(d, plt, title=None)[source]

A convenience function for plotting a horizontal bar plot from a Counter

pybel_tools.utils.barv(d, plt, title=None, rotation='vertical')[source]

A convenience function for plotting a vertical bar plot from a Counter

pybel_tools.utils.citation_to_tuple(citation)[source]

Converts a citation dictionary to a tuple. Can be useful for sorting and serialization purposes

Parameters:citation (dict) – A citation dictionary
Returns:A citation tuple
Return type:tuple
pybel_tools.utils.is_edge_consistent(graph, u, v)[source]

Checks if all edges between two nodes have the same relation

Parameters:
Returns:

If all edges from the source to target node have the same relation

Return type:

bool

pybel_tools.utils.safe_add_edge(graph, u, v, key, attr_dict, **attr)[source]

Adds an edge while preserving negative keys, and paying no respect to positive ones

Parameters:
  • graph (pybel.BELGraph) – A BEL Graph
  • u (tuple) – The source BEL node
  • v (tuple) – The target BEL node
  • key (int) – The edge key. If less than zero, corresponds to an unqualified edge, else is disregarded
  • attr_dict (dict) – The edge data dictionary
  • attr (dict) – Edge data to assign via keyword arguments
pybel_tools.utils.safe_add_edges(graph, edges)[source]

Adds an iterable of edges to the graph

Parameters:
  • graph (pybel.BELGraph) – A BEL Graph
  • edges (iter[tuple) – An iterable of 4-tuples of (source, target, key, data)
pybel_tools.utils.load_differential_gene_expression(data_path, gene_symbol_column='Gene.symbol', logfc_column='logFC')[source]

Quick and dirty loader for differential gene expression data

Parameters:
  • data_path (str) –
  • gene_symbol_column (str) –
  • logfc_colun (str) –
Returns:

A dictionary of {gene symbol: log fold change}

Return type:

dict

pybel_tools.utils.prepare_c3(data, y_axis_label='y', x_axis_label='x')[source]

Prepares C3 JSON for making a bar chart from a Counter

Parameters:
  • data (Counter or dict or collections.defaultdict) – A dictionary of {str: int} to display as bar chart
  • y_axis_label (str) – The Y axis label
  • x_axis_label (str) – X axis internal label. Should be left as default ‘x’)
Returns:

A JSON dictionary for making a C3 bar chart

Return type:

dict

pybel_tools.utils.prepare_c3_time_series(data, y_axis_label='y', x_axis_label='x')[source]

Prepares C3 JSON for making a time series

Parameters:
  • data (list) – A list of tuples [(year, count)]
  • y_axis_label (str) – The Y axis label
  • x_axis_label (str) – X axis internal label. Should be left as default ‘x’)
Returns:

A JSON dictionary for making a C3 bar chart

Return type:

dict

pybel_tools.utils.get_version()[source]

Gets the current PyBEL Tools version

Returns:The current PyBEL Tools version
Return type:str
pybel_tools.utils.build_template_environment(here)[source]

Builds a custom templating enviroment so Flask apps can get data from lots of different places

Parameters:here (str) – Give this the result of os.path.dirname(os.path.abspath(__file__))
Return type:jinja2.Environment
pybel_tools.utils.build_template_renderer(file)[source]

In your file, give this function the current file

Parameters:file (str) – The location of the current file. Pass it __file__ like in the example below.
>>> render_template = build_template_renderer(__file__)
pybel_tools.utils.calculate_betweenness_centality(graph, k=200)[source]

Calculates the betweenness centrality over nodes in the graph. Tries to do it with a certain number of samples, but then tries a complete approach if it fails.

Parameters:
Return type:

collections.Counter[tuple,float]

pybel_tools.utils.grouper(n, iterable, fillvalue=None)[source]

grouper(3, ‘ABCDEFG’, ‘x’) –> ABC DEF Gxx

pybel_tools.utils.hash_str_to_int(hash_str, length=16)[source]

Hashes a tuple to the given number of digits

Parameters:
  • hash_str (str) – Basically anything that can be pickled deterministically
  • length (int) – The length of the hash to keep
Return type:

int

pybel_tools.utils.get_circulations(t)[source]

Iterate over all possible circulations of an ordered collection (tuple or list)

Parameters:or list t (tuple) –
Return type:iter
pybel_tools.utils.canonical_circulation(t, key=None)[source]

Get get a canonical representation of the ordered collection by finding its minimum circulation with the given sort key

Parameters:
  • or list t (tuple) –
  • key – A function for sort
Returns:

The