Document Utilities¶
Creating Definition Documents¶
-
pybel_tools.definition_utils.
get_merged_namespace_names
(locations, check_keywords=True)[source]¶ Loads many namespaces and combines their names.
Parameters: Returns: A dictionary of {names: labels}
Return type: Example Usage
>>> from pybel.resources.definitions import write_namespace >>> from pybel_tools.definition_utils import export_namespace, get_merged_namespace_names >>> graph = ... >>> original_ns_url = ... >>> export_namespace(graph, 'MBS') # Outputs in current directory to MBS.belns >>> value_dict = get_merged_namespace_names([original_ns_url, 'MBS.belns']) >>> with open('merged_namespace.belns', 'w') as f: >>> ... write_namespace('MyBrokenNamespace', 'MBS', 'Other', 'Charles Hoyt', 'PyBEL Citation', value_dict, file=f)
-
pybel_tools.definition_utils.
merge_namespaces
(input_locations, output_path, namespace_name, namespace_keyword, namespace_domain, author_name, citation_name, namespace_description=None, namespace_species=None, namespace_version=None, namespace_query_url=None, namespace_created=None, author_contact=None, author_copyright=None, citation_description=None, citation_url=None, citation_version=None, citation_date=None, case_sensitive=True, delimiter='|', cacheable=True, functions=None, value_prefix='', sort_key=None, check_keywords=True)[source]¶ Merges namespaces from multiple locations to one.
Parameters: - input_locations (iter) – An iterable of URLs or file paths pointing to BEL namespaces.
- output_path (str) – The path to the file to write the merged namespace
- namespace_name (str) – The namespace name
- namespace_keyword (str) – Preferred BEL Keyword, maximum length of 8
- namespace_domain (str) – One of:
pybel.constants.NAMESPACE_DOMAIN_BIOPROCESS
,pybel.constants.NAMESPACE_DOMAIN_CHEMICAL
,pybel.constants.NAMESPACE_DOMAIN_GENE
, orpybel.constants.NAMESPACE_DOMAIN_OTHER
- author_name (str) – The namespace’s authors
- citation_name (str) – The name of the citation
- namespace_query_url (str) – HTTP URL to query for details on namespace values (must be valid URL)
- namespace_description (str) – Namespace description
- namespace_species (str) – Comma-separated list of species taxonomy id’s
- namespace_version (str) – Namespace version
- namespace_created (str) – Namespace public timestamp, ISO 8601 datetime
- author_contact (str) – Namespace author’s contact info/email address
- author_copyright (str) – Namespace’s copyright/license information
- citation_description (str) – Citation description
- citation_url (str) – URL to more citation information
- citation_version (str) – Citation version
- citation_date (str) – Citation publish timestamp, ISO 8601 Date
- case_sensitive (bool) – Should this config file be interpreted as case-sensitive?
- delimiter (str) – The delimiter between names and labels in this config file
- cacheable (bool) – Should this config file be cached?
- functions (iterable of characters) – The encoding for the elements in this namespace
- value_prefix (str) – a prefix for each name
- sort_key – A function to sort the values with
sorted()
- check_keywords (bool) – Should all the keywords be the same? Defaults to
True
-
pybel_tools.definition_utils.
export_namespace
(graph, namespace, directory=None, cacheable=False)[source]¶ Exports all names and missing names from the given namespace to its own BEL Namespace files in the given directory.
Could be useful during quick and dirty curation, where planned namespace building is not a priority.
Parameters: - graph (pybel.BELGraph) – A BEL graph
- namespace (str) – The namespace to process
- directory (str) – The path to the directory where to output the namespace. Defaults to the current working
directory returned by
os.getcwd()
- cacheable (bool) – Should the namespace be cacheable? Defaults to
False
because, in general, this operation will probably be used for evil, and users won’t want to reload their entire cache after each iteration of curation.
-
pybel_tools.definition_utils.
export_namespaces
(graph, namespaces, directory=None, cacheable=False)[source]¶ Thinly wraps
export_namespace()
for an iterable of namespaces.Parameters: - graph (pybel.BELGraph) – A BEL graph
- namespaces (iter[str]) – An iterable of strings for the namespaces to process
- directory (str) – The path to the directory where to output the namespaces. Defaults to the current working
directory returned by
os.getcwd()
- cacheable (bool) – Should the namespaces be cacheable? Defaults to
False
because, in general, this operation will probably be used for evil, and users won’t want to reload their entire cache after each iteration of curation.
Creating Knowledge Documents¶
-
pybel_tools.document_utils.
write_boilerplate
(name, version=None, description=None, authors=None, contact=None, copyright=None, licenses=None, disclaimer=None, namespace_url=None, namespace_patterns=None, annotation_url=None, annotation_patterns=None, annotation_list=None, pmids=None, entrez_ids=None, file=None)[source]¶ Writes a boilerplate BEL document, with standard document metadata, definitions.
Parameters: - name (str) – The unique name for this BEL document
- contact (str) – The email address of the maintainer
- description (str) – A description of the contents of this document
- authors (str) – The authors of this document
- version (str) – The version. Defaults to current date in format
YYYYMMDD
. - copyright (str) – Copyright information about this document
- licenses (str) – The license applied to this document
- disclaimer (str) – The disclaimer for this document
- namespace_url (dict[str,str]) – an optional dictionary of {str name: str URL} of namespaces
- namespace_patterns (dict[str,str]) – An optional dictionary of {str name: str regex} namespaces
- annotation_url (dict[str,str]) – An optional dictionary of {str name: str URL} of annotations
- annotation_patterns (dict[str,str]) – An optional dictionary of {str name: str regex} of regex annotations
- annotation_list (dict[str,set[str]]) – An optional dictionary of {str name: set of names} of list annotations
- or iter[int] pmids (iter[str]) – A list of PubMed identifiers to auto-populate with citation and abstract
- or iter[int] entrez_ids (iter[str]) – A list of Entrez identifiers to autopopulate the gene summary as evidence
- file (file) – A writable file or file-like. If None, defaults to
sys.stdout