vfb_connect package

Subpackages

Submodules

vfb_connect.cross_server_tools module

class vfb_connect.cross_server_tools.VfbConnect(neo_endpoint='http://pdb.v4.virtualflybrain.org', neo_credentials=('neo4j', 'vfb'), owlery_endpoint='http://owl.virtualflybrain.org/kbs/vfb/', solr_endpoint='http://solr.virtualflybrain.org/solr/ontology/', vfb_launch=False)[source]

Bases: object

API wrapper class for Virtual Fly Brain (VFB) connectivity.

This class wraps connections to the basal API endpoints (OWL, Neo4j) and provides higher-level methods to combine semantic queries that range across VFB content with Neo4j queries. It returns detailed metadata about anatomical classes and individuals that fulfill these queries.

Parameters:
  • neo_endpoint – Specify a Neo4j REST endpoint.

  • neo_credentials – Specify credentials for the Neo4j REST endpoint.

  • owlery_endpoint – Specify OWLery server REST endpoint.

  • lookup_prefixes – A list of ID prefixes to use for rolling name:ID lookups.

Variables:
  • nc – Provides direct access to Neo4j via the Neo4jConnect instance.

  • neo_query_wrapper – Provides enriched query capabilities using the QueryWrapper instance.

  • oc – Provides direct access to OWL queries via the OWLeryConnect instance.

  • lookup – A lookup table for resolving names to IDs.

  • vfb_base – Base URL for Virtual Fly Brain links.

cypher_query(query, return_dataframe=True, verbose=False)[source]

Run a Cypher query.

Parameters:
  • query – The Cypher query to run.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame or list of results.

Return type:

pandas.DataFrame or list of dicts

generate_lab_colors(num_colors, min_distance=100, verbose=False)[source]

Generate a list of Lab colors and convert them to RGB tuples.

Parameters:
  • num_colors – The number of colors to generate.

  • min_distance – Minimum perceptual distance between colors.

Returns:

A list of RGB tuples.

get_TermInfo(short_forms, summary=True, cache=True, return_dataframe=True, query_by_label=True, limit=None, verbose=False)[source]

Generate a JSON report or summary for terms specified by a list of VFB IDs.

This method retrieves term information for a list of specified VFB IDs (short_forms). It can return either full metadata or a summary of the terms. The results can be returned as a pandas DataFrame if return_dataframe is set to True.

Parameters:
  • short_forms (iter) – An iterable (e.g., a list) of VFB IDs (short_forms).

  • summary – Optional. If True, returns a summary report instead of full metadata. Default is True.

  • cache – Optional. If True, attempts to retrieve cached results before querying. Default is True.

  • return_dataframe – Optional. If True, returns the results as a pandas DataFrame. Default is True.

  • query_by_label – Optional. If True, it allows labels, symbols or synonyms as well as short_forms. Default is True.

Returns:

A list of term metadata as VFB_json or summary_report_json, or a pandas DataFrame if return_dataframe is True.

Return type:

list of dicts or pandas.DataFrame

get_cache_file_path()[source]

Determine a safe place to save the pickle file in the same directory as the module.

get_connected_neurons_by_type(upstream_type=None, downstream_type=None, weight=None, query_by_label=True, return_dataframe=True, verbose=False)[source]

Get all synaptic connections between individual neurons of upstream_type and downstream_type where synapse count >= weight.

Parameters:
  • upstream_type – The upstream neuron type (e.g., ‘GABAergic neuron’).

  • downstream_type – The downstream neuron type (e.g., ‘Descending neuron’).

  • query_by_label – Optional. Specify neuron type by label if True (default) or by short_form ID if False.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame or list of synaptic connections between specified neuron types.

Return type:

pandas.DataFrame or list of dicts

get_datasets(summary=True, return_dataframe=True)[source]

Get all datasets in the database.

Returns:

List of datasets in the database.

Return type:

list

get_dbs(include_symbols=True)[source]

Get all external databases in the database.

Returns:

List of external databases in the database.

Return type:

list

get_gene_function_filters()[source]

Get a list of all gene function labels.

Returns:

List of unique gene function labels in alphabetical order.

Return type:

list

get_images(short_forms, template=None, image_folder=None, image_type='swc', stomp=False)[source]

Get images for a list of individuals.

Parameters:
  • short_forms (iter) – List of short_form IDs for individuals.

  • template – Optional. Template name.

  • image_folder – Optional. Folder to save image files & manifest to.

  • image_type – Optional. Image type (file extension).

  • stomp – Optional. Overwrite image_folder if already exists.

Returns:

Manifest as Pandas DataFrame

get_images_by_filename(filenames, dataset=None, summary=True, return_dataframe=True)[source]

Get images by filename.

Parameters:
  • filenames (iter) – List of filenames.

  • dataset – Optional. Dataset name.

Returns:

List of images.

Return type:

list

get_images_by_type(class_expression, template, image_folder, image_type='swc', query_by_label=True, direct=False, stomp=False)[source]

Download all images of individuals specified by a class expression.

Parameters:
  • class_expression – A valid OWL class expression, e.g., the name or symbol of a type of neuron (MBON01).

  • template – The template name.

  • image_folder – The folder to save image files and manifest to.

  • image_type – The image file extension (e.g., ‘swc’).

  • query_by_label – Optional. Query using class labels if True, or IDs if False. Default True.

  • direct – Optional. Return only direct instances if True. Default False.

  • stomp – Optional. Overwrite the image folder if it already exists. Default False.

Returns:

A manifest of downloaded images as a pandas DataFrame.

Return type:

pandas.DataFrame

get_instances(class_expression, query_by_label=True, summary=True, return_dataframe=True, limit=None, return_id_only=False, verbose=False)[source]

Generate JSON report of all instances of a given class expression.

Instances are specific examples of a type/class, e.g., a neuron of type DA1 adPN from the FAFB_catmaid database.

Parameters:
  • class_expression – A valid OWL class expression, e.g., the name of a class.

  • query_by_label – Optional. Query using class labels if True, or IDs if False. Default True.

  • summary – Optional. Returns summary reports if True. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns a list of dicts. Default True.

Returns:

A DataFrame or list of terms as nested Python data structures following VFB_json or summary_report_json.

Return type:

pandas.DataFrame or list of dicts

get_instances_by_dataset(dataset, query_by_label=True, summary=True, return_dataframe=True, return_id_only=False)[source]

Get JSON report of all individuals in a specified dataset.

Parameters:
  • dataset – The dataset ID.

  • summary – Optional. Returns summary reports if True. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns a list of dicts. Default True.

Returns:

A DataFrame or list of terms as nested Python data structures following VFB_json or summary_report_json.

Return type:

pandas.DataFrame or list of dicts

get_neurons_downstream_of(neuron, weight, classification=None, query_by_label=True, return_dataframe=True, verbose=False)[source]

Get all neurons downstream of a specified neuron.

Parameters:
  • neuron – The name or ID of a particular neuron (dependent on query_by_label setting).

  • weight – Limit returned neurons to those connected by >= weight synapses.

  • classification – Optional. Restrict downstream neurons by classification.

  • query_by_label – Optional. Query neuron by label if True, or by ID if False. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame or list of neurons downstream of the specified neuron.

Return type:

pandas.DataFrame or list of dicts

get_neurons_upstream_of(neuron, weight, classification=None, query_by_label=True, return_dataframe=True, verbose=False)[source]

Get all neurons upstream of a specified neuron.

Parameters:
  • neuron – The name or ID of a particular neuron (dependent on query_by_label setting).

  • weight – Limit returned neurons to those connected by >= weight synapses.

  • classification – Optional. Restrict upstream neurons by classification.

  • query_by_label – Optional. Query neuron by label if True, or by ID if False. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame or list of neurons upstream of the specified neuron.

Return type:

pandas.DataFrame or list of dicts

get_nt_predictions(term, verbose=False)[source]

Find predicted neurotransmitter(s) for a single neuron or all neurons of a given type. If nothing is found, an empty DataFrame is returned. :type term: :param term: The ID, name, or symbol of a class in the Drosophila Anatomy Ontology (FBbt) or the ID or name of an individual neuron in VFB. :return: A DataFrame. :rtype: pandas.DataFrame

get_nt_receptors_in_downstream_neurons(upstream_type, downstream_type='neuron', weight=0, use_predictions=True, return_dataframe=True, verbose=False)[source]

Get neurotransmitter receptors in downstream neurons of a given neuron type.

Returns a DataFrame of neurotransmitter receptors in downstream neurons of a specified neuron type. If no data is found, returns False. If use_predictions, an extra column (‘nt_only_predicted’) will indicate whether each receptor is for a neurotransmitter that is only predicted to be released by the upstream type.

Parameters:
  • upstream_type – The ID, name, or symbol of a class in the Drosophila Anatomy Ontology (FBbt).

  • downstream_type – Optional. The type of downstream neurons to search for. Default is ‘neuron’.

  • weight – Optional. Limit returned neurons to those connected by >= weight synapses. Default is 0.

  • use_predictions – Optional. Use predicted neurotransmitters (from instances) in addition to known neurotransmitters. Default is True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame with neurotransmitter receptors in downstream neurons of the specified neuron type.

Return type:

pandas.DataFrame or list of dicts

get_potential_drivers(neuron, similarity_score='NBLAST_score', query_by_label=True, return_dataframe=True, verbose=False)[source]

Get JSON report of driver expression likely to contain the input neuron.

Parameters:
  • neuron – The neuron to find similar drivers for.

  • similarity_score – Optional. Specify the similarity score to use (e.g., ‘NBLAST_score’, ‘neuronbridge_score’). Default ‘NBLAST_score’.

  • query_by_label – Optional. Query neuron by label if True, or by ID if False. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame or list of potranial drivers (id, label, tags) + similarity score.

Return type:

pandas.DataFrame or list of dicts

get_scRNAseq_expression(id, query_by_label=True, return_id_only=False, return_dataframe=True, verbose=False)[source]

Get scRNAseq expression data for a given anatomy term.

Returns a DataFrame of scRNAseq clusters of cells that are shown to express the current anatomy term. If no data is found, returns False.

Parameters:
  • id – The ID, name, or symbol of a class in the Drosophila Anatomy Ontology (FBbt).

  • query_by_label – Optional. Query using cell type labels if True, or IDs if False. Default True.

  • return_id_only – Optional. Return only the cluster IDs if True. Default False.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame with scRNAseq expression data for clusters of cells annotated as the specified cell type.

Return type:

pandas.DataFrame or list of dicts

get_scRNAseq_gene_expression(cluster, query_by_label=True, return_id_only=False, return_dataframe=True, verbose=False)[source]

Get gene expression data for a given scRNAseq cluster.

Returns a DataFrame of gene expression data for a cluster of cells annotated as the specified cluster. If no data is found, returns False.

Parameters:
  • cluster – The ID, name, or symbol of a class in the Drosophila Anatomy Ontology (FBbt).

  • query_by_label – Optional. Query using cell type labels if True, or IDs if False. Default True.

  • return_id_only – Optional. Return only the gene IDs if True. Default False.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame with gene expression data for clusters of cells annotated as the specified cell type.

Return type:

pandas.DataFrame or list of dicts

get_similar_neurons(neuron, similarity_score='NBLAST_score', query_by_label=True, return_dataframe=True, verbose=False)[source]

Get JSON report of individual neurons similar to the input neuron.

Parameters:
  • neuron – The neuron to find similar neurons to.

  • similarity_score – Optional. Specify the similarity score to use (e.g., ‘NBLAST_score’). Default ‘NBLAST_score’.

  • query_by_label – Optional. Query neuron by label if True, or by ID if False. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame or list of similar neurons (id, label, tags, source (db) id, accession_in_source) + similarity score.

Return type:

pandas.DataFrame or list of dicts

get_subclasses(class_expression, query_by_label=True, direct=False, summary=True, return_dataframe=True, verbose=False)[source]

Generate JSON report of all subclasses of a given class expression.

Parameters:
  • class_expression – A valid OWL class expression, e.g., the name of a class.

  • query_by_label – Optional. Query using class labels if True, or IDs if False. Default True.

  • direct – Optional. Return only direct subclasses if True. Default False.

  • summary – Optional. Returns summary reports if True. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns a list of dicts. Default True.

Returns:

A DataFrame or list of terms as nested Python data structures following VFB_json or summary_report_json.

Return type:

pandas.DataFrame or list of dicts

get_superclasses(class_expression, query_by_label=True, direct=False, summary=True, return_dataframe=True)[source]

Generate JSON report of all superclasses of a given class expression.

Parameters:
  • class_expression – A valid OWL class expression, e.g., the name of a class.

  • query_by_label – Optional. Query using class labels if True, or IDs if False. Default True.

  • direct – Optional. Return only direct superclasses if True. Default False.

  • summary – Optional. Returns summary reports if True. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns a list of dicts. Default True.

Returns:

A DataFrame or list of terms as nested Python data structures following VFB_json or summary_report_json.

Return type:

pandas.DataFrame or list of dicts

get_templates(summary=True, return_dataframe=True)[source]

Get all templates in the database.

Returns:

List of templates in the database.

Return type:

list

get_terms_by_region(region, cells_only=False, verbose=False, query_by_label=True, summary=True, return_dataframe=True)[source]

Generate TermInfo reports for all terms relevant to annotating a specific region, optionally limited to cells.

Parameters:
  • region – The name (rdfs:label) of the brain region (or CURIE style ID if query_by_label is False).

  • cells_only – Optional. Limits query to cell types if True. Default False.

  • verbose – Optional. If True, prints the running query and found terms. Default False.

  • query_by_label – Optional. Query using region labels if True, or IDs if False. Default True.

  • summary – Optional. Returns summary reports if True. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns a list of dicts. Default True.

Returns:

A DataFrame or list of terms as nested Python data structures following VFB_json or summary_report_json.

Return type:

pandas.DataFrame or list of dicts

get_terms_by_xref(xrefs, db='', summary=True, return_dataframe=True)[source]

Retrieve terms by cross-reference (xref) identifiers.

This method takes a list of external cross-reference identifiers and returns the corresponding terms from the database. The terms can be returned either as full metadata or as summaries. Additionally, the results can be returned as a pandas DataFrame if return_dataframe is set to True.

Parameters:
  • xrefs (iter) – An iterable (e.g., a list) of cross-reference identifiers (xrefs).

  • db – Optional. The name of the external database to filter the results by. Default is an empty string, which means no filtering.

  • summary – Optional. If True, returns summary reports instead of full metadata. Default is True.

  • return_dataframe – Optional. If True and summary is also True, returns the results as a pandas DataFrame. Default is True.

Returns:

A list of term metadata as nested Python data structures (VFB_json or summary_report_json), or a pandas DataFrame if return_dataframe is True and summary is True.

Return type:

list of dicts or pandas.DataFrame

get_transcriptomic_profile(cell_type, gene_type=False, no_subtypes=False, query_by_label=True, return_dataframe=True)[source]

Get gene expression data for a given cell type.

Returns a DataFrame of gene expression data for clusters of cells annotated as the specified cell type (or subtypes). Optionally restricts to a gene type, which can be retrieved using get_gene_function_filters. If no data is found, returns False.

Parameters:
  • cell_type – The ID, name, or symbol of a class in the Drosophila Anatomy Ontology (FBbt).

  • gene_type – Optional. A gene function label retrieved using get_gene_function_filters.

  • no_subtypes – Optional. If True, only clusters for the specified cell_type will be returned and not subtypes. Default False.

  • query_by_label – Optional. Query using cell type labels if True, or IDs if False. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame with gene expression data for clusters of cells annotated as the specified cell type.

Return type:

pandas.DataFrame or list of dicts

Raises:

KeyError – If the cell_type or gene_type is invalid.

Generate a link to Virtual Fly Brain (VFB) that loads all available images of neurons on the specified template.

Parameters:
  • short_forms (iter) – A list (or other iterable) of VFB short_form IDs for individuals with images.

  • template – The name (label) of a template.

Returns:

A URL for viewing images and metadata for specified individuals on VFB.

Return type:

str

Raises:

ValueError – If the template name is not recognized.

json = <module 'json' from '/home/docs/.asdf/installs/python/3.10.14/lib/python3.10/json/__init__.py'>
lookup_id(key, return_curie=False, allow_subsitutions=True, subsitution_stages=['adult', 'larval', 'pupal'], verbose=False)[source]

Lookup the ID for a given key (label or symbol) using the internal lookup table.

Parameters:
  • key – The label symbol, synonym, or potential ID to look up.

  • allow_subsitutions – Optional. If True, allow for case-insensitive and character-insensitive lookups. Default True.

  • subsitution_stages – Optional. A list of prefixes to try for substitutions. Default [‘adult’, ‘larval’, ‘pupal’].

  • return_curie – Optional. If True, return the ID in CURIE (Compact URI) format. Default False.

Returns:

The ID associated with the key, or the key itself if it is already a valid ID. None is returned if the key is not found.

Return type:

str

lookup_name(ids)[source]

Lookup the name for a given ID using the internal lookup table.

Parameters:

ids – A single ID or list of IDs to look up.

Returns:

The name associated with the ID.

Return type:

str

owl_instances(query, query_by_label=True, return_id_only=False, return_dataframe=False, limit=False, verbose=False)[source]

Get instances of a given term.

Returns a VFBTerms of instances of the specified term. If no data is found, returns False.

Parameters:
  • term – The ID, name, or symbol of a class in the Drosohila Anatomy Ontology (FBbt).

  • query_by_label – Optional. Query using cell type labels if True, or IDs if False. Default True.

  • return_id_only – Optional. Return only the instance IDs if True. Default False.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns VFBTerms. Default False.

  • limit – Optional. Limit the number of instances returned. Default False.

  • verbose – Optional. If True, prints the running query and found terms. Default False.

Returns:

A VFBTerms or DataFrame with instances of the specified query.

Return type:

dependant on the options a pandas.DataFrame, list of ids or VFBTerms. Default is VFBTerms

owl_subclasses(query, query_by_label=True, return_id_only=False, return_dataframe=False, limit=False, verbose=False)[source]

Get subclasses of a given term.

Returns a VFBTerms of subclasses of the specified term. If no data is found, returns False.

Parameters:
  • term – The ID, name, or symbol of a class in the Drosohila Anatomy Ontology (FBbt).

  • query_by_label – Optional. Query using cell type labels if True, or IDs if False. Default True.

  • return_id_only – Optional. Return only the subclass IDs if True. Default False.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default False.

  • limit – Optional. Limit the number of instances returned. Default False.

  • verbose – Optional. If True, prints the running query and found terms. Default False.

Returns:

A VFBTerms or DataFrame with subclasses of the specified term.

Return type:

dependant on the options a pandas.DataFrame, list of ids or VFBTerms. Default is VFBTerms

owl_superclasses(query, query_by_label=True, return_id_only=False, return_dataframe=False, limit=False, verbose=False)[source]

Get superclasses of a given term.

Returns a VFBTerms of superclasses of the specified term. If no data is found, returns False.

param term:

The ID, name, or symbol of a class in the Drosohila Anatomy Ontology (FBbt).

Return type:

dependant on the options a pandas.DataFrame, list of ids or VFBTerms. Default is VFBTerms

reload_lookup_cache(verbose=False)[source]

Clear the lookup cache file.

search(query, return_dataframe=True, verbose=False, filter_by_has_tag=None, filter_by_not_tag=['Deprecated'])[source]

Search for terms in the database using a complex Solr query configuration.

Parameters:
  • query – The search query.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

  • verbose – Optional. If True, prints the query for debugging purposes.

  • filter_by_has_tag – Optional. List of tags to boost if present. These will be upvoted in the query.

  • filter_by_not_tag – Optional. List of tags to downvote if present. These will be downvoted in the query.

Returns:

A DataFrame or list of results.

Return type:

pandas.DataFrame or list of dicts

setNeoEndpoint(endpoint, usr, pwd)[source]

Set the Neo4j endpoint and credentials.

setOwleryEndpoint(endpoint)[source]

Set the OWLery endpoint.

term(term, verbose=False)[source]

Get a VFBTerm object for a given term id, name, symbol or synonym.

Parameters:

term – The term to look up.

Returns:

a VFBTerm object

Return type:

dict

terms(terms, verbose=False)[source]

Get a list of VFBTerm objects for a given list of term id, name, symbol or synonym.

Parameters:

terms – A list of terms to look up.

Returns:

a VFBTerms list of VFBTerm objects

Return type:

VFBTerms

vfb_id_2_xrefs(vfb_id, db='', id_type='', reverse_return=False)[source]

Map a list of short_form IDs in VFB to external DB IDs

Parameters:
  • vfb_id (iter) – An iterable (e.g. a list) of VFB short_form IDs.

  • db – optional specify the VFB id (short_form) of an external DB to map to. (use get_dbs to find options)

  • id_type – optionally specify an external id_type

  • reverse_return – Boolean: Optional (see return)

Returns:

if reverse_return is False:

dict { VFB_id : [{ db: <db> : acc : <acc> }

Return if reverse_return is True:

dict { acc : [{ db: <db> : vfb_id : <VFB_id> }

xref_2_vfb_id(acc=None, db='', id_type='', reverse_return=False, return_just_ids=True, verbose=False)[source]

Map a list external DB IDs to VFB IDs

Parameters:
  • acc – An iterable (e.g. a list) of external IDs (e.g. neuprint bodyIDs). Can be in the form of ‘db:acc’ or just ‘acc’.

  • db – optional specify the VFB id (short_form) of an external DB to map to. (use get_dbs to find options)

  • id_type – optionally specify an external id_type

  • reverse_return – Boolean: Optional (see return)

  • return_just_ids – Boolean: Optional (see return)

  • verbose – Optional. If True, prints the running query and found terms. Default False.

Returns:

if reverse_return is False: dict { acc : [{ db: <db> : vfb_id : <VFB_id> } Return if reverse_return is True: dict { VFB_id : [{ db: <db> : acc : <acc> } if return_just_ids is True: return just the VFB_ids in a list

vfb_connect.cross_server_tools.dequote(string)[source]

Remove single quotes from around a string.

Parameters:

string – A string that may have single quotes around it.

Returns:

The string without surrounding single quotes.

Return type:

str

vfb_connect.cross_server_tools.gen_short_form(iri)[source]

Generate short_form (string) from an IRI string.

Parameters:

iri – A full IRI (Internationalized Resource Identifier) string.

Returns:

The short form of the IRI (typically the last part after ‘/’ or ‘#’).

Return type:

str

vfb_connect.default_servers module

vfb_connect.default_servers.get_default_servers()[source]

Module contents

class vfb_connect.VfbConnect(neo_endpoint='http://pdb.v4.virtualflybrain.org', neo_credentials=('neo4j', 'vfb'), owlery_endpoint='http://owl.virtualflybrain.org/kbs/vfb/', solr_endpoint='http://solr.virtualflybrain.org/solr/ontology/', vfb_launch=False)[source]

Bases: object

API wrapper class for Virtual Fly Brain (VFB) connectivity.

This class wraps connections to the basal API endpoints (OWL, Neo4j) and provides higher-level methods to combine semantic queries that range across VFB content with Neo4j queries. It returns detailed metadata about anatomical classes and individuals that fulfill these queries.

Parameters:
  • neo_endpoint – Specify a Neo4j REST endpoint.

  • neo_credentials – Specify credentials for the Neo4j REST endpoint.

  • owlery_endpoint – Specify OWLery server REST endpoint.

  • lookup_prefixes – A list of ID prefixes to use for rolling name:ID lookups.

Variables:
  • nc – Provides direct access to Neo4j via the Neo4jConnect instance.

  • neo_query_wrapper – Provides enriched query capabilities using the QueryWrapper instance.

  • oc – Provides direct access to OWL queries via the OWLeryConnect instance.

  • lookup – A lookup table for resolving names to IDs.

  • vfb_base – Base URL for Virtual Fly Brain links.

cypher_query(query, return_dataframe=True, verbose=False)[source]

Run a Cypher query.

Parameters:
  • query – The Cypher query to run.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame or list of results.

Return type:

pandas.DataFrame or list of dicts

generate_lab_colors(num_colors, min_distance=100, verbose=False)[source]

Generate a list of Lab colors and convert them to RGB tuples.

Parameters:
  • num_colors – The number of colors to generate.

  • min_distance – Minimum perceptual distance between colors.

Returns:

A list of RGB tuples.

get_TermInfo(short_forms, summary=True, cache=True, return_dataframe=True, query_by_label=True, limit=None, verbose=False)[source]

Generate a JSON report or summary for terms specified by a list of VFB IDs.

This method retrieves term information for a list of specified VFB IDs (short_forms). It can return either full metadata or a summary of the terms. The results can be returned as a pandas DataFrame if return_dataframe is set to True.

Parameters:
  • short_forms (iter) – An iterable (e.g., a list) of VFB IDs (short_forms).

  • summary – Optional. If True, returns a summary report instead of full metadata. Default is True.

  • cache – Optional. If True, attempts to retrieve cached results before querying. Default is True.

  • return_dataframe – Optional. If True, returns the results as a pandas DataFrame. Default is True.

  • query_by_label – Optional. If True, it allows labels, symbols or synonyms as well as short_forms. Default is True.

Returns:

A list of term metadata as VFB_json or summary_report_json, or a pandas DataFrame if return_dataframe is True.

Return type:

list of dicts or pandas.DataFrame

get_cache_file_path()[source]

Determine a safe place to save the pickle file in the same directory as the module.

get_connected_neurons_by_type(upstream_type=None, downstream_type=None, weight=None, query_by_label=True, return_dataframe=True, verbose=False)[source]

Get all synaptic connections between individual neurons of upstream_type and downstream_type where synapse count >= weight.

Parameters:
  • upstream_type – The upstream neuron type (e.g., ‘GABAergic neuron’).

  • downstream_type – The downstream neuron type (e.g., ‘Descending neuron’).

  • query_by_label – Optional. Specify neuron type by label if True (default) or by short_form ID if False.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame or list of synaptic connections between specified neuron types.

Return type:

pandas.DataFrame or list of dicts

get_datasets(summary=True, return_dataframe=True)[source]

Get all datasets in the database.

Returns:

List of datasets in the database.

Return type:

list

get_dbs(include_symbols=True)[source]

Get all external databases in the database.

Returns:

List of external databases in the database.

Return type:

list

get_gene_function_filters()[source]

Get a list of all gene function labels.

Returns:

List of unique gene function labels in alphabetical order.

Return type:

list

get_images(short_forms, template=None, image_folder=None, image_type='swc', stomp=False)[source]

Get images for a list of individuals.

Parameters:
  • short_forms (iter) – List of short_form IDs for individuals.

  • template – Optional. Template name.

  • image_folder – Optional. Folder to save image files & manifest to.

  • image_type – Optional. Image type (file extension).

  • stomp – Optional. Overwrite image_folder if already exists.

Returns:

Manifest as Pandas DataFrame

get_images_by_filename(filenames, dataset=None, summary=True, return_dataframe=True)[source]

Get images by filename.

Parameters:
  • filenames (iter) – List of filenames.

  • dataset – Optional. Dataset name.

Returns:

List of images.

Return type:

list

get_images_by_type(class_expression, template, image_folder, image_type='swc', query_by_label=True, direct=False, stomp=False)[source]

Download all images of individuals specified by a class expression.

Parameters:
  • class_expression – A valid OWL class expression, e.g., the name or symbol of a type of neuron (MBON01).

  • template – The template name.

  • image_folder – The folder to save image files and manifest to.

  • image_type – The image file extension (e.g., ‘swc’).

  • query_by_label – Optional. Query using class labels if True, or IDs if False. Default True.

  • direct – Optional. Return only direct instances if True. Default False.

  • stomp – Optional. Overwrite the image folder if it already exists. Default False.

Returns:

A manifest of downloaded images as a pandas DataFrame.

Return type:

pandas.DataFrame

get_instances(class_expression, query_by_label=True, summary=True, return_dataframe=True, limit=None, return_id_only=False, verbose=False)[source]

Generate JSON report of all instances of a given class expression.

Instances are specific examples of a type/class, e.g., a neuron of type DA1 adPN from the FAFB_catmaid database.

Parameters:
  • class_expression – A valid OWL class expression, e.g., the name of a class.

  • query_by_label – Optional. Query using class labels if True, or IDs if False. Default True.

  • summary – Optional. Returns summary reports if True. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns a list of dicts. Default True.

Returns:

A DataFrame or list of terms as nested Python data structures following VFB_json or summary_report_json.

Return type:

pandas.DataFrame or list of dicts

get_instances_by_dataset(dataset, query_by_label=True, summary=True, return_dataframe=True, return_id_only=False)[source]

Get JSON report of all individuals in a specified dataset.

Parameters:
  • dataset – The dataset ID.

  • summary – Optional. Returns summary reports if True. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns a list of dicts. Default True.

Returns:

A DataFrame or list of terms as nested Python data structures following VFB_json or summary_report_json.

Return type:

pandas.DataFrame or list of dicts

get_neurons_downstream_of(neuron, weight, classification=None, query_by_label=True, return_dataframe=True, verbose=False)[source]

Get all neurons downstream of a specified neuron.

Parameters:
  • neuron – The name or ID of a particular neuron (dependent on query_by_label setting).

  • weight – Limit returned neurons to those connected by >= weight synapses.

  • classification – Optional. Restrict downstream neurons by classification.

  • query_by_label – Optional. Query neuron by label if True, or by ID if False. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame or list of neurons downstream of the specified neuron.

Return type:

pandas.DataFrame or list of dicts

get_neurons_upstream_of(neuron, weight, classification=None, query_by_label=True, return_dataframe=True, verbose=False)[source]

Get all neurons upstream of a specified neuron.

Parameters:
  • neuron – The name or ID of a particular neuron (dependent on query_by_label setting).

  • weight – Limit returned neurons to those connected by >= weight synapses.

  • classification – Optional. Restrict upstream neurons by classification.

  • query_by_label – Optional. Query neuron by label if True, or by ID if False. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame or list of neurons upstream of the specified neuron.

Return type:

pandas.DataFrame or list of dicts

get_nt_predictions(term, verbose=False)[source]

Find predicted neurotransmitter(s) for a single neuron or all neurons of a given type. If nothing is found, an empty DataFrame is returned. :type term: :param term: The ID, name, or symbol of a class in the Drosophila Anatomy Ontology (FBbt) or the ID or name of an individual neuron in VFB. :return: A DataFrame. :rtype: pandas.DataFrame

get_nt_receptors_in_downstream_neurons(upstream_type, downstream_type='neuron', weight=0, use_predictions=True, return_dataframe=True, verbose=False)[source]

Get neurotransmitter receptors in downstream neurons of a given neuron type.

Returns a DataFrame of neurotransmitter receptors in downstream neurons of a specified neuron type. If no data is found, returns False. If use_predictions, an extra column (‘nt_only_predicted’) will indicate whether each receptor is for a neurotransmitter that is only predicted to be released by the upstream type.

Parameters:
  • upstream_type – The ID, name, or symbol of a class in the Drosophila Anatomy Ontology (FBbt).

  • downstream_type – Optional. The type of downstream neurons to search for. Default is ‘neuron’.

  • weight – Optional. Limit returned neurons to those connected by >= weight synapses. Default is 0.

  • use_predictions – Optional. Use predicted neurotransmitters (from instances) in addition to known neurotransmitters. Default is True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame with neurotransmitter receptors in downstream neurons of the specified neuron type.

Return type:

pandas.DataFrame or list of dicts

get_potential_drivers(neuron, similarity_score='NBLAST_score', query_by_label=True, return_dataframe=True, verbose=False)[source]

Get JSON report of driver expression likely to contain the input neuron.

Parameters:
  • neuron – The neuron to find similar drivers for.

  • similarity_score – Optional. Specify the similarity score to use (e.g., ‘NBLAST_score’, ‘neuronbridge_score’). Default ‘NBLAST_score’.

  • query_by_label – Optional. Query neuron by label if True, or by ID if False. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame or list of potranial drivers (id, label, tags) + similarity score.

Return type:

pandas.DataFrame or list of dicts

get_scRNAseq_expression(id, query_by_label=True, return_id_only=False, return_dataframe=True, verbose=False)[source]

Get scRNAseq expression data for a given anatomy term.

Returns a DataFrame of scRNAseq clusters of cells that are shown to express the current anatomy term. If no data is found, returns False.

Parameters:
  • id – The ID, name, or symbol of a class in the Drosophila Anatomy Ontology (FBbt).

  • query_by_label – Optional. Query using cell type labels if True, or IDs if False. Default True.

  • return_id_only – Optional. Return only the cluster IDs if True. Default False.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame with scRNAseq expression data for clusters of cells annotated as the specified cell type.

Return type:

pandas.DataFrame or list of dicts

get_scRNAseq_gene_expression(cluster, query_by_label=True, return_id_only=False, return_dataframe=True, verbose=False)[source]

Get gene expression data for a given scRNAseq cluster.

Returns a DataFrame of gene expression data for a cluster of cells annotated as the specified cluster. If no data is found, returns False.

Parameters:
  • cluster – The ID, name, or symbol of a class in the Drosophila Anatomy Ontology (FBbt).

  • query_by_label – Optional. Query using cell type labels if True, or IDs if False. Default True.

  • return_id_only – Optional. Return only the gene IDs if True. Default False.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame with gene expression data for clusters of cells annotated as the specified cell type.

Return type:

pandas.DataFrame or list of dicts

get_similar_neurons(neuron, similarity_score='NBLAST_score', query_by_label=True, return_dataframe=True, verbose=False)[source]

Get JSON report of individual neurons similar to the input neuron.

Parameters:
  • neuron – The neuron to find similar neurons to.

  • similarity_score – Optional. Specify the similarity score to use (e.g., ‘NBLAST_score’). Default ‘NBLAST_score’.

  • query_by_label – Optional. Query neuron by label if True, or by ID if False. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame or list of similar neurons (id, label, tags, source (db) id, accession_in_source) + similarity score.

Return type:

pandas.DataFrame or list of dicts

get_subclasses(class_expression, query_by_label=True, direct=False, summary=True, return_dataframe=True, verbose=False)[source]

Generate JSON report of all subclasses of a given class expression.

Parameters:
  • class_expression – A valid OWL class expression, e.g., the name of a class.

  • query_by_label – Optional. Query using class labels if True, or IDs if False. Default True.

  • direct – Optional. Return only direct subclasses if True. Default False.

  • summary – Optional. Returns summary reports if True. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns a list of dicts. Default True.

Returns:

A DataFrame or list of terms as nested Python data structures following VFB_json or summary_report_json.

Return type:

pandas.DataFrame or list of dicts

get_superclasses(class_expression, query_by_label=True, direct=False, summary=True, return_dataframe=True)[source]

Generate JSON report of all superclasses of a given class expression.

Parameters:
  • class_expression – A valid OWL class expression, e.g., the name of a class.

  • query_by_label – Optional. Query using class labels if True, or IDs if False. Default True.

  • direct – Optional. Return only direct superclasses if True. Default False.

  • summary – Optional. Returns summary reports if True. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns a list of dicts. Default True.

Returns:

A DataFrame or list of terms as nested Python data structures following VFB_json or summary_report_json.

Return type:

pandas.DataFrame or list of dicts

get_templates(summary=True, return_dataframe=True)[source]

Get all templates in the database.

Returns:

List of templates in the database.

Return type:

list

get_terms_by_region(region, cells_only=False, verbose=False, query_by_label=True, summary=True, return_dataframe=True)[source]

Generate TermInfo reports for all terms relevant to annotating a specific region, optionally limited to cells.

Parameters:
  • region – The name (rdfs:label) of the brain region (or CURIE style ID if query_by_label is False).

  • cells_only – Optional. Limits query to cell types if True. Default False.

  • verbose – Optional. If True, prints the running query and found terms. Default False.

  • query_by_label – Optional. Query using region labels if True, or IDs if False. Default True.

  • summary – Optional. Returns summary reports if True. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns a list of dicts. Default True.

Returns:

A DataFrame or list of terms as nested Python data structures following VFB_json or summary_report_json.

Return type:

pandas.DataFrame or list of dicts

get_terms_by_xref(xrefs, db='', summary=True, return_dataframe=True)[source]

Retrieve terms by cross-reference (xref) identifiers.

This method takes a list of external cross-reference identifiers and returns the corresponding terms from the database. The terms can be returned either as full metadata or as summaries. Additionally, the results can be returned as a pandas DataFrame if return_dataframe is set to True.

Parameters:
  • xrefs (iter) – An iterable (e.g., a list) of cross-reference identifiers (xrefs).

  • db – Optional. The name of the external database to filter the results by. Default is an empty string, which means no filtering.

  • summary – Optional. If True, returns summary reports instead of full metadata. Default is True.

  • return_dataframe – Optional. If True and summary is also True, returns the results as a pandas DataFrame. Default is True.

Returns:

A list of term metadata as nested Python data structures (VFB_json or summary_report_json), or a pandas DataFrame if return_dataframe is True and summary is True.

Return type:

list of dicts or pandas.DataFrame

get_transcriptomic_profile(cell_type, gene_type=False, no_subtypes=False, query_by_label=True, return_dataframe=True)[source]

Get gene expression data for a given cell type.

Returns a DataFrame of gene expression data for clusters of cells annotated as the specified cell type (or subtypes). Optionally restricts to a gene type, which can be retrieved using get_gene_function_filters. If no data is found, returns False.

Parameters:
  • cell_type – The ID, name, or symbol of a class in the Drosophila Anatomy Ontology (FBbt).

  • gene_type – Optional. A gene function label retrieved using get_gene_function_filters.

  • no_subtypes – Optional. If True, only clusters for the specified cell_type will be returned and not subtypes. Default False.

  • query_by_label – Optional. Query using cell type labels if True, or IDs if False. Default True.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

Returns:

A DataFrame with gene expression data for clusters of cells annotated as the specified cell type.

Return type:

pandas.DataFrame or list of dicts

Raises:

KeyError – If the cell_type or gene_type is invalid.

Generate a link to Virtual Fly Brain (VFB) that loads all available images of neurons on the specified template.

Parameters:
  • short_forms (iter) – A list (or other iterable) of VFB short_form IDs for individuals with images.

  • template – The name (label) of a template.

Returns:

A URL for viewing images and metadata for specified individuals on VFB.

Return type:

str

Raises:

ValueError – If the template name is not recognized.

json = <module 'json' from '/home/docs/.asdf/installs/python/3.10.14/lib/python3.10/json/__init__.py'>
lookup_id(key, return_curie=False, allow_subsitutions=True, subsitution_stages=['adult', 'larval', 'pupal'], verbose=False)[source]

Lookup the ID for a given key (label or symbol) using the internal lookup table.

Parameters:
  • key – The label symbol, synonym, or potential ID to look up.

  • allow_subsitutions – Optional. If True, allow for case-insensitive and character-insensitive lookups. Default True.

  • subsitution_stages – Optional. A list of prefixes to try for substitutions. Default [‘adult’, ‘larval’, ‘pupal’].

  • return_curie – Optional. If True, return the ID in CURIE (Compact URI) format. Default False.

Returns:

The ID associated with the key, or the key itself if it is already a valid ID. None is returned if the key is not found.

Return type:

str

lookup_name(ids)[source]

Lookup the name for a given ID using the internal lookup table.

Parameters:

ids – A single ID or list of IDs to look up.

Returns:

The name associated with the ID.

Return type:

str

owl_instances(query, query_by_label=True, return_id_only=False, return_dataframe=False, limit=False, verbose=False)[source]

Get instances of a given term.

Returns a VFBTerms of instances of the specified term. If no data is found, returns False.

Parameters:
  • term – The ID, name, or symbol of a class in the Drosohila Anatomy Ontology (FBbt).

  • query_by_label – Optional. Query using cell type labels if True, or IDs if False. Default True.

  • return_id_only – Optional. Return only the instance IDs if True. Default False.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns VFBTerms. Default False.

  • limit – Optional. Limit the number of instances returned. Default False.

  • verbose – Optional. If True, prints the running query and found terms. Default False.

Returns:

A VFBTerms or DataFrame with instances of the specified query.

Return type:

dependant on the options a pandas.DataFrame, list of ids or VFBTerms. Default is VFBTerms

owl_subclasses(query, query_by_label=True, return_id_only=False, return_dataframe=False, limit=False, verbose=False)[source]

Get subclasses of a given term.

Returns a VFBTerms of subclasses of the specified term. If no data is found, returns False.

Parameters:
  • term – The ID, name, or symbol of a class in the Drosohila Anatomy Ontology (FBbt).

  • query_by_label – Optional. Query using cell type labels if True, or IDs if False. Default True.

  • return_id_only – Optional. Return only the subclass IDs if True. Default False.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default False.

  • limit – Optional. Limit the number of instances returned. Default False.

  • verbose – Optional. If True, prints the running query and found terms. Default False.

Returns:

A VFBTerms or DataFrame with subclasses of the specified term.

Return type:

dependant on the options a pandas.DataFrame, list of ids or VFBTerms. Default is VFBTerms

owl_superclasses(query, query_by_label=True, return_id_only=False, return_dataframe=False, limit=False, verbose=False)[source]

Get superclasses of a given term.

Returns a VFBTerms of superclasses of the specified term. If no data is found, returns False.

param term:

The ID, name, or symbol of a class in the Drosohila Anatomy Ontology (FBbt).

Return type:

dependant on the options a pandas.DataFrame, list of ids or VFBTerms. Default is VFBTerms

reload_lookup_cache(verbose=False)[source]

Clear the lookup cache file.

search(query, return_dataframe=True, verbose=False, filter_by_has_tag=None, filter_by_not_tag=['Deprecated'])[source]

Search for terms in the database using a complex Solr query configuration.

Parameters:
  • query – The search query.

  • return_dataframe – Optional. Returns pandas DataFrame if True, otherwise returns list of dicts. Default True.

  • verbose – Optional. If True, prints the query for debugging purposes.

  • filter_by_has_tag – Optional. List of tags to boost if present. These will be upvoted in the query.

  • filter_by_not_tag – Optional. List of tags to downvote if present. These will be downvoted in the query.

Returns:

A DataFrame or list of results.

Return type:

pandas.DataFrame or list of dicts

setNeoEndpoint(endpoint, usr, pwd)[source]

Set the Neo4j endpoint and credentials.

setOwleryEndpoint(endpoint)[source]

Set the OWLery endpoint.

term(term, verbose=False)[source]

Get a VFBTerm object for a given term id, name, symbol or synonym.

Parameters:

term – The term to look up.

Returns:

a VFBTerm object

Return type:

dict

terms(terms, verbose=False)[source]

Get a list of VFBTerm objects for a given list of term id, name, symbol or synonym.

Parameters:

terms – A list of terms to look up.

Returns:

a VFBTerms list of VFBTerm objects

Return type:

VFBTerms

vfb_id_2_xrefs(vfb_id, db='', id_type='', reverse_return=False)[source]

Map a list of short_form IDs in VFB to external DB IDs

Parameters:
  • vfb_id (iter) – An iterable (e.g. a list) of VFB short_form IDs.

  • db – optional specify the VFB id (short_form) of an external DB to map to. (use get_dbs to find options)

  • id_type – optionally specify an external id_type

  • reverse_return – Boolean: Optional (see return)

Returns:

if reverse_return is False:

dict { VFB_id : [{ db: <db> : acc : <acc> }

Return if reverse_return is True:

dict { acc : [{ db: <db> : vfb_id : <VFB_id> }

xref_2_vfb_id(acc=None, db='', id_type='', reverse_return=False, return_just_ids=True, verbose=False)[source]

Map a list external DB IDs to VFB IDs

Parameters:
  • acc – An iterable (e.g. a list) of external IDs (e.g. neuprint bodyIDs). Can be in the form of ‘db:acc’ or just ‘acc’.

  • db – optional specify the VFB id (short_form) of an external DB to map to. (use get_dbs to find options)

  • id_type – optionally specify an external id_type

  • reverse_return – Boolean: Optional (see return)

  • return_just_ids – Boolean: Optional (see return)

  • verbose – Optional. If True, prints the running query and found terms. Default False.

Returns:

if reverse_return is False: dict { acc : [{ db: <db> : vfb_id : <VFB_id> } Return if reverse_return is True: dict { VFB_id : [{ db: <db> : acc : <acc> } if return_just_ids is True: return just the VFB_ids in a list