`module` `dataset`

`function` `validate_counts`

validate_counts(counter, threshold, label)

Validates the counts in a counter dictionary against a threshold.

Args:

counter (collections.Counter): The counter dictionary containing the counts.
threshold (int): The minimum count threshold.
label (str): The label to be used in the assertion error message.

Raises:

AssertionError: If any count in the counter dictionary is less than the threshold.

`function` `filter_and_encode`

filter_and_encode(df, node_encoder, all_nodes, use_index=False)

Filters and encodes the given DataFrame based on the provided node encoder and all nodes.

Args:

df (pandas.DataFrame): The DataFrame to be filtered and encoded.
node_encoder (dict): A dictionary mapping node IDs to encoded values.
all_nodes (list): A list of all node IDs.
use_index (bool, optional): Whether to filter based on DataFrame index. Defaults to False.

Returns:

pandas.DataFrame: The filtered and encoded DataFrame.

`function` `drop_small`

drop_small(edges, numb)

Drop clones and cell types with less than ‘numb’ cells from the edges dataframe.

Parameters: edges (DataFrame): The dataframe containing the edges information. numb (int): The minimum number of cells required for a clone or cell type to be included.

Returns: DataFrame: The modified edges dataframe with small clones and cell types dropped.

`function` `preprocess_data`

preprocess_data(edges, overcl, spatial_edges, grid_edges)

Preprocesses the given data by filtering and filling missing values.

Args:

edges (DataFrame): The edges data.
overcl (DataFrame): The annotation data with clone and cell type labels.
spatial_edges (str): The type of spatial edges.
grid_edges (str): The type of grid edges.

Returns:

Tuple[DataFrame, DataFrame]: The preprocessed edges and overcl data.

`function` `read_and_merge_embeddings`

read_and_merge_embeddings(paths, edges, drop_less=10)

Read and merge the embeddings from spatial and RNA datasets.

Parameters:

paths (dict): A dictionary containing the file paths for the spatial and RNA datasets.
edges (pd.DataFrame): A DataFrame containing the edges of the graph.
drop_less (int): The minimum number of occurrences required for an edge to be kept.

Returns:

emb_vis (pd.DataFrame): The merged embeddings from the spatial dataset.
emb_rna (pd.DataFrame): The merged embeddings from the RNA dataset.
edges (pd.DataFrame): The filtered edges of the graph.
node_encoder (dict): A dictionary mapping node IDs to encoded node IDs.

`function` `create_data_object`

create_data_object(
    edges,
    emb_vis,
    emb_rna,
    node_encoder,
    sim=None,
    with_diploid=True
)

Create a data object for graph neural network training.

Args:

edges (pandas.DataFrame): DataFrame containing the edges of the graph.
emb_vis (pandas.DataFrame): DataFrame containing the spatial embeddings.
emb_rna (pandas.DataFrame): DataFrame containing the RNA embeddings.
node_encoder (dict): Dictionary mapping node IDs to their corresponding encodings.
sim (pandas.DataFrame, optional): Similarity matrix between clone values. Defaults to None.
with_diploid (bool, optional): Flag indicating whether to include diploid values in the encoding. Defaults to True.

Returns:

tuple: A tuple containing the data object and dictionaries for node, clone, and cell type encodings. If sim is provided, an additional similarity matrix is returned.

Raises:

AssertionError: If the data object is not valid or the shapes of the data arrays are not consistent.

`function` `create_encoding_dict`

create_encoding_dict(df, column, extras=[])

Create a dictionary that maps unique values in a column of a DataFrame to their corresponding indices.

Parameters:

df (pandas.DataFrame): The DataFrame containing the column.
column (str): The name of the column.
extras (list, optional): Additional values to exclude from the dictionary.

Returns:

dict: A dictionary mapping unique values to their corresponding indices.

This file was automatically generated via lazydocs.

module dataset

function validate_counts

function filter_and_encode

function drop_small

function preprocess_data

function read_and_merge_embeddings

function create_data_object

function create_encoding_dict

`module` `dataset`

`function` `validate_counts`

`function` `filter_and_encode`

`function` `drop_small`

`function` `preprocess_data`

`function` `read_and_merge_embeddings`

`function` `create_data_object`

`function` `create_encoding_dict`