SpatialDE.test

SpatialDE.test(adata, layer=None, omnibus=False, spatial_key='spatial', kernel_space=None, sizefactors=None, stack_kernels=None, use_cache=True)

Test for spatially variable genes.

Perform a score test to detect spatially variable genes in a spatial transcriptomics dataset. Multiple kernels can be tested to detect genes with different spatial patterns and lengthscales. The test uses a count-based likelihood and thus operates on raw count data. Two ways of handling multiple kernels are implemented: omnibus and Cauchy combination. The Cauchy combination tests each kernel separately and combines the p-values afterwards, while the omnibus test tests all kernels simultaneously. With multiple kernels the omnibus test is faster, but may have slightly less statistical power than the Cauchy combination.

Parameters:
  • adata (AnnData) – The annotated data matrix.

  • layer (Optional[str]) – Name of the AnnData object layer to use. By default adata.X is used.

  • omnibus (bool) – Whether to do an omnibus test.

  • spatial_key (str) – Key in adata.obsm where the spatial coordinates are stored.

  • kernel_space (Optional[Dict[str, Union[float, List[float]]]]) –

    Kernels to test against. Dictionary with the name of the kernel function as key and list of lengthscales (if applicable) as values. Currently, three kernel functions are known:

    • SE, the squared exponential kernel \(k(\boldsymbol{x}^{(1)}, \boldsymbol{x}^{(2)}; l) = \exp\left(-\frac{\lVert \boldsymbol{x}^{(1)} - \boldsymbol{x}^{(2)} \rVert}{l^2}\right)\)

    • PER, the periodic kernel \(k(\boldsymbol{x}^{(1)}, \boldsymbol{x}^{(2)}; l) = \cos\left(2 \pi \frac{\sum_i (x^{(1)}_i - x^{(2)}_i)}{l}\right)\)

    • linear, the linear kernel \(k(\boldsymbol{x}^{(1)}, \boldsymbol{x}^{(2)}) = (\boldsymbol{x}^{(1)})^\top \boldsymbol{x}^{(2)}\)

    By default, 5 squared exponential and 5 periodic kernels with lengthscales spanning the range of the data will be used.

  • sizefactors (Optional[ndarray]) – Scaling factors for the observations. Default to total read counts.

  • stack_kernels (Optional[bool]) – When using the Cauchy combination, all tests can be performed in one operation by stacking the kernel matrices. This leads to increased memory consumption, but will drastically improve runtime on GPUs for smaller data sets. Defaults to True for datasets with less than 2000 observations and False otherwise.

  • use_cache (bool) – Whether to use a pre-computed distance matrix for all kernels instead of computing the distance matrix anew for each kernel. Increases memory consumption, but is somewhat faster.

Return type:

Tuple[DataFrame, Optional[DataFrame]]

Returns:

If omnibus==True, a tuple with a Pandas DataFrame as the first element and None as the second. The DataFrame contains the results of the test for each gene, in particular p-values and BH-adjusted p-values. Otherwise, a tuple of two DataFrames. The first contains the combined results, while the second contains results from individual tests.