Method comparison
Load assignment results
In case some of the methods were manually run on distinct gRNA subsets, we provide the function combine_assignments to combine the outputs into one csv file. After running all guide assignment methods of interest, we can create a data frame which contains all gRNA cell assignments per method using the load_assignments function and filter for cells that got exactly one gRNA assigned.
- crispat.combine_assignments(data_dir)
Combines assignment files for methods that have been performed on gRNA subsets
- Parameters:
data_dir (str) – directory containing the assignment output
- Returns:
None
- crispat.load_assignments(gc_dict, data_dir)
Loads and combines assignment files of specified methods
- Parameters:
gc_dict (dict) – dictionary with method directory and threshold for each assignment
data_dir (str) – directory containing all assignment output folders
- Returns:
A pd DataFrame containing the gRNA-cell assignments of each method
Number of assigned cells and intersection
To plot the number of total assigned cells, as well as the number of uniquely assigned cells per method, crispat contains the function plot_n_assigned_cells. To compare how similar various assignments are to each other, crispat also includes a function (plot_intersection_heatmap) which creates a heatmap of the pairwise Jaccard index of uniquely assigned cells per method.
- crispat.plot_n_assigned_cells(perturbations, colors=None)
Plots a barplot with the number of assigned cells and uniquely assigned cells per method
- Parameters:
perturbations (pd DataFrame) – df with the assigned perturbations (needed columns: method, cell, gRNA)
colors (dictionary, optional) – specifies the colors to use for each method in the barplot
- Returns:
A matplotlib plot
- crispat.plot_intersection_heatmap(perturbations, method_order=None)
Plot a heatmap with Jaccard index showing the intersecting assignments
- Parameters:
perturbations (pd DataFrame) – df with the assigned perturbations (needed columns: method, cell, gRNA)
method_order (list) – list defining the order of the rows and columns (default: alphabetic order)
- Returns:
A matplotlib plot
Effects on downstream analysis
To investigate the consequences of the assignment differences for discovery analysis, the assignments from crispat can serve as input to different differential expression tests. For the analysis shown in our paper, we use the crispat output as an input for the R package SCEPTRE, which is tailored to single-cell CRISPR screen analysis and combines multiple analyses and control checks in a statistical rigorous fashion. First, SCEPTRE calculates the log2 fold changes and p-values for the target genes (power check). Next, it calculates the number of false discoveries which are genes that are significantly differentially expressed comparing the cells of one non-targeting gRNA against all other non-targeting control cells (calibration). And finally, it calculates the differentially expressed genes for each perturbation (discovery analysis). However, it is also possible to input the assignments obtained by crispat into other tools for differential expression testing such as scanpy or Seurat. Therefore, we provide tutorials on how to do downstream analysis with SCEPTRE, scanpy and Seurat in our github repository.