Skip to content

Primary Analysis

A. Prepare SampleInformation.tab file

  1. Current SampleInformation.tab file is a tab delimited text file. This file could be edited in a plain text editor or a spreadsheet program. Update this file when a new sample is added to the project.

B. Prepare the input data files, output folders and the swarm job file

  1. Make sure you have the Results/cellranger folder - to store your results
  2. The Tools/prep.sh file is a bash script file that should be run first, to prepare files/folders and to generate the swarm job file.
  3. Open the prep.sh file in a plain text editor and edit the "samples" variable to update new sample names as it is entered in the SampleInformation.tab file.
  4. Open a shell/terminal, to change to the Tools directory, run the below command
    cd /data/../TotalSeq/Tools/
    
  5. Run the prep.sh script using the below command
    bash prep.sh
    
  6. Check the Results folder to make sure you have directories for new samples and that the directory has corresponding fastq files and library file
  7. Check the Tools folder to make sure you have the "totalseqall.swarm" file

C. Primary analysis - barcode/umi/cell counting using cellranger

  1. Change to the Tools directory using the below command;
    cd /data/../TotalSeq/Tools/
    
  2. Run the swarm job using the below command;
    swarm -f totalseqall.swarm -g 64 -t 28 --gres=lscratch:200 --partition=norm --module cellranger
    

Note

Note down your job id

  1. Check your job status using the "jobload" command or the squeue command;
    squeue --job 14354599
    
  2. Extend the swarm job running time using the below command;
    newwall --jobid 13558558 --time 10:00:00
    

D. Organize the outputs and cleanup

  1. Once the swarm job is complete (should take about three hours), change to the Tools directory using the below command;
    cd /data/../TotalSeq/Tools/
    
  2. Open the "organizeout.sh" script in text editor and updated the variable "samples" with your new sample names
  3. Run the "organizeout.sh" script using the below command;
    bash organizeout.sh
    
  4. Check out the Results folder to make sure you have the expected output files.

Example CellRanger Output location for Sample NS3R189BTS

/data/../TotalSeq/Results/cellranger/NS3R189BTS/cellranger_output/

E. Aggregate cellranger summary outputs

  1. Login to biowulf and request an interactive node using the below 'sinteractive' command, with 64 cores, 160gb RAM and 300gb of storage;
    sinteractive --cpus-per-task=64 --mem=160g --gres=lscratch:300
    
  2. Once the interacive node is assigned, change to the Tools directory using the below command;
    cd /data/../TotalSeq/Tools/
    
  3. Load the R module using the below command;
    module load R
    
  4. Once the R module is loaded, run the 'cellrangerReport.R' script using the below command;
    Rscript cellrangerReport.R
    
  5. The cellranger report is now available in the "Reports" folder. A link to this report is also given here

Software

  1. cellranger 6.0.0
  2. R 4.0.5
  3. mkdocs 1.1.2

References

  1. Feature reference file generation
  2. TotalSeq
  3. CellRanger Biowulf
  4. Swarm Biowulf