STOmics STOmics

EN CN
FAQ
Filter Clear
Products
Stereo-seq Solutions
Stereo-seq Solution - mIF
Stereo-seq Large Chip Designs
Stereo-CITE Solution
Stereo-seq OMNI Solution
STOmics Software
Stereo-seq Analysis Workflow
StereoMap
Technical Process
Sample Preparation
Operating Procedure
Experimental Results
STOmics Product
Image Process
Sequencing Analysis
Report Interpretation
101results:
Q What are requirements for genomic annotation files including GTF/GFF formats?
A

1. File format:

GFF files or GTF files, supporting gtf/gtf.gz, gff/gff.gz, gff3/gff3.gz as file suffix names.

2. GTF file format:

Comment lines begin with #

The main body has 9 columns, separated by 'tab': seqname source feature start end score strand frame attributes

type: types of annotation information must contain gene,transcript and exon

start/end: need to be less than 231

strand: forward and reverse of strands, represented as + and -, respectively

attributes as the 9th column, whose format is tag "value" , with different attributes separated by space; of which the following four are required.

gene_name value

gene_id value: represents the unique ID of a transcript for the given gene loci of the genome. 'gene_id' and 'value' are separated by space. If the value is empty, it means that there is no corresponding gene.

transcript_name value

transcript_id value : a unique ID to identify a transcript. Empty value means no transcript.

At present, the maximum valid gene number must be less than 220, that is 1048576

Do not disrupt order. The same gene's transcript/exons need to be arranged in order

3. GFF file format:

Comment lines begin with #

The main body has 9 columns, separated by 'tab': seqid source type start end score strand phase attributes

type: types of annotation information must contain gene,mRNA and exon

start/end: max of them need to be less than 231

strand: "+" stands for forward strands, "-" stands for reverse strands, "." indicates there is no need to specify positive or negative strands, "?" means unknown

attributes as the 9th column, whose format is tag=value, with different attributes separated by semicolon

ID Name Parent must provide (Parent is not required for each gene)

For naming rules of the 3rd column, please carefully check on ⇒ "dendrachy" (tree-shaped hierarchy) (do not list 'child' rows without 'parent' rows!) An example is shown as follows:

img1.ad49f9f8

At present, the maximum valid gene number must be less than 220, that is 1048576

Although ordering is not required, the rules that 'gene' must appear ahead of corresponding mRNA, and mRNA must appear ahead of corresponding exon still need to be met.

4. Others to note:

gene/gene_name should not contain any special symbols (space, all types of brackets, quotation marks, <>, %, etc.) other than common symbols such as "_" and "."

gene/gene_name shorter than 64 characters

Although the mainly used GFF files are version 3 (GFF3), please name them as .gff ; likewise, please name GTF files as .gtf


Q What is the version compatibility between ImageQC/ImageStudio, SAW and StereoMap?
A


imageQC
ImageQC description
SAW
SAW description
<= 1.0.8
File format: .json + .tar.gz
Features: ssDNA image QC
<= 4.1.0
Support ssDNA image registration and tissue segmentation
>= 1.1.0
File format: .ipr + .tar.gz
Features: ssDNA image QC
>= 5.1.3
Support cell segmentation on ssDNA image; enable analysis of FASTQ data in Q4 format
ImageStudio
ImageStudio description
SAW
SAW description
StereoMap
StereoMap description

1.0.0
File format: .ipr + .tar.gz
Features: ssDNA image QC and manual processing
>= 5.5.0
Support cell segmentation on ssDNA image; enable analysis of FASTQ data in Q4 format
1.0.0
Support displaying spatial expression heatmap, co-visualization of gene distribution, and ssDNA image. Manual registration enabled
2.0.0
File format: .ipr + .tar.gz
Features:  Image QC for ssDNA, DAPI, mIF stains and manual processing
>= 6.0.0
Support mIF image registration; allow for rRNA filtering
2.0.0
Display of individual mIF images and the ones stacked with different image layers
2.1
File format: .ipr + .tar.gz
Features:  Image QC for ssDNA, DAPI, mIF stains and their manual image processing; Fully manual procedure for QC-failed images

>= 6.1

<7.0

Support analysis of the manually processed image outputs from ImageStudio and StereoMap

2.1

<3.0

Support reading multiple gef files at a time, which will be displayed by individual tabs
2.2
File format: .ipr + .tar.gz
Features: Image QC for ssDNA, DAPI, mIF stains and their manual image processing; fully manual procedure for QC failed image
>=6.1
<7.0
Support analysis with the results of fully manual procedure done by ImageStudio
2.1
<3.0
Support reading multiple gef files at a time, which will be displayed by individual tabs
3.0
File format: .ipr + .tar.gz
Features: Image QC for ssDNA, DAPI, H&E, mIF stains and their manual image processing; fully manual procedure for QC failed images
7.0Reconstructed 'count' go online;
'register' reconstructed with new tissue segmentation algorithm and new 'V03' cell segmentation algorithm;
Support H&E whole process;
Support cell correction using EDM algorithm based on mask file of cell segmentation result
3.0
Support reading h5ad files with different binsize/resolution;
/codedCellBlock information is written into cgef file after the SAW cellChunk module;
Render cellbin heatmap while loading cgef files
The Image studio is integrated into StereoMap

File format: .tar.gz (includes. ipr)

Features: ssDNA, DAPI, H&E, mIF Image QC and manual processing; And full manual processing for QC-failed Image;

8.0

● Now the standard spatial transcriptomic analysis workflow is intergrated into one command line.

● Support one-stop computational workflow for FFPE sample (including microorganism analysis) 

● Output zipped report file  

● Output zipped package for visualization 

4.0

● Visualization: Support reading with .stereo manifest file; compatible with  data of old version in reading 

● Manual processing: Processe image data in a step by step manner

8.1● Support Stereo-seq T FF V1.3 and Stereo-CITE T FF data analysis4.1

● Visualization: Support the display of gene expression heatmaps for cellbin analysis; support linked display for the protein & marker genes 

● Manual processing: New registration method available (Feature point registration)

● The output file supports user-defined directories.


Q How to deal with the situation of abnormal valid CID ratio?
A

There are three directions in which investigation can be carried out.

1. Sequencing quality. Low sequencing quality can affect alignment results. In addition to Q30, the presence of unknown base calls needs to be considered as well, which can be examined by reviewing base distribution in the sequencing report. If the proportion of N bases is high, it needs to be considered that sequencing problems have affected the valid CID ratio. It is recommended to prioritize such inspection.

2. The chip mask h5 file does not correspond to the FASTQ datasets. Because the CID recorded in the mask does not match the CID obtained by sequencing the sample, the valid CID ratio is low. If this situation occurs alone, the proportion is usually extremely low. If the next situation is also involved, the variation would be of significance, requiring a case by case analysis.

3. (Cross) Contamination. It occured when other samples got mixed in during the experiment, library preparation, or sequencing, which affected the valid CID ratio because of being contaminated. Here comes a likelihood that two chips can be both mapped to the sequencing data of the same library. If there is a lot of mixing, a distinct tissue pattern should be visible. If the proportion is extremely small, in some cases there will be some local bright spots.


Q How to choose appropriate bin sizes when analyzing the data?
A

Some information, such as cell sizes of specific tissue types, can be used. It is recommended to vary the bin level repeatedly based on the results of downstream analyses, with a spectrum of bin20, 50, 100, and 200. Bin20 is about the size of a regular mammalian cell, while bin50 and bin100 are both frequently adopted in the analysis. And bin200 is generally used for immediate visualization of SAW outputs.

Q What is the relationship between the sizes of a single biological cell and a square bin spot?
A

Given that the diameter of a typical mammalian cell is approximately 10μm, it is analogous to a bin20 spot that is 10μm x 10μm in area or a bin14 spot with a diagonal of 10 μm.

Q Can the library of STOmics Stereo-seq Transcriptomics Set be pooled with other libraries for sequencing?
A

No, Stereo-seq Transcriptomics Set library requires different sequencing protocols and sequencing reagents compared to other libraries.

Q Which sequencing platform is applicable to Stereo-seq Transcriptomics Set library?
A

Stereo-seq Transcriptomics Set library can be sequenced on DNBSEQ-G400RS, MGISEQ-2000RS and DNBSEQ-T7RS platforms.

Q Can samples of different tissue types be analyzed on the same Stereo-seq Transcriptomics Chip T?
A

No, samples of different tissue types need to be tested for different permeabilization conditions. In addition, different samples should not be processed on the same chip to prevent cross-contamination.

Q Can I use used chips to practice tissue section placement?
A

It can be used for practice.

Q Are the spatial barcodes of each capture area on the Stereo-seq Chip T different?
A

Yes, they are different.

Reach out to Us
Discover the power of Stereo-seq
Consult