Our research focus

Develop a novel method for quantifying scRNA-seq data.

Single-cell RNA sequencing (scRNA-seq) has transformed the field of biomedical research. The 10x Genomics scRNA-seq technology is capable of sequencing the expression of thousands of genes in hundreds of thousands of individual cells simultaneously. Quantifying UMI (Unique Molecular Identifier) data generated from this technology is challenging because of the large volume of the data and the complexity of quantification. We are developing a new algorithm for accurate and efficient quantification of data from this technology. The successful development of this new bioinformatics tool will significantly reduce the data analysis time and improve the accuracy of gene expression quantification.

Develop a new method for mapping long sequencing reads

Long-read sequencing technologies, such as Nanopore and PacBio, have the potential to sequence whole gene transcript and discover long-range genomic mutations among other applications. A significant challenge for analysing long-read data is the read mapping which aligns each read to a reference genome. This is a critical step for successfully identifying full gene transcripts and detecting breakpoints of long-range mutations. We will expand the ‘seed-and-vote’ read mapping paradigm we successfully developed for mapping short reads, to develop a new method for long-read mapping. The successful development of this new tool is likely to result in discovery of new gene transcripts and mutations in diseases such as cancer.

Reconstruct a gene regulatory network to elucidate the differentiation of CD8+ T cells

Understanding the molecular mechanisms underlying the differentiation of CD8+ T cells will not only generate new knowledge in the field of immunity but is also important for the development of new strategies for improved immunotherapy. We will utilise omics data generated for mouse with chronic infection to reconstruct a gene regulatory network (GRN) containing interaction of key transcription factors and target genes to elucidate how differentiation of CD8+ cells are delicately regulated. We will then investigate how this GRN is perturbed in metastatic breast cancer using a mouse model of this disease and also cancer patient sequencing data available in the The Cancer Genome Atlas (TCGA) database. An outcome of this study will be a gene signature that can be used to predict which metastatic breast cancer patients will respond to immunotherapy.

Provide bioinformatics support to biology labs

Modern biomedical research makes use of powerful sequencing technologies such as single-cell RNA sequencing technologies. Bioinformatics support for fast and accurate analysis of sequencing data is important for the success of such research. Our lab collaborates with almost all the labs at ONJCRI to provide strong support for their bioinformatics needs. We specialize in analysing data generated from a range of sequencing technologies including bulk RNA-seq, single-cell RNA-seq, single-cell TCR-seq, ChIP-seq, ATAC-seq etc. We are also experienced in analysing public datasets generated by large consortia such as TCGA. We have contributed to many discoveries made in projects related to a wide range of cancer such as GI cancer, breast cancer and brain cancer.

Fast facts

An inter-disciplinary field that involves computer science, mathematics, genomics and biology. Computing scientists develop algorithms and software tools to analyse genomics data that are usually in a large scale, including genome-wide molecular data such as gene expression data, mutation data, transcription factor binding data and chromatin accessibility data.

Cancer genomics is the study of the molecular changes that occur in a cancer genome. It provides a powerful approach for detecting new genes and mutations  of cancer in a very time-efficient manner.

A sequencing technology that  enables the discovery of genes, digitally, that may be turned on or off in diseases such as cancer.

A graph in which nodes represent genes and edges represent interactions. The interaction can be direct or indirect. A direct interaction is a physical interaction that for example can be DNA binding or phosphorylation. An indirect interaction is usually co-expression of two genes. In a gene regulatory network, a gene may interact with two or more genes.

Recent publications

Nucleic Acids Research

Dividing out quantification uncertainty allows efficient assessment of differential transcript expression with edgeR

DOI: 10.1093/nar/gkad1167

7 December 2023

View abstract
Life Science Alliance

Mechanisms of cellular crosstalk in the gastric tumor microenvironment are mediated by YAP1 and STAT3

DOI: 10.26508/lsa.202302411

13 November 2023

View abstract
Nature Communications

A tuft cell - ILC2 signaling circuit provides therapeutic targets to inhibit gastric metaplasia and tumor development

DOI: 10.1038/s41467-023-42215-4

28 October 2023

View abstract

Our team

Meet our researchers

  • Prof Wei Shi - Head, Bioinformatics and Cancer Genomics Laboratory Publications
  • Yang Liao - Postdoctoral Research Fellow
  • David Chisanga - Honorary