Our research focus

Develop a novel method for quantifying scRNA-seq data.

Single-cell RNA sequencing (scRNA-seq) has transformed the field of biomedical research. The 10x Genomics scRNA-seq technology is capable of sequencing the expression of thousands of genes in hundreds of thousands of individual cells simultaneously. Quantifying UMI (Unique Molecular Identifier) data generated from this technology is challenging because of the large volume of the data and the complexity of quantification. We are developing a new algorithm for accurate and efficient quantification of data from this technology. The successful development of this new bioinformatics tool will significantly reduce the data analysis time and improve the accuracy of gene expression quantification.

Develop a new method for mapping long sequencing reads

Long-read sequencing technologies, such as Nanopore and PacBio, have the potential to sequence whole gene transcript and discover long-range genomic mutations among other applications. A significant challenge for analysing long-read data is the read mapping which aligns each read to a reference genome. This is a critical step for successfully identifying full gene transcripts and detecting breakpoints of long-range mutations. We will expand the ‘seed-and-vote’ read mapping paradigm we successfully developed for mapping short reads, to develop a new method for long-read mapping. The successful development of this new tool is likely to result in discovery of new gene transcripts and mutations in diseases such as cancer.

Reconstruct a gene regulatory network to elucidate the differentiation of CD8+ T cells

Understanding the molecular mechanisms underlying the differentiation of CD8+ T cells will not only generate new knowledge in the field of immunity but is also important for the development of new strategies for improved immunotherapy. We will utilise omics data generated for mouse with chronic infection to reconstruct a gene regulatory network (GRN) containing interaction of key transcription factors and target genes to elucidate how differentiation of CD8+ cells are delicately regulated. We will then investigate how this GRN is perturbed in metastatic breast cancer using a mouse model of this disease and also cancer patient sequencing data available in the The Cancer Genome Atlas (TCGA) database. An outcome of this study will be a gene signature that can be used to predict which metastatic breast cancer patients will respond to immunotherapy.

Provide bioinformatics support to biology labs

Modern biomedical research makes use of powerful sequencing technologies such as single-cell RNA sequencing technologies. Bioinformatics support for fast and accurate analysis of sequencing data is important for the success of such research. Our lab collaborates with almost all the labs at ONJCRI to provide strong support for their bioinformatics needs. We specialize in analysing data generated from a range of sequencing technologies including bulk RNA-seq, single-cell RNA-seq, single-cell TCR-seq, ChIP-seq, ATAC-seq etc. We are also experienced in analysing public datasets generated by large consortia such as TCGA. We have contributed to many discoveries made in projects related to a wide range of cancer such as GI cancer, breast cancer and brain cancer.

Fast facts

An inter-disciplinary field that involves computer science, mathematics, genomics and biology. Computing scientists develop algorithms and software tools to analyse genomics data that are usually in a large scale, including genome-wide molecular data such as gene expression data, mutation data, transcription factor binding data and chromatin accessibility data.

Cancer genomics is the study of the molecular changes that occur in a cancer genome. It provides a powerful approach for detecting new genes and mutations  of cancer in a very time-efficient manner.

A sequencing technology that  enables the discovery of genes, digitally, that may be turned on or off in diseases such as cancer.

A graph in which nodes represent genes and edges represent interactions. The interaction can be direct or indirect. A direct interaction is a physical interaction that for example can be DNA binding or phosphorylation. An indirect interaction is usually co-expression of two genes. In a gene regulatory network, a gene may interact with two or more genes.

Recent publications

Cell Reports

Inhibition of HCK in myeloid cells restricts pancreatic tumor growth and metastasis

DOI: 10.1016/j.celrep.2022.111479

11 October 2022

View abstract

MYB orchestrates T cell exhaustion and response to checkpoint inhibition

DOI: 10.1038/s41586-022-05105-1

17 August 2022

View abstract
Immunology & Cell Biology

CD137L and CD4 T cells limit BCL6-expressing pre-germinal center B cell expansion and BCL6-driven B cell malignancy

DOI: 10.1111/imcb.12578

2 August 2022

View abstract

Our team

Meet our researchers

  • Prof Wei Shi - Head, Bioinformatics and Cancer Genomics Laboratory Publications
  • Yang Liao - Postdoctoral Research Fellow
  • David Chisanga - Honorary
  • Jennifer Snowball - PhD Student