
In the genomics era, the promise of precision medicine and tailored diagnostics is only as good as the datasets, which makes it imperative that those sets reflect the diversity of the human population. Populations from the African continent, the most genomically diverse region in the world, are underrepresented in current genomic data sets. Nowhere is closing this data gap more urgent than with triple-negative breast cancer (TNBC), which has a disproportionately high incidence in women of African descent and limited therapeutic options.
Highlighting why comprehensive population data is so important are the results of a recent study profiling of 30 TNBC tumor samples from Angola and Cape Verde (1). Whole-exome sequencing (WES), enriched with untranslated regions (UTRs), showed that 86% of somatic variants in these samples had never been reported before. WES can be especially valuable when working with limited or degraded samples, such as the FFPE samples used in this study, because it allows you to gain valuable insights from samples that are impractical for whole-genome sequencing (WGS). This study’s results emphasize the value in expanding omics cancer research so that it includes all populations and areas of the genome.
Triple-Negative Breast Cancer in Sub-Saharan Populations
TNBC is more common, more aggressive and often diagnosed at a younger age in African ancestry populations. Unfortunately, our genomic references—from The Cancer Genome Atlas (TCGA) or other major studies—are largely dominated by European ancestry cohorts.
This new study examined TNBC tumors from patients in Angola and Cape Verde, expanding the genomic information for TNBC to include samples from a population that is rich, in genetic diversity heterogeneous and in many cases novel. Key findings of this study include:
- Lower-than-expected frequency—but higher functional impact— of mutations in the tumor suppressor gene, TP53. The sub-Saharan samples showed significant decrease in the occurrence of TP53 mutations (23%) compared to African American (58%) and European American (69%) cohorts. However, analysis of the sub-Saharan TP53 mutations using a Combined Annotation Dependent Depletion (CADD) algorithm found that nearly all the mutations present in this cohort were predicted to have a functional impact, meaning that TP53 in this population might be more likely to disrupt protein function. Notably, despite the lower mutation frequency, CADD scoring revealed that most TP53 mutations in the African cohort are predicted to be highly deleterious, potentially enhancing their clinical significance.
- New candidate driver genes that could play a role in TNBC progression in African populations. An important result from this study was the identity of potential new driver genes, TTN, CEACAM7, DEFB132, COPZ2 and GAS1. These less-characterized or context-specific genes were often mutated in the African cohort and could be functionally significant. Unlike the well-studied TP53 gene, these genes are under-represented in Western-centric datasets. As a result, less information is available, although CEACAM7, DEFB132, COPZ2 have been associated with some cancers. Further analysis of these genes suggested they could have functional implications. For example, DEFB132 is involved in immune cell recruitment, and COPZ2 could have a potential tumor-suppressor role via its associated miR-152 microRNA.
- High prevalence of regulatory mutations within UTRs and other cis-regulatory elements support expanding cancer genomics studies beyond protein-coding regions. This study expanded sequencing analysis beyond standard exons, using a WES panel that was enriched with UTRs. The results found that a substantial portion of potential impactful somatic mutations in the African cohort lay outside of exons. These mutations overlapped with candidate cis-regulatory elements (cCREs). Many of the mutations occur in 5’ or 3’ UTRs, which can influence gene expression, mRNA stability and translation efficiency. This means they could affect genes tied to biological processes such as cell cycle regulation, immune response and signal transduction.
These findings reinforce the importance of broadening both the who and what of cancer genomics research. The TP53 results suggest that mutations in this gene may function differently across different populations. And by expanding beyond coding regions into untranslated regions and regulatory elements, the study found frequently mutated genes that are under-represented in standard datasets but could play a significant role in TNCB in people of African descent.
When Samples are Precious, Robust and Reliable DNA Extraction is Needed
One of the first steps on any genomics study is nucleic acid extraction. Because samples often come from formalin-fixed, paraffin-embedded (FFPE) blocks, which are limited and sometimes irreplaceable, the DNA extraction method needs to work well and work the first time.
With precious samples such as those used in this study, the reliability of the DNA extraction method is just as critical as the downstream assay. The authors used the Maxwell® RSC Instrument and the Maxwell® RSC FFPE DNA kit for DNA extraction. The benchtop automated system and cartridge-based extraction kits offer consistent yields of high-quality DNA from challenging sample types while minimizing error and saving hands-on time with preprogramed, load-and-go methods.
Final Thoughts: Beyond Inclusion, Toward Innovation
The results of this study challenge researchers to rethink what should be considered the “standard” in cancer genomics. Recognizing that most TP53 mutations, although less frequent in this African cohort, are predicted to be highly deleterious enriches our overall understanding of this tumor suppressor gene. In addition, new insights into the role of candidate driver genes such as TTN, CEACAM7, DEFB132, COPZ2 and GAS1 could lead to new therapeutic directions, especially for TNBC cases that remain refractory to conventional care.
The results of the study also serve as a reminder that we should always critically evaluate our assumptions—not just in hypothesis testing, but also in how we build the foundations of our studies. Developing reference datasets that reflect the diversity of the human population is truly a prerequisite for robust, reproducible and clinically relevant science.
Reference
- Pinto, R.J. et.al (2025) Coding and regulatory somatic profiling of triple-negative breast cancer in sub-Saharan African patients. Sci. Rep. 15, 10325.
Maxwell® RSC Instruments are For Research Use Only. Not for Use in Diagnostic Procedures.

Kelly Grooms

Latest posts by Kelly Grooms (see all)
- Growing Our Understanding of Triple-Negative Breast Cancer in Sub-Saharan Africa: Why Comprehensive Population Data Matters - June 5, 2025
- Measles and Immunosuppression—When Getting Well Means You Can Still Get Sick - April 17, 2025
- IC50, EC50 and Kd: What is the Difference and Why Do They matter? - March 6, 2025