Shakespeare and cancer diagnoses: how bard can it be?

July 24, 2013

Shakespeare's plays and cancer: two seemingly unrelated topics with an underlying common thread.

The techniques that and computer scientists use to analyse the Bard's works are also used in cancer diagnostic procedures – and it's all down to the quantification of subtle variations of attributes present in large amounts of data.

In last month's published collaboration in the journal PLoS ONE, we applied a simple and novel ranking method to a dataset involving plays of undisputed authorship from the Shakespearean era.

We ranked the frequency of words by playwrights John Fletcher, Ben Jonson, Thomas Middleton and William Shakespeare, testing all 55,055 unique words used in 168 plays.

The results of using this new method were very encouraging. For some authors, such as Shakespeare, the slight under-use of particular words provided better markers of individuation than over-used words. We found Shakespeare's four lowest ranked words to be:

  • all
  • to (infinitive)
  • now
  • ye

The last one was also among the top 20 lowest ranked scores for Jonson and Middleton, but interestingly, was the top highest score for Fletcher. His preference for the use of "ye" over the average of the plays of that do not belong to him is now very clear.

These are quantifiable markers that can objectively measure an author's creative mind at work.

The idea that variations on the use of words over time can give clues about psychological problems or even markers of depression in the work of suicidal poets has already been discussed.

But this simple idea for a new scoring method may also give expected dividends in other areas, such as diagnostics, medical algorithms and .

Information-based medicine

In a study from 2009, Shakespeare and other English Renaissance authors were studied using methods based on information theory (the scientific field that leads with the quantification of information).

They observed that Shakespeare's work seemed remarkable for its homogeneity on the probability of use of common words and for its closeness to overall average use of words at the time. This naturally triggers a central question:

Would it be possible to find some distinctive signatures of individual authors by looking at the fluctuations of the observed frequencies of words used?

So, you may be asking yourself:

Why would this be a question of interest for the analysis of biomedical data?

The identification of biological markers is critical for information-based medicine. Such biomarkers are quantitative indicators that can be objectively measured and indicate normal biological processes, the existence of pathogenic processes, or altered pharmacologic responses to a therapeutic intervention.

Biomarkers are needed for cancer diagnostics and early screening (for example, levels of the enzyme Kallikrein-3, also known as PSA or prostate-specific antigen, are often elevated in men with prostate cancer or other prostate disorders).

Biomarkers are central for the core aims of personalised medicine and the quest to individualise risk, identification, therapies and the post-treatment monitoring of possible recurrence.

But controversies exist about the use of a single biomarker (in fact, this is already happening even with established biomarkers such as PSA for prostate cancer) so current medical research advocates for finding panels of biomarkers.

Statistical scores are usually employed to rank and identify the best biomarkers when individually tested. But to identify panels it is important to find the best combination of biomarkers. Other mathematical methods are needed.

Our team uses combinatorial optimisation (the branch of computer science and discrete applied mathematics that deals with these optimal selection problems) approaches to do so, not only in cancer and the selection of therapeutic combinations but also in multiple sclerosis and in Alzheimer's disease.

Using panels of biomarkers it is possible to improve the classification accuracy of the tests, boosting sensitivities and specificities to approximately 90% as we have recently shown in studies in Alzheimer's Disease.

Finding the best fit

This is not the first time that combinatorial optimisation has been used at the University of Newcastle's Centre for Bioinformatics, Biomarker Discovery and Information-based Medicine (CIBM) both in cancer and in literature and linguistic studies.

In a different paper published in 2006, combinatorial optimisation methods were used to produce a consensus phylogenetic tree of 84 Indo-European languages. In that same study, we showed how to generate a classification of several different cancer cell lines.

Again, our approach was heavily based on combinatorial optimisation.

The application of these more sophisticated methods is necessary for personalised medicine as they can be used to subtype different types of cancers at the molecular level by analysing patterns of variations across different samples.

While our team's work concentrates on developing molecular signatures of disease states based on a combination of biomarkers (as opposed to single scores like the novel one used in our study) we also recognise the usefulness of this new score, presented in the analysis of Shakespeare's works, for a rapid preliminary analysis of large biomarker datasets.

Our team now routinely analyses large biomedical datasets with this new method. As in the Shakespeare study mentioned above, it has served to identify potentially mislabelled samples, outliers of a major class of interest of a disease, and other potential pitfalls identifiable and avoidable during early processing of the data.

For our institution our new contribution accounts as one of those success stories of collaboration across faculties and disciplines, a rare curiosity-driven basic research endeavour that generally does not get the nod from national funding agencies that only look to support translational medical research with simplistic definitions.

These "unthinkable quests" are vital to spin-off breakthrough translational research.

They need to be protected, supported and developed as computer science provides the core expertise that may lead to new scalable ways to address the tidal wave of data coming from the life sciences that may ultimately result in a blessing for your health.

Explore further: MET protein levels show promise as biomarker for aggressive colon cancer

More information: www.plosone.org/article/info%3 … journal.pone.0066813

Related Stories

MET protein levels show promise as biomarker for aggressive colon cancer

June 4, 2013
MET protein levels correlate strongly with epithelial-mesenchymal transition (EMT) phenotype, a treatment-resistant type of colorectal cancer and may be used as a surrogate biomarker, according to new research from The University ...

Refocusing the boom in biomarker research

July 27, 2011
An article in the current edition of Chemical & Engineering News, ACS's weekly newsmagazine, describes the trials, tribulations, and triumphs of one of the hottest pursuits in modern biomedical science — the search for ...

Text mining: Technology to speed up Alzheimer's biomarker discovery

November 8, 2012
New research proves that 'text mining' or using the power of computers to read the entire biomedical knowledge base, is a promising new tool in the search for Alzheimer's disease biomarkers.

New genetic test can predict man's risk of developing prostate cancer

February 8, 2013
Researchers in Japan have created a genetic test that will help doctors diagnose prostate cancer. When given together with testing for prostate specific antigen (PSA), a widely used diagnostic biomarker for prostate cancer, ...

Researchers identify genetic variants predicting aggressive prostate cancers

June 19, 2013
Researchers at Moffitt Cancer Center and colleagues at Louisiana State University have developed a method for identifying aggressive prostate cancers that require immediate therapy. It relies on understanding the genetic ...

Genetic test helps predict risk of prostate cancer recurrence

May 10, 2013
(Medical Xpress)—Prostate cancer ranks as the most common internal malignancy diagnosed in men in the United States, but often does not require extensive treatment.

Recommended for you

Shooting the achilles heel of nervous system cancers

July 20, 2017
Virtually all cancer treatments used today also damage normal cells, causing the toxic side effects associated with cancer treatment. A cooperative research team led by researchers at Dartmouth's Norris Cotton Cancer Center ...

Molecular changes with age in normal breast tissue are linked to cancer-related changes

July 20, 2017
Several known factors are associated with a higher risk of breast cancer including increasing age, being overweight after menopause, alcohol intake, and family history. However, the underlying biologic mechanisms through ...

Immune-cell numbers predict response to combination immunotherapy in melanoma

July 20, 2017
Whether a melanoma patient will better respond to a single immunotherapy drug or two in combination depends on the abundance of certain white blood cells within their tumors, according to a new study conducted by UC San Francisco ...

Discovery could lead to better results for patients undergoing radiation

July 19, 2017
More than half of cancer patients undergo radiotherapy, in which high doses of radiation are aimed at diseased tissue to kill cancer cells. But due to a phenomenon known as radiation-induced bystander effect (RIBE), in which ...

Definitive genomic study reveals alterations driving most medulloblastoma brain tumors

July 19, 2017
The most comprehensive analysis yet of medulloblastoma has identified genomic changes responsible for more than 75 percent of the brain tumors, including two new suspected cancer genes that were found exclusively in the least ...

Novel CRISPR-Cas9 screening enables discovery of new targets to aid cancer immunotherapy

July 19, 2017
A novel screening method developed by a team at Dana-Farber/Boston Children's Cancer and Blood Disorders Center—using CRISPR-Cas9 genome editing technology to test the function of thousands of tumor genes in mice—has ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.