When I last wrote about AI on this blog three years ago, I spoke of it being a tool with the potential to transform scientific discovery, but the application I described was primarily theoretical. For AI to be a meaningful tool in R&D, I argued, we needed better sources of “truth” – better data sets that AI tools could query and learn from over time – and technology capable of integrating multiple steps into a semi-automated system. My message was that AI-enabled drug discovery was coming…someday.
Fast forward to 2025, and that someday is now.
We’ve seen an explosion in the availability and capability of AI tools. Just 10 months after I wrote about the theoretical possibilities of AI in biopharma, OpenAI debuted ChatGPT. Shortly after that, we saw the rollout of Microsoft Copilot and Meta AI. We now have immense computational power at our fingertips, with programs specifically designed to query biological problems. Combined with the ingenuity of skilled scientists, who can define the research problem and generate curated datasets that will enable solutions, AI has become an important and practical tool that is helping researchers accelerate discovery (link to Google DeepMind podcast on this topic here).…
At the beginning of the summer, I had the opportunity to join the team at Royalty Pharma for a great event at MIT (link here). It was an interesting time for me as I was thinking about the new role I was about to take at Bristol Myers Squibb. While I had certainly been “a leader” for several years now, I was pushing myself to think through and question my perspectives on R&D now that I was taking on increasing responsibilities as “the leader” of the research organization.
And so the presentation opportunity with Royalty really pushed me to articulate my views in a way that would hopefully resonate with and inspire others. I titled my presentation “Bullseye.Aim.Fire” and then renamed it “Increasing R&D Productivity to Deliver Transformational Medicines” so the topic would be more obvious. What I’m really sharing in the presentation is my fundamental belief about R&D, linking together several factors that I see as mission critical.
To me, it really all comes down to causal human biology. In order to be successful, we must understand the cause-and-effect relationship between perturbing a particular biological target with a medicine and the outcome that will then impact human physiology.…
For me, the most enjoyable aspect of discovery research is exploring the unknown. It is about having a big idea; believing in that big idea based on a scientific belief framework; coming to a crossroads in the validity of the big idea, which is usually marked by deep uncertainty and skepticism; making a data-driven scientific decision to proceed (or not) to the next inflection point of testing the big idea; and ultimately arriving at a conclusion of whether the big idea is true.
Unfortunately, most of these scientific adventure stories are lost in the way we communicate about science. We tell a story to communicate the final message – we have a new medicine that is effective in treating patients – as that is the cleanest way to communicate to an audience not familiar with the gory details of the discovery. Such retrospective narratives are also the simplest way to communicate the validity of the big idea, not the tortuous and often complicated path to arrive at truth.
But such retrospective narratives don’t capture the immensely personal nature of our research discoveries. Moreover, such retrospective narratives often make the big idea seem preordained or obvious, when the big idea was anything but.…
[ I am an employee of Bristol Myers Squibb. The views expressed here are my own, assuming I am real and not a humanoid. ]
In the original Blade Runner (1982), Harrison Ford’s character, Deckard, implements a fictitious Voight-Kampff test to measure bodily functions such as heart rate and pupillary dilation in response to emotionally provocative questions. The purpose: to establish “truth”, i.e., determine whether an individual is a human or a bioengineered humanoid known as a replicant.
While the Voight-Kampff test was used to establish truth for humans vs replicants, the concept of “truth” is central to neural networks used in machine learning and artificial intelligence (AI). And for AI to be effective in drug discovery and development, it is critical to ask a fundamental question: what is “truth” in drug discovery and development?
INTRODUCTION
I recently read the book Genius Makers by Cade Metz and was reminded of the long history of machine learning, neural networks, and artificial intelligence (AI). This is a field more than 60 years in the making, with slow growth for the first 50 years – AI was founded as an academic discipline in 1956 – and exponential growth in the last 10. The original mathematical framework of neural networks was created in the 50’s (perceptron), 60’s and 70’s (backpropagation), but went largely unappreciated outside of academics, as the practical applications were few and far between.…
[Disclaimer: I am an employee of Bristol Myers Squibb. The views expressed here are my own.]
One of my favorite questions to ask is: “What captures your imagination?” At a recent family dinner, responses were varied but encouraging for the next generation: black swan events, comparative anatomy & human physiology, space exploration & intelligent life beyond our planet, and more. My response was programmable therapeutics, a topic which I have blogged about in the past.
In this blog I define programmable therapeutics and provide a few recent examples (severe combined immune deficiency and mRNA vaccines). As you will see, programmable therapeutics is more than pure imagination – we are seeing this new concept evolve before our very eyes.
What is the concept of programmable therapeutics?
While there are different definitions of the concept of programmable therapeutics (see a16z talk; programmable cells; synthetic biology; CRISPR base editing), my definition of programmable therapeutics relates to a platform with modular components that can shorten the time from new target to drug candidate and ultimately regulatory trials that can lead to an approved medicine.
For most drug development programs, the identification of a drug target represents the start of a long journey that is highly artisanal.…
[I am an employee of Celgene. The views expressed here are my own.]
In the Wizard of Oz, Dorothy clicks her heels and hopes for re-entry from her dream world by repeating, “There’s no place like home…there’s no place like home…” I often feel that many in the genetics community look at their human genetics data with the same youthful optimism as Dorothy – clicking their genetic heels and wishing “my genetic discovery will become a drug…my genetic discovery will become a drug…” But without rigor and discipline, such heel-clicking won’t overcome many of the challenges that face drug hunters along the tortuous journey from a genetic idea to a new medicine.
In this blog, I discuss a recent study on the genetics of multiple sclerosis (MS) published in Science (see here). This is a beautiful study that substantially advances the genetic landscape of patients with a devastating disease. However, the study falls short in terms of the application of human genetics to drug discovery. To chart a course for the future, I introduce the concept of mechanism, magnitude and markers (oh my!), which I refer to as the three M’s. …
[I am an employee of Celgene. All views expressed here are my own.]
At the 2018 Annual Atlas Ventures Retreat (AVR), I participated in a panel on Digital Health (along with David Schenkhein, John Reed, Scott Brun). The panel discussion was led by Michael Ringel, who also provide an excellent introduction to Digital Health (his slides here). While there are many aspects to digital health, we focused on the application to drug discovery and development. In this blog, the main point I want to emphasize is that I believe that the digital health tipping point will occur when products that benefit patients (e.g., therapeutics) facilitate the integration of digital health initiatives that currently reside in silos.
What is digital health in relation to drug discovery & development? There are many different definitions with many different components, and this, in essence, is part of the challenge (see Figure below). In early discovery biology, digital health represents various data types (e.g., human genetics, ‘omics data, cell models) and analytical methods (e.g., simple regression, machine learning, artificial intelligence). In late discovery biology, digital health includes sophisticated analytical methods for in silico drug design and organoid models to recapitulate the human system for pre-clinical testing.…
[I am an employee of Celgene. All opinions expressed here are my own.]
A meeting was recently convened to discuss a roadmap for understanding the genetics of common diseases (search Twitter for #cdcoxf18). I presented my vision of a genetics dose-response portal (slides here; link to related 2018 ASHG talk here). The organizers (@RachelGLiao, @markmccarthyoxf, @ceclindgren, Rory Collins [Oxford], Judy Cho [New York], @NancyGenetics, @dalygene, @eric_lander) asked participants to share their vision. I thought I would blog about my mine.
You’ll notice my vision is ambitious. Nonetheless, I believe these objectives are feasible to accomplish within a 3-year (Phase 1) and 7-year (Phase 2) time frame. Phase 1 would start immediately and would guide projects for Phase 2. In reality, many aspects of Phase 1 are already underway today (e.g., GWAS catalogue at EBI; Global Alliance for Genomics and Health [GA4GH] data sharing methods). Phase 2 consists of two parts: federation of global biobanks and experimental validation of variants, genes and pathways. Some components of Phase 2 could start today (e.g., exome sequencing in >100,000 cases selected from existing case-control cohorts and biobanks; human knockout project). As with Phase 1, many components of Phase 2 are already underway (federation of existing biobanks [e.g.,…
[Disclaimer: I am an employee of Celgene. The views reported here are my own.]
I presented at the PharmacoGenomics Research Network (PGRN) portion of the 2018 ASHG meeting (link to my slides here). A major theme from my talk was that precision medicine holds promise for advancing novel therapies, but that implementation of pharmacogenomics (PGx) will happen by design not by accident. Here is what I mean – and why our health care systems need to build for this future state today.
PGx by design – PGx by design starts at the very beginning of the drug discovery journey, when the choice is made to develop a therapeutic molecule against a target or a pathway. A precision medicine hypothesis is carried forward into the design of a therapeutic molecule (“matching modality with mechanism”), pre-clinical biomarkers to measure pharmacodynamic responses, and early proof-of-concept clinical studies in defined patient subsets. Late-stage clinical development is performed in these patient subsets, and regulatory approval is obtained with a label that defines this patient subset. Health care systems will essentially be required to incorporate precision medicine into patient care.
There are emerging examples of PGx by design. Indeed, there are an increasing number of FDA approvals that fit with the PGx by design model (see figure below).…
[Disclaimer: I am an employee of Celgene. The views reported here are my own.]
I recently participated in a Harvard Medical School Executive Education course on human genetics and drug discovery (link here, slides here and here). My presentation concluded with a short discussion on emerging resources such as Phenome-Wide Association Studies (PheWAS) to predict adverse drug events and guide indication selection, and protein quantitative trait loci (pQTLs) for Mendelian randomization. In this blog, I highlight briefly our recent Nature publication on pQTLs, “Genomic atlas of the human plasma proteome” (here), which represents a new public resource for drug discovery.
Human genetic targets are endowed with favorable properties, one of which is the ability to use genetic tools for nature’s randomized control trial. Central to this concept is Mendelian randomization, a method that uses human genetic variants as an instrument to examine the causal effect of a modifiable exposure (e.g., protein biomarker) on disease in observational studies (reviewed here and recent Nature Reviews Geneticshere).
Proteins provide an ideal paradigm for Mendelian randomization analysis for drug discovery, as proteins are under proximal genetic control and represent the targets of most approved drugs.…
[Disclaimer: I am an employee of Celgene. The views reported here are my own.]
On a recent family vacation to Cumberland Island, a 9,800-acre barrier island off the coast of Georgia, I was mesmerized by the dense forest of live oak trees covered with Spanish moss. Upon first glance, the branches of these magnificent trees extend chaotically in all directions, and it is difficult to discern where the trees begin and end. But upon closer inspection, the root structure can be identified, moss disentangled, and the overall complexity unraveled.
These craggy oak trees serve as metaphor for our complex human biological ecosystem: a dense forest of molecules with gnarled branches of pathways meandering in all directions, without an obvious root structure of human disease. Extending the metaphor further, the oak trees make the point that I see as one of the most difficult aspect of drug discovery and development: understanding root cause of disease, and matching therapeutic modality and biological mechanism to prevent or cure devastating illness.
In this blog, I highlight two recent publications that underscore the importance of matching modality and mechanism. The first article, published in the New England Journal of Medicine, reported clinical data on 22 patients with beta-thalassemia treated with ex vivo gene therapy (here).…
Over the holidays my family participated in an Escape Room, a live puzzle adventure game. We worked as a team to solve riddles, find clues and, over the course of 60-minutes, complete an old town bank heist. Many of the successful clues came from unexpected places – coordinates on maps, numbers inscribed in hidden places, and physical features of the room itself. Other clues seemed promising, but ultimately led to dead ends. In the end, everything came together and we escaped with only seconds to spare.
And so it goes with the invention of new medicines. The approval of a new medicine is an Escape Room of sorts, but over the course of decades not minutes. And like an Escape Room, clues can come from unexpected places, with some leading to new insights and others leading to dead ends.
I was in an Escape Room state-of-mind as I read a Science Translational Medicine article that developed a system to differentiate blood cells into microglia-like cells to study gene variants implicated in neurodegenerative disorders (here). In this blog, I provide a brief summary of the study, and then describe the potentially interesting phenomenon of genetically driven tissue-specific pathogenicity.…
A new genetics initiative was announced today: the creation of FinnGen (press release here). FinnGen’s goal is to generate sequence and GWAS data on up to 500,000 individuals with linked clinical data and consented for recall. There are many applications for such a resource, including drug discovery and development. In this blog, I want to first describe the application of PheWAS for drug discovery and development, and then introduce FinnGen as a new PheWAS resource (see FinnGen slide deck here).
[Disclaimer: I am an employee of Celgene. The views expressed here are my own.]
PheWAS
PheWAS turns GWAS on its head. While GWAS tests millions of genetic variants for association to a single trait, PheWAS does the opposite: tests hundreds (if not thousands) of traits for association with a single genetic variant. This approach is primarily relevant for those genetic variants with an unambiguous functional consequence – for example, a variant associated with disease risk or a variant that completely abrogates gene function. There are useful online resources (see here), as well as several nice recent reviews by Josh Denny and colleagues, which provide additional background on PheWAS (see here, here).
Work that originated from my academic lab represents the first example of PheWAS for drug discovery – in particular, how to use PheWAS to predict on-target adverse drug events (ADEs) and to select indications for clinical trials (see 2015 PLoS One publication here).…
Last week Alnylam reported positive news on Phase 3 outcomes for their RNA interference (RNAi) therapy to treat patients with a rare genetic cause of amyloidosis with polyneuropathy (see here). I tweeted the following:
The 20-year journey from scientific discovery to positive Phase 3 clinical trial data got me thinking about other novel therapeutic modalities. Was twenty years a long time or typical for an innovative therapeutic modality? Where are other promising modalities on their journey to regulatory approval? Is the biopharmaceutical industry on the cusp of a series of innovative modalities that could change the therapeutic landscape for patients? How will these new modalities improve our ability to test therapeutic hypotheses?
[Disclaimer: I am an employee of Celgene. The views expressed here are my own.]
To explore these questions, I decided to review different novel therapeutic modalities, which I am defining as those other than small molecules, protein therapeutics (e.g., insulin) and traditional vaccines. This decision was practical, as the amount of literature in these modalities is expansive.
For each new modality, I asked whether a drug has been approved by either a European or US regulatory agency (EMA and FDA, respectively). If a drug has been approved, I reviewed the time from seminal scientific discovery (which sometimes is clear, sometimes is not) to the approved therapy.…
A new manuscript by Jonathan Pritchard and colleagues published in Cell (see here) has garnered a lot of attention from the genetics community (see here, here, here, here, here). In this blog, I add to the ongoing commentary. I first summarize the main conclusions of the manuscript, and then I discuss the implications for drug discovery and development. For the latter, the three main points are: (1) “core genes” represent good drug targets, especially if they harbor a series of alleles that link function to phenotype; (2) regulatory networks identified by “peripheral genes” point to specific cell types and mechanism that can be used for phenotypic screens; and (3) new approaches are needed to drug cellular networks – what I will refer to as “circuit pharmacology” – as the bulk of drug discovery today is an attempt to reduce complex mechanisms to individual drug targets.
Here is a brief summary of the main conclusions of the manuscript.
There is a small number of “core genes” that “provide mechanistic insights into disease biology and may suggest druggable targets.” How these core genes are defined, however, remains to be determined. The manuscript suggests a few approaches, including: genes with large effect size variants from GWAS and genes with an allelic series, especially those with lower-frequency variants of larger effects.
A recent study in the New England Journal of Medicine provides genetic support for a pharmacologically validated target, BAFF, in the treatment of systemic lupus erythematosus. But can human genetics also be used to estimate the target dose and a therapeutic window?
As readers of plengegen.com know, I am constantly on the lookout for published studies that provide insight into the utility of human genetics for drug discovery and development. This past week there was a great post from Francis Collins on the role of the NIH in the discovery (in part via human genetics) and development of tofacitinib (see here), anakinra and potentially novel targets (e.g., STING) for inflammatory diseases (here). Nature Reviews Drug Discovery published a News & Analysis on PCSK9 as a “fertile testing ground for new drug modalities including long-acting RNA interference drugs, vaccines against self-antigens, CRISPR therapeutics and small molecules that control ribosomal activity” (here). New York City released information about a new public health initiative, The NYC Macroscope, which will use electronic health records (EHRs) to track conditions managed by primary care practices that are important to public health..and one day may be linked to genetic data for discovery research (that is me just speculating).…
In response to an original research article published in Nature by Sekar Kathiresan and colleagues (see here), I penned a News & Views piece for Nature (here), a blog for the Timmerman Report (here, here), and a podcast for BBC Inside Science (here). An important theme for drug discovery & development is that human knockouts can rule-in and to rule-out drug targets.For human knock-out data, the key concept is to understand the effect of maximum genetic perturbation on human physiology.
Rule-in drug targets: As has been described by Matt Nelson and colleagues from GlaxoSmithKline (see 2015 Nature Genetics), and David Cook and colleagues from AstraZeneca (see 2014 Nature Reviews Drug Discovery), therapeutic molecules developed against targets with human geneticdata are more likely to lead to regulatory approval than those without.PCSK9 represents the poster child for human genetic knockouts in drug discovery & development (see my plengegen.com blog here).But there are many other examples, too.
Rule-out drug targets: But human genetics can also rule-out drug targets or mechanisms that are nominated through animal models, human epidemiology or other approaches.A prominent example is related to raising HDL cholesterol, the so-called “good cholesterol”.
As readers of my blog know, I am a strong supporter of a disciplined R&D model that focuses on: picking targets based on causal human biology (e.g., genetics); developing molecules that therapeutically recapitulate causal human biology; deploying pharmacodynamic biomarkers that also recapitulate causal human biology; and conducting small clinical proof-of-concept studies to quickly test therapeutic hypotheses (see Figure below).As such, I am constantly on the look-out for literature or news reports to support / refute this model.Each week, I cryptically tweet these reports, and occasionally – like this week – I have the time and energy to write-up the reports in a coherent framework.
Of course, this model is not so easy to follow in the real-world as has been pointed out nicely by Derek Lowe and others (see here).A nice blog this week by Keith Robison (Warp Drive Bio) highlights why drug R&D is so hard.
Here are the studies or news reports from this week that support this model.
(1) Picking targets based on causal human biology: I am a proponent of an “allelic series” model for target identification. Here are a couple of published reports that fit with this model.…
Like many, I waited with bated breath for results of the anti-PCSK9 (evolocumab) FOURIER cardiovascular outcome study last week. There have been many interesting commentaries written on the findings.A few of my favorites are listed here (Matthew Herper), here (David Grainger), here (Derek Lowe), and here (Larry Husten), amongst others, with summaries provided at the end of this blog.Most of these articles focused on clinical risk reduction vs. what was predicted for cardiovascular outcome, as well as whether payers will cover the cost of the drugs.These are incredibly important topics, and I won’t comment on them further here, other than to say that the debate is now about who should get the drug and how much it should cost.
In this blog, I want to emphasize key points that pertain to human genetics and drug discovery.And make no mistake: the anti-PCSK9 story and FOURIER clinical trial outcome is a triumph for genetics and drug discovery. This message seems to be getting muddled, however, given the current cost of evolocumab and the observation that cardiovascular risk reduction was less than expected, based on predictions from a 2005 study published by Cholesterol Treatment Trialists (CTT) (see Lancet study here).…
There were so many good articles and news reports this week on genetics/genomics and drug discovery & development. A few examples include: article in Nature Communications on gene therapy via CRISPR/Cas9 for retinitis pigments (here); a partnership between Editas and Allergan (Matthew Herper story here); Nature Reviews Genetics article by Khera and Kathiresan on genetics of coronary heart disease (here); Genome Magazine article on the importance of pharmacogenetics across ethnic groups to prevent severe adverse events (here); and a victory for pre-prints in challenging the statistical robustness of a publication in Nature Genetics (here).
I decided to focus on a study that provides a mechanistic link between a genetic mutation and a therapeutic hypothesis in Parkinson’s disease. The reason I chose this article is that it highlights the challenges of going from a robust genetic association to a biology hypothesis, and ultimately how to gain confidence in a therapeutic hypothesis with pre-clinical models. As you will see at the end, a clinical trial is now underway to test the therapeutic hypothesis in humans.