How do we measure whether new therapies are making a difference? What's a surrogate endpoint?
In part one of this series, we talked about the challenges associated with understanding a rare disease and developing new therapies, and we touched briefly on a meeting recently held at FDA on clinical research in Primary Sclerosing Cholangitis (PSC). In part two we discussed the importance of doing more clinical trials to understand what types of therapies might make a difference in PSC, and how SJI is working to support those trials with a nationwide Clinical Research Network. But when we're testing a new therapy, how do we measure whether that therapy is actually working and making a difference for patients with PSC? That's a more complicated question than it appears at face value. In today's post, part three of the series, we'll discuss surrogate endpoints in the context of therapeutic development for PSC.
(Disclaimer: This post is meant to be conceptually educational regarding how endpoints are used in clinical trials of investigational new drugs for rare diseases like PSC. It is not meant to be comprehensive in scope. Clinical research is a complicated and nuanced subject, and most of the scenarios described here are context-specific. For specific, detailed inquiries, we recommend consulting FDA guidance documents directly.)
Just what is a Clinical Trial? In order to understand how we measure success in clinical trials, let's walk through the general ideas of clinical trials and make sure we have our vocabulary straight. At the most basic level, clinical trials are experiments conducted in human patients. They start with a question, usually a pretty specific question (does this antihistamine reduce formation of hives in humans patients who have been exposed to this allergen?), and are either interventional or observational (non-interventional).
Interventional Trials: In an interventional study, the investigators (the people running the trial) are intervening in some way - they are doing something to the patient, be it operating, giving an infusion, having them take a new pill, modifying their behavior, and so on, AND they are doing it with the intention that it is meant to change the course or experience of the patient's disease or change their clinical outcome. (Don't forget that term "clinical outcome" - we'll come back to it.) The National Institutes of Health defines an interventional study as "a clinical study in which participants are assigned to receive one or more interventions (or no intervention) so that researchers can evaluate the effects of the interventions on biomedical or health-related outcomes. The assignments are determined by the study protocol. Participants may receive diagnostic, therapeutic, or other types of interventions," and defines an intervention as "a process or action that is the focus of a clinical study. Interventions include drugs, medical devices, procedures, vaccines, and other products that are either investigational or already available. Interventions can also include noninvasive approaches, such as surveys, education, and interviews." So an intervention can be noninvasive (survey), but in an interventional trial, the investigators are trying to evaluate the effect(s) of the intervention on outcomes of the patients in the trial (maybe taking the survey in itself changes the experience the patient has of their disease). Classic example of an interventional trial: we give patients with a deadly cancer an experimental chemotherapy and measure whether a greater number of the patients receiving that therapy are alive one year later than in the absence of that therapy.
Observational Trials: In contrast to interventional studies, observational studies do not focus on an intervention. Patients may receive medical interventions during the course of an observational study, but the point is not to evaluate the effect(s) of a particular intervention on outcomes. The point is to learn about patients, their disease, and their outcomes in the absence of trying to change those things with a specific intervention. The National Institutes of Health defines an observational study as "a clinical study in which participants identified as belonging to study groups are assessed for biomedical or health outcomes. Participants may receive diagnostic, therapeutic, or other types of interventions, but the investigator does not assign participants to specific interventions (as in an interventional study)." Classic example of an observational study: patients with and without ulcerative colitis give a stool sample for microbiome sequencing and fill out a questionnaire about their dietary preferences.
Outcome Measures: So what makes a study interventional is whether we assign patients to get a particular intervention and test for whether that intervention had an effect on the patient's outcome(s). For the remainder of this post, we will be talking only in the context of interventional clinical trials, or trials where we are working to understand how a specific intervention changed a clinical outcome. Let's go back to this term "clinical outcome" and understand what it encompasses. Simply put, a clinical outcome is what happened to the patient's health. Did they get better? Was their life prolonged? Did they die? The National Quality Measures Clearinghouse (NQMC), maintained by the U.S. Agency for Healthcare Research and Quality (AHRQ), defines a clinical outcome as "a health state of a patient resulting from health care." Outcome measures (which are exactly what they sound like, measures of outcomes) are data that are reflective of the patient's health state. From NQMC: "Outcomes can include a vast range of health states; mortality, physiologic measures such as blood pressure, laboratory test results such as serum cholesterol, patient-reported health states such as functional status and symptoms may all be used as outcome measures." Bottom line: when we're doing an interventional trial, we have to have a way to measure the effect of the intervention on the patient's health. Many clinical trials will have a primary outcome measure and a number of secondary outcome measures; these are the data that will be recorded over the course of the study to understand how the intervention is affecting patient health.
Clinical Endpoints: Now when we're looking at a clinical outcome to detect the effect of an intervention in a clinical trial, we call it a "clinical endpoint". You can think of a clinical endpoint as the place patients get to by the end of the trial; this can be mortality, improved quality of life, tumor shrinkage, anything we can find a way to objectively measure during that trial to determine whether the intervention is benefiting the patients getting it. In some cases, that can be a direct measurement. Think back to our example of a deadly cancer; looking for a difference in one-year survival rates with a new therapy would be a direct measurement of a clinical outcome or a "true" or "clinically meaningful" endpoint.
Biomarkers: In many cases, however, it would be difficult to directly measure clinical outcomes. What if we're studying a disease where a difference in survival wouldn't be detected for decades? It is not only expensive to run a clinical trial for decades, but we'd like to understand in a time frame more useful to patients whether a new therapy is working. Many clinical outcomes are not as cut and dried as mortality and are difficult to isolate or quantify. For example, how do we demonstrate improved cardiovascular health? Answer: biomarkers. FDA defines a biological marker (or biomarker) as "a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention." So biomarker is a pretty broad term, but biomarkers are objectively measured outcome measures (imaging, laboratory tests, quantitative assessments, and so on). (For a deeper perspective on biomarkers, check out this article written by investigators at the National Institutes of Health.) FDA has a "qualification" process so that biomarkers can be used in many clinical trials in a specific context without having to reinvent the wheel every time. I mention this because we're going to set aside that term "qualified" and focus on the term "validated", and specifically we are interested in clinical validation of biomarkers, which involves showing that a biomarker accurately and reproducibly predicts the clinical outcome of interest. This is important beyond the scope of drug development; you're not going to bother with exercising and eating right to lower your total cholesterol if you're not convinced that lowering your total cholesterol really matters.
FDA Approval of an Investigational New Drug: Now everything that we've discussed up to now is for clinical trials broadly. How does measurement of clinical outcomes apply to U.S. Food and Drug Administration (FDA) approval of a new/investigational therapy? In the simplest terms, if a new intervention is being tested to determine whether it should be approved as a therapy for a particular indication (disease/condition), we have to have a way to convincingly demonstrate that it improved clinical outcomes. This is an important point: the purpose of doing interventional trials in contemporary medicine is to find ways to help patients; we only trial interventions that we think might improve healthcare. Of course there are different phases of trials an investigational therapeutic has to go through, but over the course of the clinical trials process, this is the task at hand: to show the intervention under investigation is beneficial. Otherwise, why would a patient want to take the drug? Why would physicians feel confident prescribing it to their patients? Let's walk through this issue of how we measure benefit so a drug can get FDA approval as a treatment for a particular disease/indication.
Surrogate Endpoints: Simply put, a surrogate endpoint is a validated biomarker. So what does that mean? FDA defines a surrogate endpoint as "a biomarker intended to substitute for a clinical endpoint. A surrogate endpoint is expected to predict clinical benefit (or harm, or lack of benefit) based on epidemiologic, therapeutic, pathophysiologic or other scientific evidence," or in simpler terms "a laboratory finding or physical sign that may not be a direct measurement of how a patient feels, functions, or survives, but is still considered likely to predict therapeutic benefit for the patient." Let's go back to our example of cardiovascular health. We can measure how much and what kinds of cholesterol are in a patient's blood. Do we care about our cholesterol levels in and of themselves? No. I can be walking down the street with high cholesterol and not know the difference or care at all. What I care about is that those high cholesterol levels predict that I will have bad cardiovascular outcomes and increased mortality as a result. Cholesterol is a surrogate measure of a clinically meaningful endpoint (dying from cardiovascular disease). Similarly, high blood sugar in type 2 diabetes is a surrogate endpoint because it reliably predicts the clinically meaningful outcomes associated with long-standing metabolic disease (kidney failure, diabetic retinopathy, cardiovascular disease, etc.). These biomarkers can be used as surrogate endpoints because they have been validated by multiple independent investigators as accurate predictors of the clinically meaningful endpoints we would really like to measure; they are "known" valid biomarkers.
Surrogate Endpoints in Rare Disease: We have known valid biomarkers for many different diseases that have been demonstrated, in studies of a large number of patients, conducted over a long period of time, to accurately predict the clinically meaningful endpoints for which they are serving as surrogate measures, such as the two just discussed. Often we don't have these kinds of known valid biomarkers in rare diseases, particularly in life-threatening rare diseases, because we would have to study a large number of patients over a long period of time, and that's hard to do in rare disease (small number of patients) and when patients need options relatively quickly or they're going to die. What has happened to fill this gap in some rare (and other) diseases is that, rather than the standard of validation used for more common conditions, biomarkers of disease activity/progression can be used as surrogate endpoints for drug approval (measures to predict whether an intervention will produce a beneficial effect on clinical outcomes) if the case can be made that they are biologically plausible predictors of clinically meaningful outcomes; these are considered "probable" valid biomarkers. This means there is some evidence that a biomarker is linked to disease activity/progression, and perhaps it can be demonstrated that at the molecular level this makes sense based on our understanding of the relevant biologic pathways. The standard is more flexible for rare disease because FDA regulators recognize that biomarker validation in rare disease is a huge hurdle. And it is worth noting that biomarker validation is always context-specific. What works for one test in one disease does not necessarily apply to a different test in the same disease, or the same test in a different disease, or the same test in the same disease with a different type of intervention. You get the idea.
All of that said, it pays to have good biomarkers that serve as predictive surrogate measures of the clinically meaningful outcomes we actually care about. Let's say we pick a bad biomarker, something that's easy and cheap to measure, but it doesn't fully capture or predict disease activity like we know a properly validated surrogate endpoint would. We could approve all kinds of drugs on the basis of improving this bad biomarker, but in the long run, those drugs might not help patients in ways that matter. We might correct a lab value and get a drug approved, but in the end that drug might not improve survival or delay time to transplant. So the idea is not just to measure any old thing and show that we can change that measurement with an intervention. The idea is for that to be very closely tied to whether patients get better. For this reason, choosing and understanding appropriate biomarkers that can serve as surrogate endpoints in trials of investigational new drugs is of critical importance.
FDA recognizes the importance of this challenging issue for PSC, and as a result held a meeting in early March to discuss surrogate endpoints and clinical trial design in PSC. The summary report of that meeting will be forthcoming, and we will share it with you as soon as it is released. The video content of that meeting will also be made available on the web at a future date. The long and short of the discussion at the meeting is that we don't have enough data about the molecular causes of PSC or the natural history of PSC in a large enough number of patients over a long enough period of time to know what might make a good surrogate endpoint(s) for PSC trials. Further complicating things, PSC is a heterogeneous condition that may actually represent multiple diseases at the molecular level. It may be that different subgroups of patients should be monitored with different kinds of measurements. Because there are investigational new drugs being trialed in PSC that work via different approaches to benefiting patients (microbiome, bile acids, immunity, liver fibrosis, others), it is likely that we will need different types of biomarkers to serve as surrogate endpoints in those different types of trials (immune markers for immune modulator trials, measures of liver or bile duct fibrosis for trials of antifibrotics, and so forth). It could be that a composite measurement based on a combination of multiple kinds of biomarkers will be more useful than looking at a single biomarker (such as alkaline phosphatase) or type of biomarker (such as a panel of laboratory values).
The good news, which was clearly highlighted in the wide variety of talks at the FDA meeting, is that we have a lot to choose from: imaging tests, laboratory tests, functional tests, clinical outcome assessments (includes patient-reported outcome measures, clinician-reported outcome measures, observer-reported outcome measures, and performance outcome measures) - there is a lot under investigation that might prove relevant and useful in PSC and a lot of new technology that could be applied to PSC trials. The bad news is that in the mean time, it will be challenging to get drugs approved in short order that we are truly confident in, because it is not exactly clear to anyone - patients, physicians, drug developers, or regulators - how we should measure whether a given drug is really going to help PSC patients.
We'll expand on the possibilities for addressing this issue in a future post detailing how we can layer biomarker development into clinical trials in PSC and how our nationwide Clinical Research Network can help facilitate and accelerate this process.
Thanks for reading this three part series on how we make forward progress in clinical research for PSC. Read more about the specific challenges of curing PSC in The Road Forward in PSC: Part 1 of 3 and how we can accelerate clinical trials for PSC in The Road Forward in PSC: Part 2 of 3.
Read more from the FDA at their site on How Drugs are Developed and Approved.