Clinical Trial @ MindSay


 

   
II: Is the TAILORx Trial a good fit with the available information?

A very large clinical trial (~10,000 women) is underway to assess the usefulness of the Oncotype Dx gene profile for selecting treatment for early stage estrogen receptor positive breast cancer: the TAILORx Trial.  The trial sponsors will ask women with breast cancer to undergo Oncotype Dx testing (for which they will be billed $3650) and then assign them to three risk groups based on their recurrence score. (The groups are defined using different cut-off’s for this trial than those used for the study done on the NSABP B-20 tumors.)  Women in the low risk score group will get hormonal treatment (tamoxifen or one of the newer “aromatase inhibitor” class of drugs). Women in the high risk group will get chemotherapy and hormone therapy. The women in the intermediate group will be randomly assigned to receive either hormone treatment alone or a combination of hormone and chemotherapy.

 

A least four concerns make this trial seem premature. First, the chemotherapeutic "validation" of Oncotype Dx is based on one small, old chemotherapy trial. Second, TAILORx does not incorporate many other old and new prognostic indicators that might trump Oncotype Dx in some circumstances. Third, not enough information exists about how the Oncotype Dx score might interact in a confounding way with specific hormone treatments, chemotherapy drugs, and regimens. Finally, far too many non-random, non-blinded choices are afforded to patients and physicians in TAILORx to make for decent "science".

 

TAILORx is based on a retrospective analysis of a small, non-random sample of assays performed on preserved tissue. The ability to use this old material to obtain reproducible gene profiles is an amazing technological feat, to be sure. But only 651 of about 2300 patient results were available. The chemotherapy used in the NSABP B-20 dates to the 1970's. Substantial evidence supports a conclusion that these regimens improve prognosis for many patients. But two of the agents (methothrexate and 5-fu) are not often employed today. The third agent, cytoxan, has lost popularity over concerns that it may lead to late complications like second malignancies. Meta-analysis of the many thousands of patients treated with chemotherapy indicates that the B-20 drugs are not as effective as newer drugs like the taxols and adriamyin.

 

So inspite of the fact that the paper describing the results for the 651 patients uses the word "prospective" five times, it is actually a retrospective survey, not a clinical trial. This might matter for a woman who is enrolled in TAILORx because she cannot be sure that the Oncoype Dx group studied really is representative of the group she is in. And remember that much of the power of the argument for Oncotype Dx is contained in the one subgroup that contained only 47 patients.

 

A second concern revolves around the lack of  consideration of many other older and newer prognostic variables when deciding whether a patient falls into "Group 2" (the intermediate risk group) as defined here:

 

Group 2 (Primary study group; ODRS 11-25): Patients are stratified according to tumor size (≤ 2.0 cm vs ≥ 2.1 cm), menopausal status (postmenopausal vs premenopausal vs perimenopausal), planned chemotherapy (taxane-containing [i.e., paclitaxel, docetaxel] vs nontaxane-containing), and planned radiotherapy (whole breast with no boost planned vs whole breast with boost planned vs partial breast irradiation planned vs no planned radiation therapy [for patients who have had a mastectomy]). Patients are then randomized to receive either hormonal therapy alone or combination chemotherapy and hormonal therapy.

 

For example, tumor size, tumor dna ploidy, quantitative level of estrogen receptor, tumor lymphovascular invasion, per cent of dividing cells ("s phase"), Ki 67 expression and many other features and meaurements have been used to stratify patients for risk of treatment failure. (Some of these features can make a woman with a very small tumor (<1 cm) eligible for inclusion in the trial, an acknowledgement of their possible importance). And of course, clinical features on presentation may predict risk of failure. Mammographically discovered cancers may have a different prognosis compared to those discovered by the patient feeling a mass, for example.

 

Under TAILORx rules, a 48 year old woman with a palpable 4.9 cm cancer that is high grade with lymphatic invasion, weakly estrogen receptor positive, Ki 67 expression and an S phase of 20 and a Oncotype Dx score of 23 (intermediate risk) could be randomly assigned to tamoxifen alone.

 

Some will argue that the hypothetical patient described above would be unlikely to have an Oncotype Dx score as low as 23. (The B-20 Oncotype Dx analysis used 18 to 31 to define the intermediate risk group. The range was changed to 11-25 for TAILORx, possibly because her2+ patients are excluded - more on this below.) This is probably true, since the Oncotype Dx score does correlate to some extent with many "traditional" indicators. However, if the above hypothetical patient exists, few cancer physicians would advise tamoxifen alone. This would result in one of two outcomes for our hypothetical patient.

 

First, the woman would not likely be offered the TAILORx trial. Or, if she were offered the trial and the randomization resulted in tamoxifen alone, she would be advised to withdraw from the trial and receive chemotherapy. Either way, the results from TAILORx will be compromised. If patients on the "edges" of the "groups" are manipulated to get the "right" treatment, the trial results will be of little use, because the "borderline" cases are the very ones that present the most difficult decisions, and the Will Rogers phenomenon will be in play to distort the results.

 

A third concern that may make TAILORx premature is the paucity of information about how the Oncotype Dx predictive power might be affected by the choice of hormone and chemotherapy agents. The TAILORx protocol allows each physician to choose a chemotherapy regimen and/or hormone agent. Some might still choose the B-20 CMF and tamoxifen regimen for patients perceived to be at lower risk (within the intermediate group), while others might choose "dose dense" taxol containing aggressive treatment for a patient like the hypothetical one described above. There are many reasons why this may lead to errors.

 

The gene for glutathione S-transferase (GST) GSTM1 is one of the 16 predictor genes in Oncotype Dx. The presence of this gene tends to improve prognosis. The problem is that this gene also affects the metabolism of some cancer drugs. Anticancer drugs that have been shown to be substrates for GSTs are, for example, chlorambucil, melphalan, cytoxan metabolites, and steroids.  Indirect evidence for a role of GSTs in modulating drug effects through deactivation of drug-generated hydroperoxides or other reactive oxygene species exists for adriamycin, mitomycin C, carboplatin, and cisplatin, but not taxol. Some have postulated for other malignancies like acute leukemia in children that gstm1 confers a favorable prognosis because it changes chemotherapy metabolism. Oncotype Dx does not identify which patients have one copy of GSTM1 (a null polymorphism) and which have two copies. Remember B-20 chemotherapy often included cytoxan but never taxol. 

 

Another gene in Oncotype Dx is her2. Patients with tumors that express her2 are exluded from TAILORx. This gene is a prototype marker for chemotherapy selection since the drug Herceptin works very well when it is expressed and not at all if it isn't. B-20 contained a number of her2 positive patients and herceptin was not available in that era. So the Oncotype Dx used in TAILORx is a different one than the one tenuously "validated" in B-20. If the her2 gene is a totally independent predictor, the change might not matter. But her2 does interact with other cancer genes, for example the Src family. Src is a family of proto-oncogenic tyrosine kinases originally discovered by J. Michael Bishop and Harold E. Varmus, for which they won the Nobel Prize.

 

Bag1 is another gene in Oncotype Dx and it interacts with the estrogen receptor (alpa) mechanisms which can control cancer growth. Does it interact with tamoxifen (studied in B-20) the same way as the aromatase inhibitors included in TAILORx (but not in B-20)? BAG1 seems to predict respone to tamoxifen but how it predicts benefits from other hormones or chemotherapy is less clear.

 

TAILORx is constructed on a tenuous foundation based on a very small number of observations (remember the 47 patieint subgroup) and on many exrapolations based on assumptions. Some of these may seem niggling, such as the fact that B-20 used age 70 and tumor size 4cm as cut offs and TAILORx uses 75 and 5 cm. Yet B-20 found a suggestion that age and chemotherapy effectivenes interact and that tumor size affects prognosis.

 

And, one more thing: how can "partial breast radiation" be allowed in a non-random way? This would imply that partial breast radiation has become an acceptable "standard of care". Where are the appropriately powered randomized trials that support this implication? Don't be fooled into thinking that an analysis of this non-randomly assigned "stratification" can answer any useful question.

 

Stratification (i.e., prospective randomization within smaller, rigidly predefined clinical subroups) is often employed in clinical trials (though purists might argue that this is unnecessary in trials with a large number of outcome events).  But TAILORx is not stratified by rigidly predefined criteria. Rather it will permit thousands of patients and physicians to choose among a rich buffet of treatment options related to local therapy ((type of node sampling (any type among many permitted), type of mastectomy (any type among many), lumpectomy (any type among many), radiation (type of radiation, partial/whole radiation), chemotherapy (literally thousands of combinations and permutations of drugs and doses) and hormones (serms and aromatase inhibitors).)

 

Stratification might make sense if it were based on close to totally objective (not really ever possible in the real world) and independent classifications. Stratification makes no sense if it based on thousands of differing views and biases related to the risks and benefits of a multitude of differing therapies. Those decisions will be largely based on the very risk assessment conderations that the TAILORx trial is supposed to answer.

 

"Although technological advances will further improve our understanding of breast cancer and will contribute to tailoring treatment to the individual patient, our experience with adjuvant CMF over 30 years confirms that the effects of such a regimen are long lasting and may benefit patients with favourable and unfavourable prognostic indicators, at the cost of minimal long term sequelae." This is how Dr. Bonadonna himself described in 2005 (emphasis added) results from the chemotherapy regimen that is still often called "Bonadonna CMF". "Tailoring treatment" is the holy grail but TAILORx is designed too clumbsily to trump 30 years of better designed clinical trials.

 

Monks from the Order of the Brothers of the Statistic will study the scripture that flows from TAILORx and will be able to devine all the potential biases and confounders by utililizing probabilistic testing based on dubious underpinnigs that may well result in some very "significant" and small "p" numbers. Cultists who worship at that particular altar of "evidence based" medicine will travel about with their power point slides of life table graphs and p values carried to the fourth decimal point, Genomic Health will turn a profit for the first time and its shares will skyrocket, and another level of the temple known to heretics as the "House of Cards" will have been constructed.

 

Or maybe the results will look so powerful that even a skeptic like me will be convinced (or fooled).

 

TAILORx may be another example of the technological imperative in action. TAILORx reflects the fervent need of patients and doctors for a simple "black box" method for making difficult choices. The admonition of H.L. Mencken bears remembering: "For every complex problem, there is a solution that is simple, neat, and wrong."

 

 

 
 
   
 

I. “Oncotype Dx”: Breast Cancer Technological Breakthrough or Breakdown?

A serious question for women with breast cancer and their physicians is whether chemotherapy should be employed after the initial breast surgery. This decision is particularly vexing for situations where the prognosis is relatively good, but not good enough. Patients whose cancers have estrogen receptors and who do not have any spread to the lymph nodes comprise such a group. And the group is large, perhaps half the women with breast cancer.

 

A decade or so ago the results from the National Surgical Adjuvant Breast Project chemotherapy trial B-20 were reported. This trial suggested chemotherapy was of benefit before the menopause with a step down in usefulness with menopause and then a continuing decline with age. Thus tamoxifen plus chemotherapy seemed wise up until roughly the age of 60 (the trial did not include women over 70).  The chemotherapy employed in B-20 were regimens that date to the 1970's. Many experts believe that newer regimens are more effective.

 

B-20 revealed that the degree of estrogen positivity was possibly important, with women with lower levels benefiting more from the chemotherapy and tamoxifen combination. The advent of gene profiling, like the proprietary “Oncotype Dx”, seems to have resolved the chemotherapy issue for many patients and physicians. Is this rational or simply another example of the technological imperative?

 

“The RS [Oncotype Recurrence Score] assay not only quantifies the likelihood of breast cancer recurrence in women with node-negative, estrogen receptor-positive breast cancer, but also predicts the magnitude of chemotherapy benefit” is the conclusion in a paper in the Journal of Clinical Oncology in 2006. Based largely on this study, Oncotype Dx appears with favorable mention in the American Society of Clinical Oncologists and National Comprehensive Cancer Network guidelines. Genomic Health, the company that sells the Oncotype Dx test, uses these guidelines and the JCO paper in its marketing.

 

The 12 page JCO report is chock full of sophisticated analysis, such as “linear fit of the likelihood of distant recurrence as a continuous function of recurrence score” analyses and various multivariate models. But what is the basis for the statement that Oncotype “predicts the magnitude of chemotherapy benefit”?

 

A glance at “Fig. A2” on page 11 tells the story. Figure A2 gives 12 year “overall survival” comparisons for four groups: tamoxifen versus combined tamoxifen/chemo for all patients lumped together, and the corresponding comparisons for good, intermediate, and poor Oncotype Dx score groups. Seven of the eight groups have 12 year survival ranging from about 92% for the low score tamoxifen alone group to about 82% for both intermediate groups, and the tamoxifen/chemotherapy high risk group.

 

Only the high Oncotype risk score tamoxifen alone group jumps off the page. This group has a 12 year survival of only 60%. But the tamoxifen alone group with high risk Oncotype scores consists of only 47 patients. Where did these 47 patients come from? They came from a study (B-20) done by the NSABP and reported in 1997 and included 2363 patients with breast cancer, negative lymph nodes and positive estrogen receptors. The 47 unlucky patients were about 2% of the total enrolled patients in NSABP B-20.

 

Are the 47 patients representative of all the Oncotype high risk patients in B-20? It is hard to say. Samples of the original breast tumors were available for only 670 patients and testing was successful in 651. So, only about ¼ of the B-20 patients are included in the Oncotype study. If this sample were random, probabilistic analysis might be intact. But the absence of material to test was not random. Some of the tumor samples were “used up” in other studies, and not saved in others. Presumably these other studies were focused on something specific and not random.

 

And what about the “overall survival” of 60%? Is that real? Again, it is hard to say. “Deaths before distant recurrence [was] considered [a] censoring event”. This means that a patient who was killed in an auto crash would be counted as alive but lost to follow up rather than counted as a death. But what if the crash was caused by a blood clot caused by the tamoxifen? And, since both chemotherapy and tamoxifen are thought to increase clots, what if several more patients in the combined group than in the tamoxifen died of strokes or heart attacks?

 

Oncotype is being used by patients and physicians all over the country to decide upon chemotherapy based on the 47 patients. Perhaps by a coincidence, the 12 year survival for both the Oncotype intermediate and high risk combined tamoxifen/chemotherapy groups was nearly identical at 82%. There were 212 such patients.

 

However, the 47 high risk tamoxifen alone group had survival 22% less than any other group. What are the possible explanations for the remarkably bad outcome for the 47 patients? Perhaps it is as it seems – tamoxifen alone is inadequate for high risk patients. Still it seems odd that the worst prognosis group had almost exactly the same survival experience as the intermediate group when the treatment was both chemotherapy and tamoxifen. If the understanding is that tamoxifen is good for estrogen receptor positive patients and chemotherapy adds something for some patients, how is it that the combination gives the same results for both intermediate and high risk patients?

 

Maybe the 47 were just very unlucky and the 117 high risk patients who got the combination therapy were astonishingly lucky to get the same results as the intermediate group. To account for this possibility, tests of statistical significance are performed. Using one of these (Cox proportional hazard test), there is less than one chance in a thousand that the 22% difference in survival is “random”, according to the analysis done by the authors. The tests for statistical significance assume random allocation of treatment. The original B-20 trial was randomized but the Oncotype study was not based on a random sample from that trial and was retrospective.

 

In addition, the “significance testing” was not said to have been corrected for the fact that many comparisons were made. The degree of confidence one takes away from a retrospective study full of potentially confounding variables and assumptions that violate basic probabilistic underpinnings is not as high as the statistical significance level might otherwise imply.

 

The authors of the JCO paper claim that their test “predicts the magnitude of chemotherapy benefit.” This seems not quite right. The magnitude of benefit from tamoxifen/chemo was identical in the intermediate and high risk score groups. What the test may have predicted in those 47 patients with high scores was a poor outcome with tamoxifen alone. One would think that the suggestion that Oncotype should serve as the basis for treatment selection for 100,000 women should not be based on the experience of an undefined 47 patient “chunk sample”.

 

The test costs about $3650. About 100,000 women may have er positive, node negative cancer diagnosed this year. That’s $365 mil for just one test in a complicated setting where many other images and tests will be required. Oncotype Dx should be verified by a prospective randomized trial that is appropriately stratified. Such a trial is underway.

 

More on this later.

 
 
 

 
Latest Comment
Re: I need to rant... - Well, even though you thought I hated you up until a few months ago, I love you and...

Read...


 
© 2005-2007 MindSay Interactive LLC
| Terms of Service
| Privacy Policy
My Account
Inbox
Account Settings
Lost Password?
Logout
Blog
Update Blog
Edit Old Entries
Pick a Theme
Customize Design
Modify Plugins
Community
Your Profile
Wiki Pages
MindSay Tags
Video & Photos
Geographic Directory
Inside MindSay
About MindSay
MindSay and RSS
Report Spam
Contact Us
Help