Big Data: Not Really the Same as Level 1 Data

January 15, 2015

The problem with large sets of data is the risk of the “GIGO” principle-viz. garbage in, garbage out-and it requires a very careful and thoughtful investigator to rule out the many errors of large-scale data capture.

I have the greatest respect for the seminal contributions of Drs. Crawford and Moul to the domain of prostate cancer therapy. As a result, I read their review on the side effects of androgen-deprivation therapy (ADT)[1] carefully and with interest, notwithstanding that the authors received medical writing assistance funded by Ferring Pharmaceuticals, the manufacturers of a commercially available luteinizing hormone–releasing hormone (LHRH) antagonist. (Of importance, the authors note that there was no corporate interference with the editorial development of their article.) The introduction of LHRH antagonists, as compared with agonists, has been interesting and complex but is beyond the scope of my brief note.

I was able to glean from this review that Crawford and Moul find the literature on the side effects of ADT to be somewhat inconsistent and occasionally confusing, and I think that to be fair comment.

That said, when conducting a review of published information, it is often useful to look for potential conflicts of interest among the authors who provide data suggesting that the cardiovascular side effects of ADT do not exist. Of course, such conflicts may not be present, and the repeated publication of papers suggesting that ADT has no real cardiovascular side effects may actually just represent the misuse of large data bases-or conversely the misinterpretation of very small data bases with short follow-up.

The problem with large sets of data is the risk of the “GIGO” principle-viz. garbage in, garbage out-and it requires a very careful and thoughtful investigator to rule out the many errors of large-scale data capture, before implementing any large (or small) data base. The big-data advocates believe that errors will be lost in purported real-world data sets that reflect the community as a whole; however, it seems to me that repeating an error of data recording 10,000 times, or introducing inadvertent selection bias or reporting bias 10,000 times, does not really improve the quality of the associated report.

In the case of their current article, Crawford and Moul seem to believe it is not clear whether ADT causes cardiovascular system side effects. Based on prospective published studies reported by others and noted by these authors, and personal experience, I feel that there is an increased risk of cardiovascular disease in patients who are castrated, particularly via medical therapy. While I usually believe level 1 evidence, huge historically based studies from large data bases do not meet my definition of level 1 evidence. With respect to many analyses and meta-analyses of sets of randomized patients that have led to questioning of the impact of ADT with regard to late effects, one can reasonably question several methodologic issues beyond the scope of a brief editorial comment. Having watched many patients develop cardiovascular complications or metabolic syndrome after the initiation of ADT, with no prior history or relevant risk factors, I am convinced that ADT does create metabolic syndrome in some patients. That said, it remains a very useful treatment for men with prostate cancer who have the appropriate evidence-based indications.

What is much less compelling to me is the discussion by this fine duo of the Chung paper on pneumonia.[2] They seem to accept the observation as fact, whereas I am not convinced that the source paper should have been published!! The study reflects another large, national data base of uncertain quality, with prostate cancer patients who were quite different in the ADT and non-castrate groups (significantly older age, potentially different Charlson comorbidity scores, etc.), and the study lacked both a definition of what constituted “pneumonia” and clarity about other treatments that were being administered to two quite different prostate cancer populations. There are many buckets in this study, and the absence of significant P values for the two patient populations, once allocated into these categorical buckets, may just represent the small sample size of the purported pneumonia victims. Thus, it is not at all clear that “pneumonia” really is increased in the castrate or ADT-treated population.

In sum, the review by Crawford and Moul is interesting, but I am not sure that I agree with all of its conclusions. As the year 2015 begins, it is probably more interesting to think about the potential late effects of some of the newer agents that are in widespread use, rather than running around the same old track of recycled castration data, irrespective of whether the data reflect surgical castration or agonist/antagonist treatment. The authors’ hint that antagonists have less toxicity than agonists really should not be dignified by any specific comment.

Financial Disclosure:The author has no significant financial interest or other relationship with the manufacturers of any products or providers of any service mentioned in this article.


1. Crawford ED, Moul JW. ADT risks and side effects in advanced prostate cancer: cardiovascular and acute renal injury. Oncology (Williston Park). 2015;29:55-66.

2. Chung SD, Liu SP, Lin HC, Wang LH. Increased risk of pneumonia in patients receiving gonadotropin-releasing hormone agonists for prostate cancer. PLoS One. 2014;9:e101254.