當前位置

首頁 > 英語閱讀 > 英語閱讀理解 > 吸菸與肺癌的關係真的有那麼大嗎

吸菸與肺癌的關係真的有那麼大嗎

推薦人: 來源: 閱讀: 2.11W 次

It is said that there is a correlation between the number of storks’ nests found on Danish houses and the number of children born in those houses. Could the old story about babies being delivered by storks really be true? No. Correlation is not causation. Storks do not deliver children but larger houses have more room both for children and for storks.
丹麥流傳着一種說法,一戶人家屋檐上的鸛巢數量與這家人所生孩子的數量存在着相關性。嬰兒是鸛鳥送來的古老傳說是真的嗎?當然不是。相關性跟因果關係不是一回事。鸛不會送來孩子,但大房子有更大的空間爲孩子和鸛所用。

This much-loved statistical anecdote seems less amusing when you consider how it was used in a US Senate committee hearing in 1965. The expert witness giving testimony was arguing that while smoking may be correlated with lung cancer, a causal relationship was unproven and implausible. Pressed on the statistical parallels between storks and cigarettes, he replied that they “seem to me the same”.
這是一則人們喜聞樂見的統計趣聞,但如果你知道1965年在美國參議院一場聽證會上它是如何被用到的,你就不會覺得那麼有趣了。那位做聽證發言的專家證人辯稱,儘管吸菸或許跟肺癌相關,但兩者之間不存在已證明的、令人信服的因果關係。當被問及爲何把鸛和孩子的關係與香菸和肺癌的關係進行類比,他回答說,兩者“在我看來是一樣的”。

吸菸與肺癌的關係真的有那麼大嗎

The witness’s name was Darrell Huff, a freelance journalist beloved by generations of geeks for his wonderful and hugely successful 1954 book How to Lie with Statistics. His reputation today might be rather different had the proposed sequel made it to print. How to Lie with Smoking Statistics used a variety of stork-style arguments to throw doubt on the connection between smoking and cancer, and it was supported by a grant from the Tobacco Institute. It was never published, for reasons that remain unclear. (The story of Huff’s career as a tobacco consultant was brought to the attention of statisticians in articles by Andrew Gelman in Chance in 2012 and by Alex Reinhart in Significance in 2014.)
這位證人的名字叫達萊爾•哈夫(Darrell Huff),是一名自由記者,因其1954年出版的那本精彩、大爲暢銷的《統計數字會撒謊》(How to Lie with Statistics)而深受數代極客的愛戴。如果該書續集付印的話,他今天的名聲或許會完全不同。《吸菸統計數字會撒謊》(How to Lie with Smoking Statistics)使用了各種鸛式論點來對吸菸與癌症的相關性提出質疑。該書得到了美國的菸草研究所(Tobacco Institute)資助,但不知出於什麼原因一直沒有出版。(2012年安德魯•格爾曼(Andrew Gelman)在《Chance》雜誌上發表的文章,以及2014年亞歷克斯•萊因哈特(Alex Reinhart)在《Significance》雜誌上發表的文章,使哈夫擔任菸草業顧問的經歷引起統計學家們的注意。)

Indisputably, smoking causes lung cancer and various other deadly conditions. But the problematic relationship between correlation and causation in general remains an active area of debate and confusion. The “spurious correlations” compiled by Harvard law student Tyler Vigen and displayed on his website () should be a warning. Did you realise that consumption of margarine is strongly correlated with the divorce rate in Maine?
毋庸置疑,吸菸會導致肺癌和其他多種致命疾病。但廣泛意義上的相關性與因果之間的尚存疑問的關係,仍是當前一個極易引起爭議和混淆的領域。哈佛大學(Harvard)法學院學生泰勒•維根(Tyler Vige)編撰併發布在其網站()上的“僞相關”應算是一種警告。你知道緬因州人造奶油的消費量與離婚率之間存在很強的相關性嗎?

We cannot rely on correlation alone, then. But insisting on absolute proof of causation is too exacting a standard (arguably, an impossible one). Between those two extremes, where does the right balance lie between trusting correlations and looking for evidence of causation?
所以,我們不能僅僅依賴相關性。但是,堅持爲因果關係提供絕對證據就過於苛刻了(甚至是一種不可能達到的標準)。在這兩個極端之間,如何在相信相關性與尋找因果證據之間達到合理的平衡呢?

Scientists, economists and statisticians have tended to demand causal explanations for the patterns they see. It’s not enough to know that college graduates earn more money — we want to know whether the college education boosted their earnings, or if they were smart people who would have done well anyway. Merely looking for correlations was not the stuff of rigorous science.
科學家、經濟學家和統計學家傾向於要求爲他們看到的現象提出因果解釋。知道大學畢業生能賺更多錢還不夠,我們想知道,大學教育是否提高了他們的收入,或者他們本來就是聰明人、不管接受大學教育與否都能賺更多錢。僅僅尋找相關性並非嚴格科學的做法。

But with the advent of “big data” this argument has started to shift. Large data sets can throw up intriguing correlations that may be good enough for some purposes. (Who cares why price cuts are most effective on a Tuesday? If it’s Tuesday, cut the price.) Andy Haldane, chief economist of the Bank of England, recently argued that economists might want to take mere correlations more seriously. He is not the first big-data enthusiast to say so.
但隨着“大數據”的到來,這場爭論開始發生變化。海量數據集可以產生一些有趣的相關性,在某些用途上它們就足夠好用了(誰關心爲何週二降價效果最好呢?如果確是這樣,那就選這一天降價。)英國央行(BoE)首席經濟學家安德魯•霍爾丹(Andy Haldane)不久前表示,經濟學家們或許想更認真地看待純粹相關性(mere correlation)。他不是第一個這麼說的大數據熱衷者。

This brings us back to smoking and cancer. When the British epidemiologist Richard Doll first began to suspect the link in the late 1940s, his analysis was based on a mere correlation. The causal mechanism was unclear, as most of the carcinogens in tobacco had not been identified; Doll himself suspected that lung cancer was caused by fumes from tarmac roads, or possibly cars themselves.
我們回頭來講抽菸與癌症之間的關係。20世紀40年代末,英國流行病學家理查德•多爾(Richard Doll)最早開始懷疑二者之間的聯繫。當時他的分析基於純粹相關性,他不清楚因果機制,因爲當時還沒確定菸草中的大多數致癌物。多爾本人懷疑肺癌的致病原因是柏油公路的煙氣,或者可能就是汽車本身。

Doll’s early work on smoking and cancer with Austin Bradford Hill, published in 1950, was duly criticised in its day as nothing more than a correlation. The great statistician Ronald Fisher repeatedly weighed into the argument in the 1950s, pointing out that it was quite possible that cancer caused smoking — after all, precancerous growths irritated the lung. People might smoke to soothe that irritation. Fisher also observed that some genetic predisposition might cause both lung cancer and a tendency to smoke. (Another statistician, Joseph Berkson, observed that people who were tough enough to resist adverts and peer pressure were also tough enough to resist lung cancer.)
多爾與奧斯汀•布拉德福德•希爾(Austin Bradford Hill)在1950年發表了他們關於吸菸與癌症關係的早期研究結果,由於倆人的研究基於純粹相關性,在當時果不其然遭到了批評。偉大的統計學家羅納德•費雪(Ronald Fisher)在20世紀50年代多次加入論戰,指出很可能是癌症引起吸菸,畢竟癌前期病變會對肺部造成刺激,人們可能會通過吸菸來緩解這一刺激。費雪還認爲有些遺傳特徵可能既會引發肺癌,還會引起吸菸傾向。(另一位統計學家約瑟夫•伯克森(Joseph Berkson)提出,假如一個人強悍到足以抵制廣告的誘惑和同齡人的壓力,那麼他也強悍到足以抵抗肺癌。)

Hill and Doll showed us that correlation should not be dismissed too easily. But they also showed that we shouldn’t give up on the search for causal explanations. The pair painstakingly continued their research, and evidence of a causal association soon mounted.
希爾和多爾的例子告訴我們,不要輕易否定相關性,但他們也以行動證明,不應放棄尋找因果解釋。倆人繼續勤懇研究,很快就發現了更多表明因果關係的證據。

Hill and Doll took a pragmatic approach in the search for causation. For example, is there a dose-response relationship? Yes: heavy smokers are more likely to suffer from lung cancer. Does the timing make sense? Again, yes: smokers develop cancer long after they begin to smoke. This contradicts Fisher’s alternative hypothesis that people self-medicate with cigarettes in the early stages of lung cancer. Do multiple sources of evidence add up to a coherent picture? Yes: when doctors heard about what Hill and Doll were finding, many of them quit smoking, and it became possible to see that the quitters were at lower risk of lung cancer. We should respect correlation but it is a clue to a deeper truth, not the end of our investigations.
希爾和多爾在尋找因果關係時採取了一種務實的方法。比如,是否存在一種劑量效應?是的,煙癮大的人更可能患肺癌。煙齡長短有關係嗎?有關係,吸菸者開始吸菸很久後,癌細胞開始形成。這與費舍爾設想的人們在肺癌早期階段用菸草進行自我醫療的假設相矛盾。多個證據來源湊到一起能否得到一個邏輯連貫的描述?答案是:能夠得到。當醫生們聽聞希爾和多爾的發現時,許多醫生開始戒菸,現實情況也表明戒菸者患肺癌的風險要更低。我們應該尊重相關性,但相關性只是通向更深層真理的一個線索,而不是研究的終點。

It’s not clear why Huff and Fisher were so fixated on the idea that the growing evidence on smoking was a mere correlation. Both of them were paid as consultants by the tobacco industry and some will believe that the consulting fees caused their scepticism. It seems just as likely that their scepticism caused the consulting fees. We may never know.
目前尚不清楚爲什麼面對越來越多的吸菸致癌的證據,赫夫和費雪卻執着地認爲這僅是相關性。他們二人都是菸草行業的顧問,因而有些人會認爲他們的懷疑動機來源於顧問費。但也很可能正是他們的懷疑帶來了顧問費。到底哪個爲因,哪個爲果,後人可能永遠不得而知。