Placeholder Image

Subtitles section Play video

  • In 2011, a group of researchers conducted a scientific study to find an impossible result: that listening to certain songs can make you younger.

    2011 年,一組研究人員進行了一項科學研究,發現了一個不可能的結果:聽某些歌曲可以讓你更年輕。

  • Their study involved real people, truthfully reported data, and commonplace statistical analyses.

    他們的研究涉及真實的人、如實的數據報告和常見的統計分析。

  • So, how did they do it?

    那麼他們是如何做到的呢?

  • The answer lies in a statistical method scientists often use to try to figure out whether their results mean something or if they're random noise.

    答案在於科學家們經常使用的一種統計方法,用來弄清楚他們的結果是否具有意義,或者只是隨機的變數。

  • In fact, the whole point of the music study was to point out ways this method can be misused.

    事實上,音樂研究的重點是指出這種方法可能被濫用的方式。

  • A famous thought experiment explains the method.

    一個著名的思想實驗解釋了這個方法。

  • There are eight cups of tea, four with the milk added first and four with the tea added first.

    有八杯茶,四杯先加牛奶,四杯先加茶。

  • A participant must determine which are which according to taste.

    試驗者必須根據味道確定哪些是哪些。

  • There are 70 different ways the cups can be sorted into two groups of four and only one is correct.

    有 70 種不同的方法可以將杯子分成兩組,每組四個,只有一種是正確的。

  • So, can she taste the difference?

    那麼,她能嚐出其中的不同嗎?

  • That's our research question.

    這就是我們的研究問題。

  • To analyze her choices, we define what's called a null hypothesis, that she can't distinguish the teas.

    為了分析她的選擇,我們定義了所謂的零假設,就是她無法區分出茶。

  • If she can't distinguish the teas, she'll still get the right answer 1 in 70 times by chance.

    如果她無法做出區分,她仍然會在 70 次中答出正確答案。

  • 1 in 70 is roughly .014⏤that single number is called a p-value.

    70 分之 1 大約是 0.014,而這個單一數字則稱為 p 值。

  • In many fields, a p-value of .05 or below is considered statistically significant, meaning there's enough evidence to reject the null hypothesis.

    在許多領域中,0.05 或以下的 p 值被認為具有統計顯著性,這意味著有足夠的證據來反駁零假設。

  • Based on a p-value of .014, they'd rule out the null hypothesis that she can't distinguish the teas.

    基於 0.014 的 p 值,他們排除了她無法區分茶的零假設。

  • Though p-values are commonly used by both researchers and journals to evaluate scientific results, they're really confusing, even for many scientists.

    儘管 p 值通常被研究人員和期刊用於評估科學結果,但它們確實令人困惑,即使對許多科學家來說也是如此。

  • That's partly because all a p-value actually tells us is the probability of getting a certain result, assuming the null hypothesis is true.

    有一部分是因為 p 值實際上告訴我們的是得到某個結果的概率,假設零假設為真。

  • So if she correctly sorts the teas, the p-value is the probability of her doing so assuming she can't tell the difference.

    因此,如果她正確地對茶進行了分類,則 p 值是假設她無法分辨差異的情況下她這樣做的概率。

  • But the reverse isn't true: the p-value doesn't tell us the probability that she can taste the difference, which is what we're trying to find out.

    但反之則不然:p 值並沒有告訴我們她能嘗出差異的概率,而這正是我們試圖要找出的數據。

  • So if a p-value doesn't answer the research question, why does the scientific community use it?

    因此,如果 p 值不能給出研究問題的答案,為什麼科學界要使用它呢?

  • Well, because even though a p-value doesn't directly state the probability that the results are due to random chance, it usually gives a pretty reliable indication.

    其實,因為即使 p 值不直接說明結果是由隨機的機會引起的概率,它通常也提供了非常可靠的指示。

  • At least, it does when used correctly. And that's where many researchers, and even whole fields, have run into trouble.

    至少,它在正確使用時確實如此,而這就是許多研究人員,甚至整個領域都會遇到麻煩的地方。

  • Most real studies are more complex than the tea experiment. Scientists can test their research question in multiple ways, and some of these tests might produce a statistically significant result, while others don't.

    大多數真正的研究比區分茶的實驗更複雜。科學家可以通過多種方式測試他們的研究問題,其中一些測試可能會產生具有統計意義的結果,而另一些則不會。

  • It might seem like a good idea to test every possibility. But it's not, because with each additional test, the chance of a false positive increases.

    測試每一種可能性似乎是個好主意,但事實並非如此,因為每進行一次額外的測試,誤報的機率就會增加。

  • Searching for a low p-value, and then presenting only that analysis, is often called p-hacking.

    調查低的 p 值,然後僅呈現該分析,通常稱為 p 值駭客。

  • It's like throwing darts until you hit a bullseye and then saying you only threw the dart that hit the bull's eye. This is exactly what the music researchers did.

    這就像扔飛鏢直到擊中靶心,然後說你只扔出了擊中靶心的飛鏢:這正是音樂研究人員所做的事情。

  • They played three groups of participants each a different song and collected lots of information about them.

    他們為三組試驗者演奏了不同的歌曲,並收集了關於他們的大量信息。

  • The analysis they published included only two out of the three groups.

    他們發表的分析只包括三組中的兩組。

  • Of all the information they collected, their analysis only used participants' fathers' ageto "control for variation in baseline age across participants".

    在他們收集的所有信息中,他們的分析僅使用爸爸輩年齡的試驗者——來「控制參與者之間基準年齡的變化」。

  • They also paused their experiment after every ten participants, and continued if the p-value was above .05, but stopped when it dipped below .05.

    他們還在每 10 個試驗者進行之後暫停實驗,如果 p 值高於 0.05 則繼續,但低於 0.05 時就會停止。

  • They found that participants who heard one song were 1.5 years younger than those who heard the other song, with a p-value of .04.

    他們發現聽到一首歌的試驗者比聽到另一首歌的試驗者年輕 1.5 歲,p 值為 0.04。

  • Usually it's much tougher to spot p-hacking, because we don't know the results are impossible: the whole point of doing experiments is to learn something new.

    通常發現 p 值駭客很困難,因為我們並不知道結果是不可能的:整個做實驗的意義就在於學習新的東西。

  • Fortunately, there's a simple way to make p-values more reliable: pre-registering a detailed plan for the experiment and analysis beforehand that others can check, so researchers can't keep trying different analyses until they find a significant result.

    幸運的是,有一種簡單的方法可以讓 p 值更可靠:為實驗和分析預先紀錄一個詳細的計劃,以便其他人可以檢查,這樣研究人員就不能繼續嘗試不同的分析,直到他們找到重要的結果。

  • And, in the true spirit of scientific inquiry, there's even a new field that's basically science doing science on itself: studying scientific practices in order to improve them.

    而且,本著科學探究的真正精神,甚至還出現一個新的領域,基本上是對科學本身做科學:研究科學實驗以改善它們。

  • This new field has emerged in response to a crisis in science, and p-hacking is just one part of that crisis. So, what's going on? And can we fix it? Learn more with this video.

    這個新領域的出現是為了應對科學危機,而 p 值駭客只是這危機的一部分,那麼,到底發生什麼事了?我們可以修復它嗎? 請觀看這支影片了解更多內容。

In 2011, a group of researchers conducted a scientific study to find an impossible result: that listening to certain songs can make you younger.

2011 年,一組研究人員進行了一項科學研究,發現了一個不可能的結果:聽某些歌曲可以讓你更年輕。

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it

B1 US TED-Ed 研究 試驗 分析 實驗 科學

能證實任何事情的科學方法?數據分析的謬誤 (The method that can "prove" almost anything - James A. Smith (The method that can "prove" almost anything - James A. Smith)

  • 6366 226
    Minjane posted on 2021/09/18
Video vocabulary