Among the many mysteries of human biology is why complex diseases like diabetes, high blood pressure and psychiatric disorders are so difficult to predict and, often, to treat. An equally perplexing puzzle is why one individual gets a disease like cancer or depression, while an identical twin remains perfectly healthy.
Now scientists have discovered a vital clue to unraveling these riddles. The human genome is packed with at least four million gene switches that reside in bits of DNA that once were dismissed as “junk” but that turn out to play critical roles in controlling how cells, organs and other tissues behave. The discovery, considered a major medical and scientific breakthrough, has enormous implications for human health because many complex diseases appear to be caused by tiny changes in hundreds of gene switches.
The findings, which are the fruit of an immense federal project involving 440 scientists from 32 laboratories around the world, will have immediate applications for understanding how alterations in the non-gene parts of DNA contribute to human diseases, which may in turn lead to new drugs. They can also help explain how the environment can affect disease risk. In the case of identical twins, small changes in environmental exposure can slightly alter gene switches, with the result that one twin gets a disease and the other does not.
As scientists delved into the “junk” — parts of the DNA that are not actual genes containing instructions for proteins — they discovered it is not junk at all. At least 80 percent of it is active and needed. The result is an annotated road map of much of this DNA, noting what it is doing and how. It includes the system of switches that, acting like dimmer switches for lights, control which genes are used in a cell and when they are used, and determine, for instance, whether a cell becomes a liver cell or a neuron.
“It’s Google Maps,” said Eric Lander, president of the Broad Institute, a joint research endeavor of Harvard and the Massachusetts Institute of Technology. In contrast, the project’s predecessor, the Human Genome Project, which determined the entire sequence of human DNA, “was like getting a picture of Earth from space,” he said. “It doesn’t tell you where the roads are, it doesn’t tell you what traffic is like at what time of the day, it doesn’t tell you where the good restaurants are, or the hospitals or the cities or the rivers.”
“這就像谷歌地圖，”博德研究所(Broad Institute)的所長埃里克·蘭德(Eric Lander)說道。該研究所由哈佛大學(Harvard)和麻省理工大學(Massachusetts Institute of Technology)共同成立。相比之下， 該項目的先驅確定了人類DNA序列的人類基因組計劃(Human Genome Project)則“更像是從太空中拍攝了地球的圖像。那幅畫沒有告訴你路在哪兒，沒有告訴你一天中某個時候的交通如何，沒有告訴你好的餐館在哪兒，也沒有告訴你醫院、城市或河流在哪兒，”蘭德說。
The new result “is a stunning resource,” said Dr. Lander, who was not involved in the research that produced it but was a leader in the Human Genome Project. “My head explodes at the amount of data.”
The discoveries were published on Wednesday in six papers in the journal Nature and in 24 papers in Genome Research and Genome Biology. In addition, The Journal of Biological Chemistry is publishing six review articles, and Science is publishing yet another article.
新發現以六篇論文的形式于周三發表在《自然》雜志(Nature)上，并以24篇論文發表在《基因組研究》(Genome Research)和《基因組生物學》(Genome Biology)上。另外，《生物化學雜志》(The Journal of Biological Chemistry)將會發表六篇評論文章，《科學》也會接著發表一篇文章。
Human DNA is “a lot more active than we expected, and there are a lot more things happening than we expected,” said Ewan Birney of the European Molecular Biology Laboratory-European Bioinformatics Institute, a lead researcher on the project.
人類DNA“比我們預期的要活躍得多，還有很多是我們之前沒有想到的，”來自歐洲分子生物實驗室-歐洲生物信息研究所(European Molecular Biology Laboratory-European Bioinformatics Institute)的尤安·伯尼(Ewan Birney)說道，他是該項目的領頭研究人員。
In one of the Nature papers, researchers link the gene switches to a range of human diseases — multiple sclerosis, lupus, rheumatoid arthritis, Crohn’s disease, celiac disease — and even to traits like height. In large studies over the past decade, scientists found that minor changes in human DNA sequences increase the risk that a person will get those diseases. But those changes were in the junk, now often referred to as the dark matter — they were not changes in genes — and their significance was not clear. The new analysis reveals that a great many of those changes alter gene switches and are highly significant.
“Most of the changes that affect disease don’t lie in the genes themselves; they lie in the switches,” said Michael Snyder, a Stanford University researcher for the project, called Encode, for Encyclopedia of DNA Elements.
“影響疾病的大多數變異不在基因本身，而在基因開關上，”項目的研究員之一，斯坦福大學(Stanford University)的邁克爾·斯奈德(Michael Snyder)說道。該項目稱為“DNA元件百科全書計劃”(Encyclopedia of DNA Elements)，簡稱Encode。
And that, said Dr. Bradley Bernstein, an Encode researcher at Massachusetts General Hospital, “is a really big deal.” He added, “I don’t think anyone predicted that would be the case.”
“這是很重要的發現，”Encode研究員、馬薩諸塞州綜合醫院(Massachusetts General Hospital)的布拉德利· 伯恩斯坦博士(Bradley Bernstein)說道。他還補充，“我認為沒有人預見到會是這樣。”
The discoveries also can reveal which genetic changes are important in cancer, and why. As they began determining the DNA sequences of cancer cells, researchers realized that most of the thousands of DNA changes in cancer cells were not in genes; they were in the dark matter. The challenge is to figure out which of those changes are driving the cancer’s growth.
In prostate cancer, for example, mutations have been found in important genes that are not readily attacked by drugs. But Encode, by showing which regions of the dark matter control those genes, gives another way to attack them: target those controlling switches.
Dr. Bernstein said, “This is a resource, like the human genome, that will drive science forward.”
The system, though, is stunningly complex, with many redundancies. Just the idea of so many switches was almost incomprehensible, Dr. Bernstein said.
There also is a sort of DNA wiring system that is almost inconceivably intricate.
“It is like opening a wiring closet and seeing a hairball of wires,” said Mark Gerstein, an Encode researcher from Yale. “We tried to unravel this hairball and make it interpretable.”
The project began in 2003, as researchers began to appreciate how little they knew about human DNA. In recent years, some began to find switches in the 99 percent of human DNA that is not genes, but they could not fully characterize or explain what a vast majority of it was doing.
The thought before the start of the project, said Thomas Gingeras, an Encode researcher from Cold Spring Harbor Laboratory, was that only 5 to 10 percent of the DNA in a human being was actually being used.
Encode研究員、冷泉港實驗室(Cold Spring Harbor Laboratory)的托馬斯·金格拉斯(Thomas Gingeras)稱，計劃開始之前，大家認為，僅有5%到10%的人類DNA真正被用到。
The big surprise was not only that almost all of the DNA is used but also that a large proportion of it is gene switches.
By the time the National Human Genome Research Institute, part of the National Institutes of Health, embarked on Encode, major advances in DNA sequencing and computational biology had made it conceivable to try to understand the dark matter of human DNA. Even so, the data analysis was daunting — the researchers generated 15 trillion bytes of raw data. Analyzing the data required the equivalent of more than 300 years of computer time.
當美國國家國家衛生研究院(National Institutes of Health)的分支機構國家人類基因組研究所(National Human Genome Research Institute)啟動Encode計劃的時候，DNA測序和計算生物學的重大進展已經使人類DNA暗物質變得比較容易理解了。盡管如此，數據分析仍令人望而卻步。研究人員得到的原始數據有15萬億字節之巨。分析這些數據需要相當于一臺計算機運算300多年的時間。
Just organizing the researchers and coordinating the work was an enormous undertaking. Dr. Gerstein, who was one of the project’s leaders, has produced a diagram of the authors with their connections to one another. It looks nearly as complicated as the wiring diagram for the human DNA switches.