Human Proteonome Folding, Phase 2, рассчет структуры белков в человеческом теле

Rilian	Jun 11 2008, 15:33 Пост #1
interstellar Група: Team member Повідомлень: 17 062 З нами з: 22-February 06 З: Торонто Користувач №: 184 Стать: НеСкажу Free-DC_CPID Парк машин: ноут и кусок сервера	Human Proteome Folding Project Phase 2 Официальные результаты проекта Активные эксперименты Human Microbiome Project - официальный сайт http://homepages.nyu.edu/~rb133/wcg/thread_2010_03_10.html Как присоединиться читайте в главном топике World Community Grid Proteins are essential to living beings. Just about everything in the human body involves or is made out of proteins. What are proteins? Proteins are large molecules that are made of long chains of smaller molecules called amino acids. While there are only 20 different kinds of amino acids that make up all proteins, sometimes hundreds of them make up a single protein. Adding to the complexity, proteins typically do not stay as long chains. As soon as the chain of amino acids is built, the chain folds and tangles up into a more compact and particular shape that lets it conduct specific and necessary functions within the human body. Proteins fold because the different amino acids like to stick to each other following certain rules. Imagine that amino acids are pop-beads of 20 different colors. The pop-beads are sticky, but sticky in such a way that only certain combinations of colors can stick together. This makes the amino acid chains fold in a particular way that creates proteins that are useful to the human body. Human cells have mechanisms to help the proteins fold properly and, equally important, mechanisms to get rid of improperly folded proteins. How do proteins relate to human genes? The collection of all of the human genes is known as "the human genome." Depending on how the genes are counted, there are over 30,000 genes in the human genome. Each gene, which is a section of a long chain known as DNA, dictates how to build the chain of amino acids for one of the 30,000 proteins. In recent years, scientists were able to map the sequence for each human gene. This means that we now know the sequence of amino acids in all of the human proteins. Thus, the human genome is directly related to the "human proteome," the collection of all human proteins. The protein mystery While researchers have learned a great deal about the human proteome, the functions of most of the proteins remain a mystery. The genes do not reveal exactly how the proteins will fold into their final shape, which is critical because that determines what a protein can do and what other proteins it can connect to or interact with. Proteins are like puzzle pieces. For example, muscle proteins connect to each other to form a muscle fiber. They join together in a specific manner because of their shape, as well as other factors relating to the shape. Everything that goes on in cells and in the body is very specifically controlled by the shape of the proteins that do or do not let proteins interlock with other proteins. For example, the proteins of a virus or bacteria may have particular shapes that enable it to break through the cell membrane, allowing it to infect the cell. The Human Proteome Folding Project Знания структуры белков позволит ученым понять как белки выполняют свои биологические функции, а также как болезни блокируют белки от выполнения необходимых функций для поддержания здоровых клеток The Human Proteome Folding Project will combine the power of millions of computers in a grid to help scientists understand how human proteins fold. The work to be done in this monumental task is shared across this grid, so that results can be achieved far sooner than would be possible with conventional supercomputers. With a greater understanding of protein structure, scientists can learn how diseases work and ultimately find cures for them. When your grid agent is running, it is folding an amino acid chain in various ways and evaluating how well each folding follows the specific rules of how specific amino acids stick together or not. As computers try millions of ways to fold the chains, they attempt to fold the protein in the same way that it actually folds in the human body. The best shapes identified for each protein are returned to the scientists for further study. ----- Оказывается тут тоже юзается розетта (Show/Hide) График проекта Це повідомлення відредагував Rilian: Feb 4 2011, 00:23

Відповідей(15 - 29)

cosmo_vk	Apr 2 2009, 06:49 Пост #16
kранчер Група: Trusted Members Повідомлень: 86 З нами з: 19-September 08 З: Ковров Користувач №: 828 Стать: НеСкажу Парк машин: всего по немногу	не-е у меня считает в районе 3-4 часов. Если больше значит глюк, так же с этим заданием. Рестартанул боинк, это задание досчиталось за пару часов. Кстати на форуме WCG тоже про такое говорилось: http://www.worldcommunitygrid.org/forums/w...ad?thread=24981 --------------------

Rilian

Apr 2 2009, 11:31

Пост #17

interstellar

Група: Team member
Повідомлень: 17 062
З нами з: 22-February 06
З: Торонто
Користувач №: 184
Стать: НеСкажу
Free-DC_CPID
Парк машин:
ноут и кусок сервера

А... ну да, в HPFP2 оч редко такое бывает.. Может когда-нибудь исправят

--------------------

(Show/Hide)

загальна статистика: BOINCstats * FreeDC команда: BOINC команда Ukraine

Rilian

Apr 8 2009, 20:54

Пост #18

interstellar

Група: Team member
Повідомлень: 17 062
З нами з: 22-February 06
З: Торонто
Користувач №: 184
Стать: НеСкажу
Free-DC_CPID
Парк машин:
ноут и кусок сервера

Patrick Winters продолжает радовать нас апдейтами статуса проекта. Так как база данных заданий не имеет красивого фронт-энда и веб-интерфейса со всякими наворотами, Патрик на своей домашней странице пообещал периодически обновлять список экспериментов которые сейчас считаются

http://homepages.nyu.edu/~rb133/wcg/experiments.html

HPF2 Experiments - Updated April 2009

Code	Organism	Range
mc	Trypanosoma cruzi strain CL Brener	238-999
md	Trypanosoma cruzi strain CL Brener	000-999
me	Trypanosoma cruzi strain CL Brener	000-999
mf	Trypanosoma cruzi strain CL Brener	000-999
mg	Trypanosoma cruzi strain CL Brener	000-999
mh	Trypanosoma cruzi strain CL Brener	000-999
mi	Trypanosoma cruzi strain CL Brener	000-822
mi	Plasmodium knowlesi	823-999
mj	Plasmodium knowlesi	000-999
mk	Plasmodium knowlesi	000-998
ml	Plasmodium knowlesi	000-999
mm	Plasmodium knowlesi	000-999
mn	Plasmodium knowlesi	000-325

--------------------

(Show/Hide)

загальна статистика: BOINCstats * FreeDC команда: BOINC команда Ukraine

vitalidze1	May 28 2009, 16:09 Пост #19
ЮЗЕР Група: Trusted Members Повідомлень: 1 367 З нами з: 17-May 09 З: Вінниця Користувач №: 1 029 Стать: Чол Free-DC_CPID Парк машин: ~15-20 компліхтерів.	cosmo_vk, В мене іноді такі лажі на компах висканують, тільки ті , що на роботі 2 машини, тоді, коли в завданнях тільки завдання про рис. Якщо мікст, все ок --------------------

cosmo_vk	May 29 2009, 16:41 Пост #20
kранчер Група: Trusted Members Повідомлень: 86 З нами з: 19-September 08 З: Ковров Користувач №: 828 Стать: НеСкажу Парк машин: всего по немногу	не-е на рисе у меня все нормально. Глюк с этим проектом вроде прошел после ресетинга всего WCG. --------------------

Rilian

May 29 2009, 16:51

Пост #21

interstellar

Група: Team member
Повідомлень: 17 062
З нами з: 22-February 06
З: Торонто
Користувач №: 184
Стать: НеСкажу
Free-DC_CPID
Парк машин:
ноут и кусок сервера

Это не из-за ресета проекта итд. Есть ошибка в рассчетном ядре HPF2, и она иногда проявляется. Пока не исправлена

--------------------

(Show/Hide)

загальна статистика: BOINCstats * FreeDC команда: BOINC команда Ukraine

cosmo_vk	May 29 2009, 16:57 Пост #22
kранчер Група: Trusted Members Повідомлень: 86 З нами з: 19-September 08 З: Ковров Користувач №: 828 Стать: НеСкажу Парк машин: всего по немногу	пока она не проявляется и это главное. Правда сейчас я сбавил обороты в WCG и довольно значительную часть мощностей перебросил на POEM. --------------------

Rilian

May 29 2009, 17:01

Пост #23

interstellar

Група: Team member
Повідомлень: 17 062
З нами з: 22-February 06
З: Торонто
Користувач №: 184
Стать: НеСкажу
Free-DC_CPID
Парк машин:
ноут и кусок сервера

У меня вылазит примерно раз на 1000 ВЮ

--------------------

(Show/Hide)

загальна статистика: BOINCstats * FreeDC команда: BOINC команда Ukraine

Rilian

Oct 28 2009, 12:32

Пост #24

interstellar

Група: Team member
Повідомлень: 17 062
З нами з: 22-February 06
З: Торонто
Користувач №: 184
Стать: НеСкажу
Free-DC_CPID
Парк машин:
ноут и кусок сервера

Получил изумрудную медаль за 1 год процессорного времени

IPB Image

Статус проекта на 1 Nov 2009!

http://homepages.nyu.edu/~rb133/wcg/thread_2009_11_01.html

--------------------

(Show/Hide)

загальна статистика: BOINCstats * FreeDC команда: BOINC команда Ukraine

Rilian

Oct 31 2009, 21:26

Пост #25

interstellar

Група: Team member
Повідомлень: 17 062
З нами з: 22-February 06
З: Торонто
Користувач №: 184
Стать: НеСкажу
Free-DC_CPID
Парк машин:
ноут и кусок сервера

Пресс-релиз от 28 октября 2009

HPF2 Update - November 2009

Greetings WCG Volunteers,

As the first World Community Grid project, we'd like to celebrate the WCG's anniversary with a recap of all the contributions to protein science that your work as made. Over the past few years, WCG volunteers have provided over 50,000 CPU years (as calculated by the WCG) and folded over tens of thousands of protein sequences. Often there is very little known about the sequences we've folded, and WCG protein structure predictions provide the only available annotations for scientists studying these proteins. Biologists from different disciplines have used our structure predictions to make informed decisions about experiments and infer protein functions and molecular processes.

In the early stages of our project, an effort was made to make focused predictions for proteins of interest. The yeast proteome was originally targeted for the vast amount of other experimental data available.

Публикация Malmström L, Riffle M., Strauss CEM, Chivian, D, Davis TN., Bonneau R.3 and Baker D. Superfamily Assignments for the Yeast Proteome through Integration of Structure Prediction with the Gene Ontology. PLoS Biol. (2007) Apr;5(4):e76.
We predicted protein structures to further annotate this genome and compliment the array of protein interaction and molecular function information on this heavily studied model organism. Our results confirmed the feasibility of extending our approach to other less studied, larger proteomes.

A cross section of organisms (including Human, Mouse, Fly, E.Coli, Worm, and other unique organisms) have been processed completely, and protein sequences of unknown structure have been folded by the WCG. Our database has grown to include over a million protein sequences, and WCG predictions are complimented by known structures and a host of other structure and sequence metrics. We regularly receive special requests for predictions for proteins of varying kind (including but not limited to those related to HIV infection, the development of Malaria, and particular bacterial enzymatic processes).

A few high profile uses of our database include:

Публикация Bonneau, R, Facciotti, MT, Reiss, DJ, Madar A,, Baliga, NS, et al. A predictive model for transcriptional control of physiology in a free living cell. (2007) Cell. Dec 131:1354-1365.
Here we used our structure predictions to find transcription factors, the proteins that turn on and off genes. These predicted transcription factors proved critical (and accurate) in building the genome wide circuit for this organism. The general application here is environmental bioengineering and systems biology.

Публикация Mike Boxem, Zoltan Maliga, Niels J. Klitgord, Na Li, Irma Lemmens, Miyeko Mana, Lorenzo De Lichtervelde, Joram Mul, Diederik van de Peut, Maxime Devos, Nicolas Si-monis, Anne-Lore Schlaitz, Murat Cokol, Muhammed A. Yildirim, Tong Hao, Changyu Fan, Chenwei Lin, Mike Tipsword, Kevin Drew, Matilde Galli, Kahn Rhrissorrakrai, David Drech-sel, David E. Hill, Richard Bonneau, Kristin C. Gunsalus, Frederick P. Roth, Fabio Piano, Jan Tavernier, Sander van den Heuvel, Anthony A. Hyman, Marc Vidal. A Protein Domain-Based Interactome Network for C. elegans Early Embryogenesis. (2008) Cell, 134(3) pp. 534 - 545.
Here our predictions were used to map the boundaries between functional parts of proteins. This allows for a whole new way of looking at how proteins interact and co-function to form a working system that the cell relies on. The general application here is broad, as this describes a dataset all types of biologists will use.

Публикация Andersen-Nissen E, Smith KD, Bonneau R, Strong RK, Aderem A. A conserved surface on Toll-like receptor 5 recognizes bacterial flagellin. (2007) J Exp Med. Feb 19;204(2):393-403.
Here we predicted the structure of key immune proteins, resulting in a prediction that allowed us to re-engineer a key imune receptor allowing for a better animal model of innate immune responses (key to figuring out several aspects of our response to bacterial infection). This publication has direct application to immunology and fighting infectious disease.

Recently, we've been working towards a paper that will describe our new methods, highlight our successes, and publicize the already open access to our database. This year we've received an average of 6,300 unique visitors a month. That's over 200 users a day (including weekends)! With the publication of our new methods we expect a significant increase in exposure and are preparing to provide multiple means of user-friendly access for the sometimes complex data. This will include using BioNetBuilder.

Публикация Iliana Avila-Campillo, Kevin Drew, John Lin, David J. Reiss, Richard Bonneau. BioNetBuilder, an automatic network interface. (2007) Bioinformatics. Feb 1;23(3):392-3.

Future work will undoubtedly involve the refinement of our protein structure annotations. We're investigating methods for incorporating evolutionary information into our predictions, and overhauling parts of the pipeline that are outdated. There is significant room for improvement in our methods for selecting native-state conformations from structure predictions and assigning family annotations. With the WCG we've been able to cast a wide net, and now we're interested in the improvement of our algorithms and classifiers. WCG predictions will continue to provide data for our ever improving experiments and value to the scientific community.

Here at the Bonneau Lab, we thank you for your dedication to science and ask that you keep crunching!
--
Patrick Winters
Bonneau Lab

Как видно с помощью проекта Human Proteonome Folding, Phase 2 за полгода было сделано много исследований и 5 публикаций

--------------------

(Show/Hide)

загальна статистика: BOINCstats * FreeDC команда: BOINC команда Ukraine

Rilian

Mar 17 2010, 00:28

Пост #26

interstellar

Група: Team member
Повідомлень: 17 062
З нами з: 22-February 06
З: Торонто
Користувач №: 184
Стать: НеСкажу
Free-DC_CPID
Парк машин:
ноут и кусок сервера

We'll be uploading more work units to IBM soon; there's no concern. We also have a batch mp 200-999 that was initially skipped. So we've bought a few more weeks, and are working towards the next experiment. Some of the analysis we've performed for the paper-in-progress has inspired new ideas.

To answer rilian, we have already run about 100 different species through the HPF pipeline. The unifying factor is that all of these organisms provoke particular interest in the scientific community. Many are parasites and disease causing species that affect humans, some are important for studying human food sources, etc. It's safe to say that rice is one of the most important food sources for humans, if not the most heavily consumed food source. Improving annotation of the rice proteome, as well as the other organisms we've folded, greatly increases the available scientific resources for researchers studying any of these species.

As for folding all of the human proteome... There are somewhere around 30k protein coding regions, of which we identified near 70k uniquely folding protein domains. Many of these can be matched to known protein structures, above 50%, using sequence based similarity methods. We've folded thousands on the WCG, but not everything. Rosetta's effectiveness diminishes due to a number of factors including, but not limited to, protein length, disorder, and trans-membrane regions. We've already run everything that passed our filters.

CODE

https://secure.worldcommunitygrid.org/forums/wcg/viewthread_thread,28336_lastpage,yes#271805

--------------------

(Show/Hide)

загальна статистика: BOINCstats * FreeDC команда: BOINC команда Ukraine

Rilian

Mar 25 2010, 02:43

Пост #27

interstellar

Група: Team member
Повідомлень: 17 062
З нами з: 22-February 06
З: Торонто
Користувач №: 184
Стать: НеСкажу
Free-DC_CPID
Парк машин:
ноут и кусок сервера

Обновление статуса проекта за март 2010

Кратко, как я понял: изучая специальные стабильные последовательности аминокислот в предсказанных HPF2 белках, из одного поколения белка в другое (данные о белках берутся из разных популяций одного вида животных или растений), ученые проекта смотрят какие эволюционные факторы вызывают какие изменения в структуре белков, и, если возможно, какие новые функции они получают.

Дальше, используя метод вероятностей (поиск по большой базе результатов, который планируется сделать в проекте с помощью мощностей WCG), ученые смогут найти

1) каким эволюционным факторам подверглись белки с неизвестными пока функциями.
2) какие функции могут приобрести белки при определенных эволюционных факторах

Последнее мне кажется особенно актуальным для практического применения при создании новых генетически модифицированных организмов.

Итак, статья

http://homepages.nyu.edu/~rb133/wcg/thread_2010_03_10.html

HPF2 Update - March 2010

Greetings WCG Volunteers,

We've been working diligently to develop a pipeline for a cooperative analysis of phylogenetic and structural data. We will integrate our structure predictions with knowledge of how proteins (and functional sites on folded proteins) evolve, by estimating the phylogenies of all protein domain families in our database and identifying positively-selected amino acid sites in these families using codon-based molecular evolution models that can be mapped onto the predicted structures. The first stages of this analysis are coming to fruition, and we've begun investigating preliminary results.

Using phylogenetic models, we intend to identify sites of proteins exhibiting evolutionary pressure. This may improve our understanding of how proteins evolve new functions and structures, and will ultimately lead to an increase in genome annotation for proteins whose purpose we know next to nothing about. The great scale of and wealth of information in our database may allow us to improve upon our existing and future de novo structure and function predictions. Identifying structurally or functionally importing residues in protein domains should inform our comparative modeling techniques. We use probabilistic methods to produce models of evolution using observed rates of mutation in protein families. Lots of different evolutionary pressures affect the mutation and expression of proteins, but we hope to garner insight with this analysis about how evolution adapts protein function.

Using our automated methods, we produced evolutionary models for a handful of identified protein domain families in major plant genomes. One such protein family matches PDB 1TQE "Myocyte Enhancer Factor-2". While this analysis is very preliminary (and I stress preliminary), positive selection analysis identifies a few residues that may be involved in DNA binding and the integrity of the dimer near the substrate. This is the kind of science we'll be investigating in the future using WCG predicted structures.

--
Patrick Winters
Bonneau Lab

PDB 1TQE: colored blue, with probability of positive selection highlighted yellow-red.

PDB 1TQE: the two chains colored blue and green, with probability of positive selection highlighted yellow-red.

Screenshot from embedded Jalview of the family's alignment.

Screenshot from embedded PhyloWidget of the family's phylogenetic tree.

--------------------

(Show/Hide)

загальна статистика: BOINCstats * FreeDC команда: BOINC команда Ukraine

Rilian

Mar 25 2010, 23:08

Пост #28

interstellar

Група: Team member
Повідомлень: 17 062
З нами з: 22-February 06
З: Торонто
Користувач №: 184
Стать: НеСкажу
Free-DC_CPID
Парк машин:
ноут и кусок сервера

краткое содержание предыдущего текста

Organisms from the various branches of the tree of life share a lot in common. Wildly different organisms share much of the same molecular machinery, and as you group them into smaller categories their proteomes begin to look very similar. Using protein sequence similarity we can identify proteins from multiple organisms that clearly shared a common ancestor, and biologists have developed algorithms for determining their evolutionary relationships. From these relationships we can infer how evolutionary pressures change proteins... encourage or discourage mutations at certain places in the protein. You can imagine that some portions of a protein might be very important for carrying its task and don't show many mutations, some portions just mutate randomly (neutral drift), and others seem to mutate wildly (perhaps as the protein develops a new function).

We've begun to perform this kind of analysis on some major plant protein families. It remains to be seen what kind of evolutionary trends we'll discover, but on a per protein basis this information can be very important to researchers. In my example I show how our analysis suggests that the DNA binding portions of a particular protein family are undergoing some sort of adaptive change.

Now we can perform this kind of analysis irrespective of structure predictions since it is based on protein sequence, but integrating it with WCG structure predictions is a primary goal of the project. The best part is that it doesn't require re-running any results from the WCG, and the trends we discover can be used to better select models from WCG runs and better identify functional properties.

--------------------

(Show/Hide)

загальна статистика: BOINCstats * FreeDC команда: BOINC команда Ukraine

Rilian

Apr 12 2010, 23:59

Пост #29

interstellar

Група: Team member
Повідомлень: 17 062
З нами з: 22-February 06
З: Торонто
Користувач №: 184
Стать: НеСкажу
Free-DC_CPID
Парк машин:
ноут и кусок сервера

We will be running a new Windows build for HPF2 through beta soon. This new build is to address the "ERROR:: Exit at: .\dock_structure.cc line:401" error. We have seen good results in our internal testing environment. Any members who have machines that experience this error may want to try and get some of the beta workunits. We appreciate the members on-going patience and help in the forums while we are working on this issue.

Thanks,
armstrdj

И, как бе, да, ЕСТЬ БЭТА ВЮ!

--------------------

(Show/Hide)

загальна статистика: BOINCstats * FreeDC команда: BOINC команда Ukraine

Rilian

Apr 26 2010, 01:50

Пост #30

interstellar

Група: Team member
Повідомлень: 17 062
З нами з: 22-February 06
З: Торонто
Користувач №: 184
Стать: НеСкажу
Free-DC_CPID
Парк машин:
ноут и кусок сервера

За 2 процессорных года в этом проекте получил сапфировую медаль IPB Image

--------------------

(Show/Hide)

загальна статистика: BOINCstats * FreeDC команда: BOINC команда Ukraine

« Попередня тема · Завершені проекти WCG · Наступна тема »

1 Користувачів переглядають дану тему (1 Гостей і 0 Прихованих Користувачів)

0 Користувачів:

Українська команда з розподілених обчислень