Human Proteome Folding Project
Phase 2
http://homepages.nyu.edu/~rb133/wcg/rbonneau_posts.html
http://homepages.nyu.edu/~rb133/wcg/experiments.html
http://commonfund.nih.gov/hmp/
http://homepages.nyu.edu/~rb133/wcg/thread_2010_03_10.html
http://distributed.org.ua/forum/index.php?showtopic=890
Proteins are essential to living beings. Just about everything in the human body involves or is made out of proteins.
What are proteins?
Proteins are large molecules that are made of long chains of smaller molecules called amino acids. While there are only 20 different kinds of amino acids that make up all proteins, sometimes hundreds of them make up a single protein.
Adding to the complexity, proteins typically do not stay as long chains. As soon as the chain of amino acids is built, the chain folds and tangles up into a more compact and particular shape that lets it conduct specific and necessary functions within the human body.
Proteins fold because the different amino acids like to stick to each other following certain rules. Imagine that amino acids are pop-beads of 20 different colors. The pop-beads are sticky, but sticky in such a way that only certain combinations of colors can stick together. This makes the amino acid chains fold in a particular way that creates proteins that are useful to the human body. Human cells have mechanisms to help the proteins fold properly and, equally important, mechanisms to get rid of improperly folded proteins.
How do proteins relate to human genes?
The collection of all of the human genes is known as "the human genome." Depending on how the genes are counted, there are over 30,000 genes in the human genome. Each gene, which is a section of a long chain known as DNA, dictates how to build the chain of amino acids for one of the 30,000 proteins. In recent years, scientists were able to map the sequence for each human gene. This means that we now know the sequence of amino acids in all of the human proteins. Thus, the human genome is directly related to the "human proteome," the collection of all human proteins.
The protein mystery
While researchers have learned a great deal about the human proteome, the functions of most of the proteins remain a mystery. The genes do not reveal exactly how the proteins will fold into their final shape, which is critical because that determines what a protein can do and what other proteins it can connect to or interact with.
Proteins are like puzzle pieces. For example, muscle proteins connect to each other to form a muscle fiber. They join together in a specific manner because of their shape, as well as other factors relating to the shape.
Everything that goes on in cells and in the body is very specifically controlled by the shape of the proteins that do or do not let proteins interlock with other proteins. For example, the proteins of a virus or bacteria may have particular shapes that enable it to break through the cell membrane, allowing it to infect the cell.
The Human Proteome Folding Project
Знания структуры белков позволит ученым понять как белки выполняют свои биологические функции, а также как болезни блокируют белки от выполнения необходимых функций для поддержания здоровых клеток
The Human Proteome Folding Project will combine the power of millions of computers in a grid to help scientists understand how human proteins fold. The work to be done in this monumental task is shared across this grid, so that results can be achieved far sooner than would be possible with conventional supercomputers. With a greater understanding of protein structure, scientists can learn how diseases work and ultimately find cures for them.
When your grid agent is running, it is folding an amino acid chain in various ways and evaluating how well each folding follows the specific rules of how specific amino acids stick together or not. As computers try millions of ways to fold the chains, they attempt to fold the protein in the same way that it actually folds in the human body. The best shapes identified for each protein are returned to the scientists for further study.
-----
Оказывается тут тоже юзается розетта
Росетта следит за тобой!
ЗЫ: ты когда себе табличку поправишь?
надо бы перевести шапку...
Кратко: проект с помощью алгоритмов проекта Rosetta рассчитывает более точные структуры (чем есть сейчас в базах ученых) белков в человеческом теле, и их патогенов. Довольно важная вещь
Насчитал 100 процессорных дней в проекте
Это составило 372 ВЮ, дало 40000 БОИНК-очков а также Золотую Медаль Human Proteome Folding 2
Richard Bonneau, head scientist of the Human Proteome Folding Project, was active in the original development of Rosetta at David Baker's laboratory while obtaining his PhD.[72] More information on the relationship between the HPF1, HPF2 and Rosetta@home can be found on Richard Bonneau's website http://homepages.nyu.edu/~rb133/wcg/rbonneau_posts.html
Рассчитал 500 протеинов за 142 процессорных дня
Объявляется мини-соревнование - кто быстрее подсчитает 1000 ВЮ проекта. Победитель получает приз зрительских симпатий
Участники заездасчета:
rilian
Vzhik
http://www.yeastrc.org/pdr/pages/search/advancedSearchForm.jsp
http://www.boinc-af.org/content/view/543/287/
http://www.boinc-af.org/content/view/668/219/
http://www.dp.by/wiki/Projects/Humanproteomefoldingproject
кстати не у кого не возникало проблем с расчетом заданий?
А то у меня уже несколько раз было: задание вроде и считается судя по загрузке проца, но прогресс стоит на месте. Лечится только перезапуском боинка.
Не, но у меня на висте проблемы именно с HPFP2...
А все-таки тормозятся вычисления, теперь уже такой глюк дома проявился.
второе задание сверху с временем расчета 19:33:22, за такое время у меня 5-6 заданий обычно пролетает, а тут одно столько считает.
у меня на 2-гигагерцовых ксеонах бывает считает и по 20 часов...
не-е у меня считает в районе 3-4 часов. Если больше значит глюк, так же с этим заданием. Рестартанул боинк, это задание досчиталось за пару часов.
Кстати на форуме WCG тоже про такое говорилось:
http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=24981
А... ну да, в HPFP2 оч редко такое бывает.. Может когда-нибудь исправят
Patrick Winters продолжает радовать нас апдейтами статуса проекта. Так как база данных заданий не имеет красивого фронт-энда и веб-интерфейса со всякими наворотами, Патрик на своей домашней странице пообещал периодически обновлять список экспериментов которые сейчас считаются
http://homepages.nyu.edu/~rb133/wcg/experiments.html
HPF2 Experiments - Updated April 2009
Code | Organism | Range |
---|---|---|
mc | Trypanosoma cruzi strain CL Brener | 238-999 |
md | Trypanosoma cruzi strain CL Brener | 000-999 |
me | Trypanosoma cruzi strain CL Brener | 000-999 |
mf | Trypanosoma cruzi strain CL Brener | 000-999 |
mg | Trypanosoma cruzi strain CL Brener | 000-999 |
mh | Trypanosoma cruzi strain CL Brener | 000-999 |
mi | Trypanosoma cruzi strain CL Brener | 000-822 |
mi | Plasmodium knowlesi | 823-999 |
mj | Plasmodium knowlesi | 000-999 |
mk | Plasmodium knowlesi | 000-998 |
ml | Plasmodium knowlesi | 000-999 |
mm | Plasmodium knowlesi | 000-999 |
mn | Plasmodium knowlesi | 000-325 |
cosmo_vk,
В мене іноді такі лажі на компах висканують, тільки ті , що на роботі 2 машини, тоді, коли в завданнях тільки завдання про рис. Якщо мікст, все ок
не-е на рисе у меня все нормально.
Глюк с этим проектом вроде прошел после ресетинга всего WCG.
Это не из-за ресета проекта итд. Есть ошибка в рассчетном ядре HPF2, и она иногда проявляется. Пока не исправлена
пока она не проявляется и это главное. Правда сейчас я сбавил обороты в WCG и довольно значительную часть мощностей перебросил на POEM.
У меня вылазит примерно раз на 1000 ВЮ
Получил изумрудную медаль за 1 год процессорного времени
Статус проекта на 1 Nov 2009!
http://homepages.nyu.edu/~rb133/wcg/thread_2009_11_01.html
Пресс-релиз от 28 октября 2009
HPF2 Update - November 2009
Greetings WCG Volunteers,
As the first World Community Grid project, we'd like to celebrate the WCG's anniversary with a recap of all the contributions to protein science that your work as made. Over the past few years, WCG volunteers have provided over 50,000 CPU years (as calculated by the WCG) and folded over tens of thousands of protein sequences. Often there is very little known about the sequences we've folded, and WCG protein structure predictions provide the only available annotations for scientists studying these proteins. Biologists from different disciplines have used our structure predictions to make informed decisions about experiments and infer protein functions and molecular processes.
In the early stages of our project, an effort was made to make focused predictions for proteins of interest. The yeast proteome was originally targeted for the vast amount of other experimental data available.
Публикация http://biology.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pbio.0050076
We predicted protein structures to further annotate this genome and compliment the array of protein interaction and molecular function information on this heavily studied model organism. Our results confirmed the feasibility of extending our approach to other less studied, larger proteomes.
A cross section of organisms (including Human, Mouse, Fly, E.Coli, Worm, and other unique organisms) have been processed completely, and protein sequences of unknown structure have been folded by the WCG. Our database has grown to include over a million protein sequences, and WCG predictions are complimented by known structures and a host of other structure and sequence metrics. We regularly receive special requests for predictions for proteins of varying kind (including but not limited to those related to HIV infection, the development of Malaria, and particular bacterial enzymatic processes).
A few high profile uses of our database include:
Публикация http://dx.doi.org/10.1016/j.cell.2007.10.053
Here we used our structure predictions to find transcription factors, the proteins that turn on and off genes. These predicted transcription factors proved critical (and accurate) in building the genome wide circuit for this organism. The general application here is environmental bioengineering and systems biology.
Публикация http://dx.doi.org/10.1016/j.cell.2008.07.009
Here our predictions were used to map the boundaries between functional parts of proteins. This allows for a whole new way of looking at how proteins interact and co-function to form a working system that the cell relies on. The general application here is broad, as this describes a dataset all types of biologists will use.
Публикация http://dx.doi.org/10.1084/jem.20061400
Here we predicted the structure of key immune proteins, resulting in a prediction that allowed us to re-engineer a key imune receptor allowing for a better animal model of innate immune responses (key to figuring out several aspects of our response to bacterial infection). This publication has direct application to immunology and fighting infectious disease.
Recently, we've been working towards a paper that will describe our new methods, highlight our successes, and publicize the already open access to our database. This year we've received an average of 6,300 unique visitors a month. That's over 200 users a day (including weekends)! With the publication of our new methods we expect a significant increase in exposure and are preparing to provide multiple means of user-friendly access for the sometimes complex data. This will include using BioNetBuilder.
Публикация http://bioinformatics.oxfordjournals.org/cgi/content/abstract/23/3/392
Future work will undoubtedly involve the refinement of our protein structure annotations. We're investigating methods for incorporating evolutionary information into our predictions, and overhauling parts of the pipeline that are outdated. There is significant room for improvement in our methods for selecting native-state conformations from structure predictions and assigning family annotations. With the WCG we've been able to cast a wide net, and now we're interested in the improvement of our algorithms and classifiers. WCG predictions will continue to provide data for our ever improving experiments and value to the scientific community.
Here at the Bonneau Lab, we thank you for your dedication to science and ask that you keep crunching!
--
Patrick Winters
Bonneau Lab
Как видно с помощью проекта Human Proteonome Folding, Phase 2 за полгода было сделано много исследований и 5 публикаций
We'll be uploading more work units to IBM soon; there's no concern. We also have a batch mp 200-999 that was initially skipped. So we've bought a few more weeks, and are working towards the next experiment. Some of the analysis we've performed for the paper-in-progress has inspired new ideas.
To answer rilian, we have already run about 100 different species through the HPF pipeline. The unifying factor is that all of these organisms provoke particular interest in the scientific community. Many are parasites and disease causing species that affect humans, some are important for studying human food sources, etc. It's safe to say that rice is one of the most important food sources for humans, if not the most heavily consumed food source. Improving annotation of the rice proteome, as well as the other organisms we've folded, greatly increases the available scientific resources for researchers studying any of these species.
As for folding all of the human proteome... There are somewhere around 30k protein coding regions, of which we identified near 70k uniquely folding protein domains. Many of these can be matched to known protein structures, above 50%, using sequence based similarity methods. We've folded thousands on the WCG, but not everything. Rosetta's effectiveness diminishes due to a number of factors including, but not limited to, protein length, disorder, and trans-membrane regions. We've already run everything that passed our filters.
Обновление статуса проекта за март 2010
Кратко, как я понял: изучая специальные стабильные последовательности аминокислот в предсказанных HPF2 белках, из одного поколения белка в другое (данные о белках берутся из разных популяций одного вида животных или растений), ученые проекта смотрят какие эволюционные факторы вызывают какие изменения в структуре белков, и, если возможно, какие новые функции они получают.
Дальше, используя метод вероятностей (поиск по большой базе результатов, который планируется сделать в проекте с помощью мощностей WCG), ученые смогут найти
1) каким эволюционным факторам подверглись белки с неизвестными пока функциями.
2) какие функции могут приобрести белки при определенных эволюционных факторах
Последнее мне кажется особенно актуальным для практического применения при создании новых генетически модифицированных организмов.
Итак, статья
http://homepages.nyu.edu/~rb133/wcg/thread_2010_03_10.html
HPF2 Update - March 2010
Greetings WCG Volunteers,
We've been working diligently to develop a pipeline for a cooperative analysis of phylogenetic and structural data. We will integrate our structure predictions with knowledge of how proteins (and functional sites on folded proteins) evolve, by estimating the phylogenies of all protein domain families in our database and identifying positively-selected amino acid sites in these families using codon-based molecular evolution models that can be mapped onto the predicted structures. The first stages of this analysis are coming to fruition, and we've begun investigating preliminary results.
Using phylogenetic models, we intend to identify sites of proteins exhibiting evolutionary pressure. This may improve our understanding of how proteins evolve new functions and structures, and will ultimately lead to an increase in genome annotation for proteins whose purpose we know next to nothing about. The great scale of and wealth of information in our database may allow us to improve upon our existing and future de novo structure and function predictions. Identifying structurally or functionally importing residues in protein domains should inform our comparative modeling techniques. We use probabilistic methods to produce models of evolution using observed rates of mutation in protein families. Lots of different evolutionary pressures affect the mutation and expression of proteins, but we hope to garner insight with this analysis about how evolution adapts protein function.
Using our automated methods, we produced evolutionary models for a handful of identified protein domain families in major plant genomes. One such protein family matches http://www.pdb.org/pdb/explore/explore.do?structureId=1TQE "Myocyte Enhancer Factor-2". While this analysis is very preliminary (and I stress preliminary), positive selection analysis identifies a few residues that may be involved in DNA binding and the integrity of the dimer near the substrate. This is the kind of science we'll be investigating in the future using WCG predicted structures.
--
Patrick Winters
Bonneau Lab
краткое содержание предыдущего текста
Organisms from the various branches of the tree of life share a lot in common. Wildly different organisms share much of the same molecular machinery, and as you group them into smaller categories their proteomes begin to look very similar. Using protein sequence similarity we can identify proteins from multiple organisms that clearly shared a common ancestor, and biologists have developed algorithms for determining their evolutionary relationships. From these relationships we can infer how evolutionary pressures change proteins... encourage or discourage mutations at certain places in the protein. You can imagine that some portions of a protein might be very important for carrying its task and don't show many mutations, some portions just mutate randomly (neutral drift), and others seem to mutate wildly (perhaps as the protein develops a new function).
We've begun to perform this kind of analysis on some major plant protein families. It remains to be seen what kind of evolutionary trends we'll discover, but on a per protein basis this information can be very important to researchers. In my example I show how our analysis suggests that the DNA binding portions of a particular protein family are undergoing some sort of adaptive change.
Now we can perform this kind of analysis irrespective of structure predictions since it is based on protein sequence, but integrating it with WCG structure predictions is a primary goal of the project. The best part is that it doesn't require re-running any results from the WCG, and the trends we discover can be used to better select models from WCG runs and better identify functional properties.
We will be running a new Windows build for HPF2 through beta soon. This new build is to address the "ERROR:: Exit at: .\dock_structure.cc line:401" error. We have seen good results in our internal testing environment. Any members who have machines that experience this error may want to try and get some of the beta workunits. We appreciate the members on-going patience and help in the forums while we are working on this issue.
Thanks,
armstrdj
И, как бе, да, ЕСТЬ БЭТА ВЮ!
За 2 процессорных года в этом проекте получил сапфировую медаль
Апдейт от координаторов проекта!
Hello,
We've been stalling starting any large experiments in anticipation of our next big initiative. I've been developing more work units as necessary, folding Candidatus Desulforudis audaxviator (an interesting bacterium).
From Wikipedia:
Бета тест прошел успешно, и версия 6.17 уже в строю. В ней исправлены разные ошибки при работе в win32 и особенно в серверных 64-бит версиях Windows®
Пока ученые проекта готовят статью про текущие исследования и эксперименты,
Hi,
I just wanted to add that I've verified a few proteins run with the new executable and they look good. Also, I'll update everyone with a status update soon, but I just wanted to add that I've sent IBM more work units. We're going to fold Chlamydomonas reinhardtii in the interim before we get to the microbiome project. It's a nice little single celled alga whose metabolism is being exploited to create clean sources of hydrogen.
Here's two pics of the beta results NG590 and NG592.
--
Patrick
Human Proteome Folding Scientist
I'll be happy to update our experiments table and post a status update when I get a chance. It's been a little hectic here for me. While I'm not technically leaving the lab, I'm leaving NY in a couple weeks. I have quite a backlog of re-processing to do for our publication before I leave.
A short update would be that we have plenty more work queued up. I added "Chlamydomonas reinhardtii" into the pipeline (wikipedia it for a description). It's a few months worth of work and it will give us time to set up the Human Microbiome proteins (mentioned in previous posts) which will run for a very long time.
I understand you guys want to know what's up. I'll tell you more when I have time. Suffices to say, there hasn't been any slowdown or delay with the project and we are still producing great structure predictions. Now we just need to finish this manuscript and get the whole thing publicized! We know the WCG predictions are going to be a hugely popular resource.
Patrick Winters
Human Proteome Folding Scientist
Чото давно тихо в этой ветке. Новостей не слышно
блин выдаёт на двух компах следующее
<message>
CreateProcess() failed - (0x5)
</message>
]]>
кис вроде настроен нормально. Может кто знает, чо нужно сделать
Можно перезапустить проект - есть такая кнопка в боинке, только нужно убедиться, что остальные вцг-задания досчитались
Перезапустит все вцг-проекты, может разве что сброситься количество очков, которые заработал этот хост в этом проекте, общая статистика не пропадет
блин что за фигня - задания считаются меньше минуты и выдает "Ошибка вычисления"
Видимо пункт про антивирусы надо вписать в нашу стандартную "шапку" подключения к BOINC-проектам
Да, вообще-то в WCG в исполняемых файлах на windows нету букв .exe. Надо тоже это в шапку прописать..
Это чо в протеоме каждое задание считается по 15 раз Раньше разве тоже так было?
corsar83, всегда так было. задания немного разные, на самом деле. подробнее на англ в официальном форуме..
Пресс релиз за Октябрь 2010
http://homepages.nyu.edu/~rb133/wcg/thread_2010_10_10.html
Долгожданный апдейт!
http://homepages.nyu.edu/~rb133/wcg/thread_2011_01_31.html
HPF2 Update - January 2011
Greetings WCG Community,
Happy new year! I have taken it upon myself to write a quick status update regarding the Human Proteome Folding project, in part to introduce myself and also to summarize a few of the tasks that lie ahead for us.
As of early January, the Bonneau lab team working on the HPF project has undergone a slight change of staff. Patrick Winters, from whom you have heard in previous updates, has moved on to other pastures, and I have taken his place. My name is Duncan Penfold-Brown, and I am coming out of a previous research position in the application of high-powered computing (both Grid and Cloud) to high-energy physics and astronomy. I have a degree in computer science, and have in the past focused on distributed and self-organizing systems. I also have experience (and a great deal of interest) in bioinformatics, which I look forward to increasing in my work on this project.
In short, I will be responsible for support and development of HPF projects and research. I am interested in pursuing a greater knowledge of proteomics - which seems to me like biological puzzle solving - and also the applications (medical, investigative) of the research we are completing together.
The immediate tasks we are approaching are to work on improving our pre- and post-analysis tools, in order to get more data to the Grid, and to better interpret what comes out. Goals for the immediate future include continuing the incorporation of phylogenetic data and evolutionary analysis of protein domains into our analysis process in order to enhance our annotation of select unknown proteins. This incorporation will provide us with a better understanding of the function of unknown proteins, as we can more accurately identify evolutionarily similar structures and cross-examine their structure-function relationships. With additions to our analysis process and continued work, we will ultimately be improving the end data of our research.
Speaking of data, here is a quick update of what is currently being turned over on the WCG:
Currently, all protein data being folded is from the Human Microbiome Project (HMP - see their site at http://commonfund.nih.gov/hmp/), with a focus on microbes found in the human gut. As of late January, we have completed pre-analysis on the last of the data that we will be working on from the HMP (for now), and have dropped it off to be picked up by the grid (see codes 'oh' - 'ok' in the following table).
We are now looking into new organisms - such as the malaria parasite Plasmodium Yoelii Yoelii (a model rodent malaria important for understanding the function of human malaria) - to analyze and send to the grid.
The following table describes the data that has recently been or is being processed by the WCG:
Считать американцам их микробиом. Слишком много чести. Хочу родной кефирчик.
Summary
The Human Proteome Folding project researchers have published a paper in the journal Genome Research, which announces the availability of their data base of predicted protein structures, their validation methods and how this augments other information about these proteins, thus helping to solve a critical problem for biologists.
World Community Grid Lecture Series - Human Proteome Folding project
Dear ******,
You are invited to participate in a live webcast on Octover 21, 2011 to hear an overview and update on World Community Grid's Human Proteome Folding project. The event will be hosted by Dr. Richard Bonneau from New York University.
Since 2006, World Community Grid has had the privilege of supporting the innovative research underway at New York University to use computers to predict the structure of proteins, the "molecular machines" of the human body. Knowing protein structure is a critical step in advancing the understanding of how proteins affect human health, providing scientists with the information they need to develop new cures for human diseases.
This is the "Human Proteome Folding - Phase 2" project that many of you run every day on your laptops and PCs for World Community Grid, helping us make progress towards aiding researchers in understanding how proteins perform their intended functions and also how diseases prevent proteins from maintaining healthy cells.
The webcast will take place on October 21, 2011, starting promptly at 11:00AM Eastern Daylight Time (USA), which is 15:00 Coordinated Universal Time. Please join a few minutes early so that you're sure not to miss anything.
Participants can listen to Dr. Bonneau while viewing an on-screen presentation. Time permitting, you will be able to ask Dr. Bonneau questions via a text chat interface.
Access to the webcast is via this link: https://apps.lotuslive.com/meetings/join?id=0327108
You can check if your computer is ready for the webcast at this link: https://www.conferenceservers.com/browser?brand=LLENGAGE_EN-US
And whether or not you can join the webcast, make sure your laptop, PC or Mac is running World Community Grid, and let your friends know this easy way to participate in helping humanity!
Also, please note that World Community Grid has added three new download servers to help support our additional growth. Download servers are used to send work to your computer. As a result of this change, your computer may prompt you to communicate with the IP addresses of these new servers. If you have experienced this, please click on this link for further information: http://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,31492
Thank you,
The World Community Grid Team
P.S. After the webcast we will post the video of the webcast on YouTube, in the World Community Grid News & Update section, and we'll send you a link to the video.
The Human Proteome Folding project research scientists have posted an informative status update on their web site. They highlight their recently published paper in Genome Research and an upcoming paper about the evolution of proteins. Future work is also discussed, including some work which should help the scientific community working on malaria.
You may review their update https://files.nyu.edu/rb133/public/wcg/thread_2011_11_11.html.
World Community Grid Post - HPF2 Update, November 2011
Greetings to everyone,
It's been a stretch since the last update, but excitingly (!), we've been quite busy wrapping up ongoing projects with publications, and also getting our teeth into new projects and data. So, without further ado, I'd like to first mention our accepted and pending publications, and then go over the new data we're crunching and where it is leading us.
The lab has been very excited to recently have two gargantuan efforts come to fruition with the acceptance of one paper and the completion and submission of a second. The first, Kevin Drew (et al.)'s, is an enormous work covering nearly everything we do in terms of protein structure and function prediction, and was made conceivable in the first place and achievable in the second by support of World Community Grid computing cycles.
The paper will be available in the journal Genome Research this month (November 2011). The abstract is as follows, and the lab spent extra to ensure an open license so that the paper could be viewed in full - take a look!
The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition and grid-computing enabled de novo structure prediction. We predict protein domain boundaries and 3D structures for protein domains from 94 genomes (including Human, Arabidopsis, Rice, Mouse, Fly, Yeast, E. coli and Worm). De novo structure predictions were distributed on a grid of over 1.5 million CPUs worldwide (World Community Grid). We generate significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions.
The paper can be viewed here: http://genome.cshlp.org/content/early/2011/09/16/gr.121475.111.abstract
Also, take a quick look at this seminal image from the paper - predicting domain boundaries, and using the grid to do de Novo structure prediction for unknown domains:
в мене пендінг валідейшн по цьому проекту вже днів 5 тягнеться, щось вони довго не проходять перевірку...
Resizing of HPF2 work units
For future work units, we have decreased the average run time from 9 hours to 6 hours for this project. This will allow users with slower computers or computers which are available less time to have a better chance of completing work units for this project. It will take about 20 days for the existing longer work units to be sent out, so the new shorter work units won't be seen until after this time.
Seippel
Apr 13, 2012
с этого момента все задания будут считаться в среднем 6 часов
http://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,32972
Опубликовали работу "The Plant Proteome Folding Project: Structure and Positive Selection in Plant Protein Families" в журнале Genome Biology and Evolution
Researchers have published a paper in the journal Genome Biology and Evolution, which documents their findings studying a number of plant genomes, their proteomes, evolution and protein structure.
Lay Person Abstract:
Melissa Pentony et al. have presented work considering components of proteins exhibiting faster-than-average evolution in the proteomes of five major plant species, including rice (Oryza sativa) and Arabidopsis thaliana (an important model organism for plant study). They describe new information on the relationship between evolution and protein structure in plants.
The World Community Grid has contributed to this study by providing a much more structurally complete view of unknown and understudied proteins from five plant families than was previously available. The results from the Human Proteome Folding project produced 29,202 protein structures contributing to this project, of which 4,764 were very high-confidence. This should eventually assist agricultural scientists to better understand important plant and food crops, how to breed them for disease resistance, better nutrition and to better handle environmental stress.
Technical Abstract:
Despite its importance, relatively little is known about the relationship between the structure, function, and evolution of proteins, particularly in land plant species. We have developed a database with predicted protein domains for five plant proteomes (http://pfp.bio.nyu.edu/) and used both protein structural fold recognition and de novo Rosetta-based protein structure prediction to predict protein structure for Arabidopsis and rice proteins. Based on sequence similarity, we have identified ~15,000 orthologous/paralogous protein family clusters among these species and used codon-based models to predict positive selection in protein evolution within 175 of these sequence clusters. Our results show that codons that display positive selection appear to be less frequent in helical and strand regions and are overrepresented in amino acid residues that are associated with a change in protein secondary structure. Like in other organisms, disordered protein regions also appear to have more selected sites. Structural information provides new functional insights into specific plant proteins and allows us to map positively selected amino acid sites onto protein structures and view these sites in a structural and functional context.
Access to Paper:
To view the paper, please http://gbe.oxfordjournals.org/content/4/3/360.full%C2%A0.
Апдейт статуса проекта!
http://bonneaulab.bio.nyu.edu/wcg/thread_2012_07_01.html
Paper published in the journal Molecular Cell using Human Proteome Folding project results
http://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=204
Summary
A paper was published in the journal Molecular Cell, which used results from the Human Proteome Folding project in identifying proteins which regulate processes in human cells.
Paper Title:
“The mRNA-Bound Proteome and its Global Occupancy Profile on Protein-Coding Transcripts”
Lay Person Abstract:
The Bonneau lab at NYU collaborated with Markus Landthaler and colleagues from the Max Delbruch Center for Molecular Medicine, Berlin, contributing in an effort to discover and study novel RNA-binding proteins in the human proteome. These proteins play an important role in regulating activity in the cell. Some of the proteins have been implicated in diseases such as Alzheimer’s, muscular diseases, cancers and others. This information should help scientists in further understanding of disease processes, possibly leading to better treatments.
The Landthaler group at the MDC put together a landmark experiment for discovering RNA-binding proteins - a type of protein extremely important to human genetic systems. They then contacted the Bonneau lab for computational analysis. World Community Grid has provided predicted structures for a more complete structural landscape, contributing greatly to the analysis of human protein structure and function. This analysis allowed the Bonneau lab to verify experiment results from the Landthaler lab, lending confidence to their methods and providing data on RNA-binding proteins found via experimental methods. Furthermore, cutting-edge function prediction methods were developed and proved in this experiment, which will feature World Community Grid data in future publications.
Technical Abstract:
Protein-RNA interactions are fundamental to core biological processes, such as mRNA splicing, localization, degradation, and translation. We developed a photoreactive nucleotide-enhanced UV crosslinking and oligo(dT) purification approach to identify the mRNA-bound proteome using quantitative proteomics and to display the protein occupancy on mRNA transcripts by next-generation sequencing. Application to a human embryonic kidney cell line identified close to 800 proteins. To our knowledge, nearly one-third were not previously annotated as RNA binding, and about 15% were not predictable by computational methods to interact with RNA. Protein occupancy profiling provides a transcriptome-wide catalog of potential cis-regulatory regions on mammalian mRNAs and showed that large stretches in 3′ UTRs can be contacted by the mRNA-bound proteome, with numerous putative binding sites in regions harboring disease-associated nucleotide polymorphisms. Our observations indicate the presence of a large number of mRNA binders with diverse molecular functions participating in combinatorial posttranscriptional gene-expression networks.
Access to Paper:
To view the paper, http://www.cell.com/molecular-cell/abstract/S1097-2765%2812%2900437-6
Среднее время выполнения всех заданий увеличилось на 15%!
Новости проекта
World Community Grid Post - HPF2 Update, June/July 2012
http://bonneaulab.bio.nyu.edu/wcg/thread_2012_07_01.html
World Community Grid Post - HPF2 Update, Fall/Winter 2012
http://bonneaulab.bio.nyu.edu/wcg/thread_2012_12_04.html
Блін...
Тепер навіть незнаю що кранчити, все вже на сапфірах...
Проект внезапно подошел к концу - http://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=299. Судя по данным с http://i137.photobucket.com/albums/q210/Sekerob/WCGYearsPi1Project.png таблички, осталось 24 дня счета. Рекомендую всем, желающим получить очередной баджик налегать на проект после завершения пентатлона. Например, себе )
Summary
The first project to run on World Community Grid, the Human Proteome Folding project, is coming to a close. They have added greatly to the knowledge of protein structures, providing their results to other scientists via their data base resources.
The first project to run on World Community Grid, the Human Proteome Folding project, is coming to a close.
They have greatly added to the knowledge of protein structures, providing their results to other scientists http://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=291. In addition, the project has published many high quality http://www.worldcommunitygrid.org/research/hpf2/news.do?filterCategory=3_10&filterTags=14&sortBy=&pageNum=1. These publications and the data base resources have helped many other scientists with their own work to understand disease processes and to accelerate their search for cures.
We are a little sad to see the project ending in a few weeks, but we are also very proud of this project's accomplishments. Please read their http://bonneaulab.bio.nyu.edu/wcg/thread_2013_05_16.html for more details.
We thank you, our member volunteers, for contributing to this project and we hope you will http://www.worldcommunitygrid.org/research/viewAllProjects.do, as well as the many new ones we expect to launch before too long.
Вау круто!
Наверное выпустят третью фазу с новыми алгоритмами Rosetta@home
Взято отсюда http://bonneaulab.bio.nyu.edu/wcg/thread_2013_05_16.html
There are some exciting research possibilities he and others are considering such as investigating how mutations alter protein structure. Perhaps one of these ideas may grow into a new World Community Grid project at some time in the future.
Осталось 21 день до завершения проекта!
Предлагаю подключить его всем у кого он не подключен, и ускорить это событие!
Задания будут выдаваться еще 1 неделю! Запасайте кэш если вы охотитесь за бейджиком
Новых заданий осталось на пару дней!
после этого бдут выдаваться только задания на пересчет
Новые задания больше не выдаются
еще пару недель будут досчитываться те что уже выданы, и потом проект будет завершен!
Human Proteome Folding Project - Phase 2: Grid phase complete
Проект завершен!
Благодаря данным проекта http://www.worldcommunitygrid.org/about_us/displayNews.do?filterCategory=3_10&filterTags=14&sortBy=&pageNum=1 в научные журналы (см ссылку)
The grid-computing phase of the Human Proteome Folding - Phase 2 project is now complete. It was a massive project, launched in June 2006, and was the second-longest-running World Community Grid project to date. Volunteer members contributed over 123,000 CPU-years of computing power to run simulations and help determine the structure of proteins. The researchers at the Bonneau lab are very thankful to our members for this support, without which the project would have been impossible.
The researchers have made the protein structure data calculated during this project available to scientists around the world through a public database. This data has led to the publications of http://www.worldcommunitygrid.org/about_us/displayNews.do?filterCategory=3_10&filterTags=14&sortBy=&pageNum=1 in academic journals and these results have helped in better understanding the proteins involved in many diseases and have led to further research into how they might be treated. Work will continue for some time as scientists continue to analyze the protein structures.
For more details, please see the latest post from the Bonneau lab team http://bonneaulab.bio.nyu.edu/wcg/thread_2013_05_16.html, as well as their more detailed recent post http://bonneaulab.bio.nyu.edu/wcg/thread_2013_06_17.html.
http://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=308
Invision Power Board
© Invision Power Services