Версія даної теми для друку

Натисніть сюди для перегляду даної теми у оригінальному форматі

Розподілені обчислення в Україні _ Завершені проекти WCG _ Human Proteonome Folding, Phase 2

Автор: Rilian Jun 11 2008, 15:33

IPB Image

Human Proteome Folding Project
Phase 2


http://homepages.nyu.edu/~rb133/wcg/rbonneau_posts.html
http://homepages.nyu.edu/~rb133/wcg/experiments.html
http://commonfund.nih.gov/hmp/

http://homepages.nyu.edu/~rb133/wcg/thread_2010_03_10.html

http://distributed.org.ua/forum/index.php?showtopic=890 thumbsup.gif

Proteins are essential to living beings. Just about everything in the human body involves or is made out of proteins.

What are proteins?
Proteins are large molecules that are made of long chains of smaller molecules called amino acids. While there are only 20 different kinds of amino acids that make up all proteins, sometimes hundreds of them make up a single protein.

Adding to the complexity, proteins typically do not stay as long chains. As soon as the chain of amino acids is built, the chain folds and tangles up into a more compact and particular shape that lets it conduct specific and necessary functions within the human body.

Proteins fold because the different amino acids like to stick to each other following certain rules. Imagine that amino acids are pop-beads of 20 different colors. The pop-beads are sticky, but sticky in such a way that only certain combinations of colors can stick together. This makes the amino acid chains fold in a particular way that creates proteins that are useful to the human body. Human cells have mechanisms to help the proteins fold properly and, equally important, mechanisms to get rid of improperly folded proteins.

How do proteins relate to human genes?
The collection of all of the human genes is known as "the human genome." Depending on how the genes are counted, there are over 30,000 genes in the human genome. Each gene, which is a section of a long chain known as DNA, dictates how to build the chain of amino acids for one of the 30,000 proteins. In recent years, scientists were able to map the sequence for each human gene. This means that we now know the sequence of amino acids in all of the human proteins. Thus, the human genome is directly related to the "human proteome," the collection of all human proteins.

The protein mystery
While researchers have learned a great deal about the human proteome, the functions of most of the proteins remain a mystery. The genes do not reveal exactly how the proteins will fold into their final shape, which is critical because that determines what a protein can do and what other proteins it can connect to or interact with.

Proteins are like puzzle pieces. For example, muscle proteins connect to each other to form a muscle fiber. They join together in a specific manner because of their shape, as well as other factors relating to the shape.

Everything that goes on in cells and in the body is very specifically controlled by the shape of the proteins that do or do not let proteins interlock with other proteins. For example, the proteins of a virus or bacteria may have particular shapes that enable it to break through the cell membrane, allowing it to infect the cell.

The Human Proteome Folding Project
Знания структуры белков позволит ученым понять как белки выполняют свои биологические функции, а также как болезни блокируют белки от выполнения необходимых функций для поддержания здоровых клеток

The Human Proteome Folding Project will combine the power of millions of computers in a grid to help scientists understand how human proteins fold. The work to be done in this monumental task is shared across this grid, so that results can be achieved far sooner than would be possible with conventional supercomputers. With a greater understanding of protein structure, scientists can learn how diseases work and ultimately find cures for them.

When your grid agent is running, it is folding an amino acid chain in various ways and evaluating how well each folding follows the specific rules of how specific amino acids stick together or not. As computers try millions of ways to fold the chains, they attempt to fold the protein in the same way that it actually folds in the human body. The best shapes identified for each protein are returned to the scientists for further study.

IPB Image

-----

Оказывается тут тоже юзается розетта huh1.gif

(Show/Hide)

IPB Image


График проекта
IPB Image

Автор: nikelong Jun 11 2008, 16:37

Росетта следит за тобой!

ЗЫ: ты когда себе табличку поправишь?

Автор: (_KoDAk_) Jun 11 2008, 23:33

yes.gif

Автор: Rilian Jan 2 2009, 16:30

надо бы перевести шапку...

Кратко: проект с помощью алгоритмов проекта Rosetta рассчитывает более точные структуры (чем есть сейчас в базах ученых) белков в человеческом теле, и их патогенов. Довольно важная вещь

Автор: Rilian Jan 4 2009, 19:27

Насчитал 100 процессорных дней в проекте

Это составило 372 ВЮ, дало 40000 БОИНК-очков а также Золотую Медаль Human Proteome Folding 2 IPB Image

Автор: Rilian Jan 6 2009, 04:44

Richard Bonneau, head scientist of the Human Proteome Folding Project, was active in the original development of Rosetta at David Baker's laboratory while obtaining his PhD.[72] More information on the relationship between the HPF1, HPF2 and Rosetta@home can be found on Richard Bonneau's website http://homepages.nyu.edu/~rb133/wcg/rbonneau_posts.html

Автор: Rilian Jan 11 2009, 04:16

Рассчитал 500 протеинов за 142 процессорных дня

Автор: Rilian Jan 13 2009, 21:16

Объявляется мини-соревнование - кто быстрее подсчитает 1000 ВЮ проекта. Победитель получает приз зрительских симпатий

Участники заездасчета:
rilian
Vzhik

Автор: nikelong Feb 13 2009, 15:59

http://www.yeastrc.org/pdr/pages/search/advancedSearchForm.jsp

http://www.boinc-af.org/content/view/543/287/

http://www.boinc-af.org/content/view/668/219/

http://www.dp.by/wiki/Projects/Humanproteomefoldingproject

Автор: Rilian Mar 6 2009, 22:07

QUOTE(Rilian @ Jan 13 2009, 21:16) *

Объявляется мини-соревнование - кто быстрее подсчитает 1000 ВЮ проекта. Победитель получает приз зрительских симпатий

Участники заездасчета:
rilian
Vzhik

Vzhik получает зрительские симпатии за то что первее покранчил 1000 ВЮ! punk2.gif ves001.gif worthy.gif

Автор: cosmo_vk Mar 7 2009, 08:04

кстати не у кого не возникало проблем с расчетом заданий?
А то у меня уже несколько раз было: задание вроде и считается судя по загрузке проца, но прогресс стоит на месте. Лечится только перезапуском боинка. sad.gif

Автор: Rilian Mar 7 2009, 15:23

Не, но у меня на висте проблемы именно с HPFP2...

Автор: Rilian Mar 25 2009, 02:10

QUOTE
Hello,

It's been some time since we last updated you on what we're doing. I hope you'll all forgive me. With so much data and analysis to attend to, I've been pretty busy. A couple of months ago, when I started in the Bonneau lab, a lot of time was spent learning the basics of the project. Things are running smoothly now and there will be more time for updates and interesting research!

http://homepages.nyu.edu/~rb133/wcg/thread_2009_03_24.html

I hope you all enjoy the Cytoscape network. Thanks again.
--
Patrick Winters
Bonneau Lab


Привет,

Прошло уже некоторое время с тех пор как мы последний раз сообщали о текущем состоянии проекта. Надеюсь вы меня простите. С таким огромным количеством данных для анализа, я был очеть занят. Пару месяцев назад, когда я пришел в лабораторию Бонно, было потрачено очень много времени на изучение самого проекта. Но все идет отлично, и теперь будет больше времени для обновлений и интересных исследований!

http://translate.google.com/translate?prev=_t&hl=en&ie=UTF-8&u=http%3A%2F%2Fhomepages.nyu.edu%2F~rb133%2Fwcg%2Fthread_2009_03_24.html&sl=en&tl=ru&history_state0=

Надеюсь вам понравится сеть Cytoscape. Спасибо.

IPB Image

Патрик Винтерс
лаборатория Бонно (штат Южная Каролина, США)

Автор: cosmo_vk Apr 1 2009, 18:20

А все-таки тормозятся вычисления, теперь уже такой глюк дома проявился.

второе задание сверху с временем расчета 19:33:22, за такое время у меня 5-6 заданий обычно пролетает, а тут одно столько считает. idontno.gif

Автор: Rilian Apr 1 2009, 20:50

у меня на 2-гигагерцовых ксеонах бывает считает и по 20 часов...

Автор: cosmo_vk Apr 2 2009, 06:49

не-е у меня считает в районе 3-4 часов. Если больше значит глюк, так же с этим заданием. Рестартанул боинк, это задание досчиталось за пару часов.

Кстати на форуме WCG тоже про такое говорилось:
http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=24981

Автор: Rilian Apr 2 2009, 11:31

А... ну да, в HPFP2 оч редко такое бывает.. Может когда-нибудь исправят

Автор: Rilian Apr 8 2009, 20:54

Patrick Winters продолжает радовать нас апдейтами статуса проекта. Так как база данных заданий не имеет красивого фронт-энда и веб-интерфейса со всякими наворотами, Патрик на своей домашней странице пообещал периодически обновлять список экспериментов которые сейчас считаются

http://homepages.nyu.edu/~rb133/wcg/experiments.html

HPF2 Experiments - Updated April 2009

Code Organism Range
mc Trypanosoma cruzi strain CL Brener 238-999
md Trypanosoma cruzi strain CL Brener 000-999
me Trypanosoma cruzi strain CL Brener 000-999
mf Trypanosoma cruzi strain CL Brener 000-999
mg Trypanosoma cruzi strain CL Brener 000-999
mh Trypanosoma cruzi strain CL Brener 000-999
mi Trypanosoma cruzi strain CL Brener 000-822
mi Plasmodium knowlesi 823-999
mj Plasmodium knowlesi 000-999
mk Plasmodium knowlesi 000-998
ml Plasmodium knowlesi 000-999
mm Plasmodium knowlesi 000-999
mn Plasmodium knowlesi 000-325


rtfm.gif

Автор: vitalidze1 May 28 2009, 16:09

cosmo_vk,
В мене іноді такі лажі на компах висканують, тільки ті , що на роботі 2 машини, тоді, коли в завданнях тільки завдання про рис. Якщо мікст, все ок

Автор: cosmo_vk May 29 2009, 16:41

не-е на рисе у меня все нормально.
Глюк с этим проектом вроде прошел после ресетинга всего WCG.

Автор: Rilian May 29 2009, 16:51

Это не из-за ресета проекта итд. Есть ошибка в рассчетном ядре HPF2, и она иногда проявляется. Пока не исправлена

Автор: cosmo_vk May 29 2009, 16:57

пока она не проявляется и это главное. Правда сейчас я сбавил обороты в WCG и довольно значительную часть мощностей перебросил на POEM.

Автор: Rilian May 29 2009, 17:01

У меня вылазит примерно раз на 1000 ВЮ

Автор: Rilian Oct 28 2009, 12:32

Получил изумрудную медаль за 1 год процессорного времени

IPB Image winner.gif

Статус проекта на 1 Nov 2009!

http://homepages.nyu.edu/~rb133/wcg/thread_2009_11_01.html


Автор: Rilian Oct 31 2009, 21:26

Пресс-релиз от 28 октября 2009 6e047365df22.gif

HPF2 Update - November 2009

Greetings WCG Volunteers,

As the first World Community Grid project, we'd like to celebrate the WCG's anniversary with a recap of all the contributions to protein science that your work as made. Over the past few years, WCG volunteers have provided over 50,000 CPU years (as calculated by the WCG) and folded over tens of thousands of protein sequences. Often there is very little known about the sequences we've folded, and WCG protein structure predictions provide the only available annotations for scientists studying these proteins. Biologists from different disciplines have used our structure predictions to make informed decisions about experiments and infer protein functions and molecular processes.

In the early stages of our project, an effort was made to make focused predictions for proteins of interest. The yeast proteome was originally targeted for the vast amount of other experimental data available.

Публикация http://biology.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pbio.0050076
We predicted protein structures to further annotate this genome and compliment the array of protein interaction and molecular function information on this heavily studied model organism. Our results confirmed the feasibility of extending our approach to other less studied, larger proteomes.

A cross section of organisms (including Human, Mouse, Fly, E.Coli, Worm, and other unique organisms) have been processed completely, and protein sequences of unknown structure have been folded by the WCG. Our database has grown to include over a million protein sequences, and WCG predictions are complimented by known structures and a host of other structure and sequence metrics. We regularly receive special requests for predictions for proteins of varying kind (including but not limited to those related to HIV infection, the development of Malaria, and particular bacterial enzymatic processes).

A few high profile uses of our database include:

Публикация http://dx.doi.org/10.1016/j.cell.2007.10.053
Here we used our structure predictions to find transcription factors, the proteins that turn on and off genes. These predicted transcription factors proved critical (and accurate) in building the genome wide circuit for this organism. The general application here is environmental bioengineering and systems biology.

Публикация http://dx.doi.org/10.1016/j.cell.2008.07.009
Here our predictions were used to map the boundaries between functional parts of proteins. This allows for a whole new way of looking at how proteins interact and co-function to form a working system that the cell relies on. The general application here is broad, as this describes a dataset all types of biologists will use.

Публикация http://dx.doi.org/10.1084/jem.20061400
Here we predicted the structure of key immune proteins, resulting in a prediction that allowed us to re-engineer a key imune receptor allowing for a better animal model of innate immune responses (key to figuring out several aspects of our response to bacterial infection). This publication has direct application to immunology and fighting infectious disease.

Recently, we've been working towards a paper that will describe our new methods, highlight our successes, and publicize the already open access to our database. This year we've received an average of 6,300 unique visitors a month. That's over 200 users a day (including weekends)! With the publication of our new methods we expect a significant increase in exposure and are preparing to provide multiple means of user-friendly access for the sometimes complex data. This will include using BioNetBuilder.

Публикация http://bioinformatics.oxfordjournals.org/cgi/content/abstract/23/3/392

Future work will undoubtedly involve the refinement of our protein structure annotations. We're investigating methods for incorporating evolutionary information into our predictions, and overhauling parts of the pipeline that are outdated. There is significant room for improvement in our methods for selecting native-state conformations from structure predictions and assigning family annotations. With the WCG we've been able to cast a wide net, and now we're interested in the improvement of our algorithms and classifiers. WCG predictions will continue to provide data for our ever improving experiments and value to the scientific community.

Here at the Bonneau Lab, we thank you for your dedication to science and ask that you keep crunching!
--
Patrick Winters
Bonneau Lab

Как видно с помощью проекта Human Proteonome Folding, Phase 2 за полгода было сделано много исследований и 5 публикаций thumbsup.gif

Автор: Rilian Mar 17 2010, 00:28

We'll be uploading more work units to IBM soon; there's no concern. We also have a batch mp 200-999 that was initially skipped. So we've bought a few more weeks, and are working towards the next experiment. Some of the analysis we've performed for the paper-in-progress has inspired new ideas.

To answer rilian, we have already run about 100 different species through the HPF pipeline. The unifying factor is that all of these organisms provoke particular interest in the scientific community. Many are parasites and disease causing species that affect humans, some are important for studying human food sources, etc. It's safe to say that rice is one of the most important food sources for humans, if not the most heavily consumed food source. Improving annotation of the rice proteome, as well as the other organisms we've folded, greatly increases the available scientific resources for researchers studying any of these species.

As for folding all of the human proteome... There are somewhere around 30k protein coding regions, of which we identified near 70k uniquely folding protein domains. Many of these can be matched to known protein structures, above 50%, using sequence based similarity methods. We've folded thousands on the WCG, but not everything. Rosetta's effectiveness diminishes due to a number of factors including, but not limited to, protein length, disorder, and trans-membrane regions. We've already run everything that passed our filters.

CODE
https://secure.worldcommunitygrid.org/forums/wcg/viewthread_thread,28336_lastpage,yes#271805

Автор: Rilian Mar 25 2010, 02:43

Обновление статуса проекта за март 2010

Кратко, как я понял: изучая специальные стабильные последовательности аминокислот в предсказанных HPF2 белках, из одного поколения белка в другое (данные о белках берутся из разных популяций одного вида животных или растений), ученые проекта смотрят какие эволюционные факторы вызывают какие изменения в структуре белков, и, если возможно, какие новые функции они получают.

Дальше, используя метод вероятностей (поиск по большой базе результатов, который планируется сделать в проекте с помощью мощностей WCG), ученые смогут найти

1) каким эволюционным факторам подверглись белки с неизвестными пока функциями.
2) какие функции могут приобрести белки при определенных эволюционных факторах

Последнее мне кажется особенно актуальным для практического применения при создании новых генетически модифицированных организмов.

Итак, статья

http://homepages.nyu.edu/~rb133/wcg/thread_2010_03_10.html

HPF2 Update - March 2010

Greetings WCG Volunteers,

We've been working diligently to develop a pipeline for a cooperative analysis of phylogenetic and structural data. We will integrate our structure predictions with knowledge of how proteins (and functional sites on folded proteins) evolve, by estimating the phylogenies of all protein domain families in our database and identifying positively-selected amino acid sites in these families using codon-based molecular evolution models that can be mapped onto the predicted structures. The first stages of this analysis are coming to fruition, and we've begun investigating preliminary results.

Using phylogenetic models, we intend to identify sites of proteins exhibiting evolutionary pressure. This may improve our understanding of how proteins evolve new functions and structures, and will ultimately lead to an increase in genome annotation for proteins whose purpose we know next to nothing about. The great scale of and wealth of information in our database may allow us to improve upon our existing and future de novo structure and function predictions. Identifying structurally or functionally importing residues in protein domains should inform our comparative modeling techniques. We use probabilistic methods to produce models of evolution using observed rates of mutation in protein families. Lots of different evolutionary pressures affect the mutation and expression of proteins, but we hope to garner insight with this analysis about how evolution adapts protein function.

Using our automated methods, we produced evolutionary models for a handful of identified protein domain families in major plant genomes. One such protein family matches http://www.pdb.org/pdb/explore/explore.do?structureId=1TQE "Myocyte Enhancer Factor-2". While this analysis is very preliminary (and I stress preliminary), positive selection analysis identifies a few residues that may be involved in DNA binding and the integrity of the dimer near the substrate. This is the kind of science we'll be investigating in the future using WCG predicted structures.

--
Patrick Winters
Bonneau Lab

IPB Image
http://www.pdb.org/pdb/explore/explore.do?structureId=1TQE: colored blue, with probability of positive selection highlighted yellow-red.


IPB Image
http://www.pdb.org/pdb/explore/explore.do?structureId=1TQE: the two chains colored blue and green, with probability of positive selection highlighted yellow-red.


IPB Image
Screenshot from embedded http://www.jalview.org/ of the family's alignment.


IPB Image
Screenshot from embedded http://www.phylowidget.org/ of the family's phylogenetic tree.

Автор: Rilian Mar 25 2010, 23:08

краткое содержание предыдущего текста

Organisms from the various branches of the tree of life share a lot in common. Wildly different organisms share much of the same molecular machinery, and as you group them into smaller categories their proteomes begin to look very similar. Using protein sequence similarity we can identify proteins from multiple organisms that clearly shared a common ancestor, and biologists have developed algorithms for determining their evolutionary relationships. From these relationships we can infer how evolutionary pressures change proteins... encourage or discourage mutations at certain places in the protein. You can imagine that some portions of a protein might be very important for carrying its task and don't show many mutations, some portions just mutate randomly (neutral drift), and others seem to mutate wildly (perhaps as the protein develops a new function).

We've begun to perform this kind of analysis on some major plant protein families. It remains to be seen what kind of evolutionary trends we'll discover, but on a per protein basis this information can be very important to researchers. In my example I show how our analysis suggests that the DNA binding portions of a particular protein family are undergoing some sort of adaptive change.

Now we can perform this kind of analysis irrespective of structure predictions since it is based on protein sequence, but integrating it with WCG structure predictions is a primary goal of the project. The best part is that it doesn't require re-running any results from the WCG, and the trends we discover can be used to better select models from WCG runs and better identify functional properties.

Автор: Rilian Apr 12 2010, 23:59

We will be running a new Windows build for HPF2 through beta soon. This new build is to address the "ERROR:: Exit at: .\dock_structure.cc line:401" error. We have seen good results in our internal testing environment. Any members who have machines that experience this error may want to try and get some of the beta workunits. We appreciate the members on-going patience and help in the forums while we are working on this issue.

Thanks,
armstrdj

И, как бе, да, ЕСТЬ БЭТА ВЮ! punk.gif

Автор: Rilian Apr 26 2010, 01:50

За 2 процессорных года в этом проекте получил сапфировую медаль IPB Image winner.gif

Автор: Rilian Apr 26 2010, 20:54

Апдейт от координаторов проекта!

Hello,
We've been stalling starting any large experiments in anticipation of our next big initiative. I've been developing more work units as necessary, folding Candidatus Desulforudis audaxviator (an interesting bacterium).

From Wikipedia:

QUOTE
it has survived for millions of years on chemical food sources that derive from the radioactive decay of minerals in the surrounding rock, making it one of the few organisms known that does not depend on sunlight for nourishment and the only species known to be alone in its ecosystem


The current plan, to keep you all informed, is to annotate proteins from the Human Microbiome Project. Compiling the protein sets for this data is proving to be extremely complicated. With such a huge number of sequence reads, we want to make sure we target proteins that are truly novel for the WCG. This is going to be a major undertaking and may result in some new publicity from IBM. With that in mind I don't want to give too much away, but I can assure you we will roll it out with plenty of information.

We have a huge backlog of work here at the lab preparing our manuscript, but things are looking very good. We may continue to produce work units on an "as needed" basis until we can finalize things for the microbiome project. I'll try to keep you all informed.

--
Patrick Winters
Human Proteome Folding Scientist

по русски:

Если кратко: в проекте завершились раздаваться ВЮ из фолдинга рисовых белков. Сейчас раздаются ВЮ из подпроекта Human Microbiome Project. Пока что они не могут подробнее рассказать об этом, но по всей видимости намечается новый отдельный проект WCG в этой области.

По поводу рисовых белков. Посчитано очень много информации, и сейчас они готовят "манускрипт" (статью).

В данное время считаются белки из http://ru.wikipedia.org/wiki/Desulforudis_audaxviator - "интересной бактерии".

QUOTE
Desulforudis audaxviator была обнаружена в 2002 году в пробах воды в золотодобывающей шахте Мпоненг (Mponeng) в Южной Африке недалеко от Йоханнесбурга на глубине 2,8 км[1]. Длина Desulforudis audaxviator составляет приблизительно четыре микрометра. Этот вид не нуждается в солнечном свете и получает энергию в ходе восстановительной реакции с участием сульфата (SO42-) и водорода, образующегося в результате распада радиоактивных изотопов урана, тория и калия, содержащихся в горных породах[2]. Desulforudis audaxviator не способна утилизировать кислород или хотя бы защищаться от его токсичного действия.

Бактерия была изолирована от поверхности Земли в течение нескольких миллионов лет, приспособившись к выживанию в экстремальных условиях — при температурах более 60 °C и рН 9,3[3]. Таким образом Desulforudis audaxviator является одновременно термофильным и алкалифильным микроорганизмом.

Desulforudis audaxviator является на сегодняшний день единственным видом, представляющим собой самодостаточную экосистему, способную самовоспроизводиться без всякого контакта с остальной земной биосферой. Поскольку окружающая среда на таких глубинах похожа на раннюю Землю, это дает основания строить предположения о том, какие организмы существовали до возникновения кислородной атмосферы.

Предполагают, что значительную долю своих генов Desulforudis audaxviator получила от архей (другого царства живых существ) путём горизонтального переноса.

По имеющимся оценкам, бактерии, обитающие в подобных условиях, из-за острого дефицита ресурсов должны расти и размножаться невероятно медленно. Ученые не исключают, что между двумя клеточными делениями у таких микробов могут проходить сотни и даже тысячи лет.

Вероятно, что в ходе дальнейших исследований Desulforudis audaxviator будут разрешены некоторые вопросы, относящиеся к проблеме происхождения жизни на Земле. ph34r2.gif


bunny.gif dance.gif blink.gif

Кип он кранчинг! koc.gif

Автор: Rilian May 5 2010, 21:55

Бета тест прошел успешно, и версия 6.17 уже в строю. В ней исправлены разные ошибки при работе в win32 и особенно в серверных 64-бит версиях Windows®™

Пока ученые проекта готовят статью про текущие исследования и эксперименты,

Hi,
I just wanted to add that I've verified a few proteins run with the new executable and they look good. Also, I'll update everyone with a status update soon, but I just wanted to add that I've sent IBM more work units. We're going to fold Chlamydomonas reinhardtii in the interim before we get to the microbiome project. It's a nice little single celled alga whose metabolism is being exploited to create clean sources of hydrogen.

Here's two pics of the beta results NG590 and NG592.

IPB Image

IPB Image

--
Patrick
Human Proteome Folding Scientist

Автор: Rilian May 13 2010, 22:58

I'll be happy to update our experiments table and post a status update when I get a chance. It's been a little hectic here for me. While I'm not technically leaving the lab, I'm leaving NY in a couple weeks. I have quite a backlog of re-processing to do for our publication before I leave.

A short update would be that we have plenty more work queued up. I added "Chlamydomonas reinhardtii" into the pipeline (wikipedia it for a description). It's a few months worth of work and it will give us time to set up the Human Microbiome proteins (mentioned in previous posts) which will run for a very long time.

I understand you guys want to know what's up. I'll tell you more when I have time. Suffices to say, there hasn't been any slowdown or delay with the project and we are still producing great structure predictions. Now we just need to finish this manuscript and get the whole thing publicized! We know the WCG predictions are going to be a hugely popular resource.

Patrick Winters
Human Proteome Folding Scientist

CODE
http://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,28940_lastpage,yes#279602

Автор: corsar83 Jun 22 2010, 14:00

Чото давно тихо в этой ветке. Новостей не слышно rilian.gif

Автор: Rilian Jun 22 2010, 14:42

QUOTE(corsar83 @ Jun 22 2010, 15:00) *

Чото давно тихо в этой ветке. Новостей не слышно rilian.gif

считаем то что написано выше ... ^

Автор: corsar83 Aug 28 2010, 11:21

блин выдаёт на двух компах следующее

<message>
CreateProcess() failed - (0x5)
</message>
]]>

кис вроде настроен нормально. Может кто знает, чо нужно сделать blink.gif

Автор: Rilian Aug 29 2010, 09:24

QUOTE(corsar83 @ Aug 28 2010, 12:21) *

блин выдаёт на двух компах следующее

<message>
CreateProcess() failed - (0x5)
</message>
]]>

кис вроде настроен нормально. Может кто знает, чо нужно сделать blink.gif

http://boincfaq.mundayweb.com/index.php?language=1&view=448

This is caused by something blocking BOINC from starting up the science application. Always check that you allowed BOINC through your firewall and exclude both the BOINC and BOINC Data directories from actively being scanned by your anti virus and anti spyware product(s).

Put BOINC (boinc.exe and boincmgr.exe) in the trusted zone of your firewall and only scan the directory or directories by hand with your anti-virus and other anti-malware software, after you closed down or suspended BOINC.

Автор: corsar83 Aug 29 2010, 10:30

(Rilian @ Aug 29 2010, 10:24) *

(corsar83 @ Aug 28 2010, 12:21) *

блин выдаёт на двух компах следующее

<message>
CreateProcess() failed - (0x5)
</message>
]]>

кис вроде настроен нормально. Может кто знает, чо нужно сделать blink.gif

http://boincfaq.mundayweb.com/index.php?language=1&view=448

This is caused by something blocking BOINC from starting up the science application. Always check that you allowed BOINC through your firewall and exclude both the BOINC and BOINC Data directories from actively being scanned by your anti virus and anti spyware product(s).

Put BOINC (boinc.exe and boincmgr.exe) in the trusted zone of your firewall and only scan the directory or directories by hand with your anti-virus and other anti-malware software, after you closed down or suspended BOINC.



Вроде все так и сделано, но не хочет. В списке киса вобще почему-то нету исполняющих по протеоме, не в довереных, не в ограничениях. По остальным проектам есть. А можно как-то сделать, чтоб по протеоме перегрузились заново все необходимые файлы с сайта?

Автор: EKONOMIST Aug 29 2010, 12:17

Можно перезапустить проект - есть такая кнопка в боинке, только нужно убедиться, что остальные вцг-задания досчитались

Автор: corsar83 Aug 29 2010, 12:33

(EKONOMIST @ Aug 29 2010, 13:17) *

Можно перезапустить проект - есть такая кнопка в боинке, только нужно убедиться, что остальные вцг-задания досчитались



Это перезапустит сразу все проекты? А статистика не обнулится?

Автор: EKONOMIST Aug 29 2010, 12:37

Перезапустит все вцг-проекты, может разве что сброситься количество очков, которые заработал этот хост в этом проекте, общая статистика не пропадет

Автор: Rilian Aug 29 2010, 12:46

QUOTE(corsar83 @ Aug 29 2010, 11:30) *

Вроде все так и сделано, но не хочет. В списке киса вобще почему-то нету исполняющих по протеоме, не в довереных, не в ограничениях. По остальным проектам есть. А можно как-то сделать, чтоб по протеоме перегрузились заново все необходимые файлы с сайта?


зайди в папку проекта wcg в data и все exe файлы добавь в "доверенные"

"перезагрузка" проекта в этом случае бесполезна

Автор: corsar83 Aug 29 2010, 14:21

(Rilian @ Aug 29 2010, 13:46) *

(corsar83 @ Aug 29 2010, 11:30) *

Вроде все так и сделано, но не хочет. В списке киса вобще почему-то нету исполняющих по протеоме, не в довереных, не в ограничениях. По остальным проектам есть. А можно как-то сделать, чтоб по протеоме перегрузились заново все необходимые файлы с сайта?


зайди в папку проекта wcg в data и все exe файлы добавь в "доверенные"

"перезагрузка" проекта в этом случае бесполезна



блин хоть убей всё облазил не могу найти эту папку и файлы. Куда оно его скачует в папке боинка нету. Всю XP облазил так и не нашёл. st.gif

Автор: tiss Aug 29 2010, 21:15

(corsar83 @ Aug 29 2010, 15:21) *

блин хоть убей всё облазил не могу найти эту папку и файлы. Куда оно его скачует в папке боинка нету. Всю XP облазил так и не нашёл. st.gif


ОС какая??? Если ХРюша то где-то в папке Документс энд Сеттингс, если Всита или семерка то в папке Програм Дата

Автор: Gelo Sep 27 2010, 07:41

блин что за фигня - задания считаются меньше минуты и выдает "Ошибка вычисления"

(Show/Hide)

27.09.2010 8:35:27 | World Community Grid | Starting nu124_00006_17

27.09.2010 8:35:27 | World Community Grid | Starting task nu124_00006_17 using hpf2 version 617

27.09.2010 8:35:27 | World Community Grid | Starting nu124_00005_6

27.09.2010 8:35:27 | World Community Grid | Starting task nu124_00005_6 using hpf2 version 617

27.09.2010 8:35:49 | World Community Grid | Computation for task nu124_00005_6 finished

27.09.2010 8:35:49 | World Community Grid | Output file nu124_00005_6_0 for task nu124_00005_6 absent

27.09.2010 8:35:49 | World Community Grid | Starting nu122_00087_12

27.09.2010 8:35:49 | World Community Grid | Starting task nu122_00087_12 using hpf2 version 617

27.09.2010 8:35:51 | World Community Grid | Computation for task nu124_00006_17 finished

27.09.2010 8:35:51 | World Community Grid | Output file nu124_00006_17_0 for task nu124_00006_17 absent

27.09.2010 8:35:51 | World Community Grid | Starting nu122_00026_8

27.09.2010 8:35:51 | World Community Grid | Starting task nu122_00026_8 using hpf2 version 617

27.09.2010 8:36:33 | World Community Grid | Computation for task nu122_00087_12 finished

27.09.2010 8:36:33 | World Community Grid | Output file nu122_00087_12_0 for task nu122_00087_12 absent

27.09.2010 8:36:39 | World Community Grid | Computation for task nu122_00026_8 finished

27.09.2010 8:36:39 | World Community Grid | Output file nu122_00026_8_0 for task nu122_00026_8 absent


то же самое и с Help Cure Muscular Dystrophy - Phase 2

Автор: corsar83 Sep 27 2010, 09:20

(Gelo @ Sep 27 2010, 08:41) *

блин что за фигня - задания считаются меньше минуты и выдает "Ошибка вычисления"
(Show/Hide)

27.09.2010 8:35:27 | World Community Grid | Starting nu124_00006_17

27.09.2010 8:35:27 | World Community Grid | Starting task nu124_00006_17 using hpf2 version 617

27.09.2010 8:35:27 | World Community Grid | Starting nu124_00005_6

27.09.2010 8:35:27 | World Community Grid | Starting task nu124_00005_6 using hpf2 version 617

27.09.2010 8:35:49 | World Community Grid | Computation for task nu124_00005_6 finished

27.09.2010 8:35:49 | World Community Grid | Output file nu124_00005_6_0 for task nu124_00005_6 absent

27.09.2010 8:35:49 | World Community Grid | Starting nu122_00087_12

27.09.2010 8:35:49 | World Community Grid | Starting task nu122_00087_12 using hpf2 version 617

27.09.2010 8:35:51 | World Community Grid | Computation for task nu124_00006_17 finished

27.09.2010 8:35:51 | World Community Grid | Output file nu124_00006_17_0 for task nu124_00006_17 absent

27.09.2010 8:35:51 | World Community Grid | Starting nu122_00026_8

27.09.2010 8:35:51 | World Community Grid | Starting task nu122_00026_8 using hpf2 version 617

27.09.2010 8:36:33 | World Community Grid | Computation for task nu122_00087_12 finished

27.09.2010 8:36:33 | World Community Grid | Output file nu122_00087_12_0 for task nu122_00087_12 absent

27.09.2010 8:36:39 | World Community Grid | Computation for task nu122_00026_8 finished

27.09.2010 8:36:39 | World Community Grid | Output file nu122_00026_8_0 for task nu122_00026_8 absent


то же самое и с Help Cure Muscular Dystrophy - Phase 2


Настраивай касперский (или у тебя другой антивирь awesome.png ) Поставь их в довереные. Вчера тоже с протеомой парился.

Автор: Rilian Sep 27 2010, 09:28

QUOTE(corsar83 @ Aug 29 2010, 15:21) *

QUOTE(Rilian @ Aug 29 2010, 13:46) *

QUOTE(corsar83 @ Aug 29 2010, 11:30) *

Вроде все так и сделано, но не хочет. В списке киса вобще почему-то нету исполняющих по протеоме, не в довереных, не в ограничениях. По остальным проектам есть. А можно как-то сделать, чтоб по протеоме перегрузились заново все необходимые файлы с сайта?


зайди в папку проекта wcg в data и все exe файлы добавь в "доверенные"

"перезагрузка" проекта в этом случае бесполезна



блин хоть убей всё облазил не могу найти эту папку и файлы. Куда оно его скачует в папке боинка нету. Всю XP облазил так и не нашёл. st.gif

при запуске БОИНКа в первых строчках Messages есть этот путь

Автор: tiss Sep 27 2010, 09:33

(Rilian @ Sep 27 2010, 10:28) *

при запуске БОИНКа в первых строчках Messages есть этот путь


Типа так

09/27/10 10:06:53 Starting BOINC client version 6.10.45 for windows_x86_64
09/27/10 10:06:53 Config: report completed tasks immediately
09/27/10 10:06:53 log flags: file_xfer, sched_ops, task
09/27/10 10:06:53 Libraries: libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
09/27/10 10:06:53 Running as a daemon
09/27/10 10:06:53 Data directory: C:\ProgramData\BOINC
09/27/10 10:06:53 Running under account boinc_master
09/27/10 10:06:54 Processor: 4 GenuineIntel Intel® Core™ i5 CPU 650 @ 3.20GHz [Family 6 Model 37 Stepping 2]
09/27/10 10:06:54 Processor: 256.00 KB cache

Автор: Rilian Sep 27 2010, 10:10

Видимо пункт про антивирусы надо вписать в нашу стандартную "шапку" подключения к BOINC-проектам

Автор: corsar83 Sep 27 2010, 10:42

(Rilian @ Sep 27 2010, 10:28) *

(corsar83 @ Aug 29 2010, 15:21) *

(Rilian @ Aug 29 2010, 13:46) *

(corsar83 @ Aug 29 2010, 11:30) *

Вроде все так и сделано, но не хочет. В списке киса вобще почему-то нету исполняющих по протеоме, не в довереных, не в ограничениях. По остальным проектам есть. А можно как-то сделать, чтоб по протеоме перегрузились заново все необходимые файлы с сайта?


зайди в папку проекта wcg в data и все exe файлы добавь в "доверенные"

"перезагрузка" проекта в этом случае бесполезна



блин хоть убей всё облазил не могу найти эту папку и файлы. Куда оно его скачует в папке боинка нету. Всю XP облазил так и не нашёл. st.gif

при запуске БОИНКа в первых строчках Messages есть этот путь


Да вчера уже разобрался. Оказывается у протеомы исполняемый файл называется по хитрому я и не подумал вначале, что это он. Уже заработало cool2.gif

Автор: Rilian Sep 27 2010, 10:45

Да, вообще-то в WCG в исполняемых файлах на windows нету букв .exe. Надо тоже это в шапку прописать..

Автор: corsar83 Sep 27 2010, 10:54

(Rilian @ Sep 27 2010, 11:45) *

Да, вообще-то в WCG в исполняемых файлах на windows нету букв .exe. Надо тоже это в шапку прописать..


Это да. Но в других проектах хоть названия такие же токо сокращенно. А по протеоме в каспере был файлик, точно не помню уже типа Client....(тут не помню)... и вспоминания про институт Парижа (или Нью Йорка )и еще чото (длинное название), вобще думал это рекламка какаето. Оказалось именно оно awesome.png .

Автор: corsar83 Sep 27 2010, 19:21

Это чо в протеоме каждое задание считается по 15 раз fear.gif Раньше разве тоже так было?

Автор: Rilian Sep 27 2010, 20:18

corsar83, всегда так было. задания немного разные, на самом деле. подробнее на англ в официальном форуме..

Автор: Gelo Sep 28 2010, 13:55

QUOTE(corsar83 @ Sep 27 2010, 10:20) *

QUOTE(Gelo @ Sep 27 2010, 08:41) *

блин что за фигня - задания считаются меньше минуты и выдает "Ошибка вычисления"
(Show/Hide)

27.09.2010 8:35:27 | World Community Grid | Starting nu124_00006_17

27.09.2010 8:35:27 | World Community Grid | Starting task nu124_00006_17 using hpf2 version 617

27.09.2010 8:35:27 | World Community Grid | Starting nu124_00005_6

27.09.2010 8:35:27 | World Community Grid | Starting task nu124_00005_6 using hpf2 version 617

27.09.2010 8:35:49 | World Community Grid | Computation for task nu124_00005_6 finished

27.09.2010 8:35:49 | World Community Grid | Output file nu124_00005_6_0 for task nu124_00005_6 absent

27.09.2010 8:35:49 | World Community Grid | Starting nu122_00087_12

27.09.2010 8:35:49 | World Community Grid | Starting task nu122_00087_12 using hpf2 version 617

27.09.2010 8:35:51 | World Community Grid | Computation for task nu124_00006_17 finished

27.09.2010 8:35:51 | World Community Grid | Output file nu124_00006_17_0 for task nu124_00006_17 absent

27.09.2010 8:35:51 | World Community Grid | Starting nu122_00026_8

27.09.2010 8:35:51 | World Community Grid | Starting task nu122_00026_8 using hpf2 version 617

27.09.2010 8:36:33 | World Community Grid | Computation for task nu122_00087_12 finished

27.09.2010 8:36:33 | World Community Grid | Output file nu122_00087_12_0 for task nu122_00087_12 absent

27.09.2010 8:36:39 | World Community Grid | Computation for task nu122_00026_8 finished

27.09.2010 8:36:39 | World Community Grid | Output file nu122_00026_8_0 for task nu122_00026_8 absent


то же самое и с Help Cure Muscular Dystrophy - Phase 2


Настраивай касперский (или у тебя другой антивирь awesome.png ) Поставь их в довереные. Вчера тоже с протеомой парился.


пользуюсь авастом, ставил в доверенные - один хрен. Помогло удаление-добавление проекта.
ЗЫ: начались траблы с этим проектом после переустановки виндовса, до этого считалось нормально.

Автор: Tamagoch Sep 28 2010, 15:55

(Rilian @ Sep 27 2010, 11:45) *
Да, вообще-то в WCG в исполняемых файлах на windows нету букв .exe

что весьма кстати, если в офисе существует запрет на скачку исполняемых файлов и лень (а если правильно, то по соображениям безопасности не хочется) его отключать

Автор: Rilian Oct 21 2010, 16:51

Пресс релиз за Октябрь 2010

http://homepages.nyu.edu/~rb133/wcg/thread_2010_10_10.html

Автор: Rilian Feb 3 2011, 16:16

Долгожданный апдейт!

http://homepages.nyu.edu/~rb133/wcg/thread_2011_01_31.html

HPF2 Update - January 2011

Greetings WCG Community,

Happy new year! I have taken it upon myself to write a quick status update regarding the Human Proteome Folding project, in part to introduce myself and also to summarize a few of the tasks that lie ahead for us.

As of early January, the Bonneau lab team working on the HPF project has undergone a slight change of staff. Patrick Winters, from whom you have heard in previous updates, has moved on to other pastures, and I have taken his place. My name is Duncan Penfold-Brown, and I am coming out of a previous research position in the application of high-powered computing (both Grid and Cloud) to high-energy physics and astronomy. I have a degree in computer science, and have in the past focused on distributed and self-organizing systems. I also have experience (and a great deal of interest) in bioinformatics, which I look forward to increasing in my work on this project.

In short, I will be responsible for support and development of HPF projects and research. I am interested in pursuing a greater knowledge of proteomics - which seems to me like biological puzzle solving - and also the applications (medical, investigative) of the research we are completing together.

The immediate tasks we are approaching are to work on improving our pre- and post-analysis tools, in order to get more data to the Grid, and to better interpret what comes out. Goals for the immediate future include continuing the incorporation of phylogenetic data and evolutionary analysis of protein domains into our analysis process in order to enhance our annotation of select unknown proteins. This incorporation will provide us with a better understanding of the function of unknown proteins, as we can more accurately identify evolutionarily similar structures and cross-examine their structure-function relationships. With additions to our analysis process and continued work, we will ultimately be improving the end data of our research.

Speaking of data, here is a quick update of what is currently being turned over on the WCG:

Currently, all protein data being folded is from the Human Microbiome Project (HMP - see their site at http://commonfund.nih.gov/hmp/), with a focus on microbes found in the human gut. As of late January, we have completed pre-analysis on the last of the data that we will be working on from the HMP (for now), and have dropped it off to be picked up by the grid (see codes 'oh' - 'ok' in the following table).

We are now looking into new organisms - such as the malaria parasite Plasmodium Yoelii Yoelii (a model rodent malaria important for understanding the function of human malaria) - to analyze and send to the grid.

The following table describes the data that has recently been or is being processed by the WCG:

CODE
Code Experiment Project/Organism                                              Description                Status

oa     1169     Microbiome     Novel Gastro-Intestinal proteins from the Human Microbiome Project     Finished
ob     1169     Microbiome     Novel Gastro-Intestinal proteins from the Human Microbiome Project     Finished
oc     1169     Microbiome     Novel Gastro-Intestinal proteins from the Human Microbiome Project     Working...
od     1169     Microbiome     Novel Gastro-Intestinal proteins from the Human Microbiome Project     Working...
...     1169     Microbiome     Novel Gastro-Intestinal proteins from the Human Microbiome Project     Working...
og     1169     Microbiome     Novel Gastro-Intestinal proteins from the Human Microbiome Project     Working...
oh     1169     Microbiome     Novel Gastro-Intestinal proteins from the Human Microbiome Project     Submitted
oi     1169     Microbiome     Novel Gastro-Intestinal proteins from the Human Microbiome Project     Submitted
oj     1169     Microbiome     Novel Gastro-Intestinal proteins from the Human Microbiome Project     Submitted
ok     1169     Microbiome     Novel Gastro-Intestinal proteins from the Human Microbiome Project     Submitted

Thanks!

--

Duncan Penfold-Brown, Bonneau Lab

Автор: Salmonella Feb 3 2011, 17:15

Считать американцам их микробиом. Слишком много чести. Хочу родной кефирчик.

Автор: Rilian Sep 27 2011, 11:01


Summary
The Human Proteome Folding project researchers have published a paper in the journal Genome Research, which announces the availability of their data base of predicted protein structures, their validation methods and how this augments other information about these proteins, thus helping to solve a critical problem for biologists.

QUOTE
Проект HPF опубликовал статью в Genome Research. Анонсируют доступность их базы данных предсказанных белковых структур, их методы валидации, как новые знания расширяют уже известные знания о данных белках, позволяя решать критические проблемы для биологов.


Lay Person Abstract:

Lack of information about the structure of proteins is a critical problem for biologists and severely limits their ability to do further research and conduct experiments to understand the roles of proteins in disease processes. The researchers for the Human Proteome Projects have published a paper in Genome Research entitled "The proteome folding project: proteome-scale prediction of structure and function." The paper describes how they were able to use the computation results from World Community Grid to predict protein structure and protein function. Protein structure determines the function of proteins in life processes. Knowing the structure of these proteins helps scientists studying biological and medical processes and can, for example, hasten the process of discovering treatments for diseases. The human genome as well as 93 other genomes of importance to humans were processed. The paper describes the methods used to validate the accuracy of their predictions, which are now publicly available in a data base for all scientists to use.

Technical Abstract:

The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition, and grid-computing-enabled de novo structure prediction. We predict protein domain boundaries and three-dimensional (3D) structures for protein domains from 94 genomes (including human, Arabidopsis, rice, mouse, fly, yeast, Escherichia coli, and worm). De novo structure predictions were distributed on a grid of more than 1.5 million CPUs worldwide (World Community Grid). We generated significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions.

Access to Paper:

To view the paper, please http://genome.cshlp.org/content/early/2011/09/16/gr.121475.111.full.pdf.
http://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=180

Автор: Kirilkaper Oct 19 2011, 10:20

World Community Grid Lecture Series - Human Proteome Folding project

Dear ******,

You are invited to participate in a live webcast on Octover 21, 2011 to hear an overview and update on World Community Grid's Human Proteome Folding project. The event will be hosted by Dr. Richard Bonneau from New York University.

Since 2006, World Community Grid has had the privilege of supporting the innovative research underway at New York University to use computers to predict the structure of proteins, the "molecular machines" of the human body. Knowing protein structure is a critical step in advancing the understanding of how proteins affect human health, providing scientists with the information they need to develop new cures for human diseases.

This is the "Human Proteome Folding - Phase 2" project that many of you run every day on your laptops and PCs for World Community Grid, helping us make progress towards aiding researchers in understanding how proteins perform their intended functions and also how diseases prevent proteins from maintaining healthy cells.

The webcast will take place on October 21, 2011, starting promptly at 11:00AM Eastern Daylight Time (USA), which is 15:00 Coordinated Universal Time. Please join a few minutes early so that you're sure not to miss anything.

Participants can listen to Dr. Bonneau while viewing an on-screen presentation. Time permitting, you will be able to ask Dr. Bonneau questions via a text chat interface.

Access to the webcast is via this link: https://apps.lotuslive.com/meetings/join?id=0327108

You can check if your computer is ready for the webcast at this link: https://www.conferenceservers.com/browser?brand=LLENGAGE_EN-US

And whether or not you can join the webcast, make sure your laptop, PC or Mac is running World Community Grid, and let your friends know this easy way to participate in helping humanity!

Also, please note that World Community Grid has added three new download servers to help support our additional growth. Download servers are used to send work to your computer. As a result of this change, your computer may prompt you to communicate with the IP addresses of these new servers. If you have experienced this, please click on this link for further information: http://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,31492


Thank you,


The World Community Grid Team

P.S. After the webcast we will post the video of the webcast on YouTube, in the World Community Grid News & Update section, and we'll send you a link to the video.

Автор: Rilian Dec 28 2011, 23:18

The Human Proteome Folding project research scientists have posted an informative status update on their web site. They highlight their recently published paper in Genome Research and an upcoming paper about the evolution of proteins. Future work is also discussed, including some work which should help the scientific community working on malaria.

6e047365df22.gif You may review their update https://files.nyu.edu/rb133/public/wcg/thread_2011_11_11.html.

World Community Grid Post - HPF2 Update, November 2011

Greetings to everyone,

It's been a stretch since the last update, but excitingly (!), we've been quite busy wrapping up ongoing projects with publications, and also getting our teeth into new projects and data. So, without further ado, I'd like to first mention our accepted and pending publications, and then go over the new data we're crunching and where it is leading us.

The lab has been very excited to recently have two gargantuan efforts come to fruition with the acceptance of one paper and the completion and submission of a second. The first, Kevin Drew (et al.)'s, is an enormous work covering nearly everything we do in terms of protein structure and function prediction, and was made conceivable in the first place and achievable in the second by support of World Community Grid computing cycles.

The paper will be available in the journal Genome Research this month (November 2011). The abstract is as follows, and the lab spent extra to ensure an open license so that the paper could be viewed in full - take a look!

The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition and grid-computing enabled de novo structure prediction. We predict protein domain boundaries and 3D structures for protein domains from 94 genomes (including Human, Arabidopsis, Rice, Mouse, Fly, Yeast, E. coli and Worm). De novo structure predictions were distributed on a grid of over 1.5 million CPUs worldwide (World Community Grid). We generate significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions.

The paper can be viewed here: http://genome.cshlp.org/content/early/2011/09/16/gr.121475.111.abstract

Also, take a quick look at this seminal image from the paper - predicting domain boundaries, and using the grid to do de Novo structure prediction for unknown domains:

IPB Image


The second piece of good news is that another paper involving protein structure folding has recently been submitted for publication.

Melissa Pentony et al. have presented work considering sites of positive selection (areas of faster-than-average evolution) in the proteomes of five major plant species in order to study plant protein evolution, and have extended this analysis in a novel way by mapping sites of positive selection in proteins onto 3D predicted protein structures. This is exciting as, seen in the image below, it allows scientists to visualize where sites of increased evolution occur structurally on a protein.

IPB Image
[Image: a DNA-binding protein interacting with DNA, with positively selected residues of protein highlighted by blue spheres. Notice, then, that the parts of the protein interacting with DNA are under selected evolution!]



This work is currently being revised, and will be available for preview shortly - Another example of the grid producing data (predicted protein structure!) that can be used in diverse biological studies to extend analyses and relate biological phenomena to the fundamental molecular machines of the human body (proteins!).

Now on to what's been grinding on your CPUs...

IPB Image


Processing for the Human Microbiome Project (described in the last update) was finished with batch 'ok', and from there we moved on to Plasmodium Yoelii Yoelii, which made up batches 'ol' through 'op'.

I mentioned the bacteria Plasmodium Yoelii Yoelii in the previous status update and very briefly in my last forum post. Pyy is a rodent malaria used a model organism for studying malaria in general, and specifically human malaria (the concept of using very similar model organisms is common in the field, and is extremely helpful for increasing data set size and inferring properties of an organism from known properties in a model). For this reason, having accurate structural knowledge of Pyy is important for the malaria research community.

Knowing this, we looked up our collaborator Jane Carlton, recently moved to the NYU Department of Biology, and asked for the most up-to-date data. We were pointed to a resource called PlasmoDB ( http://plasmodb.org/plasmo/ ), and from the data we found there put together five batches of novel protein domains to be sent for de Novo structure prediction.

After malaria...

After malaria, while we updated our post-processing analyses to make better use of grid results, we moved on to Archaea, which make up the third domain of life (the other two being bacteria and eukaryotes). Archaea are incredibly interesting and important organisms - they're now getting a lot of press due to their role in the function of the human colonic system, and interestingly, some species are known to thrive in incredibly harsh environments, such as salt lakes and hot springs.

For more information on Archaea, check this Berkeley resource or, of course, wikipedia - Archaea. The archaea Haloferax and Haloarcula comprise batches oq through ow.

Pausing the Archaeas

At the moment, we have a large list of archaea to analyze, but have switched priorities due to some extremely exciting new ideas regarding protein function prediction based on machine learning techniques (which sounds AI-cool, but is more statistics-cool) which we have developed in house, and on revised proteome data for Mouse and Human.

We have decided to re-run this new mouse and human data through our domain prediction pipeline and send results to the grid in order to get the best possible protein structure data. With improvements and updates to our pre- and post-processing methods and increased sampling on the grid (we're now folding 100,000 structures per domain, up from 30,000!), we will be able to approach the problem of protein structure prediction in a novel and potentially game-changing way with the best data available.

In terms of work batches, ox through ql (we skipped the letter p in batch naming) are made up of Mouse protein data, with ox through ql running on the grid now. After ql, new Human data will take over.

The first culmination of this mouse and human redo, along with our new protein function prediction ideas, will be our presence at a nation-wide protein structure/function jamboree hosted by the University of California, San Diego in early December, where we will present the work of the grid and its incorporation into our new methods to hopefully astounding effect!

Cross your fingers for us… koc.gif awesome.png

Автор: vitalidze1 Mar 26 2012, 23:42

в мене пендінг валідейшн по цьому проекту вже днів 5 тягнеться, щось вони довго не проходять перевірку...

Автор: Rilian Apr 16 2012, 21:03

Resizing of HPF2 work units

For future work units, we have decreased the average run time from 9 hours to 6 hours for this project. This will allow users with slower computers or computers which are available less time to have a better chance of completing work units for this project. It will take about 20 days for the existing longer work units to be sent out, so the new shorter work units won't be seen until after this time.

Seippel
Apr 13, 2012

с этого момента все задания будут считаться в среднем 6 часов

http://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,32972

Автор: Rilian Jun 21 2012, 17:31

Опубликовали работу "The Plant Proteome Folding Project: Structure and Positive Selection in Plant Protein Families" в журнале Genome Biology and Evolution

Researchers have published a paper in the journal Genome Biology and Evolution, which documents their findings studying a number of plant genomes, their proteomes, evolution and protein structure.

Lay Person Abstract:

Melissa Pentony et al. have presented work considering components of proteins exhibiting faster-than-average evolution in the proteomes of five major plant species, including rice (Oryza sativa) and Arabidopsis thaliana (an important model organism for plant study). They describe new information on the relationship between evolution and protein structure in plants.

The World Community Grid has contributed to this study by providing a much more structurally complete view of unknown and understudied proteins from five plant families than was previously available. The results from the Human Proteome Folding project produced 29,202 protein structures contributing to this project, of which 4,764 were very high-confidence. This should eventually assist agricultural scientists to better understand important plant and food crops, how to breed them for disease resistance, better nutrition and to better handle environmental stress.

Technical Abstract:

Despite its importance, relatively little is known about the relationship between the structure, function, and evolution of proteins, particularly in land plant species. We have developed a database with predicted protein domains for five plant proteomes (http://pfp.bio.nyu.edu/) and used both protein structural fold recognition and de novo Rosetta-based protein structure prediction to predict protein structure for Arabidopsis and rice proteins. Based on sequence similarity, we have identified ~15,000 orthologous/paralogous protein family clusters among these species and used codon-based models to predict positive selection in protein evolution within 175 of these sequence clusters. Our results show that codons that display positive selection appear to be less frequent in helical and strand regions and are overrepresented in amino acid residues that are associated with a change in protein secondary structure. Like in other organisms, disordered protein regions also appear to have more selected sites. Structural information provides new functional insights into specific plant proteins and allows us to map positively selected amino acid sites onto protein structures and view these sites in a structural and functional context.

Access to Paper:

To view the paper, please http://gbe.oxfordjournals.org/content/4/3/360.full%C2%A0.

Автор: Rilian Jul 12 2012, 11:02

Апдейт статуса проекта!

http://bonneaulab.bio.nyu.edu/wcg/thread_2012_07_01.html

Автор: Rilian Jul 12 2012, 17:34

Paper published in the journal Molecular Cell using Human Proteome Folding project results

http://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=204

Summary
A paper was published in the journal Molecular Cell, which used results from the Human Proteome Folding project in identifying proteins which regulate processes in human cells.

Paper Title:

“The mRNA-Bound Proteome and its Global Occupancy Profile on Protein-Coding Transcripts”

Lay Person Abstract:

The Bonneau lab at NYU collaborated with Markus Landthaler and colleagues from the Max Delbruch Center for Molecular Medicine, Berlin, contributing in an effort to discover and study novel RNA-binding proteins in the human proteome. These proteins play an important role in regulating activity in the cell. Some of the proteins have been implicated in diseases such as Alzheimer’s, muscular diseases, cancers and others. This information should help scientists in further understanding of disease processes, possibly leading to better treatments.

The Landthaler group at the MDC put together a landmark experiment for discovering RNA-binding proteins - a type of protein extremely important to human genetic systems. They then contacted the Bonneau lab for computational analysis. World Community Grid has provided predicted structures for a more complete structural landscape, contributing greatly to the analysis of human protein structure and function. This analysis allowed the Bonneau lab to verify experiment results from the Landthaler lab, lending confidence to their methods and providing data on RNA-binding proteins found via experimental methods. Furthermore, cutting-edge function prediction methods were developed and proved in this experiment, which will feature World Community Grid data in future publications.

Technical Abstract:

Protein-RNA interactions are fundamental to core biological processes, such as mRNA splicing, localization, degradation, and translation. We developed a photoreactive nucleotide-enhanced UV crosslinking and oligo(dT) purification approach to identify the mRNA-bound proteome using quantitative proteomics and to display the protein occupancy on mRNA transcripts by next-generation sequencing. Application to a human embryonic kidney cell line identified close to 800 proteins. To our knowledge, nearly one-third were not previously annotated as RNA binding, and about 15% were not predictable by computational methods to interact with RNA. Protein occupancy profiling provides a transcriptome-wide catalog of potential cis-regulatory regions on mammalian mRNAs and showed that large stretches in 3′ UTRs can be contacted by the mRNA-bound proteome, with numerous putative binding sites in regions harboring disease-associated nucleotide polymorphisms. Our observations indicate the presence of a large number of mRNA binders with diverse molecular functions participating in combinatorial posttranscriptional gene-expression networks.

Access to Paper:

To view the paper, http://www.cell.com/molecular-cell/abstract/S1097-2765%2812%2900437-6


Автор: Bel Aug 25 2012, 09:58

Среднее время выполнения всех заданий увеличилось на 15%!

Автор: Rilian Dec 16 2012, 19:06

Новости проекта

World Community Grid Post - HPF2 Update, June/July 2012
http://bonneaulab.bio.nyu.edu/wcg/thread_2012_07_01.html

World Community Grid Post - HPF2 Update, Fall/Winter 2012
http://bonneaulab.bio.nyu.edu/wcg/thread_2012_12_04.html


Автор: Sonechko Feb 12 2013, 00:11

Блін...



Тепер навіть незнаю що кранчити, все вже на сапфірах... suicide.gif

Автор: KING100N May 18 2013, 19:00

Проект внезапно подошел к концу - http://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=299. Судя по данным с http://i137.photobucket.com/albums/q210/Sekerob/WCGYearsPi1Project.png таблички, осталось 24 дня счета. Рекомендую всем, желающим получить очередной баджик налегать на проект после завершения пентатлона. Например, себе )

Summary
The first project to run on World Community Grid, the Human Proteome Folding project, is coming to a close. They have added greatly to the knowledge of protein structures, providing their results to other scientists via their data base resources.



The first project to run on World Community Grid, the Human Proteome Folding project, is coming to a close.

They have greatly added to the knowledge of protein structures, providing their results to other scientists http://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=291. In addition, the project has published many high quality http://www.worldcommunitygrid.org/research/hpf2/news.do?filterCategory=3_10&filterTags=14&sortBy=&pageNum=1. These publications and the data base resources have helped many other scientists with their own work to understand disease processes and to accelerate their search for cures.

We are a little sad to see the project ending in a few weeks, but we are also very proud of this project's accomplishments. Please read their http://bonneaulab.bio.nyu.edu/wcg/thread_2013_05_16.html for more details.

We thank you, our member volunteers, for contributing to this project and we hope you will http://www.worldcommunitygrid.org/research/viewAllProjects.do, as well as the many new ones we expect to launch before too long.

Автор: Rilian May 18 2013, 19:45

Вау круто!

Наверное выпустят третью фазу с новыми алгоритмами Rosetta@home

Автор: Rilian May 18 2013, 19:59

Взято отсюда http://bonneaulab.bio.nyu.edu/wcg/thread_2013_05_16.html

There are some exciting research possibilities he and others are considering such as investigating how mutations alter protein structure. Perhaps one of these ideas may grow into a new World Community Grid project at some time in the future.


Автор: Rilian May 21 2013, 10:39

Осталось 21 день до завершения проекта!

Предлагаю подключить его всем у кого он не подключен, и ускорить это событие! koc.gif

Автор: Rilian Jun 4 2013, 10:55

Задания будут выдаваться еще 1 неделю! Запасайте кэш если вы охотитесь за бейджиком koc.gif help.gif

Автор: Rilian Jun 8 2013, 10:21

Новых заданий осталось на пару дней! koc.gif

после этого бдут выдаваться только задания на пересчет

Автор: Rilian Jun 11 2013, 12:40

Новые задания больше не выдаются

еще пару недель будут досчитываться те что уже выданы, и потом проект будет завершен! koc.gif

Автор: Rilian Jul 8 2013, 11:57

Human Proteome Folding Project - Phase 2: Grid phase complete

Проект завершен!

Благодаря данным проекта http://www.worldcommunitygrid.org/about_us/displayNews.do?filterCategory=3_10&filterTags=14&sortBy=&pageNum=1 в научные журналы (см ссылку)

The grid-computing phase of the Human Proteome Folding - Phase 2 project is now complete. It was a massive project, launched in June 2006, and was the second-longest-running World Community Grid project to date. Volunteer members contributed over 123,000 CPU-years of computing power to run simulations and help determine the structure of proteins. The researchers at the Bonneau lab are very thankful to our members for this support, without which the project would have been impossible.

The researchers have made the protein structure data calculated during this project available to scientists around the world through a public database. This data has led to the publications of http://www.worldcommunitygrid.org/about_us/displayNews.do?filterCategory=3_10&filterTags=14&sortBy=&pageNum=1 in academic journals and these results have helped in better understanding the proteins involved in many diseases and have led to further research into how they might be treated. Work will continue for some time as scientists continue to analyze the protein structures.

For more details, please see the latest post from the Bonneau lab team http://bonneaulab.bio.nyu.edu/wcg/thread_2013_05_16.html, as well as their more detailed recent post http://bonneaulab.bio.nyu.edu/wcg/thread_2013_06_17.html.

koc.gif

http://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=308

Invision Power Board
© Invision Power Services