Привіт Гість ( Вхід | Реєстрація )

> Linux Problems
mickydl*
Apr 27 2011, 21:59
Пост #1


Соромлюсь щось писати
*

Група: New Members
Повідомлень: 4
З нами з: 26-April 11
Користувач №: 1 744
Стать: bot



Hello everybody,

I have been trying to get some work done on my Linux machines but so far ALL my WUs ended in an error. sad.gif

One of the problem seems to be related to ZIP (Program not found) (this is one of the results).
After adding a symbolic link to gzip the errors changed to Signal 11 errors.

Has anyone successfully completed any work on Linux ? help.gif

BTW: I'm running 64Bit Linux only.

Regards,
Michael
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
 
Reply to this topicStart new topic
Відповідей(1 - 8)
SLinCA-Yuri
Apr 28 2011, 08:02
Пост #2


SLinCA@Home Admin
*****

Група: Trusted Members
Повідомлень: 175
З нами з: 26-January 11
Користувач №: 1 595
Стать: Чол
Free-DC_CPID



(mickydl* @ Apr 27 2011, 22:59) *

Hello everybody,

I have been trying to get some work done on my Linux machines but so far ALL my WUs ended in an error. sad.gif

One of the problem seems to be related to ZIP (Program not found) (this is one of the results).
After adding a symbolic link to gzip the errors changed to Signal 11 errors.

Has anyone successfully completed any work on Linux ? help.gif

BTW: I'm running 64Bit Linux only.

Regards,
Michael


Dear Michael,
idontno.gif
it seems to be that you are only person with such flavor of Linux OS (kernel 2.6.34.1).
Many other Linux-users (for example, here you can see valid results for 64-bit machines of EDGES cluster)
did not report such problems - at the moment smile.gif at least - ... only many Windows-users. smile.gif

From you error-log (thanks - thumbsup.gif ) I see that system cannot find zip
"sh: zip: command not found".
Could you, please, check, if you can make ANY ZIP-archives from your command line?

Thank you in advance!
Cheers,
Yuri
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Yuriy V
Apr 28 2011, 16:59
Пост #3


мрію про ферму...
*****

Група: Trusted Members
Повідомлень: 181
З нами з: 21-November 10
Користувач №: 1 545
Стать: Чол
Free-DC_CPID
Парк машин:
Як прийдеться



(mickydl* @ Apr 27 2011, 22:59) *

BTW: I'm running 64Bit Linux only.
Michael

Hi, mickydl*
As SLinCA-Yuri said, try to install the appropriate package, (eg unzip) from the repository of your Linux distribution.


--------------------

User is offlineProfile CardPM
Go to the top of the page
+Quote Post
mickydl*
Apr 28 2011, 21:40
Пост #4


Соромлюсь щось писати
*

Група: New Members
Повідомлень: 4
З нами з: 26-April 11
Користувач №: 1 744
Стать: bot



QUOTE(Yuriy V @ Apr 28 2011, 16:59) *

QUOTE(mickydl* @ Apr 27 2011, 22:59) *

BTW: I'm running 64Bit Linux only.
Michael

Hi, mickydl*
As SLinCA-Yuri said, try to install the appropriate package, (eg unzip) from the repository of your Linux distribution.


Hi Yuriy V and SLinCA-Yuri,

Thanks for your responses.

I'm not sure that the ZIP problem is the real problem.

After I noticed the computation errors and saw the error message in the WU I linked to in my first post I checked for ZIP on my machines.
I don't currently have zip installed, but I do have gzip on my Linux machines. So in a first attempt to cure the problem I created a symbolic link named zip to gzip in the search path. The error messages of zip not being found are gone now. However, I get Signal 11 errors now (here is an example). I don't yet know if it happens at the end of a WU (and might still be related to the zip problem) or somewhere in the middle of a WU.

There might be different calling conventions for zip and gzip. What exactly do your applications need - zip and unzip ? what command line options do they need? Maybe there is a way to solve this without having to install new software.

Simply installing the right package for my distribution is difficult because I'm not using any distribution. The Linux is completely compiled from scratch (CLFS). So if there's no simple cure for the problem I'd have to compile the right programs from source. Although compiling is not really a problem, setting everything up correctly on my machines is quite some work (a small disk-less cluster booting over network). So I'm somewhat reluctant to to do it.

Regards,
Michael
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
SLinCA-Yuri
Apr 29 2011, 08:56
Пост #5


SLinCA@Home Admin
*****

Група: Trusted Members
Повідомлень: 175
З нами з: 26-January 11
Користувач №: 1 595
Стать: Чол
Free-DC_CPID



(mickydl* @ Apr 28 2011, 22:40) *

(Yuriy V @ Apr 28 2011, 16:59) *

(mickydl* @ Apr 27 2011, 22:59) *

BTW: I'm running 64Bit Linux only.
Michael

Hi, mickydl*
As SLinCA-Yuri said, try to install the appropriate package, (eg unzip) from the repository of your Linux distribution.


Hi Yuriy V and SLinCA-Yuri,

Thanks for your responses.

I'm not sure that the ZIP problem is the real problem.

After I noticed the computation errors and saw the error message in the WU I linked to in my first post I checked for ZIP on my machines.
I don't currently have zip installed, but I do have gzip on my Linux machines. So in a first attempt to cure the problem I created a symbolic link named zip to gzip in the search path. The error messages of zip not being found are gone now. However, I get Signal 11 errors now (here is an example). I don't yet know if it happens at the end of a WU (and might still be related to the zip problem) or somewhere in the middle of a WU.

There might be different calling conventions for zip and gzip. What exactly do your applications need - zip and unzip ? what command line options do they need? Maybe there is a way to solve this without having to install new software.

Simply installing the right package for my distribution is difficult because I'm not using any distribution. The Linux is completely compiled from scratch (CLFS). So if there's no simple cure for the problem I'd have to compile the right programs from source. Although compiling is not really a problem, setting everything up correctly on my machines is quite some work (a small disk-less cluster booting over network). So I'm somewhat reluctant to to do it.

Regards,
Michael


Dear Michael and Yuriy V,

thanks for your care!

Michael, you are right, thumbsup.gif
it seems to be that you (and we) are facing the famous problem,
described at BOINC official forum:
Process got signal 11
and
Process got signal 22
related with "32-bit binaries don't just work on every 64-bit Linux. If for example you install a fresh Ubuntu 6.10 or 7.04, 32-bit binaries won't work".

Thank you for your experiments!
During the next days (or 1 week) we will try to port application to 64-bit Linux and fix this problem.
In any case, we will try to compensate your efforts by quadruple credits for
discovering this problem and its roots.
In addition, we will try to compensate effort of all other users with NEWEST 64-bit Linux OSs by double credits,
who could not obtain credits for their work, because of this problem.
worthy.gif

Best regards,
SLinCA-Yuri

Best regards,
Yuri
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
mickydl*
Apr 29 2011, 20:45
Пост #6


Соромлюсь щось писати
*

Група: New Members
Повідомлень: 4
З нами з: 26-April 11
Користувач №: 1 744
Стать: bot



Hi SLinCA-Yuri,

Thanks for the replay (and the credits biggrin.gif )

I'm glad I could help. I'll keep an eye on the project and run sum WUs from time to time to see if anything is changing. Always glad to support another physics project.

koc.gif
Michael
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
mickydl*
May 6 2011, 20:27
Пост #7


Соромлюсь щось писати
*

Група: New Members
Повідомлень: 4
З нами з: 26-April 11
Користувач №: 1 744
Стать: bot



Update:

I've been doing some more testing.

I have removed the link to gzip and compiled and installed a zip from source (into /usr/local/bin).
BOINC and the science applications are running as the user boinc. The zip command was in the search path of the boinc user. The two test-units that I ran (WU 1, WU 2) ended in a computation error with a zip not found error message.

Second test:
I added a symbolic link /usr/bin/zip pointing to the newly installed /usr/local/bin/zip. Since then I have run more than ten WUs and all have completed without error and validated. yahoo.gif

The only explanations I can think of are:

1) Although the top command shows that the applications are executed with the user boinc, they are run as the user root.
The difference between the two users (root and boinc) is the configuration of the search path. The root user does not have the directory /usr/local/bin in its search path only /usr/bin. The boinc user has both directories in its search path. So, if the zip command is installed into /usr/local/bin the user boinc should find it, the user root should not. After adding the symbolic link to /usr/bin the root user can find the zip command as well. However, I don't believe it's a likely explanation.


2) Your application has a hard-coded path to /usr/local/bin/zip in its call to the zip command which requires zip to be present in /usr/local/bin. Maybe you could check that ? idontno.gif

Anyways, it seems to work now punk.gif

Regards,
Michael

Edit:
It seems that I was too optimistic no.gif
I have a computation error again. However, It's the Signal 11 error instead of the zip error. (example)
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
SLinCA-Yuri
May 18 2011, 08:11
Пост #8


SLinCA@Home Admin
*****

Група: Trusted Members
Повідомлень: 175
З нами з: 26-January 11
Користувач №: 1 595
Стать: Чол
Free-DC_CPID



(mickydl* @ May 6 2011, 21:27) *

Update:

I've been doing some more testing.

I have removed the link to gzip and compiled and installed a zip from source (into /usr/local/bin).
BOINC and the science applications are running as the user boinc. The zip command was in the search path of the boinc user. The two test-units that I ran (WU 1, WU 2) ended in a computation error with a zip not found error message.

Second test:
I added a symbolic link /usr/bin/zip pointing to the newly installed /usr/local/bin/zip. Since then I have run more than ten WUs and all have completed without error and validated. yahoo.gif

The only explanations I can think of are:

1) Although the top command shows that the applications are executed with the user boinc, they are run as the user root.
The difference between the two users (root and boinc) is the configuration of the search path. The root user does not have the directory /usr/local/bin in its search path only /usr/bin. The boinc user has both directories in its search path. So, if the zip command is installed into /usr/local/bin the user boinc should find it, the user root should not. After adding the symbolic link to /usr/bin the root user can find the zip command as well. However, I don't believe it's a likely explanation.


2) Your application has a hard-coded path to /usr/local/bin/zip in its call to the zip command which requires zip to be present in /usr/local/bin. Maybe you could check that ? idontno.gif

Anyways, it seems to work now punk.gif

Regards,
Michael

Edit:
It seems that I was too optimistic no.gif
I have a computation error again. However, It's the Signal 11 error instead of the zip error. (example)


Dear mickydl - thank you very much for your persistence!
worthy.gif
You are right and your analysis is VERY helpful for us.
Currently, we are very busy and can not devote enough attention to this problem.
Moreover, all tasks work well on our Linux PCs and on many distributed Linux-clusters
with quite standard and similar (to ours) configuration,
and we could not catch this problem before your clever observation.
That is why thank you again!!!
We will try to fix this after finishing the current stage of work -
our schedule is very tight and we cannot redsesign application and its configuration right now (it is really sadly).
But we will do this 1-2 weeks later.


By the way, we try to recalculate (about once a week) DOUBLE credits for finished and invalid tasks
+
we will try to award special (pleasant) surprises for those
who will manage to solve the most crucial problems in SLinCA-operation and teach others.
Dear x3mEn and YOU already in this list. thumbsup.gif

Good luck!
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
AMDave
Oct 6 2012, 01:54
Пост #9


Соромлюсь щось писати
*

Група: New Members
Повідомлень: 12
З нами з: 7-August 11
Користувач №: 1 916
Стать: bot
Free-DC_CPID



since the client app update yesterday from 32.34 to 32.38
I am getting signal 11 on all linux WUs (x86_64) after many hours of computing
please fix

same zip issue repeating in Win_x86_64 with client 32.38 as happened before
http://dg.imp.kiev.ua/slinca/result.php?resultid=651457
again after many hours of computing
please fix

same client errors again as a few months ago? idontno.gif very frustrating.


--------------------
. . . . . ___
. . . . . . .\___/\IPB Image______
. . . . . . . IPB Image\__AMD___\IPB Image\__
---------------------------------------------
User is offlineProfile CardPM
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 Користувачів переглядають дану тему (1 Гостей і 0 Прихованих Користувачів)
0 Користувачів:

 



- Lo-Fi Версія Поточний час: 29th March 2024 - 00:15

Invision Power Board v1.3.3 © 1996 IPS, Inc.