Linux Problems |
Привіт Гість ( Вхід | Реєстрація )
Linux Problems |
mickydl* |
Apr 27 2011, 21:59
Пост
#1
|
Соромлюсь щось писати Група: New Members Повідомлень: 4 З нами з: 26-April 11 Користувач №: 1 744 Стать: bot |
Hello everybody,
I have been trying to get some work done on my Linux machines but so far ALL my WUs ended in an error. One of the problem seems to be related to ZIP (Program not found) (this is one of the results). After adding a symbolic link to gzip the errors changed to Signal 11 errors. Has anyone successfully completed any work on Linux ? BTW: I'm running 64Bit Linux only. Regards, Michael |
SLinCA-Yuri |
Apr 28 2011, 08:02
Пост
#2
|
SLinCA@Home Admin Група: Trusted Members Повідомлень: 175 З нами з: 26-January 11 Користувач №: 1 595 Стать: Чол Free-DC_CPID |
Hello everybody, I have been trying to get some work done on my Linux machines but so far ALL my WUs ended in an error. One of the problem seems to be related to ZIP (Program not found) (this is one of the results). After adding a symbolic link to gzip the errors changed to Signal 11 errors. Has anyone successfully completed any work on Linux ? BTW: I'm running 64Bit Linux only. Regards, Michael Dear Michael, it seems to be that you are only person with such flavor of Linux OS (kernel 2.6.34.1). Many other Linux-users (for example, here you can see valid results for 64-bit machines of EDGES cluster) did not report such problems - at the moment at least - ... only many Windows-users. From you error-log (thanks - ) I see that system cannot find zip "sh: zip: command not found". Could you, please, check, if you can make ANY ZIP-archives from your command line? Thank you in advance! Cheers, Yuri |
Yuriy V |
Apr 28 2011, 16:59
Пост
#3
|
мрію про ферму... Група: Trusted Members Повідомлень: 179 З нами з: 21-November 10 Користувач №: 1 545 Стать: Чол Free-DC_CPID Парк машин: Як прийдеться |
BTW: I'm running 64Bit Linux only. Michael Hi, mickydl* As SLinCA-Yuri said, try to install the appropriate package, (eg unzip) from the repository of your Linux distribution. -------------------- |
mickydl* |
Apr 28 2011, 21:40
Пост
#4
|
Соромлюсь щось писати Група: New Members Повідомлень: 4 З нами з: 26-April 11 Користувач №: 1 744 Стать: bot |
BTW: I'm running 64Bit Linux only. Michael Hi, mickydl* As SLinCA-Yuri said, try to install the appropriate package, (eg unzip) from the repository of your Linux distribution. Hi Yuriy V and SLinCA-Yuri, Thanks for your responses. I'm not sure that the ZIP problem is the real problem. After I noticed the computation errors and saw the error message in the WU I linked to in my first post I checked for ZIP on my machines. I don't currently have zip installed, but I do have gzip on my Linux machines. So in a first attempt to cure the problem I created a symbolic link named zip to gzip in the search path. The error messages of zip not being found are gone now. However, I get Signal 11 errors now (here is an example). I don't yet know if it happens at the end of a WU (and might still be related to the zip problem) or somewhere in the middle of a WU. There might be different calling conventions for zip and gzip. What exactly do your applications need - zip and unzip ? what command line options do they need? Maybe there is a way to solve this without having to install new software. Simply installing the right package for my distribution is difficult because I'm not using any distribution. The Linux is completely compiled from scratch (CLFS). So if there's no simple cure for the problem I'd have to compile the right programs from source. Although compiling is not really a problem, setting everything up correctly on my machines is quite some work (a small disk-less cluster booting over network). So I'm somewhat reluctant to to do it. Regards, Michael |
SLinCA-Yuri |
Apr 29 2011, 08:56
Пост
#5
|
SLinCA@Home Admin Група: Trusted Members Повідомлень: 175 З нами з: 26-January 11 Користувач №: 1 595 Стать: Чол Free-DC_CPID |
BTW: I'm running 64Bit Linux only. Michael Hi, mickydl* As SLinCA-Yuri said, try to install the appropriate package, (eg unzip) from the repository of your Linux distribution. Hi Yuriy V and SLinCA-Yuri, Thanks for your responses. I'm not sure that the ZIP problem is the real problem. After I noticed the computation errors and saw the error message in the WU I linked to in my first post I checked for ZIP on my machines. I don't currently have zip installed, but I do have gzip on my Linux machines. So in a first attempt to cure the problem I created a symbolic link named zip to gzip in the search path. The error messages of zip not being found are gone now. However, I get Signal 11 errors now (here is an example). I don't yet know if it happens at the end of a WU (and might still be related to the zip problem) or somewhere in the middle of a WU. There might be different calling conventions for zip and gzip. What exactly do your applications need - zip and unzip ? what command line options do they need? Maybe there is a way to solve this without having to install new software. Simply installing the right package for my distribution is difficult because I'm not using any distribution. The Linux is completely compiled from scratch (CLFS). So if there's no simple cure for the problem I'd have to compile the right programs from source. Although compiling is not really a problem, setting everything up correctly on my machines is quite some work (a small disk-less cluster booting over network). So I'm somewhat reluctant to to do it. Regards, Michael Dear Michael and Yuriy V, thanks for your care! Michael, you are right, it seems to be that you (and we) are facing the famous problem, described at BOINC official forum: Process got signal 11 and Process got signal 22 related with "32-bit binaries don't just work on every 64-bit Linux. If for example you install a fresh Ubuntu 6.10 or 7.04, 32-bit binaries won't work". Thank you for your experiments! During the next days (or 1 week) we will try to port application to 64-bit Linux and fix this problem. In any case, we will try to compensate your efforts by quadruple credits for discovering this problem and its roots. In addition, we will try to compensate effort of all other users with NEWEST 64-bit Linux OSs by double credits, who could not obtain credits for their work, because of this problem. Best regards, SLinCA-Yuri Best regards, Yuri |
mickydl* |
Apr 29 2011, 20:45
Пост
#6
|
Соромлюсь щось писати Група: New Members Повідомлень: 4 З нами з: 26-April 11 Користувач №: 1 744 Стать: bot |
Hi SLinCA-Yuri,
Thanks for the replay (and the credits ) I'm glad I could help. I'll keep an eye on the project and run sum WUs from time to time to see if anything is changing. Always glad to support another physics project. Michael |
mickydl* |
May 6 2011, 20:27
Пост
#7
|
Соромлюсь щось писати Група: New Members Повідомлень: 4 З нами з: 26-April 11 Користувач №: 1 744 Стать: bot |
Update:
I've been doing some more testing. I have removed the link to gzip and compiled and installed a zip from source (into /usr/local/bin). BOINC and the science applications are running as the user boinc. The zip command was in the search path of the boinc user. The two test-units that I ran (WU 1, WU 2) ended in a computation error with a zip not found error message. Second test: I added a symbolic link /usr/bin/zip pointing to the newly installed /usr/local/bin/zip. Since then I have run more than ten WUs and all have completed without error and validated. The only explanations I can think of are: 1) Although the top command shows that the applications are executed with the user boinc, they are run as the user root. The difference between the two users (root and boinc) is the configuration of the search path. The root user does not have the directory /usr/local/bin in its search path only /usr/bin. The boinc user has both directories in its search path. So, if the zip command is installed into /usr/local/bin the user boinc should find it, the user root should not. After adding the symbolic link to /usr/bin the root user can find the zip command as well. However, I don't believe it's a likely explanation. 2) Your application has a hard-coded path to /usr/local/bin/zip in its call to the zip command which requires zip to be present in /usr/local/bin. Maybe you could check that ? Anyways, it seems to work now Regards, Michael Edit: It seems that I was too optimistic I have a computation error again. However, It's the Signal 11 error instead of the zip error. (example) |
SLinCA-Yuri |
May 18 2011, 08:11
Пост
#8
|
SLinCA@Home Admin Група: Trusted Members Повідомлень: 175 З нами з: 26-January 11 Користувач №: 1 595 Стать: Чол Free-DC_CPID |
Update: I've been doing some more testing. I have removed the link to gzip and compiled and installed a zip from source (into /usr/local/bin). BOINC and the science applications are running as the user boinc. The zip command was in the search path of the boinc user. The two test-units that I ran (WU 1, WU 2) ended in a computation error with a zip not found error message. Second test: I added a symbolic link /usr/bin/zip pointing to the newly installed /usr/local/bin/zip. Since then I have run more than ten WUs and all have completed without error and validated. The only explanations I can think of are: 1) Although the top command shows that the applications are executed with the user boinc, they are run as the user root. The difference between the two users (root and boinc) is the configuration of the search path. The root user does not have the directory /usr/local/bin in its search path only /usr/bin. The boinc user has both directories in its search path. So, if the zip command is installed into /usr/local/bin the user boinc should find it, the user root should not. After adding the symbolic link to /usr/bin the root user can find the zip command as well. However, I don't believe it's a likely explanation. 2) Your application has a hard-coded path to /usr/local/bin/zip in its call to the zip command which requires zip to be present in /usr/local/bin. Maybe you could check that ? Anyways, it seems to work now Regards, Michael Edit: It seems that I was too optimistic I have a computation error again. However, It's the Signal 11 error instead of the zip error. (example) Dear mickydl - thank you very much for your persistence! You are right and your analysis is VERY helpful for us. Currently, we are very busy and can not devote enough attention to this problem. Moreover, all tasks work well on our Linux PCs and on many distributed Linux-clusters with quite standard and similar (to ours) configuration, and we could not catch this problem before your clever observation. That is why thank you again!!! We will try to fix this after finishing the current stage of work - our schedule is very tight and we cannot redsesign application and its configuration right now (it is really sadly). But we will do this 1-2 weeks later. By the way, we try to recalculate (about once a week) DOUBLE credits for finished and invalid tasks + we will try to award special (pleasant) surprises for those who will manage to solve the most crucial problems in SLinCA-operation and teach others. Dear x3mEn and YOU already in this list. Good luck! |
AMDave |
Oct 6 2012, 01:54
Пост
#9
|
Соромлюсь щось писати Група: New Members Повідомлень: 12 З нами з: 7-August 11 Користувач №: 1 916 Стать: bot Free-DC_CPID |
since the client app update yesterday from 32.34 to 32.38
I am getting signal 11 on all linux WUs (x86_64) after many hours of computing please fix same zip issue repeating in Win_x86_64 with client 32.38 as happened before http://dg.imp.kiev.ua/slinca/result.php?resultid=651457 again after many hours of computing please fix same client errors again as a few months ago? very frustrating. -------------------- . . . . . ___
. . . . . . .\___/\______ . . . . . . . \__AMD___\\__ --------------------------------------------- |
Lo-Fi Версія | Поточний час: 25th September 2024 - 17:18 |