Linux Problems |
Привіт Гість ( Вхід | Реєстрація )
Linux Problems |
mickydl* |
Apr 27 2011, 21:59
Пост
#1
|
Соромлюсь щось писати Група: New Members Повідомлень: 4 З нами з: 26-April 11 Користувач №: 1 744 Стать: bot |
Hello everybody,
I have been trying to get some work done on my Linux machines but so far ALL my WUs ended in an error. One of the problem seems to be related to ZIP (Program not found) (this is one of the results). After adding a symbolic link to gzip the errors changed to Signal 11 errors. Has anyone successfully completed any work on Linux ? BTW: I'm running 64Bit Linux only. Regards, Michael |
mickydl* |
May 6 2011, 20:27
Пост
#2
|
Соромлюсь щось писати Група: New Members Повідомлень: 4 З нами з: 26-April 11 Користувач №: 1 744 Стать: bot |
Update:
I've been doing some more testing. I have removed the link to gzip and compiled and installed a zip from source (into /usr/local/bin). BOINC and the science applications are running as the user boinc. The zip command was in the search path of the boinc user. The two test-units that I ran (WU 1, WU 2) ended in a computation error with a zip not found error message. Second test: I added a symbolic link /usr/bin/zip pointing to the newly installed /usr/local/bin/zip. Since then I have run more than ten WUs and all have completed without error and validated. The only explanations I can think of are: 1) Although the top command shows that the applications are executed with the user boinc, they are run as the user root. The difference between the two users (root and boinc) is the configuration of the search path. The root user does not have the directory /usr/local/bin in its search path only /usr/bin. The boinc user has both directories in its search path. So, if the zip command is installed into /usr/local/bin the user boinc should find it, the user root should not. After adding the symbolic link to /usr/bin the root user can find the zip command as well. However, I don't believe it's a likely explanation. 2) Your application has a hard-coded path to /usr/local/bin/zip in its call to the zip command which requires zip to be present in /usr/local/bin. Maybe you could check that ? Anyways, it seems to work now Regards, Michael Edit: It seems that I was too optimistic I have a computation error again. However, It's the Signal 11 error instead of the zip error. (example) |
SLinCA-Yuri |
May 18 2011, 08:11
Пост
#3
|
SLinCA@Home Admin Група: Trusted Members Повідомлень: 175 З нами з: 26-January 11 Користувач №: 1 595 Стать: Чол Free-DC_CPID |
Update: I've been doing some more testing. I have removed the link to gzip and compiled and installed a zip from source (into /usr/local/bin). BOINC and the science applications are running as the user boinc. The zip command was in the search path of the boinc user. The two test-units that I ran (WU 1, WU 2) ended in a computation error with a zip not found error message. Second test: I added a symbolic link /usr/bin/zip pointing to the newly installed /usr/local/bin/zip. Since then I have run more than ten WUs and all have completed without error and validated. The only explanations I can think of are: 1) Although the top command shows that the applications are executed with the user boinc, they are run as the user root. The difference between the two users (root and boinc) is the configuration of the search path. The root user does not have the directory /usr/local/bin in its search path only /usr/bin. The boinc user has both directories in its search path. So, if the zip command is installed into /usr/local/bin the user boinc should find it, the user root should not. After adding the symbolic link to /usr/bin the root user can find the zip command as well. However, I don't believe it's a likely explanation. 2) Your application has a hard-coded path to /usr/local/bin/zip in its call to the zip command which requires zip to be present in /usr/local/bin. Maybe you could check that ? Anyways, it seems to work now Regards, Michael Edit: It seems that I was too optimistic I have a computation error again. However, It's the Signal 11 error instead of the zip error. (example) Dear mickydl - thank you very much for your persistence! You are right and your analysis is VERY helpful for us. Currently, we are very busy and can not devote enough attention to this problem. Moreover, all tasks work well on our Linux PCs and on many distributed Linux-clusters with quite standard and similar (to ours) configuration, and we could not catch this problem before your clever observation. That is why thank you again!!! We will try to fix this after finishing the current stage of work - our schedule is very tight and we cannot redsesign application and its configuration right now (it is really sadly). But we will do this 1-2 weeks later. By the way, we try to recalculate (about once a week) DOUBLE credits for finished and invalid tasks + we will try to award special (pleasant) surprises for those who will manage to solve the most crucial problems in SLinCA-operation and teach others. Dear x3mEn and YOU already in this list. Good luck! |
AMDave |
Oct 6 2012, 01:54
Пост
#4
|
Соромлюсь щось писати Група: New Members Повідомлень: 12 З нами з: 7-August 11 Користувач №: 1 916 Стать: bot Free-DC_CPID |
since the client app update yesterday from 32.34 to 32.38
I am getting signal 11 on all linux WUs (x86_64) after many hours of computing please fix same zip issue repeating in Win_x86_64 with client 32.38 as happened before http://dg.imp.kiev.ua/slinca/result.php?resultid=651457 again after many hours of computing please fix same client errors again as a few months ago? very frustrating. -------------------- . . . . . ___
. . . . . . .\___/\______ . . . . . . . \__AMD___\\__ --------------------------------------------- |
Lo-Fi Версія | Поточний час: 27th April 2024 - 22:46 |