Message boards : Number crunching : Error while computing (Ubuntu 18.04 LTS, Boinc version 7.9.3)
Author | Message |
---|---|
sergioclr Send message Joined: 16 Jan 13 Posts: 3 Credit: 12,757 RAC: 0 |
"Error while computing" on 4 tasks. Please help. Thanks in advance. Computer Intel(R) Core(TM)2 Duo CPU E8500 @ 3.16GHz [Family 6 Model 23 Stepping 10] AMD AMD TURKS (DRM 2.50.0 / 4.15.0-33-generic, LLVM 6.0.0) (2048MB) OpenCL: 1.1 Ubuntu 18.04.1 LTS [4.15.0-33-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)] BOINC version 7.9.3 Tasks that failed (all of them) PF16401.4_nojmps_aivan_SAVE_ALL_OUT_03_09_686649_6570_1 Stderr output <core_client_version>7.9.3</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63)</message> <stderr_txt> command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.07_x86_64-pc-linux-gnu @PF16401.4.nojmps.flags -in:file:boinc_wu_zip PF16401.4.nojmps.zip -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 1171180 rosetta_4.07_x86_64-pc-linux-gnu: loadlocale.c:129: _nl_intern_locale_data: Assertion `cnt < (sizeof (_nl_value_type_LC_TIME) / sizeof (_nl_value_type_LC_TIME[0]))' failed. SIGABRT: abort called Stack trace (17 frames): [0x5efead0] [0x5ffe380] [0x607e517] [0x60083a8] [0x6002794] [0x60027ee] [0x6000f73] [0x6001996] [0x60007df] [0x600020e] [0x5f1d10e] [0x5f1d73e] [0x5f1707a] [0x5f17202] [0x412631] [0x5fff8cc] [0x610b97] Exiting... </stderr_txt> ]]> PF14092.5_jmps_aivan_SAVE_ALL_OUT_03_09_686650_38888_0 Stderr output <core_client_version>7.9.3</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63)</message> <stderr_txt> command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.07_x86_64-pc-linux-gnu @PF14092.5.jmps.flags -in:file:boinc_wu_zip PF14092.5.jmps.zip -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 3961113 rosetta_4.07_x86_64-pc-linux-gnu: loadlocale.c:129: _nl_intern_locale_data: Assertion `cnt < (sizeof (_nl_value_type_LC_TIME) / sizeof (_nl_value_type_LC_TIME[0]))' failed. SIGABRT: abort called Stack trace (17 frames): [0x5efead0] [0x5ffe380] [0x607e517] [0x60083a8] [0x6002794] [0x60027ee] [0x6000f73] [0x6001996] [0x60007df] [0x600020e] [0x5f1d10e] [0x5f1d73e] [0x5f1707a] [0x5f17202] [0x412631] [0x5fff8cc] [0x610b97] Exiting... </stderr_txt> ]]> PF11937.7_jmps_aivan_SAVE_ALL_OUT_03_09_686692_2863_0 Stderr output <core_client_version>7.9.3</core_client_version> <![CDATA[ <message> couldn't start app: Input file minirosetta_database_1a38360_n_methyl.zip missing or invalid: RSA key check failed for file</message> ]]> PF06834.10_jmps_aivan_SAVE_ALL_OUT_03_09_686470_11068_0 Stderr output <core_client_version>7.9.3</core_client_version> <![CDATA[ <message> app_version download error: couldn't get input files: <file_xfer_error> <file_name>minirosetta_database_1a38360_n_methyl.zip</file_name> <error_code>-120 (RSA key check failed for file)</error_code> <error_message>signature verification failed</error_message>Tasks that failed (all of them) </file_xfer_error> </message> ]]> |
rjs5 Send message Joined: 22 Nov 10 Posts: 273 Credit: 23,054,272 RAC: 8,196 |
The key information is your Ubuntu version AND the error message. "rosetta_4.07_x86_64-pc-linux-gnu: loadlocale.c:129: _nl_intern_locale_data: Assertion `cnt < (sizeof (_nl_value_type_LC_TIME) / sizeof (_nl_value_type_LC_TIME[0]))' failed." Search for "glibc" on the message boards and change the locale settings for the BOINC user. from: Trotador Rosetta 4.07 was linked statically. Ubuntu 18.04 migrated to the next version of GLIBC version 2.27. They made some changes around the "locale" settings system calls that introduce a error. The solution is to set the "locale" setting properly for the BOINC user. You can search the forum for "glibc" and find a number of discussions. I would REALLY like for the Rosetta developers to publish the preferred instructions. Instead of changing language options globally I would suggest limiting changes to only what is needed, in this case BOINC client. For those using repository BOINC package and systemd distro, you can edit boinc-client.service file or add an override to the service. The override would look something like this: [Service] Environment=LC_ALL=C LC_ALL overrides all the other language settings. Put the override file in /etc/systemd/system/boinc-client.service.d/locale.override.conf and restart the client with sudo systemctl restart boinc-client. If changing the distro supplied service file then find boinc-client.service and add the Environment line in Service section. Changes to the file will be overwritten any time the package is updated. For those not using distro package or not using systemd make similar change to whatever startup script you use for the client. I installed BOINC with a .sh file so it is completely within the BOINC folder, what would I have to modify to circumvent the locale error? =============================== This was posted by henfredemars: in the thread .... https://boinc.bakerlab.org/rosetta/forum_thread.php?id=12242 2) Message boards : Number crunching : Rosetta 4.0+ (Message 89466) Posted 11 days ago by henfredemars Post: It took me hours to find a fix for this! I am so glad that others have found this problem and found a solution. Setting the locale to C using systemd's service file worked perfectly. Please don't statically link to glibc. That's just a bad idea. Hint: Ubuntu users, you can use systemctl show boinc-client.service | grep Path ...to find the service file. |
sergioclr Send message Joined: 16 Jan 13 Posts: 3 Credit: 12,757 RAC: 0 |
Thank you for your prompt answer. I edited /lib/systemd/system/boinc-client.service to include the Environment parameter. [Unit] Description=Berkeley Open Infrastructure Network Computing Client Documentation=man:boinc(1) After=network-online.target [Service] Environment=LC_ALL=C ProtectHome=true Type=simple Nice=10 User=boinc WorkingDirectory=/var/lib/boinc ExecStart=/usr/bin/boinc ExecStop=/usr/bin/boinccmd --quit ExecReload=/usr/bin/boinccmd --read_cc_config ExecStopPost=/bin/rm -f lockfile IOSchedulingClass=idle . . I stopped Boinc client, rebooted the computer and requested 2 new tasks to verify if the problem (Error while computing) still exists after the tasks are finished. Status: waiting for the tasks to finish execution. |
sergioclr Send message Joined: 16 Jan 13 Posts: 3 Credit: 12,757 RAC: 0 |
For the time being 2 tasks have been successfully processed and validated after including the Environment parameter as per my last thread update. Tasks: fadh_9_11_db_gre_1xlattparam_1_rpce1_atpacstincr-run_0_0_X_h17_l3_h16_l3-start_0006__9.9_10_fixedLoop_0003_design_m8_fragments_abinitio_SAVE_ALL_OUT_687026_180_0 foldit_2005364_0001_fold_and_dock_SAVE_ALL_OUT_686932_487_1 Question: did someone, with the same problem (Error while computing), apply the same solution? i appreciate any answers or comments. Tks, Sergio. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Question: did someone, with the same problem (Error while computing), apply the same solution? Sure. I (and everybody else) have done it ever since rjs5/Juha came up with that solution. https://boinc.bakerlab.org/rosetta/forum_thread.php?id=12242&postid=88951#88951 It is strange that we have to do it, but apparently to cure it will introduce other problems now. Rosetta is in a bit of a hole. |
rjs5 Send message Joined: 22 Nov 10 Posts: 273 Credit: 23,054,272 RAC: 8,196 |
Question: did someone, with the same problem (Error while computing), apply the same solution? What will make your head explode is that .... ALL Rosetta developers have to do to make their code work is to add ONE system call at the beginning of execution that sets the LC_ALL parameter. If they did that, every OLD and NEW GLIBC distribution would work automatically. They are paying the cost of NOT fixing it through the system cost of sending these jobs and managing the failure results. Crunchers are paying for the network bandwidth, disk space .... The one line code change to eliminate the problem? The putenv call will likely work and make the Rosetta execution more robust and work in both environments. Of course, their are other ways, but this one system call should just change operation for that one WU so nothing else on the system is changed. Others will correct me if I am wrong, BUT they will have a better solution. 8-) putenv("LC_ALL=C"); |
Paul Send message Joined: 29 Oct 05 Posts: 193 Credit: 66,366,511 RAC: 8,447 |
I hope someone can fix this soon. I tried to implement the fix listed above but my systems don’t have that directory structure. I have /etc/system/systemd but I don’t have the boinc-client.system.d folder. If the developers can fix this with 1 line of code I hope they will do it soon. I have several big systems that refuse to process 4.07 jobs because they have the 2.27 version of glibc. Thx! Paul |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 42 |
I've had three work units fail today, exit code 1, Windows 8.1 BOINC 7.12.1, all after just a few seconds. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
rjs5 Send message Joined: 22 Nov 10 Posts: 273 Credit: 23,054,272 RAC: 8,196 |
I hope someone can fix this soon. I tried to implement the fix listed above but my systems don’t have that directory structure. I have /etc/system/systemd but I don’t have the boinc-client.system.d folder. The one line fix I gave them not only appears to fix the problem that glibc 2.27 users see today, BUT should also prevent the reverse when Rosetta starts building on 18.04. The Rosetta systems are all on Ubuntu 16.04 today. The file on my Virtualbox image of Ubuntu 18.04 and my Fedora 27 hardware is located one level deeper in the "multi-user.target.wants" directory. Seems like the systemctl command automatically leaves out that one "multi-user" directory in the path. find /etc | grep boinc-client.service /etc/systemd/system/multi-user.target.wants/boinc-client.service If you get a clean simple set of instructions that work for you, it would be nice to get them documented in a clean, new thread. You can message me if you think I can help. 1. I isolated the problem to a static link with glibc 2.26 when running on a system with glibc 2.27. 2. My first sledge-hammer solution was to set LC_ALL=C system wide which causes some problems with some applications like "terminal". 3. The solution was refined to remove my system-wide sledge-hammer patch and locate the specific boinc user file to modify it. |
Paul Send message Joined: 29 Oct 05 Posts: 193 Credit: 66,366,511 RAC: 8,447 |
That fixed it! I have several successful 4.07 jobs completed & validated. Thx Thx! Paul |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,623,704 RAC: 9,591 |
Something new (from Ralph): Rosetta beta 64 bit linux version 4.08 released for testing. I hope they use also the others suggestion of rjs5 to improve the code... Well done Rjs5!! |
Message boards :
Number crunching :
Error while computing (Ubuntu 18.04 LTS, Boinc version 7.9.3)
©2024 University of Washington
https://www.bakerlab.org