Questions and Answers :
Unix/Linux :
Workunits failure after upgrading Debian to 11 (bullseye)
Message board moderation
Author | Message |
---|---|
Send message Joined: 19 Jun 20 Posts: 10 Credit: 2,792,400 RAC: 0 |
I am a Debian user with three computers (two P.C. and a laptop). I executed successfully tasks in the three computers. Recently, I upgraded Debian from 10 (buster) to 11 (bullseye). When upgrading, the boinc client and other projects (Rosetta & Einstein) are working as expected, but not QuChemPedIA. I tried to detach and attach again the project, and the issue persists. Now, the details. This is my laptop. All the tasks have "invalid" as output, and all of them last no more than 1s of CPU time. I will mention some examples. I could not see an easy way to fix it in the stderr logs of the tasks: |
Send message Joined: 5 Sep 20 Posts: 103 Credit: 2,142,600 RAC: 0 |
Your Linux kernel has passed from 4.19 to 5.10.Evidently the Linux executable has not been updated. Tullio |
Send message Joined: 3 Oct 19 Posts: 153 Credit: 32,412,973 RAC: 0 |
I have been running Ubuntu 20.04.3 LTS [5.4.0-81-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9.2)] for some time with no problems. https://quchempedia.univ-angers.fr/athome/results.php?hostid=8557&offset=0&show_names=0&state=4&appid= |
Send message Joined: 19 Jun 20 Posts: 10 Credit: 2,792,400 RAC: 0 |
Interesting, Jim1348. I do not know the reason, Tullio, and there are other relevant changes in the Debian upgrade. Damotbe told the generator of the project is EvoMol. It was developed on Ubuntu 18.04+ and it is written in Python. Several packages built with Python 2 were removed in Debian 11 (bullseye): «Python 2 is already beyond its End Of Life, and will receive no security updates. It is not supported for running applications, and packages relying on it have either been switched to Python 3 or removed. However, Debian bullseye does still include a version of Python 2.7, as well as a small number of Python 2 build tools such as python-setuptools. These are present only because they are required for a few application build processes that have not yet been converted to Python 3.»I am sure this incident will be solved soon, because 15 of the 20 top computers in average credit are using Debian 10 (Buster). |
Send message Joined: 3 Oct 19 Posts: 153 Credit: 32,412,973 RAC: 0 |
All the tasks have "invalid" as output, and all of them last no more than 1s of CPU time. Now I am seeing the same thing. I just added a new machine, a Ryzen 3700X on Ubuntu 20.04.3 that I just updated to all the latest stuff. https://quchempedia.univ-angers.fr/athome/results.php?hostid=8719&offset=0&show_names=0&state=5&appid= And the last seven refuse to run at all. I will reboot, but try another project until this one is fixed. It may be something to do with the libraries, but that is beyond me. |
Send message Joined: 19 Jun 20 Posts: 10 Credit: 2,792,400 RAC: 0 |
Jim1348, your new machine has the same Ubuntu version but a newer Linux kernel: 5.11.0-27. It could be the kernel version, as Tullio said... And, very important think, the issue is affecting to several Linux distributions, not only Debian. |
Send message Joined: 3 Oct 19 Posts: 153 Credit: 32,412,973 RAC: 0 |
It could be the kernel version, as Tullio said...Yes, certainly. Then it would be between 5.4 and 5.10. Maybe someone can narrow it further. That would help the project find it. |
Send message Joined: 5 Sep 20 Posts: 103 Credit: 2,142,600 RAC: 0 |
As as a notice, I am running SuSE Tumbleweed with a 5.13.13 kernel on Einstein@home and the tasks all complete. In QuChem I am using Windows 10 since I saw it is faster using VirtualBox than most Linux wingmen. Tullio |
Send message Joined: 3 Oct 19 Posts: 153 Credit: 32,412,973 RAC: 0 |
But it isn't just a question of the OS version. I completed a work unit normally on an i9-10900F running Ubuntu 20.04.3 with the 5.11.0 kernel. https://quchempedia.univ-angers.fr/athome/result.php?resultid=7650634 It ran the full time, and was inconclusive, but all the others who tried it thus far have produced validate errors with short run times. (They were all Intel CPU's too.) So the CPU type may be another factor that is also important for success now, or maybe something else. |
Send message Joined: 19 Jun 20 Posts: 10 Credit: 2,792,400 RAC: 0 |
It would be nice to know if your computer with the i9-10900F CPU is able to end a task successfully. I saw in the LHC forum the following message: (...) After upgrading this machine to Linux kernel 5.4.0-58 (from 5.4.0-52) I started getting failures, so I had to abort them. |
Send message Joined: 3 Oct 19 Posts: 153 Credit: 32,412,973 RAC: 0 |
There is something strange about that upgrade. It is causing problems for me on QuChemPedIA also on two machines, a Ryzen 3900X and the i7-9700. I have run several other projects on it without problem. At present, it is on TN-Grid, where it has been running the same on the new kernel as before. As for LHC, that was my post. It turned out not to be the Linux kernel, but the BOINC version. I posted on it here. https://quchempedia.univ-angers.fr/athome/forum_thread.php?id=114&postid=1389#1389 That problem was due to using BOINC from the Locutus of Borg repository, and isn't the problem here, since I no longer use that one. But there could be some other incompatibility of BOINC with something new in the libraries. That is possible. |
Send message Joined: 5 Sep 20 Posts: 103 Credit: 2,142,600 RAC: 0 |
On my SuSe Tumbleweed Linux I have a 7.18.0 BOINC with the warning that it is a development varsion and it may not work.But it works. Tullio |
Send message Joined: 3 Oct 19 Posts: 153 Credit: 32,412,973 RAC: 0 |
On my Ryzen 3700X, I tried upgrading BOINC from 7.6.11 (from Ubuntu Software) to 7.6.17 (from Locutus-of-Borg). https://launchpad.net/~costamagnagianfranco/+archive/ubuntu/boinc Surprisingly, it worked. I am now running normally again. https://quchempedia.univ-angers.fr/athome/results.php?hostid=8719&offset=0&show_names=0&state=4&appid= YMMV. |
Send message Joined: 23 Jul 19 Posts: 289 Credit: 464,119,561 RAC: 0 |
I'm totally busy and I don't have intern or engineer to work on this. The guy who compile nwchem for the project has gone and I'm not able to update the executable. The straightforward workaround is to run an old linux (in a VM) to compute for the project or it works if you have chance.... :( :( :( |
Send message Joined: 3 Oct 19 Posts: 153 Credit: 32,412,973 RAC: 0 |
It worked for me with Ubuntu 20.04.3 and BOINC 7.16.11 on one machine. On another machine it did not work on that OS until I upgraded BOINC to 7.16.17. Some combinations work on some machines and others don't. |
Send message Joined: 5 Sep 20 Posts: 103 Credit: 2,142,600 RAC: 0 |
On my Windows 10 CPU it uses a Linux Virtual machine starting "other Linux - 64 bit". I don't knows which Linux is this. Tullio. |
Send message Joined: 23 Jul 19 Posts: 289 Credit: 464,119,561 RAC: 0 |
you can know the Linux version with the command uname -a and the OS with cat /etc/os-release |
Send message Joined: 19 Jun 20 Posts: 10 Credit: 2,792,400 RAC: 0 |
I finally created a Virtual Machine with VirtualBox, with Debian 9 (Linux 4.9.0-16-amd64, Boinc 7.6.33), and I could execute successfully a test task (200 credit). I am sure Debian 10 will work too, as I was using Debian 10. Virtualization is a valid workaround. When using virtualization, I prefer to run projects that were created with it in mind from the start, as Cosmology. I will still keeping an eye on further developments of this QuChemPedIA, because it is a good idea. I love to support basic research, and quantic is an uncharted territory. I hope the project will find the resources for keeping it alive. Thank you all for your contributions to this thread, and thanks to the QuChemPedIA team for this amazing project. I wish you the best. |
Send message Joined: 2 Jan 22 Posts: 10 Credit: 665,400 RAC: 0 |
So, the whole "It's because Debian has a newer kernel" thing doesn't quite pan out, I think, because my Arch Linux boxes run work units just fine, but my Debian (Proxmox) box doesn't. I dunno. I don't know anything about the build environment. Doesn't work: uname -a Linux serverofpie 5.13.19-1-pve #1 SMP PVE 5.13.19-3 (Tue, 23 Nov 2021 13:31:19 +0100) x86_64 GNU/Linux Works: uname -a Linux HolyPie 5.15.10-arch1-1 #1 SMP PREEMPT Fri, 17 Dec 2021 11:17:37 +0000 x86_64 GNU/Linux Both have python2 and python3 installed. |
Send message Joined: 2 Jan 22 Posts: 10 Credit: 665,400 RAC: 0 |
Never mind, both don't work, they're just failing in different ways. |
©2024 Benoit DA MOTA - LERIA, University of Angers, France