Posts by damotbe

41) Message boards : Number crunching : Host ID 1388 corrupted (Message 1299)
Posted 29 Dec 2020 by damotbe
Post:
I think the problem is solved, but I can't check it since my machine is not allowed now.
It was not the Linux kernel version, but BOINC 7.16.14, which I obtained from PPA costamagnagianfranco.

I fixed it with the expert advice from Gunde on the LHC forum, where I could not run CMS (a VirtualBox project).
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5560
Basically, if you have that PPA version, you go to "software and update" and uncheck line for that PPA then:
sudo apt remove boinc-client boinc-manager
sudo apt install boinc-client boinc-manager

That will install BOINC from the apt repository.
In the case of Ubuntu 20.04.1, it will be BOINC 7.16.6.
In the case of Ubuntu 18.04.5, it will be BOINC 7.9.3.
Either should work with QuChemPedIA, I expect.


This solution seems to work !
If you have a blacklisted host and want to test the patch, ask me for whitelist.
42) Message boards : Number crunching : Ubuntu 20.04.1 does not validate (Message 1298)
Posted 29 Dec 2020 by damotbe
Post:
I think the problem is solved, but I can't check it since my machine is not allowed now.
It was not the Linux kernel version, but BOINC 7.16.14, which I obtained from PPA costamagnagianfranco.

I fixed it with the expert advice from Gunde on the LHC forum, where I could not run CMS (a VirtualBox project).
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5560
Basically, if you have that PPA version, you go to "software and update" and uncheck line for that PPA then:
sudo apt remove boinc-client boinc-manager
sudo apt install boinc-client boinc-manager

That will install BOINC from the apt repository.
In the case of Ubuntu 20.04.1, it will be BOINC 7.16.6.
In the case of Ubuntu 18.04.5, it will be BOINC 7.9.3.
Either should work with QuChemPedIA, I expect.


This solution seems to work !
If you have a blacklisted host and want to test the patch, ask me for whitelist.
43) Message boards : Number crunching : Host ID 1388 corrupted (Message 1297)
Posted 29 Dec 2020 by damotbe
Post:
Done...
Thank you!
44) Message boards : Number crunching : Ubuntu 20.04.1 does not validate (Message 1284)
Posted 23 Dec 2020 by damotbe
Post:
5132 whitelisted ! Be careful not to make too many errors.
45) Message boards : Number crunching : Host ID 1388 corrupted (Message 1283)
Posted 23 Dec 2020 by damotbe
Post:
no, fortunately no scientific problem. But some volunteers have lost a lot of computing time and I will have to intervene manually to get them recalculated.
46) Message boards : Number crunching : Ubuntu 20.04.1 does not validate (Message 1280)
Posted 22 Dec 2020 by damotbe
Post:
I think the problem is solved, but I can't check it since my machine is not allowed now.
It was not the Linux kernel version, but BOINC 7.16.14, which I obtained from PPA costamagnagianfranco.

I fixed it with the expert advice from Gunde on the LHC forum, where I could not run CMS (a VirtualBox project).
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5560
Basically, if you have that PPA version, you go to "software and update" and uncheck line for that PPA then:
sudo apt remove boinc-client boinc-manager
sudo apt install boinc-client boinc-manager

That will install BOINC from the apt repository.
In the case of Ubuntu 20.04.1, it will be BOINC 7.16.6.
In the case of Ubuntu 18.04.5, it will be BOINC 7.9.3.
Either should work with QuChemPedIA, I expect.


I can whitelist your host if you want to test ? Just give me the host id.
47) Message boards : Number crunching : Validate error. (Message 1276)
Posted 20 Dec 2020 by damotbe
Post:
Sorry, there is a big issue with Linux app and a user just "destroyed" 300k workunits... It goes very fast because of the 3-10 seconds ! it just makes me loose a lot of time and many calculations are lost.

For your information, all errors on all OS are treated identically.

Thank you for your help.
Best regards,
Benoit
48) Message boards : Number crunching : Host ID 1388 corrupted (Message 1274)
Posted 20 Dec 2020 by damotbe
Post:
You’re upset? I’m livid. I’m user 111 and I wasted a bunch of electricity on your project. I had no idea your project is broken on newer Linux kernels. Don’t worry, though, I’ll delete your project from all my machines and won’t crunch it again. There are plenty of other projects that work just fine on newer kernels.



I'm not upset about you.

I'm upset of this new bug. I no longer have an intern or engineer at the moment. Plenty of molecules were lost because your machines emptied half of the stock. it's exhausting !
49) Message boards : Number crunching : Ubuntu 20.04.1 does not validate (Message 1269)
Posted 19 Dec 2020 by damotbe
Post:
I am going to update the linux kernel on this machine and see what happens.

Not surprisingly, after updating to the 5.4.0-58 Linux kernel, all the remaining twelve work units ended in "Validate error" after running only a few seconds.
https://quchempedia.univ-angers.fr/athome/results.php?hostid=3052&offset=0&show_names=0&state=5&appid=

So there is the smoking gun. It is something that changed between 5.4.0-52 and 5.4.0.-58.
(Or else between BOINC 7.16.11 and 7.16.14).
Maybe that will help.


It helps a lot !
At least we have something.
I have no idea how to solve this at the moment.
50) Message boards : Number crunching : Host ID 1388 corrupted (Message 1268)
Posted 19 Dec 2020 by damotbe
Post:
All hosts from user 111 blacklisted.

So many wasted calculations ! I'm upset !
51) Message boards : Number crunching : Host ID 1388 corrupted (Message 1240)
Posted 17 Dec 2020 by damotbe
Post:
here are all blacklisted Linux hosts. If you have an idea, you are welcome !

MariaDB [boinc]> select id, os_name, os_version  from host where max_results_day=-1 and os_name regexp "Linux.*";
+------+------------------+----------------------------------------------------------------------------------+
| id   | os_name          | os_version                                                                       |
+------+------------------+----------------------------------------------------------------------------------+
|  365 | Linux Debian     | Debian GNU/Linux bullseye/sid [5.9.0-3-amd64|libc 2.31 (Debian GLIBC 2.31-4)]    |
| 1003 | Linux Ubuntu     | Ubuntu 20.04.1 LTS [5.4.0-54-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9.1)]   |
| 1006 | Linux Ubuntu     | Ubuntu 20.04.1 LTS [5.4.0-54-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9.1)]   |
| 1007 | Linux Ubuntu     | Ubuntu 20.04.1 LTS [5.4.0-54-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9.1)]   |
| 1008 | Linux Ubuntu     | Ubuntu 20.04.1 LTS [5.4.0-54-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9.1)]   |
| 1388 | Linux Ubuntu     | Ubuntu 18.04.5 LTS [4.15.0-112-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1.2)] |
| 1912 | Linux Ubuntu     | Ubuntu 18.04.5 LTS [4.15.0-122-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1.2)] |
| 1967 | Linux Ubuntu     | Ubuntu 20.04.1 LTS [5.4.0-54-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9.1)]   |
| 3664 | Linux Ubuntu     | Ubuntu 20.04 LTS [5.3.0-59-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9)]       |
| 3685 | Linux Ubuntu     | Ubuntu 20.04 LTS [5.4.0-47-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9)]       |
| 3712 | Linux Ubuntu     | Ubuntu 20.04 LTS [4.19.107-Unraid|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9)]        |
| 3713 | Linux Ubuntu     | Ubuntu 20.04.1 LTS [5.4.74-1-lts|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9.1)]       |
| 3733 | Linux Ubuntu     | Ubuntu 20.04 LTS [5.4.0-47-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9)]       |
| 4129 | Linux Arch Linux | Arch Linux [5.8.14-arch1-1|libc 2.32 (GNU libc)]                                 |
| 4443 | Linux Ubuntu     | Ubuntu 20.04 LTS [5.4.68-1-lts|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9)]           |
| 4505 | Linux Ubuntu     | Ubuntu 18.04.4 LTS [5.4.0-48-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)]     |
| 4790 | Linux Ubuntu     | Ubuntu 20.04 LTS [5.4.0-52-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9)]       |
| 4796 | Linux Ubuntu     | Ubuntu 18.04.4 LTS [5.4.0-52-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)]     |
| 4832 | Linux Ubuntu     | Ubuntu 20.04 LTS [4.9.0-12-amd64|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9)]         |
| 5132 | Linux Ubuntu     | Ubuntu 20.04.1 LTS [5.4.0-47-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9)]     |
| 5157 | Linux Ubuntu     | Ubuntu 20.04.1 LTS [4.4.59+|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9.1)]            |
| 5306 | Linux Ubuntu     | Ubuntu 20.04.1 LTS [5.4.0-56-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9.1)]   |
+------+------------------+----------------------------------------------------------------------------------+

52) Message boards : Number crunching : Ubuntu 20.04.1 does not validate (Message 1239)
Posted 17 Dec 2020 by damotbe
Post:
We have a clue! If somebody can help please ... What is the difference between kernels ?
53) Message boards : Number crunching : Ubuntu 20.04.1 does not validate (Message 1235)
Posted 15 Dec 2020 by damotbe
Post:
this problem is discouraging... no idea of what occured...
54) Message boards : Number crunching : Tasks incorrectly marked as invalid: Please check validation rules (Message 1234)
Posted 15 Dec 2020 by damotbe
Post:
as soon as we have an engineer...
55) Message boards : Number crunching : Host ID 1388 corrupted (Message 1233)
Posted 15 Dec 2020 by damotbe
Post:
Thank you.

Unfortunately, "segmentation fault" gives the cause, not the root of the problem :(
At the moment, I suspect Glib or Kernel major modifications.
56) Message boards : Number crunching : Host ID 1388 corrupted (Message 1225)
Posted 3 Dec 2020 by damotbe
Post:
Not an Ubuntu 20.04 ! thanks
57) Message boards : Number crunching : Host ID 1388 corrupted (Message 1222)
Posted 29 Nov 2020 by damotbe
Post:
Done. Thank you
58) Message boards : Number crunching : Host ID 1388 corrupted (Message 1217)
Posted 27 Nov 2020 by damotbe
Post:
After verification, a lot of Ubuntu 20.04 hosts work perfectly. There is something, but I don't find what !
59) Message boards : Number crunching : Host ID 1388 corrupted (Message 1216)
Posted 27 Nov 2020 by damotbe
Post:
At the moment, I have no idea... no informative message to exploit. I'll try to reproduce the bogus.
60) Message boards : Number crunching : Validate error. (Message 1209)
Posted 25 Nov 2020 by damotbe
Post:
these events are pure system messages. It is not related to molecules. "Trickle-Up Event" are messages send to the server.


Previous 20 · Next 20

©2024 Benoit DA MOTA - LERIA, University of Angers, France