Posts by Jim1348

21) Message boards : Number crunching : High failure rate (Message 1680)
Posted 24 Feb 2022 by Jim1348
Post:
You don't define "failure" and your computers are hidden.
22) Questions and Answers : Unix/Linux : BOINC 7.18.1 causes invalids (Message 1668)
Posted 22 Feb 2022 by Jim1348
Post:
Why would one choose to run this version instead of the one in the Linux repository?

It is the latest one in the development release (LocutusOfBorg) repository.
https://launchpad.net/~costamagnagianfranco/+archive/ubuntu/boinc
Sometimes they fix problems. Sometimes they cause them.
23) Message boards : Number crunching : BOINC 7.18.1 causes invalids on Linux (Message 1664)
Posted 21 Feb 2022 by Jim1348
Post:
I still see a lot of people who run Linux producing invalids.
Maybe they did not see my post in the Linux section.
https://quchempedia.univ-angers.fr/athome/forum_thread.php?id=166

(Even more likely, they never check their results. It happens all the time.)
24) Message boards : Number crunching : High failure rate (Message 1661)
Posted 21 Feb 2022 by Jim1348
Post:
I was worried about that too in my long-term statistics.
But I just reattached a machine that I had not used for a while, and it seems OK.
https://quchempedia.univ-angers.fr/athome/results.php?hostid=10585

So either it was "fixed", or else it was just the data that was hard to crunch.
25) Message boards : Number crunching : ERROR: Vboxwrapper lost communication with VirtualBox, rescheduling task for a later time (Message 1653)
Posted 12 Feb 2022 by Jim1348
Post:
I haven't tried this myself yet.

I have. BOINC does a "Signature verification" and won't accept the new wrapper.
But maybe you should try it and see if you can make it work.
26) Questions and Answers : Unix/Linux : BOINC 7.18.1 causes invalids (Message 1644)
Posted 30 Jan 2022 by Jim1348
Post:
Upgrading to BOINC 7.18.1 (from LoctusOfBorg, ppa:costamagnagianfranco/boinc) caused rapid invalids on my Ryzen 3600 under Ubuntu 20.04.3.
But BOINC 7.16.6 (from Ubuntu Software) and 7.16.17 (the previous version from LOB) work OK.

(But there is a new build of 7.18.1 on LOB. I have not tried it.)
27) Message boards : Number crunching : Stuck tasks (Message 1641)
Posted 23 Jan 2022 by Jim1348
Post:
Does anybody know what exactly is going on here? Should I just abort these tasks?

You must be on Windows, though your computers are hidden.

They get stuck. That is what is going on.
Yes, abort them. (Linux doesn't have the problem for some reason.)
28) Questions and Answers : Unix/Linux : Any alternative to the current taskset clobbering? (Message 1639)
Posted 15 Jan 2022 by Jim1348
Post:
I think you are having a problem with definitions. It is a common condition from what I can see.
BOINC reports as virtual cores (as does the OS) what the software people refer to as "threads".

I think you need to step out of your narrow thinking.
29) Questions and Answers : Unix/Linux : Any alternative to the current taskset clobbering? (Message 1636)
Posted 13 Jan 2022 by Jim1348
Post:
Wow. That is not how any of that works. At all.

Setting the usage limits in the BOINC client doesn't actually restrict how many cores get used, it restricts the number of threads that work units can use. As such, if you set that number to 75% on a 12-core, 24-thread system, you wind up with a total number of work units running that equals eighteen threads. If the operating system scheduler doesn't keep those threads contained to specific cores, it results in latency issues for the F@H work units, as context switching takes time.

As for "To run only one task per core", that's not the issue. The issue is the project's run script using taskset to work around how mpirun handles unit execution.

Yes, it is how it works.
BOINC reports virtual cores, since that is how the operating system sees them.
You need to brush up on virtualization.

Your idea the the OS scheduler produces the latency is imaginative though.
30) Message boards : Number crunching : Inconclusive validation (Message 1626)
Posted 11 Jan 2022 by Jim1348
Post:
So burned a bunch of CPU time for a dead task. Nice. [heavy sarcasm]

Yes, I know. That is a very common situation.

I asked some time ago whether they could not reduce the number of work units before they declare it an "invalid".
But apparently they need that many results to ensure that it is in fact invalid.

I still expect that they could cut it down to maybe five (I think the "aborted" does not count).
After that, I have never seen one that turns our "valid". But I don't see the overall statistics either. The project admin does.
31) Questions and Answers : Unix/Linux : Any alternative to the current taskset clobbering? (Message 1618)
Posted 6 Jan 2022 by Jim1348
Post:
Your computers are hidden so I am not sure that you will get much advice. And you are doing things the hard way.

To reserve a core for the GPU (I do Folding on all my GPUs), just set the BOINC Preference to use less than 100% of the CPUs.
For example, on a 12 core Ryzen 3600, I set it to use 95%, which leaves one virtual core free.

To run only one task per core, just set QuChemPedIA preferences "Max # CPUs 1".
32) Questions and Answers : Unix/Linux : Linux vs Windows (Message 1615)
Posted 4 Jan 2022 by Jim1348
Post:
The windows hosts can go as fast as the linux hosts but they are less reliable because of the use of VirtualBox.(it needs to be handled , cautiously , and doesn't suit with big hosts and multiple VMs running simultaneously.)
Each cruncher has to do a good compromise between speed and reliability , in order to not waste his resources .
Sometimes , it's better to run a bit slower but surely , especially when it is NewChem long application (several days long).
Running with windows, you have to often check the wus to see if the behavior is correct.With Linux it's not necessary,the probability to fail is lower.

I was about to post the same thing. I have run QuChemPedIA on Ryzen 3600 machines under Win10 (VBox 5.2.44) and Ubuntu 20.04.3.
It seems that the times are the same as nearly as I can determine. But you get hung work units under Windows, and not under Ubuntu.

I now have it on a Ryzen 3900X (Ubuntu 20.04.3) and will let it go for a while.
33) Message boards : Number crunching : Inconclusive validation (Message 1600)
Posted 20 Dec 2021 by Jim1348
Post:
Invalid tasks are definitely wrong or useless.

Not quite correct either. It tells the researchers that the initial starting points do not produce valid results.
That is useful information, and part of what they are trying to find.
34) Message boards : Number crunching : Inconclusive validation (Message 1597)
Posted 18 Dec 2021 by Jim1348
Post:
I have a lot of "Inconclusive validation" on my Win11.

It is quite normal, though I don't remember the distinction between "inconclusive" and "invalid".
https://quchempedia.univ-angers.fr/athome/results.php?hostid=9340

But it is due to how the science operates. You can't determine upfront which results will come out valid.
35) Message boards : Number crunching : Fast wu,s invalid (Message 1594)
Posted 18 Dec 2021 by Jim1348
Post:
Give us feedback to inform us of the success or the fail of this process.

At the moment, I am using BOINC 7.16.17 from ppa:costamagnagianfranco/boinc
https://launchpad.net/~costamagnagianfranco/+archive/ubuntu/boinc

It is running fine on Ubuntu 20.04.3.
On some versions there is an incompatibility with the libraries, and going back to the Ubuntu Software version usually fixes it.
36) Message boards : Number crunching : Virtual enviorment unmanageable (Message 1572)
Posted 21 Nov 2021 by Jim1348
Post:
If you don't listen to what I say, you are on your own.
37) Message boards : Number crunching : Virtual enviorment unmanageable (Message 1568)
Posted 21 Nov 2021 by Jim1348
Post:
You can usually solve it by going back to VirtualBox 5.2.44, at least on most projects.
https://www.virtualbox.org/wiki/Download_Old_Builds_5_2

You will still get a lot of invalids, but that is just the nature of the scientific work and not a VBox problem.
38) Message boards : Number crunching : Runtimes (Message 1525)
Posted 24 Oct 2021 by Jim1348
Post:
How did you set "Max # CPUs"?
It would be possible for the CPU time to be greater than the run time if it was set to greater than "1".
39) Message boards : Number crunching : Host ID 1388 corrupted (Message 1518)
Posted 14 Oct 2021 by Jim1348
Post:
I've an issue with my two ubuntu 21.04 hosts with quchem task since the update to ubuntu 21.04
One computer has finished much results before the update

I don't see the machines. Can you provide a link to them?
40) Message boards : Number crunching : Host ID 1388 corrupted (Message 1517)
Posted 14 Oct 2021 by Jim1348
Post:
All invalids. They don't have hardware virtualization, though I did not know that it was necessary on Linux.

https://quchempedia.univ-angers.fr/athome/results.php?hostid=8903
Virtualization Virtualbox (6.1.26_Ubuntur145957) installed, CPU does not have hardware virtualization support
Operating System Linux Ubuntu
Ubuntu 21.04 [5.11.0-37-generic|libc 2.33 (Ubuntu GLIBC 2.33-0ubuntu5)]

https://quchempedia.univ-angers.fr/athome/results.php?hostid=8613
Virtualization None
Operating System Linux Ubuntu
Ubuntu 20.04.1 LTS [5.4.0-84-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9.1)]


Previous 20 · Next 20

©2024 Benoit DA MOTA - LERIA, University of Angers, France