Posts by Dayle Diamond

1) Message boards : Number crunching : Short Tasks run for 14 hours and counting. (Message 733)
Posted 8 Apr 2020 by Dayle Diamond
Post:
You "have to wonder" because the project administrator is staying silent.

My machine is been reliable with quite a few different BOINC tasks over the years.
The only constant I can tell is that in every project forum there's always somebody who appoints themselves defender of the status quo.
2) Message boards : Number crunching : Short Tasks run for 14 hours and counting. (Message 731)
Posted 8 Apr 2020 by Dayle Diamond
Post:
Well, the long work units started to run longer than average too.

I'm disconnecting from the project and aborting my units.
Days and days of work with 0 results is too much to ask.

I hope QuChemPedIA works out, but if glitched files with infinite runtimes is normal, the person sending them should be responding to errors rather than letting fellow volunteers guess at the problem.
3) Message boards : Number crunching : Short Tasks run for 14 hours and counting. (Message 727)
Posted 7 Apr 2020 by Dayle Diamond
Post:
My CPU is a steady 64C.
To limit operational costs, I have not had a GPU project attached on this machine in a while.

I say short because to me, an hourlong task is short, and to contrast them with the longer tasks.
The worst of the 'short' tasks is now at zero seconds remaining, and "waiting to run" after 19 hours, 44 minutes. It is listed as 99.998 percent complete.
Computer is doing some WCG tasks to reflect my project priorities.

Edit: If I could make it finish ASAP and get a result file, I would.
I'm temporarily suspending network activity to drain the queue so this task is re-prioritized ASAP.

If you have any other ideas in the meantime, I'm open to hearing them.
4) Message boards : Number crunching : Short Tasks run for 14 hours and counting. (Message 725)
Posted 7 Apr 2020 by Dayle Diamond
Post:
Hello,

I recently joined QuChemPedIA@Home.
I'm running a 32 thread, 1950x Threadripper with 32 GB of RAM, and I've sent my priority to 1%, just to work out the bugs before committing more cores.

So naturally BOINC downloaded 28 tasks and eventually tried to run them all at once.
Everything looked okay when I went to take a nap yesterday, with several scheduled to wrap up within fifteen minutes of my departure.

I awoke to a mess. Nothing was finished.
Some tasks were paused, waiting for more RAM, even as they only used 30 MB apiece.
Most had been running for many, many hours. The two that were nearest completion were still nearest completion, with a "minute" left.
The seconds count down slower as they reach the finish line, as if algorithmicly getting "nearly there" but never actually approaching 100%.

I've written this whole post while od9_athome_b3lyp-321gd,batch82,000822664,nwchem,1582561504_0 has eleven seconds left.
I'd just restart, but when I checked the last time it saved, it says:

CPU time since checkpoint 00:00:00
Elapsed time 14:36:51
Estimated time remaining 00:00:11


Which is nonsense. Certainly they're not idle, as my CPU usage is hovering around 95%.
I'd just scrap the "broken" unit but this appears to be happening to all of them? Or at least certainly to the "short" tasks.

I've got 3 days, 12 hours and 54 minutes of CPU time racked up on these suckers, and if they're VALID but unintentionally large files, I'd hate to spend all this time and energy only to have the next person do it too.

And if they're errors, from what I gather, it's better they fail than that the data I'm holding be discarded.

Please advise as soon as you can!
5) Message boards : News : Public opening (Message 721)
Posted 6 Apr 2020 by Dayle Diamond
Post:
I don't see this project on BOINC's list of projects to add, and only discovered QuChemPedIA from a poster on a Rosetta forum.

How does a project get into the default BOINC project list?

People are downloading BOINC for COVID-19 and I'd imagine there's going to be a whole bunch of people looking at the list each day and subscribing to multiple projects.
If this project isn't on there, that's processing power going to other projects.
A fraction of that would be enough to wrap up your current unsent work units.




©2024 Benoit DA MOTA - LERIA, University of Angers, France