Posts by swiftmallard

1) Message boards : Number crunching : Host ID 1388 corrupted (Message 1328)
Posted 28 days ago by swiftmallard
Post:
4811
6072
2) Message boards : Number crunching : Never ending tasks (Message 1293)
Posted 27 Dec 2020 by swiftmallard
Post:
Yeah, that task should have been aborted long ago.
3) Message boards : Number crunching : Never ending tasks (Message 1291)
Posted 26 Dec 2020 by swiftmallard
Post:
Check the Properties of the task in question. If the difference between the CPU time and the Elapsed time is more than a few minutes, that WU has stopped processing. It will proceed no further and should be aborted.
4) Message boards : Number crunching : Ubuntu 20.04.1 does not validate (Message 1288)
Posted 24 Dec 2020 by swiftmallard
Post:
I know nothing about Linux but it sure is nice to see a resolution to this sort of issue.
5) Questions and Answers : Windows : how to solve problems related to virtualization (Message 1267)
Posted 19 Dec 2020 by swiftmallard
Post:
Unhide your computer

There is a good tutorial about setting up your system here: https://quchempedia.univ-angers.fr/athome/forum_thread.php?id=44&postid=662
6) Questions and Answers : Windows : no tasks available? (Message 1069)
Posted 22 Sep 2020 by swiftmallard
Post:
I am running a Windows system and have the test applications box checked.
Long WUs show as available: NWChem long 2154 8325 69 (7.75 - 181.27)
9/22/2020 6:47:15 AM | QuChemPedIA@home | Requesting new tasks for CPU
9/22/2020 6:47:17 AM | QuChemPedIA@home | Scheduler request completed: got 0 new tasks
9/22/2020 6:47:17 AM | QuChemPedIA@home | No tasks sent
9/22/2020 6:47:17 AM | QuChemPedIA@home | No tasks are available for NWChem long
7) Message boards : Number crunching : Validate error. (Message 976)
Posted 27 Jul 2020 by swiftmallard
Post:
>>> Workunit 1379411

This work unit has been set "validate error" by all that have crunched it so far, after several days of CPU time each. Not happy about that.

This happens to all of us, and nobody likes it. The molecule your task was working on was probably unstable and the validate error confirms that.
8) Message boards : Number crunching : Long work units. (Message 975)
Posted 27 Jul 2020 by swiftmallard
Post:
The task manager shows the CPU pretty much maxed out on all cores/threads.

If you are satisfied that all cores are crunching properly, then let the WU run. It would not be the first time someone has had a very long running task.
9) Message boards : Number crunching : Long work units. (Message 973)
Posted 27 Jul 2020 by swiftmallard
Post:
>>> Are you certain that your CPU is actually working on this?

Yes. I am really curious about what it is actually doing, I've asked, no reply. It is still running this morning but the remaining field is now down to 1 second.

Ctrl-Alt-Del and then select the Task Manager. Choose the Performance tab. There you will see a graph of your CPU activity. If it does not correspond accurately to what the Boinc Manager says is happening, then one or more of your cores is not actually crunching. Pause all your WUs in the Tasks tab of Boinc and then start them slowly, one by one. Watch the Windows Task Manager to see which one does not cause a rise in CPU activity. (It's most likely to be the one you have been posting about.) That is your problem WU and you should abort it.
10) Message boards : Number crunching : Long work units. (Message 969)
Posted 26 Jul 2020 by swiftmallard
Post:
12 Hours later and still crunching away, but now down to 00:00:03 remaining, hours minutes and seconds are relative with these jobs. Would be really good to know what it is they are doing, they are using serious amounts of CPU time.

Are you certain that your CPU is actually working on this? The time/progress indicator will show that you are even if you are not. A better indicator is the Windows Task Manager - Performance tab. Your number of processors crunching should equal the amount you have set in your Boinc preferences. If the total comes up short, (and it does occasionally happen) then one of your cores is not crunching
11) Message boards : Number crunching : Validation Inconclusive (Message 962)
Posted 24 Jul 2020 by swiftmallard
Post:
An occasional invalid result is normal in this project.
12) Questions and Answers : Macintosh : Computing Errors (Message 949)
Posted 21 Jul 2020 by swiftmallard
Post:
Unhide your computer
13) Questions and Answers : Windows : Workunits error out after 7 seconds (Message 935)
Posted 15 Jul 2020 by swiftmallard
Post:
Unhide your computer
14) Message boards : Number crunching : missing computers (Message 855)
Posted 7 Jun 2020 by swiftmallard
Post:
Most of my returned WUs lately have gone into pending. Out of idle curiosity I looked to see who my wing crunchers were and I see some confusing info.
The other crunchers have computer numbers such as 2329, 2333, 2328, 2236, 2334. All are anonymously owned and all have aborted the WUs I successfully completed. These WUs otherwise remain unsent to a third cruncher, effectively leaving my completed units in limbo.
The server status page shows there are only 1592 computers even attached to the project.
My question is: Do these other computers actually exist?
My computer is not hidden, please take a look and see if I am missing something obvious.
15) Message boards : Number crunching : 2,5 days long and counting... (Message 835)
Posted 7 May 2020 by swiftmallard
Post:
When a WU shows as running for a long time but using no CPU, it should be aborted.
16) Questions and Answers : Windows : "Postponed: vm job unmanageable restarting later" status in quchempedia (Message 812)
Posted 25 Apr 2020 by swiftmallard
Post:
Are you trying to crunch with all available cores? It sure sounds like it.
Momentary spikes in memory usage is the main suspect in triggering the messages you are seeing.
I found through trial and error that my system was at it's most stable running on only 50% of the processors.
17) Questions and Answers : Windows : "Postponed: vm job unmanageable restarting later" status in quchempedia (Message 799)
Posted 21 Apr 2020 by swiftmallard
Post:
When that happens to me I just shut down Boinc, open VB and make certain all tasks stop, then start Boinc again. If that doesn't work, you can always use brute force and restart your system.
If you do nothing at all, I believe the unmanageable tasks will resume processing in 24 hours.
18) Message boards : Number crunching : 2,5 days long and counting... (Message 755)
Posted 12 Apr 2020 by swiftmallard
Post:
I am going to let my "long" long tasks run, they are completing and validating.
19) Message boards : Number crunching : 2,5 days long and counting... (Message 748)
Posted 10 Apr 2020 by swiftmallard
Post:
I have the same questions as I am seeing something similar on my Windows system. My resource monitor says I am using 83% of my CPU, consistent with crunching on 5 of 6 cores.
A couple have been aborted but I hate to keep doing that.
20) Message boards : Number crunching : Short Tasks run for 14 hours and counting. (Message 728)
Posted 7 Apr 2020 by swiftmallard
Post:
Regular tasks that run up to 99.999% and then sit for hours should be aborted. Look through the other threads in the number crunching board and you'll find many discussions about why.
If you look at how others have fared with these WUs, they've already been aborted once or will not validate.
Personally, on my system, I never let an od9 work unit run longer than 6 hours and I haven't been burned yet.


Next 20

©2021 Benoit DA MOTA - LERIA, University of Angers, France