Everything is bad

Message boards : Number crunching : Everything is bad
Message board moderation

To post messages, you must log in.

AuthorMessage
den777

Send message
Joined: 11 Oct 19
Posts: 1
Credit: 61,173
RAC: 0
Message 245 - Posted: 3 Nov 2019, 17:07:08 UTC

I tried to run this project for some days, but still not able to finish any single WU.
I run it on my Linux box, as native app
* For some reason all WUs are running on one core, leaving others idle (srsly, how did you manage to make apps work this way?)
* When I restart boinc-client, WUs start progress from zero
* One of the WUs still shows no progress and ETA over 102 days, it shows 17+ hours of run time but only 2 hrs of CPU time (repeating my question above)
ID: 245 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 26 Aug 19
Posts: 15
Credit: 1,265,326
RAC: 0
Message 246 - Posted: 3 Nov 2019, 20:50:08 UTC
Last modified: 3 Nov 2019, 20:51:33 UTC

For linux and CPU core affinity issue (causes to be stuck into a single core instead of spreading harmoniously), here is a script provided by damotbe, it worked well with me (in Debian), I translate here his instructions :

while [[ 1 -eq 1 ]]; do for pid in $(pgrep nwchem); do taskset -p 0xffffffff $pid; done; sleep 60; done

Every minute it will force affinity of nwChem.

Depending on configuration it must be ran in root (or sudo), ideally inside a "screen" session. (install and use screen utility)

I used it straight away (not in screen) and it worked well for the tasks already running. I stopped it and its effect is lasting until the end of these tasks.


Alternative of screen (I did not test this) : create a affinity.sh and give it execution rights

touch affinity.sh
chmod +x affinity.sh

then add inside

#!/bin/bash
while [[ 1 -eq 1 ]]; do
    for pid in $(pgrep nwchem); do
        taskset -p 0xffffffff $pid
    done
    sleep 60
done

then execute is as a service with nohup (to be run again when booting the machine)

nohup sudo ./affinity.sh 2>/dev/null >/dev/null &


*** I did not write these scripts and provide no kind of support for them ***
ID: 246 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Everything is bad

©2024 Benoit DA MOTA - LERIA, University of Angers, France