Posts by xii5ku

21) Message boards : Number crunching : "Multithreading" in prefs (Message 900) Posted 21 Jun 2020 by xii5ku Post: Sorry for reviving an old thread. But the issue at hand is still applicable, so here we go: dannyridel wrote: I've set the multithread settings to a max of 4 CPUS. Somehow I keep on getting work that runs on 1 CPU core only. Check the apps.php page, i.e. "Computing -> Applications" at the top of the web page. At this time, this hints that the "NWChem" application is single-thread, the "NWChem long" application is single-threaded on Windows, the "NWChem long" application is currently single-threaded on Linux unless "Run test applications?" is switched to Yes at the project preferences, until the 2t/ 4t/ 8t versions are promoted out of beta status. At least that's how I understand it.
22) Message boards : Number crunching : Suspicious near-instant results with NWChem long t4 (Message 899) Posted 21 Jun 2020 by xii5ku Post: Luigi R. wrote: xii5ku wrote: Besides a full /tmp, or lacking access permissions to /tmp, another potential problem source could be [...] What do you mean for full /tmp? 0byte? This morning I had 600MB free space. I deleted some log files and now it is 3.2GB. On my host, each nwchem_long task takes 8.2 MBytes in /tmp. (BTW, I completed three tasks by now, and out of these three, one did not remove its "pid." subdirectory in /tmp/ompi./.) 8.2 MBytes is not much obviously. If there is no space left for this small amount in /tmp anymore, the host may exhibit serious other problems outside of boinc as well.
23) Message boards : Number crunching : Suspicious near-instant results with NWChem long t4 (Message 896) Posted 21 Jun 2020 by xii5ku Post: Besides a full /tmp, or lacking access permissions to /tmp, another potential problem source could be issues with the TCP port which MPI (Open MPI?) uses. I have one nwchem_long task running so far, and this for example occupies the port 38253. This may show you what ports are (or were) in use: cat /tmp/ompi./pid./contact.txt So, maybe those who had failures after a few seconds run time had some conflict which prevented the use of the TCP port? Luigi R. wrote: P.S. please, don't care about errors. They are caused by bash crashes and I solved it with os restart. ;) But maybe those bash crashes were caused by nwchem_long not cleaning up properly.
24) Message boards : Number crunching : Suspicious near-instant results with NWChem long t4 (Message 894) Posted 21 Jun 2020 by xii5ku Post: Alien Seeker wrote: I've had the problem again, this time on the other computer and with only 1 core per task. I suspect the reason this time was a full /tmp; although I didn't check the size, the problem vanished when I removed the many leftover /tmp/ompi.hostname.123/pid.1234 directories from previous computations. I think tasks should clean up after themselves when they end; even if each directory is rather small, they pile up after a while and the /tmp partition isn't meant to be very big. crashtech wrote: Has there been a resolution to this issue? One of my computers only runs WUs for a few seconds, then marks them as complete https://quchempedia.univ-angers.fr/athome/results.php?hostid=1227 @crashtech, maybe this host has a full /tmp (like Alien Seeker suspected with the own host). Check with "df -h /tmp" for example. Or the boinc-client service on this host is set up in a way which does not permit it to create files outside of its data directory, or at least not in /tmp. What does /lib/systemd/system/boinc-client.service contain on this host?

Previous 20