1)
Message boards :
Number crunching :
Out of Work
(Message 1792)
Posted 2 Sep 2022 by crashtech Post: I took the above news as notice that the project was coming to an end: https://quchempedia.univ-angers.fr/athome/forum_thread.php?id=183&postid=1783 |
2)
Message boards :
Number crunching :
Suspicious near-instant results with NWChem long t4
(Message 919)
Posted 2 Jul 2020 by crashtech Post: @xii5ku , I'm out of ideas on this one. Thanks for your help, though. |
3)
Message boards :
Number crunching :
Suspicious near-instant results with NWChem long t4
(Message 910)
Posted 26 Jun 2020 by crashtech Post: @crashtech: I'm pretty sure those are two client instances on the same host. |
4)
Message boards :
Number crunching :
Suspicious near-instant results with NWChem long t4
(Message 908)
Posted 23 Jun 2020 by crashtech Post: @crashtech, in addition to ProtectSystem=full, you could try: PrivateTmp=false Done, still nothing! One of the other things I tried was comparing boinc-client.service on a working host with the one on the non-working host, and commenting out all of the extra lines that are found in the non-working one. That also did not work. The temptation for me is to move my BOINC data directories to temporary storage, then "nuke and pave" the installation and start fresh. I realize that is more something out of the Windows noob playbook and is possibly offensive to a Linux pro. |
5)
Message boards :
Number crunching :
Suspicious near-instant results with NWChem long t4
(Message 904)
Posted 22 Jun 2020 by crashtech Post: @crashtech, "df" reports "file system disk space usage", i.e. the used space and available space in the filesystem in which the optionally given file or directory resides. My main intention was to verify how much free space is left in your /tmp. We now know that there is plenty of space left in it. (There are 180 GBytes available in /tmp.) Thank you xii5ku! First I appended -/tmp to the ReadWritePaths line and rebooted, but QuChemPedIA would not run. Then I changed "strict" to "full" and rebooted, but it still won't run! It's a real puzzle. |
6)
Message boards :
Number crunching :
Suspicious near-instant results with NWChem long t4
(Message 902)
Posted 21 Jun 2020 by crashtech Post:
Taking your suggestions one at a time, it looks as if "df -h /tmp" is not doing what is intended to do in this case, which is to give the size of /tmp. What the command does do, after further experimentation, is give the total usage of /dev/sda5, at least when exucuted on this particular host. It does this no matter which directory is input as a target: ga7pxsl@GAX570UD_test:~$ df -h /tmp Filesystem Size Used Avail Use% Mounted on /dev/sda5 228G 37G 180G 17% / ga7pxsl@GAX570UD_test:~$ df -h /home/ga7pxsl Filesystem Size Used Avail Use% Mounted on /dev/sda5 228G 37G 180G 17% / ga7pxsl@GAX570UD_test:~$ df -h / Filesystem Size Used Avail Use% Mounted on /dev/sda5 228G 37G 180G 17% / It does do something different if no target directory is given, which might provide a clue to someone who knows something: ga7pxsl@GAX570UD_test:~$ df -h Filesystem Size Used Avail Use% Mounted on udev 16G 0 16G 0% /dev tmpfs 3.2G 2.0M 3.2G 1% /run /dev/sda5 228G 37G 180G 17% / tmpfs 16G 208K 16G 1% /dev/shm tmpfs 5.0M 4.0K 5.0M 1% /run/lock tmpfs 16G 0 16G 0% /sys/fs/cgroup /dev/sda1 511M 6.1M 505M 2% /boot/efi tmpfs 3.2G 32K 3.2G 1% /run/user/1000 But, looking at /tmp in the graphical file manager (the thing I sort of know how to use) as root, the Properties tab tells me there is less than 100KB in /tmp. Or the boinc-client service on this host is set up in a way which does not permit it to create files outside of its data directory, or at least not in /tmp. What does /lib/systemd/system/boinc-client.service contain on this host? [Unit] Description=Berkeley Open Infrastructure Network Computing Client Documentation=man:boinc(1) After=network-online.target [Service] Type=simple ProtectHome=true PrivateTmp=true ProtectSystem=strict ProtectControlGroups=true ReadWritePaths=-/var/lib/boinc -/etc/boinc-client Nice=10 User=boinc WorkingDirectory=/var/lib/boinc ExecStart=/usr/bin/boinc ExecStop=/usr/bin/boinccmd --quit ExecReload=/usr/bin/boinccmd --read_cc_config ExecStopPost=/bin/rm -f lockfile IOSchedulingClass=idle # The following options prevent setuid root as they imply NoNewPrivileges=true # Since Atlas requires setuid root, they break Atlas # In order to improve security, if you're not using Atlas, # Add these options to the [Service] section of an override file using # sudo systemctl edit boinc-client.service #NoNewPrivileges=true #ProtectKernelModules=true #ProtectKernelTunables=true #RestrictRealtime=true #RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX #RestrictNamespaces=true #PrivateUsers=true #CapabilityBoundingSet= #MemoryDenyWriteExecute=true [Install] WantedBy=multi-user.target |
7)
Message boards :
Number crunching :
Suspicious near-instant results with NWChem long t4
(Message 890)
Posted 15 Jun 2020 by crashtech Post: Possibly there is a problem with the BOINC installation itself. Thanks, I have done so, and verified it in the Event Log: Mon 15 Jun 2020 08:42:37 AM MDT | | Starting BOINC client version 7.17.0 for x86_64-pc-linux-gnu Alas, the tasks still error out immediately. There don't seem to be any clues in the stderr output of the failed tasks, either. I wonder if there aren't some installed libraries that this project relies on that I might check and/or re-install. |
8)
Message boards :
Number crunching :
Suspicious near-instant results with NWChem long t4
(Message 888)
Posted 14 Jun 2020 by crashtech Post: Very strange. I don't see anything wrong with your machines. Thanks, I have done so more than once, checking the second time to be sure that the project directory was actually removed. There seems to be something about that particular host's configuration that causes QuChemPedIA to fail. |
9)
Message boards :
Number crunching :
Suspicious near-instant results with NWChem long t4
(Message 886)
Posted 14 Jun 2020 by crashtech Post: I run all of my work units as t1, by setting "Max # CPUs 1" in the preferences. Hi, based on your post, I set up a location in the preferences page here to only allow one CPU, but all the WUs still end prematurely. For now I can't run the project on it, but would like to figure out why the work is failing. It runs other projects without issues, so I don't think it's hardware related. |
10)
Message boards :
Number crunching :
Suspicious near-instant results with NWChem long t4
(Message 884)
Posted 13 Jun 2020 by crashtech Post: Has there been a resolution to this issue? One of my computers only runs WUs for a few seconds, then marks them as complete https://quchempedia.univ-angers.fr/athome/results.php?hostid=1227 |
11)
Message boards :
Number crunching :
Application 0.15 (t1) (beta test) Result Not Completing Successfully
(Message 523)
Posted 7 Feb 2020 by crashtech Post: Hi, how do I tell between good and bad units? On a few machines that were experiencing low CPU utilization or unusually long run times, I aborted them all, but I'd rather not make a habit of that. |
©2024 Benoit DA MOTA - LERIA, University of Angers, France