Posts by Bryan

1) Message boards : Number crunching : Pending Validation (Message 605)
Posted 24 Feb 2020 by Bryan
Post:
With a 14 day deadline your wingman might have timed out and now you are waiting for a 2nd wingman to complete them. -or- you might be the wingman of the #9 computer in the top computers list (ID 1459) . That is probably a dual CPU machine with HT turned off. It has 3456 WU sitting in cache which will take it around 12 days to complete. Oh joy.
2) Message boards : News : Credits and Gridcoin (Message 470)
Posted 30 Jan 2020 by Bryan
Post:
Thank you for the affinity fix ... that will help :)
3) Message boards : Number crunching : Credit (Message 333)
Posted 6 Dec 2019 by Bryan
Post:
Please don't change to Credit Screw. That is a random credit generator and pays horribly. I DO NOT support projects running credit new unless I have no alternative. This is my personal opinion and others may view it differently.

This is NOT meant as a threat BTW, I'm just stating facts. :)
4) Message boards : Number crunching : Very little CPU usage (Message 96)
Posted 10 Oct 2019 by Bryan
Post:
Excellent! That will greatly help the project and the users :)
5) Message boards : Number crunching : Native Linux WU refuses to suspend (Message 90)
Posted 9 Oct 2019 by Bryan
Post:
I noticed on one machine when I finished all WU and they had been turned in I still had 8 nwchem processes running. I had to kill them using their PIDs. They apparently were orphaned.
6) Message boards : Number crunching : Very little CPU usage (Message 89)
Posted 9 Oct 2019 by Bryan
Post:
I don't know if the VBox version actually assigns CPU affinity since I haven't run it. On the native Linux, which runs well, you should warn people that running more than 1 WU per CPU is only increasing the time to completion. On my dual CPU Intel machines running 64 WU means it would take 16X more time to complete a WU.

I hope you find a solution because the native linux app runs very well.
7) Message boards : Number crunching : Very little CPU usage (Message 76)
Posted 8 Oct 2019 by Bryan
Post:
damotbe, the problem is that you are wasting resources. When 64 WU are launched on a 64t machine only 2 threads are used on a single CPU and 4 threads get used on a dual CPU machine. That means that the threads are over committed by either 16:1 or 32:1. It is a total waste of machine capability. Without your program setting affinity a single machine could produce a lot more work for the project.

The program continuously launches "child" processes and kills the old processes so an affinity script must continuously run. The best solution is to remove the affinity control from the program's executable. Let Linux decide what threads to use.
8) Message boards : Number crunching : Very little CPU usage (Message 68)
Posted 8 Oct 2019 by Bryan
Post:
Some hypervisors will run a VM inside of a VM and other do not support it.
9) Message boards : Number crunching : Very little CPU usage (Message 60)
Posted 7 Oct 2019 by Bryan
Post:
There is absolutely nothing wrong with the native app other than the assigning of core/thread affinity. Take that out of the executable so machines can use all threads and you have a winner. Like Michael, I've been running it with a affinity script and it works quite well.

I don't run VBox projects unless I have no alternative.
10) Message boards : Number crunching : Very little CPU usage (Message 34)
Posted 4 Oct 2019 by Bryan
Post:
On a 2 CPU 64t machine that is running 2 BOINC instances I'm showing 7% CPU utilization. Each instance is assigned 32 threads.

Looking at HTOP it is showing only 4 threads being utilized - each of the 4 is at 100% loading. 2 threads are active on each processor.
11) Message boards : Number crunching : Very little CPU usage (Message 23)
Posted 4 Oct 2019 by Bryan
Post:
I am running the linux native t1/t2 tasks on multiple machines.

1. On a 64t machine I'm only seeing 7% CPU usage even though the WU are using all threads. Is this the level that will be used until the WU finishes or is there a point that the WU will increase the CPU loading? I'm wondering if I can crunch other WU alongside so the threads don't sit idle 93% of the time.

2. The WU are reserving 1.2G of RAM yet they are only using 120Mb. Why reserve that much memory if it isn't needed?

I'm just trying to figure out the best way to crunch this project ... I'm not complaining.
12) Message boards : Number crunching : No work? (Message 21)
Posted 4 Oct 2019 by Bryan
Post:
Not really in my case ! in fact, I mostly discover how to properly configure a server. And believe me, it's not pretty from the inside. :-/



:) :)
13) Message boards : Number crunching : WU failures (Message 7)
Posted 3 Oct 2019 by Bryan
Post:
I attached another instance and got the same almost instantaneous failure on the 12 intel_mt WU. I opened up the permissions on the project folder to rw for everyone. It failed another 13 WU. The only executable I see in the folder is the wrapper. There are quite a few tar balls.

If it wil help HERE are my hosts.

I did have a t1 WU complete and validate. I have a few other t1 and t2 WU that have been running for several hours.
14) Message boards : Number crunching : WU failures (Message 5)
Posted 3 Oct 2019 by Bryan
Post:
I'm running Linux Mint 19 and I'm seeing immediate failures on the Intel_mt WU. The t1 and t2 WU appear to be running although completion time estimates vary between 5 minutes and 20 hours.

The failure on the Intel_mt is saying
execv() failed: : Permission denied

I'm trying 2 different machines: Intel E5-2684 V4 and AMD 2990WX and both fail on the Intel_mt WU.




©2024 Benoit DA MOTA - LERIA, University of Angers, France