Posts by Aurum

21) Message boards : Number crunching : Compile for AVX-512 VNNI (Message 1146)
Posted 16 Oct 2020 by Aurum
Post:
I have neither Intel Compiler either the time to start such a project.
But, I'm pretty sure that if a volunteer compile a new version of nwchem for himself and modify local configurations files in quchempedia directory, he can manage to test such things.

Sometimes that happens, e.g. http://asteroidsathome.net/boinc/ had someone compile it for CUDA 10.2.

If that was something I had or knew how to do I'd gladly give it a try. Looks like those compilers are expensive.

So would your WUs run faster if they were compiled for high end CPUs?
22) Message boards : Number crunching : Got any Betas??? (Message 1143)
Posted 15 Oct 2020 by Aurum
Post:
I've been watching beta t1, t2 & t4s running and I can't tell the difference. Their Properties say they have the same 500 TLOPs. They run the same speed. Why doesn't t2 run twice as fast as t1 and t4 four times as fast as t1???
23) Message boards : Number crunching : Compile for AVX-512 VNNI (Message 1142)
Posted 15 Oct 2020 by Aurum
Post:
As an aside, do QuChem WUs use more integer operations or floating point ???
24) Message boards : Number crunching : Compile for AVX-512 VNNI (Message 1141)
Posted 15 Oct 2020 by Aurum
Post:
I came across this article:
https://www.nas.nasa.gov/hecc/support/kb/cascade-lake-processors_579.html#:~:text=Cascade%20Lake%20also%20introduces%20in,floating%2Dpoint%20operations%20per%20cycle
"In addition to the instruction sets SSE, SSE2, SSE3, Supplemental SSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, and AVX512[F,CD,BW,DQ,VL], which are available in its Skylake predecessor, Cascade Lake also includes the new AVX-512 Vector Neural Network Instructions (VNNI), which provide significant, more efficient deep-learning inference acceleration. Cascade Lake also introduces in-hardware mitigations for the Spectre and Meltdown security flaws.

With 512-bit floating-point vector registers and two floating-point functional units, each capable of Fused Multiply-Add (FMA), a Cascade Lake core can deliver 32 double-precision floating-point operations per cycle.

Use the Intel compiler flag -xCORE-AVX512 for Skylake and Cascade Lake-SP specific optimizations. The optimization flag -qopt-zmm-usage=high -xCORE-AVX512 may benefit floating-point heavy applications running on Skylake and Cascade Lake.

Tip: If you want a single executable that will run on any of the Aitken, Electra, Pleiades and Merope processor types, with suitable optimization to be determined at run time, you can compile your application using the option -O3 -ipo -axCORE-AVX512,CORE-AVX2,AVX -xSSE4.2."

Note that 32 DP FLOPs per cycle is double what we've had.
If you want to compile for AVX-512 VNNI I'll be glad to test it.
25) Message boards : Number crunching : The aborted and resend WU (Message 1097)
Posted 25 Sep 2020 by Aurum
Post:
Everything I know about BOINC servers is thanks to Doctor Google, i.e. I know nothing as Sergeant Schultz would say.
26) Message boards : Number crunching : Validation Inconclusive (Message 1087)
Posted 25 Sep 2020 by Aurum
Post:
Tullio, That might be me if they're from the last 2 days. I have a 44t computer dying, probably the motherboard. I saw it was throwing off 100% Computation Errors within seconds. It's offline now.
27) Message boards : Number crunching : The aborted and resend WU (Message 1086)
Posted 25 Sep 2020 by Aurum
Post:
I have a bunch of stale unvalidated WUs as well. I wonder if they're from a bad batch that got cancelled before everything finished?
28) Message boards : Number crunching : Got any Betas??? (Message 1085)
Posted 25 Sep 2020 by Aurum
Post:
Server Status says there's 2038 NWChem Long WUs unsent but my log says, "No tasks are available for NWChem long."

Since this morning it's gone from 1 user to 5 so maybe they'll come when they come.

Does the client request betas or are they just sent by the server as available and so nothing would be in my log unless it actually arrived?
29) Message boards : Number crunching : Got any Betas??? (Message 1084)
Posted 25 Sep 2020 by Aurum
Post:
That is such a good idea! I'm going to get out my waffle maker and treat my kids to Belgian waffles with peanut butter and hot maple syrup :-)
30) Message boards : Number crunching : Got any Betas??? (Message 1078)
Posted 24 Sep 2020 by Aurum
Post:
I'm not getting any beta or NWChemLong WUs. Any available?
What's the trick?
Run test applications?	Yes
Run only the selected applications:
NWChem: no
NWChem long: yes
If no work for selected applications is available, accept work from other applications?	no
Max # jobs	No limit
Max # CPUs	No limit
31) Message boards : Number crunching : "Multithreading" in prefs (Message 954)
Posted 21 Jul 2020 by Aurum
Post:
Nope, it does not work. I can't find the <plan_class> definition in the client_state file. It works fine for LHC ATLAS.
32) Message boards : Number crunching : "Multithreading" in prefs (Message 953)
Posted 21 Jul 2020 by Aurum
Post:
1. Allow to fix the threads per task to one value only, instead of a range.

Currently, Linux users can choose between exactly 1 thread/task, or randomly 1...2 threads/task, or randomly 1/2/4 threads per task, or randomly 1/2/4/8 threads per task. Frankly, the random options are completely bogus.

Whenever I run projects with multithreaded applications on my hosts, I always configure the client to run tasks with one uniform thread count per task. Not doing this will soon confuse the work queue management, and leave me with an under-utilized host, which I detest to no end.
Are you saying this does not work???
<app_config>
<app>
    <name>nwchem_long</name>
    <!-- Xeon E5-2699 v4  22c44t  L3 Cache = 55 MB  -->
</app>
<app_version>
    <app_name>nwchem_long</app_name>
    <plan_class>t1</plan_class>
    <avg_ncpus>18</avg_ncpus>
    <cmdline>--nthreads 18</cmdline>
</app_version>
</app_config>
33) Message boards : News : Credits and Gridcoin (Message 602)
Posted 23 Feb 2020 by Aurum
Post:
So when will QCP be whitelisted for GRC???
QCP is performing far better than many BOINC projects, it's time.
34) Message boards : News : NWChem long (Message 599)
Posted 22 Feb 2020 by Aurum
Post:
The native Linux ap RAM has been using 0.8-0.9 GB. If RAM may be a limiting factor I put a warning in my app_config, e.g. for Rosetta:
<app_config>
    <app>
        <name>rosetta</name>
             <!-- needs 5 MB L3 cache per WU -->
             <!-- needs 1.5 GB RAM per WU -->
             <!-- Xeon E5-2686 v4,  18c36t,  32 GB,  45 MB L3 Cache  -->
             <max_concurrent>9</max_concurrent>
    </app>
</app_config>

I also set my Swap file to 16 GB:
sudo swapoff -a
sudo dd if=/dev/zero of=/swapfile bs=1M count=16384
sudo mkswap /swapfile
sudo swapon /swapfile
35) Message boards : Number crunching : Application 0.15 (t1) (beta test) Result Not Completing Successfully (Message 569)
Posted 15 Feb 2020 by Aurum
Post:
Ok, but there's no reason that two different runs should perform the same number of operations. I don't know exactly what they're simulating but usually it's two molecules placed in close proximity. Different orientations are tested seeking the configuration with lowest energy. No two simulations will get to the lowest energy configuration via the same path, even if run on the same computer.

BTW, I see thousands of Longs on the Server Status but only 2 users running them. Little help...
36) Message boards : Number crunching : Application 0.15 (t1) (beta test) Result Not Completing Successfully (Message 567)
Posted 14 Feb 2020 by Aurum
Post:
Xeon E5-2686 v4. Not sure how CPU time is so much higher than run time. Logic dictates that Run Time > or = CPU Time.

https://quchempedia.univ-angers.fr/athome/result.php?resultid=1719940

https://quchempedia.univ-angers.fr/athome/workunit.php?wuid=1101030

My wingman ran for 93 hours on an AMD Ryzen 7 1700 Eight-Core Processor. I wonder if this t8 WU was the only thing running on this 8c CPU?
37) Message boards : Number crunching : Application 0.15 (t1) (beta test) Result Not Completing Successfully (Message 565)
Posted 14 Feb 2020 by Aurum
Post:
Just got my first t8 Long to validate with credit. It ran for 73 hours.
38) Message boards : News : Molecules are coming! (Message 558)
Posted 12 Feb 2020 by Aurum
Post:
Why??? I want 32 threads to test. It's set to No Limit.
39) Message boards : News : Molecules are coming! (Message 555)
Posted 12 Feb 2020 by Aurum
Post:
An individual thread on the multi-threaded ATLAS application can take minutes to complete, so beyond a certain number of threads the time it takes waiting for the last thread to complete while all the other threads are idle led to inefficiencies.
Yes, that's why I only run ATLAS single-threaded. Will NWChem behave the same way? I just got a t2 and a t8 so I'm watching for now.
40) Message boards : News : Molecules are coming! (Message 552)
Posted 12 Feb 2020 by Aurum
Post:
One never knows until you try it. LHC ATLAS will use all CPU threads unless tamed like this:
<app_config>
<app>
    <name>nwchem_long</name>
    <!-- Xeon E5-2699 v4  22c44t  L3 Cache = 55 MB  -->
</app>
<app_version>
    <app_name>nwchem_long</app_name>
    <plan_class>t1</plan_class>
    <avg_ncpus>18</avg_ncpus>
    <cmdline>--nthreads 18</cmdline>
</app_version>
</app_config>
Then we can decide what works best for a given CPU model.


Previous 20 · Next 20

©2024 Benoit DA MOTA - LERIA, University of Angers, France