use hyper threading?

Message boards : Number crunching : use hyper threading?
Message board moderation

To post messages, you must log in.

AuthorMessage
Drago75

Send message
Joined: 26 Mar 21
Posts: 6
Credit: 2,026,800
RAC: 0
Message 1723 - Posted: 12 Apr 2022, 21:47:03 UTC
Last modified: 12 Apr 2022, 21:48:48 UTC

I ran a full set of 16 wu's at once on my 8 core Linux machine but they seem to finish a lot slower than if I let only 8 tasks run at once. It is a bit difficult to compare since the times necessary to finish them are usually all over the place but on average I would say they finish quicker if only one task per actual cpu core is assigned. Does anybody have any data about this? Does hyper threading make sense with this project? Memory of 16 GB is sufficient in both cases.
ID: 1723 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle

Send message
Joined: 26 Jan 22
Posts: 4
Credit: 510,400
RAC: 0
Message 1724 - Posted: 13 Apr 2022, 15:05:40 UTC - in response to Message 1723.  

To understand what hyper-threading means and how it affects computer performance this might be a good entry point:
https://en.wikipedia.org/wiki/Hyper-threading

You may also get an impression why nobody will be able to give a 100 % correct answer whether it's better to run the project with or without HT on your computer.
ID: 1724 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 23 Feb 22
Posts: 23
Credit: 4,423,400
RAC: 0
Message 1725 - Posted: 14 Apr 2022, 19:07:07 UTC - in response to Message 1723.  

I ran a full set of 16 wu's at once on my 8 core Linux machine but they seem to finish a lot slower than if I let only 8 tasks run at once. It is a bit difficult to compare since the times necessary to finish them are usually all over the place but on average I would say they finish quicker if only one task per actual cpu core is assigned. Does anybody have any data about this? Does hyper threading make sense with this project? Memory of 16 GB is sufficient in both cases.
you are indeed running 16 WUs concurrently with 16GB RAM?
So the RAM requirements seem to differ substantially between Linux and Windows. My Windows machines with 8GB RAM (and 6 "real" cores) don't allow more than 3 WUs concurrently. For each WU 1.900MB RAM is allocated, although the peak working set size does not exceed around 58MB. This is really too bad, as I can only use half of the CPU capacity on 3 such machines :-(
ID: 1725 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 14 Dec 19
Posts: 68
Credit: 45,744,261
RAC: 0
Message 1726 - Posted: 17 Apr 2022, 8:59:54 UTC
Last modified: 17 Apr 2022, 9:08:35 UTC

It's been a long time since I tried running with HT disabled. I had come to the conclusion that the problem with some BOINC projects is the amount of CPU cache they fill. Some projects like WCG's MIP had that problem and recommended a rule-of-thumb to run fewer WUs than your L3 cache divided by 5 MB/WU. I don't know if there's such a rule for QC but there probably should be. I set the maximum number of WUs to 8 in preferences with a single CPU and they ran much better. I leave HTing enabled since it seems to help normal projects more than it hurts. For CPUs with fewer threads I specify a different venue and use no more than half the threads for a project like QC. Run another project like Universe and HTing will fill in all the gaps with it and your overall throughput will be maximized.

I think it can also sense a difference between CPU cache designs, e.g. compare E5-2699 v4 to an i9-9980XE with different style caches. You didn't say what model CPU you're using just that it's 8c/16t. I suggest leaving HTing enabled and set your preferences for QC to a Max # jobs = 4 and Max # CPUs = 1. If that works good then up Max # jobs until you feel it slowing down and then back up one and your done.
ID: 1726 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Drago75

Send message
Joined: 26 Mar 21
Posts: 6
Credit: 2,026,800
RAC: 0
Message 1727 - Posted: 24 Apr 2022, 14:31:43 UTC - in response to Message 1726.  

Hey Aurum. Thanks for your input. That may really explain it. Iam noticing a major difference in completion times on my Ryzen 7 5800H which has only 16 mb of L3 cache and not so much on my Ryzen 9's if I let them run a full stack. They are equipped with 64 mb. If the problem is not HT so much then I will experiment a little and try to find out an optimum setting for the R7 if the question isn't just 8 or 16 wu's at once.

Yesterday I read that AMD is starting to introduce a new processor design with 3D stacked level 3 cache wich increases it's amount significantly. That may be a useful feature for crunchers.
ID: 1727 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Drago75

Send message
Joined: 26 Mar 21
Posts: 6
Credit: 2,026,800
RAC: 0
Message 1728 - Posted: 24 Apr 2022, 14:41:23 UTC - in response to Message 1725.  

I ran a full set of 16 wu's at once on my 8 core Linux machine but they seem to finish a lot slower than if I let only 8 tasks run at once. It is a bit difficult to compare since the times necessary to finish them are usually all over the place but on average I would say they finish quicker if only one task per actual cpu core is assigned. Does anybody have any data about this? Does hyper threading make sense with this project? Memory of 16 GB is sufficient in both cases.
you are indeed running 16 WUs concurrently with 16GB RAM?
So the RAM requirements seem to differ substantially between Linux and Windows. My Windows machines with 8GB RAM (and 6 "real" cores) don't allow more than 3 WUs concurrently. For each WU 1.900MB RAM is allocated, although the peak working set size does not exceed around 58MB. This is really too bad, as I can only use half of the CPU capacity on 3 such machines :-(


Hey Erich56. The problem with the Windows version of QuChem is that it needs virtualization which requires a lot of resources. You can run QuChem under Linux natively and yes 16 wu's at once do fit into 16 GB of ram. :-)

If you want to try out Linux you can set up a fast USB stick with a Mint or Ubuntu distro and boot your PC from there if you don't want two OS's on the same hard drive. You can run BOINC also from that USB stick as it doesn't require fast read and write access. It works perfectly. I do that on my two Ryzen 9 when I support projects that run faster under Linux like QuChem or Universe.
ID: 1728 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : use hyper threading?

©2024 Benoit DA MOTA - LERIA, University of Angers, France