Message boards :
Number crunching :
use hyper threading?
Message board moderation
Author | Message |
---|---|
Send message Joined: 26 Mar 21 Posts: 6 Credit: 2,026,800 RAC: 0 |
I ran a full set of 16 wu's at once on my 8 core Linux machine but they seem to finish a lot slower than if I let only 8 tasks run at once. It is a bit difficult to compare since the times necessary to finish them are usually all over the place but on average I would say they finish quicker if only one task per actual cpu core is assigned. Does anybody have any data about this? Does hyper threading make sense with this project? Memory of 16 GB is sufficient in both cases. |
Send message Joined: 26 Jan 22 Posts: 4 Credit: 510,400 RAC: 0 |
To understand what hyper-threading means and how it affects computer performance this might be a good entry point: https://en.wikipedia.org/wiki/Hyper-threading You may also get an impression why nobody will be able to give a 100 % correct answer whether it's better to run the project with or without HT on your computer. |
Send message Joined: 23 Feb 22 Posts: 23 Credit: 4,423,400 RAC: 0 |
I ran a full set of 16 wu's at once on my 8 core Linux machine but they seem to finish a lot slower than if I let only 8 tasks run at once. It is a bit difficult to compare since the times necessary to finish them are usually all over the place but on average I would say they finish quicker if only one task per actual cpu core is assigned. Does anybody have any data about this? Does hyper threading make sense with this project? Memory of 16 GB is sufficient in both cases.you are indeed running 16 WUs concurrently with 16GB RAM? So the RAM requirements seem to differ substantially between Linux and Windows. My Windows machines with 8GB RAM (and 6 "real" cores) don't allow more than 3 WUs concurrently. For each WU 1.900MB RAM is allocated, although the peak working set size does not exceed around 58MB. This is really too bad, as I can only use half of the CPU capacity on 3 such machines :-( |
Send message Joined: 14 Dec 19 Posts: 68 Credit: 45,744,261 RAC: 0 |
It's been a long time since I tried running with HT disabled. I had come to the conclusion that the problem with some BOINC projects is the amount of CPU cache they fill. Some projects like WCG's MIP had that problem and recommended a rule-of-thumb to run fewer WUs than your L3 cache divided by 5 MB/WU. I don't know if there's such a rule for QC but there probably should be. I set the maximum number of WUs to 8 in preferences with a single CPU and they ran much better. I leave HTing enabled since it seems to help normal projects more than it hurts. For CPUs with fewer threads I specify a different venue and use no more than half the threads for a project like QC. Run another project like Universe and HTing will fill in all the gaps with it and your overall throughput will be maximized. I think it can also sense a difference between CPU cache designs, e.g. compare E5-2699 v4 to an i9-9980XE with different style caches. You didn't say what model CPU you're using just that it's 8c/16t. I suggest leaving HTing enabled and set your preferences for QC to a Max # jobs = 4 and Max # CPUs = 1. If that works good then up Max # jobs until you feel it slowing down and then back up one and your done. |
Send message Joined: 26 Mar 21 Posts: 6 Credit: 2,026,800 RAC: 0 |
Hey Aurum. Thanks for your input. That may really explain it. Iam noticing a major difference in completion times on my Ryzen 7 5800H which has only 16 mb of L3 cache and not so much on my Ryzen 9's if I let them run a full stack. They are equipped with 64 mb. If the problem is not HT so much then I will experiment a little and try to find out an optimum setting for the R7 if the question isn't just 8 or 16 wu's at once. Yesterday I read that AMD is starting to introduce a new processor design with 3D stacked level 3 cache wich increases it's amount significantly. That may be a useful feature for crunchers. |
Send message Joined: 26 Mar 21 Posts: 6 Credit: 2,026,800 RAC: 0 |
I ran a full set of 16 wu's at once on my 8 core Linux machine but they seem to finish a lot slower than if I let only 8 tasks run at once. It is a bit difficult to compare since the times necessary to finish them are usually all over the place but on average I would say they finish quicker if only one task per actual cpu core is assigned. Does anybody have any data about this? Does hyper threading make sense with this project? Memory of 16 GB is sufficient in both cases.you are indeed running 16 WUs concurrently with 16GB RAM? Hey Erich56. The problem with the Windows version of QuChem is that it needs virtualization which requires a lot of resources. You can run QuChem under Linux natively and yes 16 wu's at once do fit into 16 GB of ram. :-) If you want to try out Linux you can set up a fast USB stick with a Mint or Ubuntu distro and boot your PC from there if you don't want two OS's on the same hard drive. You can run BOINC also from that USB stick as it doesn't require fast read and write access. It works perfectly. I do that on my two Ryzen 9 when I support projects that run faster under Linux like QuChem or Universe. |
©2024 Benoit DA MOTA - LERIA, University of Angers, France