Hyperthreading issue

THE POST BELOW IS MORE THAN 5 YEARS OLD. RELATED SUPPORT INFORMATION MIGHT BE OUTDATED OR DEPRECATED

On 22/10/2009 at 13:56, xxxxxxxx wrote:

User Information:
Cinema 4D Version:   10-11.5 
Platform:   Windows  ; Mac  ;  Mac OSX  ; 
Language(s) :     C++  ;

---------
Hi,

what is the most optimal way to take advantage of hyperthreading?

I use MPThreads for multicpu support (as documented) but some users report the CPU threads not to be fully used. It works on CPUs with multicore systems but hyperthreading doesn´t seem to be taken into account by MPThreads does it?

I tested this on my i7 system with 4 cores and 8 available threads. My task manager tells me that the MP code does only use 25% of available cpu power! (4 threads working with 50% to give more detail)
On my AMD dual core it has no problems using 100% and both cores and also on Mac Pro with 8 cores it fully uses all cores for 100%. Also other configurations work correctly, but whenever it comes down to hyperthreading thers this problem (at least I think it´s about Hyperthreading). I have a betatester who has the same i7 system with 4 cores and 8 threads and the same problem occurs for him.

So my question is how to support this type of threading? The MPThread code doesn´t seem to fully work for this, so what would be the code extension to support all threads, whether MP or HT. Should the user give the amount of "CPUs" (instead of automatically using GetCPUCount()) and then create MPThreads accordingly? Or has it a completely different reason that this happens?

HELP! *Holt mich hier raus...ich bin eine CPU* :)

Thanks in advance

THE POST BELOW IS MORE THAN 5 YEARS OLD. RELATED SUPPORT INFORMATION MIGHT BE OUTDATED OR DEPRECATED

On 22/10/2009 at 15:11, xxxxxxxx wrote:

Does your renderer utilize 4 or 8 threads? It may be a limitation of the new hyperthreading technology not yet being implemented in Cinema 4D (?). Or maybe an SDK limitation (?). Will be interesting finding out. :)

THE POST BELOW IS MORE THAN 5 YEARS OLD. RELATED SUPPORT INFORMATION MIGHT BE OUTDATED OR DEPRECATED

On 23/10/2009 at 01:58, xxxxxxxx wrote:

Quote: Originally posted by 3D Designer on 22 October 2009
>
> * * *
>
> User Information:
>
> Cinema 4D Version:   10-11.5 </br>Platform:   Windows  ; Mac  ; Mac OSX  ; </br>Language(s) :    C++  ;  </br>
>
> ---------
>
> Hi,
>
> what is the most optimal way to take advantage of hyperthreading?
>
> I use MPThreads for multicpu support (as documented) but some users report the CPU threads not to be fully used. It works on CPUs with multicore systems but hyperthreading doesn´t seem to be taken into account by MPThreads does it?
>
> I tested this on my i7 system with 4 cores and 8 available threads. My task manager tells me that the MP code does only use 25% of available cpu power! (4 threads working with 50% to give more detail)
> On my AMD dual core it has no problems using 100% and both cores and also on Mac Pro with 8 cores it fully uses all cores for 100%. Also other configurations work correctly, but whenever it comes down to hyperthreading thers this problem (at least I think it´s about Hyperthreading). I have a betatester who has the same i7 system with 4 cores and 8 threads and the same problem occurs for him.
>
> So my question is how to support this type of threading? The MPThread code doesn´t seem to fully work for this, so what would be the code extension to support all threads, whether MP or HT. Should the user give the amount of "CPUs" (instead of automatically using GetCPUCount()) and then create MPThreads accordingly? Or has it a completely different reason that this happens?
>
> HELP! *Holt mich hier raus...ich bin eine CPU* :)
>
> Thanks in advance
>
>
> * * *

What is the return value of GetCPUCount() on your i7 system?

Best regards,

Wilfried Behne

THE POST BELOW IS MORE THAN 5 YEARS OLD. RELATED SUPPORT INFORMATION MIGHT BE OUTDATED OR DEPRECATED

On 23/10/2009 at 05:09, xxxxxxxx wrote:

Hi,

rendering gives me 8 threads as expected and also GeGetCPUCount() returns 8 (which I use for creating the MPThreads)! I just set up my visual studio for my i7 system so I will make some debug sessions now but I wonder why not all processors are used when it returns the correct amount of CPU threads? (considering that it works on other systems...)

I´ll report back what my debugging showed me. I am open though for more information or possible reasons as I am not sure what to look for (let´s see if the threads are correctly started and executed).

thx!

THE POST BELOW IS MORE THAN 5 YEARS OLD. RELATED SUPPORT INFORMATION MIGHT BE OUTDATED OR DEPRECATED

On 23/10/2009 at 06:09, xxxxxxxx wrote:

ok, it seems the threads are correctly created and started. So maybe this is simply that the threads end up so quickly that the CPU usage never simply shows the 100%? (this is a series of thread starting/ending so not only a one-time creation/starting). But the strange thing is that the code still takes several magnitudes longer to calculate the same code on a faster system. That somehow doesn´t seem right to me.

(this is of course not per frame, but for 300 frames)
I7 920 (4 cores 8 threads) : cpu load 20-30 %, one thread has full peak the others show hardly any activity, 434 seconds
Q6600 (4 cores) : cpu load 33-42 %, evenly spread on all cores, 96 seconds

What could be the reason for this? any ideas? All run on Vista64Bit except for my system, it already runs on win7 64. Could this be a driver issue? Or could it be related to vista? Or system configuration? Or maybe the creation and starting of the threads takes longer on that one system (but would it have such a high impact on performance)? all kind of strange but apparently not a problem of the MPThreads as they are executed as they should. Darn :-(

THE POST BELOW IS MORE THAN 5 YEARS OLD. RELATED SUPPORT INFORMATION MIGHT BE OUTDATED OR DEPRECATED

On 23/10/2009 at 06:12, xxxxxxxx wrote:

Quote: Originally posted by 3D Designer on 23 October 2009
>
> * * *
>
> Hi,
>
> rendering gives me 8 threads as expected and also GeGetCPUCount() returns 8 (which I use for creating the MPThreads)! I just set up my visual studio for my i7 system so I will make some debug sessions now but I wonder why not all processors are used when it returns the correct amount of CPU threads? (considering that it works on other systems...)
>
>
>
>
> * * *

Cinema doesn't care if these are "real threads on real cores" or hyperthreads. If GetCpuCount returns 8, you should get 8 threads running.

My first suspicion would be, that you've a resource sharing problem (e.g. same cache line or threads are waiting for memory access) - something that can ruin the performance of HT pretty badly ; I already wondered why your overall cpu utilization is so low...

Best regards,

Wilfried Behne

THE POST BELOW IS MORE THAN 5 YEARS OLD. RELATED SUPPORT INFORMATION MIGHT BE OUTDATED OR DEPRECATED

On 23/10/2009 at 06:19, xxxxxxxx wrote:

Quote: Originally posted by wbeh on 23 October 2009
>
> * * *
>
> Cinema doesn't care if these are "real threads on real cores" or hyperthreads. If GetCpuCount returns 8, you should get 8 threads running.
>
> * * *

yep that works.

> Quote: _My first suspicion would be, that you've a resource sharing problem (e.g. same cache line or threads are waiting for memory access) - something that can ruin the performance of HT pretty badly ; I already wondered why your overall cpu utilization is so low..
>
> * * *
_


Well all threads access the same array but each one a different part of memory (i.e. thread 1 processes entry 0-100, second one 101-200, etc.). Evenly spread, so there shouldn´t be any thread waiting for access due to same memory access. Or could this potentially be a problem?

THE POST BELOW IS MORE THAN 5 YEARS OLD. RELATED SUPPORT INFORMATION MIGHT BE OUTDATED OR DEPRECATED

On 23/10/2009 at 07:16, xxxxxxxx wrote:

Quote: Originally posted by 3D Designer on 23 October 2009
>
> * * *
>
>
> Well all threads access the same array but each one a different part of memory (i.e. thread 1 processes entry 0-100, second one 101-200, etc.). Evenly spread, so there shouldn´t be any thread waiting for access due to same memory access. Or could this potentially be a problem?
>
>
> * * *

Have a look at this http://www.xbitlabs.com/articles/cpu/display/nehalem-microarchitecture_4.html

Except for processor registers and return stack buffer all other resources are shared when using hypertreading. If your code is memory bandwidth limited or stresses the ROB or cache too much (doesn't fit into it), enabling HT might actually decrease performance.

Easiest test would be to disable HT in the BIOS and check again.

Best regards,

Wilfried Behne

THE POST BELOW IS MORE THAN 5 YEARS OLD. RELATED SUPPORT INFORMATION MIGHT BE OUTDATED OR DEPRECATED

On 23/10/2009 at 07:35, xxxxxxxx wrote:

Thank you Wilfried for the link and info. I will study it and test without HT.

I´ll report back.

THE POST BELOW IS MORE THAN 5 YEARS OLD. RELATED SUPPORT INFORMATION MIGHT BE OUTDATED OR DEPRECATED

On 24/10/2009 at 05:02, xxxxxxxx wrote:

Yep, it was Hyperthreading! :) Thanks again for your help.