OpenSource For You

A performanc­e graph of threads vs chunk size

-

As you can see in Figure 3, a single- thread process results in a different execution time for different chunk sizes. It is evident that one thread on a single core will compete for CPU time, hence one time slot performs operations on N loops. A higher chunk size indicates more performanc­e on a single thread, with less overhead of dynamic schedule allocation. The single- thread operation of OpenMP code is slower than plain serial code, due to the overheads involved.

For two-thread operations, a chunk size of 2 takes more time to complete than all other chunk sizes. Chunk size 1 provides maximum performanc­e. For four-thread with approximat­ely 188 chunks to be processed by each thread. If we go on increasing the number of threads to the maximum number of cores present in the system, i.e., 128, the performanc­e difference between different chunk sizes vanishes. There is a slight difference between chunk size performanc­es with 128 cores; this is because of the overheads of the dynamic schedule option that requests for chunks to process. In dynamic mode, threads are spawned slowly as each thread requests for chunks of data to process. Since there is a possibilit­y of a job being completed before spawning of all specified threads, it is necessary to choose the chunk size and number of threads wisely if the system is used for multiple parallel programs.

Newspapers in English

Newspapers from India