Multi core programming using Task Parallel Library with .NET 4.0

varun_manipal, 9 Apr 2012 CPOL

Rate this:

Please Sign up or sign in to vote.

The task parallel library allows you to write code which is human readable, less error prone, and adjusts itself with the number of Cores available.

Introduction

Nowadays, all personalcomputer and workstations come with multiple cores. Most .NET applications failto harness the full potential of this computing power. Even when developers attempt to do so, it isgenerally be means of writing low level manipulation of threads and locks. Thisoften leads to a situation, where the code becomes either un-readable or fullof potential threats. These threats are often not detected if running on asingle Core machine.

The task parallel library allows you to write code which is human readable, less error prone, and adjusts itself with the number of Cores available. So you can be sure that your software would auto-upgrade itself with the upgrading environment.

What kind of Performance Boost are we talking about?

What is the first thing that you try to do, when you see parts of your code not performing well. Lazy load, Linq queries, Optimizing For loops, etc. We often overlook parallelization in the time consuming independent units of work.

Most often the CPU will show you the following story during your performance intensive routines.

Shouldn’t your CPU be utilized more like this?

Task Parallel Library

The Task Parallel Library (TPL) is a set of public types and APIs in the System.Threading and System.Threading.Tasks namespaces in the .NET Framework 4.0. The TPL scales the degree of concurrency dynamically to efficiently use all the cores that are available. By using TPL, you can maximize the performance of your code while focusing on the work that your program is designed to accomplish.

The Task Parallel Library introduces the concept of “Task”. Task parallelism is the process of running these tasks in parallel. A Task is an independent unit of work, which runs within a program. Benefits of identifying tasks within your system are:

More efficient and more scalable use of system resources.
More programmatic control than is possible with a thread or work item.

The task parallel library utilizes the Threads under the hood to execute these tasks in parallel. The decision and number of Threads to use is dynamically calculated by the runtime environment.

Why Tasks? Why not threads?

The creation of a thread comes with a huge cost. Creating a huge number of Threads within your application also comes with an overhead of Context Switching. In a single core environment, it might lead to a bad performance as well, since we have a single core which serves various threads.

The task on the other hand, dynamically calculates if it needs to create different threads of execution or not. It uses the ThreadPool under the hood, in order to distribute the work, without going through the overhead of Thread creation/or un-necessary context switching if not required.

Fig 1. The time difference between a traditional Thread based approach, and a task based approach.

The following code snippet shows the creation of parallel tasks using Threads and Task.

You can download the sample used above.

So how is this different from creating a thread again? Well, one of the first advantages of using Tasks over Threads is that it becomes easier to guarantee that you are going to maximize the performance of your application on any given system. For example, if I am going to fire off multiple threads that are all going to be doing heavy CPU bound work, On a single core machine we are likely to cause the work to take significantly longer. It is clear, threading has overhead, and if you are trying to execute more CPU bound threads on a machine than you have available cores for them to run, then you can possibly run into problems. Each time the CPU has to switch from thread to thread, there is a bit of overhead, and if you have many threads running at once, then this switching can happen quite often, causing the work to take longer than if it had just been executed synchronously. This diagram might help spell that out for you a bit better:

As you can see, if we aren’t switching between pieces of work, then we don’t have the context switches between threads. So, the total cumulative time to process in that manner is much longer, even though the same amount of work was done. If these were being processed by two different cores, then we could simply execute them on two cores, and the two sets of work would get executed simultaneously, providing the highest possible efficiency.

Why Tasks? Why not ThreadPools?

Now when we have a slight idea of Tasks and their capacity, let us look into these Tasks in a little more detail and how they are different from ThreadPools.

Let us see how you can start a new execution on a ThreadPool

Let us see what you will have to do if you wish to Wait () for the thread to finish.

Messy! Isn’t is?.

What if you have to wait for 15 threads to finish?

How do you capture the return values from multiple threads?

How do you return the control back to GUI thread?

There are answers to it. Delegates, Raising events but this leads to an error prone situation when we drill into a chain of multi threaded actions.

Let us see how Tasks handle this situation elegantly:

Creation of a new Task

Waiting on Tasks:

Execute another Async task when the current task is done:

In real world scenarios, we often have multiple operations which we want to perform asynchronously. Look at the following code snippet and see how you can model it alternatively.

Parallel Extensions:

Parallel extensions have been introduced along with the Task Parallel Library to achieve data Parallelism. Data parallelism refers to scenarios in which the same operation is performed concurrently (that is, in parallel) on elements in a source collection or array. The .NET provides new constructs to achieve data parallelism by using Parallel.For and Parallel.Foreach constructs.

Let us see how we can use these:

The above mentioned Parallel.ForEach construct utilizes the multiple cores and thus enhances the performance in the same fashion.

The following graph shows, how parallel extensions improve the performance of the system:

Fig 1. Matrix multiplication running on a Dual Core machine. The parallel extensions consume less time.

Fig 2. Matrix multiplication running on a Quad Core machine. The same code consume far less time without any modifications

Fig 3. Matrix multiplication running on a single core machine. The execution time remains identical.

You can download the code from the following link [Download]

Conclusion

The parallel extensions and the task parallel library helps the developers to leverage the full potential of the available hardware capacity. The same code can adjust itself to give you the benefits across various hardware. It also improves the readability of the code and thus reduces the risk of introducing nasty bugs which drives developers crazy.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

varun_manipal

Software Developer (Senior)
India

No Biography provided

Comments and Discussions

You must Sign In to use this message board.

Search Comments
	Profile popups Spacing Noise Layout Per page

C# Parallel.For multi thread parallel, in native win7 64 test can be multiple threads to run in parallel, but deployed after the Windows Server 2008 R2, a maximum of 16 threads running in parallel, people know this is why? Have set the maximum thread does not limit.
Sign In·View Thread·Permalink

Through MSDN's Threading.Tasks, only Windows 8 support multi-coeur tasks. Did you test with a Windows 7? I've only windows 8 in my hand :S
Sign In·View Thread·Permalink

This was tested with Windows 7. But the results should remain more or less with Windows 8 i believe. Let me know if you find it otherwise.
Sign In·View Thread·Permalink

Excellent article & i liked the demo simple & effective.
Sign In·View Thread·Permalink

Thanks Chait301
Sign In·View Thread·Permalink

WOw! You did it very good man! Nice description and nice examples.
Sign In·View Thread·Permalink

5 out of 5.
Sign In·View Thread·Permalink

Very helpful! Thanks!
Sign In·View Thread·Permalink

Really clear and understandable article. hi-five!
Sign In·View Thread·Permalink

Parallel programming in . NET Framework 4.0 http://www.enterra-inc.com/techzone/parallel-programming-in-net-framework-4-0/
Sign In·View Thread·Permalink

you are spamming the forum please delete all your recent posts else you will be reported. Jibesh V P
Sign In·View Thread·Permalink

Nice explanations with examples
Sign In·View Thread·Permalink

My vote for 5 is a "thank" that you show well: MS again reinvented wheel. Because there is absolutely no difference between tasks and threads if you create threads as much as CPUs you have. And well... some syntax sugar to wait multiple threads, that's it! MS _WASTE_ our time while we was waiting for a really new features.
Sign In·View Thread·Permalink

great one (I missed out on this until today!)
Sign In·View Thread·Permalink

abcd
Sign In·View Thread·Permalink

very instructive, tanks
Sign In·View Thread·Permalink

Short, clear, and backed by numbers. 5 stars 4 sho!
Sign In·View Thread·Permalink

After a lot of testing I found that Parallel processing always win. So my vote is 5. Thanks for sharing.
Sign In·View Thread·Permalink

This is a great inspiring article. I am pretty much pleased with your good work. You put really very helpful information. Keep it up once again.
Sign In·View Thread·Permalink

5
Sign In·View Thread·Permalink

Very nice, especially for a first article. Please keep writing. The only thing I would suggest is to cover things in even more depth. Just because the code works, it doesn't mean that it is good code.
Sign In·View Thread·Permalink

The performance improvement of the Parallel.ForEach seems quite convincing. Is there a reason not to use this or the Parallel.For every time a loop is needed? Thanks!
Sign In·View Thread·Permalink

In general the code using multi-threading is far more difficult to write correctly and debug. It can be even dangerous if used recklessly. Also in some simple loops there might be no performance gain at all. Iterating a few times might not overcome the cost of maintaining multiple threads.
Sign In·View Thread·Permalink

Hi Paul89, The multi-threading do adds some complexity to the code. However, TPL is supposed to provide a controlled execution which is easier to read as well. I guess the key to identify here is if the code is an ideal candidate for parallel execution or not.
Sign In·View Thread·Permalink

yeah, you are right.
Sign In·View Thread·Permalink

Hi Ninj4n,

The trick to using parallel programming is to identify the segments of code which are ideal candidate for Parallel processing.

You would want to add parallelism to independent units of code, however, if the executing unit isn't independent, Parallelism is a "No".

Ex: Imagine bubble sorting an array within a For loop, where every (i+1)th iteration is dependent on the ith execution, Adding Parallelism will lead to incorrect results there.

However, matrix multiplication (The one in the download sample) is a perfect example of independent unit of work and thus is suited for Parallelism.

Hope this helps.

Okay, I get that. It was just that, compared to other parallelism methods, this seemed a very easy way, and it might have been one of those things you could just do "blindly". But clearly that's not the case. Thanks for the answer!
Sign In·View Thread·Permalink

on my i7 machine if I set to zero each byte in a byte array of 10 million within a normal "for" loop it is far faster than doing the same using a "Parallel.For" loop.
Sign In·View Thread·Permalink

Thanks for the feedback RugbyLeague.

It is always important to pick up the right context to run within the parallel loops. Any operation which is dependent on the earlier iteration execution might not be the right candidate.

Having said that, it would be interesting to have a look at your code snippet.
Would be interesting to have a look at it and see how we can leverage the parallelism here.

BTW Nice machine you have there Wink | ;)

Code snippet here:

        static void Main(string[] args)
        {
            const int SIZE = 1000000000;
 
            Console.WriteLine("Assign");
            byte[] buffer = new byte[SIZE];
 
            Console.WriteLine("Parallel");
            Parallel.For(0, SIZE, i => buffer[i] = 255);
 
            Console.WriteLine("Traditional");
            for (int i = 0; i < SIZE; ++i)
                buffer[i] = 255;
 
            Console.WriteLine("Finished");
        }

Interestingly if I use the .Net stopwatch the parallel version is faster but when I use a profiler the tradional version is faster - I suppose the profiler must be adding a lot of overhead or not timing it effectively
Sign In·View Thread·Permalink

This is much better than the amorphous "hello from task 1 or 2" example everyone else writes. Now we can actually see how it works and what it's good for!
Sign In·View Thread·Permalink

Thanks Dave
Sign In·View Thread·Permalink

I did a TPL series a while back, which you may like to look at for interest

Task Parallel Library: 1 of n
Task Parallel Library: 2 of n
Task Parallel Library: 3 of n
Task Parallel Library: 4 of n
Task Parallel Library: 5 of n
Task Parallel Library: 6 of n

Still this is quite a nice article you have done

Sacha Barber

Microsoft Visual C# MVP 2008-2012
Codeproject MVP 2008-2012

Open Source Projects
Cinch SL/WPF MVVM

Your best friend is you.
I'm my best friend too. We share the same views, and hardly ever argue

My Blog : sachabarber.net

Thanks for the links. It looks very detailed and a good read. I am pretty impressed by this feature. You might also want to have a look at the System.Reactive extensions.
Sign In·View Thread·Permalink

Yeah I have also look at that a while back (god that is like 1 1/2 years ago now, Yikes where does the time go) : Fun with Rx

Its ok, but is harder to use, can get quite messy, clever though.

Sacha Barber

Microsoft Visual C# MVP 2008-2012
Codeproject MVP 2008-2012

Open Source Projects
Cinch SL/WPF MVVM

Your best friend is you.
I'm my best friend too. We share the same views, and hardly ever argue

My Blog : sachabarber.net

Brilliant read. I explored it with a PDF based tutorial available on the Microsoft website. Its a good read. Yeah, it can get messy but again, the normal approach of background threads would drive one crazy in those situations. With the Windows 8 "Fast and Fluid" metro Apps, there is an ever increasing awareness of multi-threaded applications. Good to see the non-responsive UIs dying.
Sign In·View Thread·Permalink

Hello Sacha,

since you are a real guru here: Did you ever notice that an installation of .NET4.5 (coming along with VS11 for example) breaks some TPL behavior inside .NET 4.0 applications?

I noticed this at 2 places:
* VS2010 Build output window: The finishing line "X completed / Y failed / Z skipped" is missing since installation of VS11.
* Some own 4.0 projects (compiled using VS2010) and utilizing TPL are now deadlocking here and there.

I'm very pissed of by Microsofts versioning strategy which changes the CLR all some months. This did happen with 2.0 where 3.0 and 3.5 changed 2.0 CLR and now happens again -- it's not only "just adding bits" like they promised.

To be honest I have not downloaded VS11, so don't know what's ok and what's broken now Sacha Barber Microsoft Visual C# MVP 2008-2012 Codeproject MVP 2008-2012 Open Source Projects Cinch SL/WPF MVVM Your best friend is you. I'm my best friend too. We share the same views, and hardly ever argue My Blog : sachabarber.net
Sign In·View Thread·Permalink

As far as I know, .NET 4.5 a preview thus if something is broken it will probably be fixed before Windows get released.

Also, are you sure that your code does depend upon undefined behavior.

I haven't noticed the problem you are talking with VS 2010 after the instalaltion of VS 11 but for me it seems that it take a bit longer to load and their might be some other minor problems. I usually uses VS 11 except when building release version since setup project are not supported.

Philippe Mori

At least the VS-last-build-line-missing problem was a one day mysterium only. But that the installation of 4.5ß breaks our 4.0 apps is reproducible on any random machine.
Sign In·View Thread·Permalink

Explainations I've seen, I really liked it! Have a 5!!!
Sign In·View Thread·Permalink

Thanks Dewey
Sign In·View Thread·Permalink

Good intro! Attempting to load signature... A NullSignatureException was unhandled. Message: "No signature exists"
Sign In·View Thread·Permalink

u are welcome Zac.
Sign In·View Thread·Permalink

General

News

Suggestion

Question

Bug

Answer

Joke

Rant

Admin

Tagged as

Multi core programming using Task Parallel Library with .NET 4.0

Introduction

What kind of Performance Boost are we talking about?

Task Parallel Library

Why Tasks? Why not threads?

Why Tasks? Why not ThreadPools?

Creation of a new Task

Parallel Extensions:

Conclusion

License

Share

About the Author

Comments and Discussions

Research

		11,125,093 members (66,433 online) Sign in Email Password Forgot your password? Sign in using

Tagged as

Related Articles

Multi core programming using Task Parallel Library with .NET 4.0

Introduction

What kind of Performance Boost are we talking about?

Task Parallel Library

Why Tasks? Why not threads?

Why Tasks? Why not ThreadPools?

Creation of a new Task

Parallel Extensions:

Conclusion

License

Share

About the Author

Comments and Discussions

Research