Parallel.ForEach with HttpClient in C#

ยท

3 min read

Use case: we want to call remote server asynchronously to get the data for an array of inputs. To improve the performance we have decided to use Parallel.ForEach.

Problem statement: When we use async HttpClient inside Parallel.ForEach, we are getting empty results (control returns before completing the async call) for some of the below examples. So let's find out which one we have to use for the expected result.

Let's assume, we have a list of inputs that need to be passed to HttpClient.

List<int> sampleList = Enumerable.Range(1, 10).ToList();
async Task<string> callRemote()
{
    var httpClient = new HttpClient();
    var result = await httpClient.GetStringAsync("https://www.boredapi.com/api/activity");
    return result;
}

Consider the below examples.

Example 1:

List<string> outputList1 = new List<string>();
Parallel.ForEach(sampleList, async i =>
    {
        outputList1.Add(await callRemote());
        Console.WriteLine("Example 1 Executed at {0} ", i);
    });
Console.WriteLine(outputList1.Count); //Outputs: 0

This returns 0 as the Console.WriteLine executes before completing the parallel execution. Note that, callRemote() method uses async/await.

Example 2:

var outputList2 = new System.Collections.Concurrent.ConcurrentBag<string>();
Parallel.ForEach(sampleList, async i =>
    {
        outputList2.Add(await callRemote());
        Console.WriteLine("Example 2 Executed at {0} ", i);
    });
Console.WriteLine(outputList2.Count); //Outputs: 0

This also returns 0 as the Console.WriteLine executes before completing the parallel execution. Even though we have used thread-safe ConcurrentBag, still we are not getting the expected result.

Example 3:

var outputList3 = new List<string>();
var cancellationToken = new CancellationToken();
await Parallel.ForEachAsync(sampleList, async (i, cancellationToken) =>
    {
        outputList3.Add(await callRemote());
        Console.WriteLine("Example 3 Executed at {0} ", i);
    });
Console.WriteLine(outputList3.Count); 
/* Outputs:
Example 3 Executed at 5 
Example 3 Executed at 1 
Example 3 Executed at 8
Example 3 Executed at 9 
Example 3 Executed at 10 
Example 3 Executed at 6 
Example 3 Executed at 7 
Example 3 Executed at 4 
Example 3 Executed at 3
Example 3 Executed at 2 
*/

We used Parallel.ForEachAsync to call remote data with async/await statement and it returns the expected result.

Example 4:

var outputList4 = new List<string>();
Parallel.ForEach(sampleList, i =>
    {
        outputList4.Add(callRemote().Result);
        Console.WriteLine("Example 4 Executed at {0} ", i);
    });
Console.WriteLine(outputList4.Count); 
/* Outputs:
Example 4 Executed at 3 
Example 4 Executed at 9 
Example 4 Executed at 2
Example 4 Executed at 10
Example 4 Executed at 4
Example 4 Executed at 1 
Example 4 Executed at 6 
Example 4 Executed at 7 
Example 4 Executed at 5 
Example 4 Executed at 8 
*/

we used Parallel.ForEach with synchronization call to the remote method callRemote().

HttpClient withasync/awaitsynchronization
Parallel.ForEachFailed (even with ConcurrentBag)Passed
Parallel.ForEachAsyncPassedNA

When we use Parallel.ForEach for running tasks, we use multi-threading and run tasks in individual threads. But once we introduce async/await calls inside the Parallel.ForEach, it is unable to collect the results before the completion. So we are getting Empty results.

Hence Parallel.ForEach should await for any async/await calls inside them. So we should use Parallel.ForEachAsync if we are planning to use async/await calls inside them.

Did you find this article valuable?

Support Rajasekar by becoming a sponsor. Any amount is appreciated!