Parallel.ForEach with HttpClient in C#
Use case:
we want to call remote server asynchronously to get the data for an array of inputs. To improve the performance, we have decided to use Parallel.ForEach
.
Problem statement:
When we use asyncHttpClient
inside Parallel.ForEach
, we are getting empty results (control returns before completing the async call) for some of the below examples. So, let's find out which one we have to use for the expected result.
Let's assume, we have a list of inputs that need to be passed to HttpClient.
List<int> sampleList = Enumerable.Range(1, 10).ToList();
async Task<string> callRemote()
{
var httpClient = new HttpClient();
var result = await httpClient.GetStringAsync("https://www.boredapi.com/api/activity");
return result;
}
Consider the below examples.
Example 1:
List<string> outputList1 = new List<string>();
Parallel.ForEach(sampleList, async i =>
{
outputList1.Add(await callRemote());
Console.WriteLine("Example 1 Executed at {0} ", i);
});
Console.WriteLine(outputList1.Count); //Outputs: 0
This returns 0 as the Console.WriteLine executes before completing the parallel execution. Note that, callRemote() method uses async/await
.
Example 2:
var outputList2 = new System.Collections.Concurrent.ConcurrentBag<string>();
Parallel.ForEach(sampleList, async i =>
{
outputList2.Add(await callRemote());
Console.WriteLine("Example 2 Executed at {0} ", i);
});
Console.WriteLine(outputList2.Count); //Outputs: 0
This also returns 0 as the Console.WriteLine executes before completing the parallel execution. Even though we have used thread-safe ConcurrentBag, still we are not getting the expected result.
Example 3:
var outputList3 = new List<string>();
var cancellationToken = new CancellationToken();
await Parallel.ForEachAsync(sampleList, async (i, cancellationToken) =>
{
outputList3.Add(await callRemote());
Console.WriteLine("Example 3 Executed at {0} ", i);
});
Console.WriteLine(outputList3.Count);
/* Outputs:
Example 3 Executed at 5
Example 3 Executed at 1
Example 3 Executed at 8
Example 3 Executed at 9
Example 3 Executed at 10
Example 3 Executed at 6
Example 3 Executed at 7
Example 3 Executed at 4
Example 3 Executed at 3
Example 3 Executed at 2
*/
We used Parallel.ForEachAsync to call remote data with async/await
statement and it returns the expected result.
Example 4:
var outputList4 = new List<string>();
Parallel.ForEach(sampleList, i =>
{
outputList4.Add(callRemote().Result);
Console.WriteLine("Example 4 Executed at {0} ", i);
});
Console.WriteLine(outputList4.Count);
/* Outputs:
Example 4 Executed at 3
Example 4 Executed at 9
Example 4 Executed at 2
Example 4 Executed at 10
Example 4 Executed at 4
Example 4 Executed at 1
Example 4 Executed at 6
Example 4 Executed at 7
Example 4 Executed at 5
Example 4 Executed at 8
*/
We used Parallel.ForEach with synchronization call to the remote method callRemote().
HttpClient with | async/await | synchronization |
Parallel.ForEach | Failed (even with ConcurrentBag) | Passed |
Parallel.ForEachAsync | Passed | NA |
When we use Parallel.ForEach
for running tasks, it users multiple threads and run tasks in individual threads. But once we introduce async/await
calls inside the Parallel.ForEach
, it is unable to collect the results before the completion. So, we are getting Empty results.
Hence Parallel.ForEach
should await
for any async/await
calls inside them. So, we should use Parallel.ForEachAsync
if we are planning to use async/await
calls inside them.