Sunday, April 18, 2010

Parallelism on a thread-safe data structure

Introduction
In this article we have generated some smalls tests on Parallelism using Dot Net Framework 4.0 with some further extensions from the previous article.  In this article we will be adding items on a list.  The list is to have an unknown initial size and without indexing.  In the previous article, data was added to a two-dimensional array with known initial size and allows for indexing.

The construct used for parallelism to produce the below tests, is the same as that used in the previous article.  The new construct that will be used will allow for different threads to add data to a shared list.  Thus this construct must ensure for race conditions, in order to avoid having two items written simultaneously in the same location and thus overwriting the first write.


Construct Details
ConcurrentBag Class


Bags are useful for storing objects when ordering doesn't matter, and unlike sets, bags support duplicates. ConcurrentBag is a thread-safe bag implementation, optimized for scenarios where the same thread will be both producing and consuming data stored in the bag.


Methodology 
The test performed was the factorial of the current iteration number and taking timing benchmarks  while adding it to the list.  The tests where performed using:




  • standard sequential method, and
  • parallelism.

The method used to compute the factorial in both scenarios is:

public int Factorial(int number)
{
    if (number == 0)
    {
        return 1;
    }
    else
    {
        return number * Factorial(number - 1);
    }
}



The machine used has the following specifications: 

Intel(R) Core(TM)2 CPU 6400 @ 2.16, 2.14 GHz, 1.00GB of RAM




Results
These are the results achieved (time shown is in seconds):



                       10       100     1000     10000 12000
Sequential       0      0       0.01        1.38           1.99
Parallel 0.03   0.02    0.03        0.79           1.31



Conclusions
As per the results produced, it can be concluded that the best performance was achieved using parallelism.  Unfortunately, I could not perform more tests with larger iterations, since a StackOverflowException was being given.

Code
Below is the code for all the mentioned above methods using C#:

public List GenerateNumbersListSequentially(int size)
{
    List numbers = new List();

    for (int x=0; x
    {
        numbers.Add(Factorial(x));
    }

    return numbers;
}


public List GenerateNumbersListParallel(int size)
{
    ConcurrentBag numbers = new ConcurrentBag();

    Parallel.For(0, size, x =>
    {
        numbers.Add(Factorial(x));
    });

    return numbers.GetEnumerator() as List;
}