.NET

Working with FTP for the first time? Quick setup & quick C# code.

March 21, 2012 .NET, C#, QuickTip No comments

Recently I had some FTP work to do. Nothing special, but in case you need quick guide on setting up FTP and writing access code in .NET you might find this interesting. Also you know where to find it in case you need it later.

I will define simple task and we will solve it!

Task:

Imagine we have external FTP server, where some vendor puts many files. Of course they provided us with credentials. We want to connect to server and then parse some files from the whole list of files. Also for testing purposes we are going to mock external service with our own local.

Setup FTP:

1) Enable FPT in Windows features.

image

2) Go to IIS –> Sites –> “Add FPT Site…”. You would need to specify some folder.

3) As for our task we want to mock some system. Following setup might be good:

  • Binding with all assigned host names and port 21
  • No SSL
  • Allow for Anonymous and Basic Authentication
  • Add Read permissions for All Users and Anonymous

You should see something like this:

image

image

You will be able to access FTP locally without any issues and need to provide credentials.

4) Go to User Accounts –> Advanced –> Advanced –> New User… Create user you would like use when connecting to FTP.

image

5) Go to IIS -> your FTP site –> Basic Settings –> Connect as… –> Specific User. And enter same user again.

image

We added this user because we need to imitate situation in which our code and FTP have different credentials.

Access code:

To get list of files on server (using WebRequest):

public List<string> FetchFilesList()
{
    var request = WebRequest.Create(FtpServerUri);
    request.Method = WebRequestMethods.Ftp.ListDirectory;

    request.Credentials = new NetworkCredential(UserName, UserPassword);

    using (var response = (FtpWebResponse)request.GetResponse())
    {
        var responseStream = response.GetResponseStream();

        using (var reader = new StreamReader(responseStream))
        {
            var fileNamesString = reader.ReadToEnd();
            var fileNames = fileNamesString.Split(Environment.NewLine.ToCharArray(), StringSplitOptions.RemoveEmptyEntries);

            return fileNames.ToList();
        }
    }
}

To fetch some file contents as XDocument (using WebClient):

public XDocument FetchFile(string fileName)
{
    var client = new WebClient();
    client.Credentials = new NetworkCredential(UserName, UserPassword);

    var fileUri = new Uri(FtpServerUri, fileName);
    var downloadedXml = client.DownloadString(fileUri);
    return XDocument.Parse(downloadedXml);
}

I don’t think those two chucks of code need lot of explanations. As you can see with WebClient there is less code, but this way you cannot specify request ftp method.

Hope this overview is quick and not too much noisy.

NOTE: I’m not professional administrator, so my FTP setup may be somewhatwrong, but it satisfied needs of task described in the beginning of blog post.

In any case here are some links:


No comments


Custom configuration: collection without “add” plus CDATA inside

March 20, 2012 .NET, C#, QuickTip 2 comments

This blog post might look like any other standard blog posts, answering question, which can be googled and found on stackoverflow. But it isn’t. You see… it composes couple of interesting things you might need for you custom configuration. Also it is not congested with explanations. I’m adding this as quick reference for myself, so I don’t spend my time googling a lot to find answers. Also if you just starting with custom configuration and don’t want to read MSDN pages, please refer to my earlier blog post on basics here.

Let’s get back to topic:

We want section in our app/web.config with collection which will be able to contain elements without ugly “add” tag and also have CDATA inside. See configuration:

    <Feeds defaultPollingInterval="00:10:00">
      <Feed>
        <![CDATA[http://www.andriybuday.com/getXmlFeed.aspx?someParam=A&somethingElse=B]]>
      </Feed>
      <Feed pollingInterval="00:05:00">
        <![CDATA[http://www.andriybuday.com/getXmlFeed.aspx?someParam=C&somethingElse=D]]>
      </Feed>
    </Feeds>

So as you can see in collection of elements there is custom name “Feed”, which is awesome. Also notice that URL contains weird characters (not for us, but for XML), so we surround URL into CDATA. Those feeds are fake of course.

To make all this happen we need few things:

  1. Override CollectionType property for our collection, and set type to BasicMap
  2. Override ElementName property for our collection, and return preferred name
  3. Override DeserializeElement method for element inside collection. Here you need to manually fetch your attributes, like I do for poollingInterval and read contents of CDATA. Please refer to source code below to see how this is done as it is bit tricky. For example because of the nature of the XmlReader you need to read attributes first and then proceed to contents.

Source code below (interesting pieces are in bold):

[ConfigurationCollection(typeof(FeedConfigElement))]
public class FeedsConfigElementCollection : ConfigurationElementCollection
{
    [ConfigurationProperty("defaultPollingInterval", DefaultValue = "00:10:00")]
    public string DefaultPollingInterval
    {
        get
        {
            return (string)base["defaultPollingInterval"];
        }
    }
    protected override ConfigurationElement CreateNewElement()
    {
        return new FeedConfigElement();
    }
    protected override object GetElementKey(ConfigurationElement element)
    {
        return ((FeedConfigElement)(element)).Url;
    }

    // In order to avoid standard keyword "add"
    // we override ElementName and set CollectionType to BasicMap
    protected override string ElementName
    {
        get { return "Feed"; }
    }

    public override ConfigurationElementCollectionType CollectionType
    {
        get { return ConfigurationElementCollectionType.BasicMap; }
    }
    public FeedConfigElement this[int index]
    {
        get
        {
            return (FeedConfigElement)BaseGet(index);
        }
    }
}

public class FeedConfigElement : ConfigurationElement
{
    public string Url { get; private set; }

    public string PollingInterval { get; private set; }

    // To get value from the CDATA we need to overrride this method
    protected override void DeserializeElement(XmlReader reader, bool serializeCollectionKey)
    {
        PollingInterval = reader.GetAttribute("pollingInterval") ?? "00:00:00";

        // Also for some unknown reason for CDATA ReadElementContentAsString returns 
        // a lot of spaces before and after the actual string, so we Trim it
        Url = reader.ReadElementContentAsString().Trim();
    }
}

Hope this gives quick answers to some of you. It took me good portion of time to find all this things, because for some odd reason it wasn’t so much easy to find.

Some links:


2 comments


log4net versions deployment issue

October 6, 2010 .NET, Deployment 13 comments

So few days ago I faced with issue of 3rd party references.

My original question on stackoverflow:

What is the best approach to use 3rd party that uses another version of other 3rd party (log4net) already used in the system?

  • Currently we use log4net of version 1.2.10.0 and we should start using some 3rd party components developed by other team.
  • Mentioned component references log4net of version 1.2.9.0.
  • All binaries are deployed into one folder.
I’m sure that we cannot rebuild our sources with 1.2.9.0 version,
because there are too many other dependencies and will require lot of
efforts. Are there any other approaches to solve this issue? I’m NOT
looking for too sophisticated that have something to do with CLR
assemblies loading, but would hear them with great pleasure. I’m looking
for the simplest approaches. I guess someone has encountered the same
issue.

I got (as for now) two answers and I would like to try them out.

So I created 3 projects, one references log4net of version of 1.2.10.0 and another references 1.2.9.0. Both of them are referenced in client console application, which also references one of the log4net assemblies. In client application I’m trying to execute code that requires log4net in both of the assemblies.

Below is projects layout:

When I execute my code I’m getting error:

Unhandled Exception: System.IO.FileLoadException: Could not load file or assembly ‘log4net, Version=1.2.9.0, Culture=neutral, PublicKeyToken=b32731d11ce58905’ or one of its dependencies. The located assembly’s manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)
File name: ‘log4net, Version=1.2.9.0, Culture=neutral, PublicKeyToken=b32731d11ce58905’
   at ProjReferencesLog4Net._1._2._9._0.ClassA..ctor()
   at ConsoleAppReferencesLog4Net1._2._10._0_andBothAssemblies.Program.Main(String[] args) in
  ….

In order to resolve this I tried suggestion one by one…

Suggestion number 1

Redirecting Assembly Versions

Accordingly to MSDN there is possibility to redirect code execution to assembly with higher version, just with using following configuration:

<configuration>
   <runtime>
      <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
       <dependentAssembly>
         <assemblyIdentity name="log4net"
                           publicKeyToken="b32731d11ce58905"
                           culture="neutral" />
         <bindingRedirect oldVersion="1.2.9.0"
                          newVersion="1.2.10.0"/>
       </dependentAssembly>
      </assemblyBinding>
   </runtime>
</configuration>

I’ve tried it and it did not work. Reason is that we cannot do redirection between assemblies with different PublicKeyToken-s. log4net 1.2.9.0 has “b32731d11ce58905” and log4net 1.2.10.0 has “1b44e1d426115821”. Also see this stackoverflow question.

Suggestion number 2

Use GAC. So when I install those two assemblies into GAC:

In this case code works, but suggestion doesn’t work for us, since we do not want to gac assemblies we use.

Other suggestions

So I’ve been thinking about another approach.
Approaches that require rebuilding our code with different version of log4net are not suitable for us. At least for now.
Another thing about which I’ve been thinking is to load those assemblies into different application domain or host 3rd party that uses 1.2.9.0 under different WinService. Both of these are cumbersome solutions and I like to avoid them.

YOUR SUGGESTION

If you have some ideas, could you please let me know!

[EDITED 7 Oct, 2010 11PM]

FUNNY-HAPPY-END OF THIS STORY

Do you know what is the most interesting about all of this? It is how it has finished. We contacted those guys, who developed component we now should use. They gave us know, that they were encountering issues with updating on-the-fly configuration file for log4net 1.2.10.0. By their words, new version of log4net is not capable of doing this. So they sent as simple application that demonstrates this, and indeed, after updating config when app is running, 1.2.10.0 did not catch up new configuration, but 1.2.9.0 was working just fine. This surprised me very much, so I went to this download page and downloaded latest binaries. When I tried it got working!!! Actually I guess that they simply used version of log4net buit with references to .net framework 1.1, and we should use one built with .net 2.0 (Yeah! Actually if you would download you will see.)

After all of this, they created new sub-release of their sources especially for us and they were able to fix some minor bug. Great news! Unexpected end of story! :)


13 comments


.NET Remoting Quickly

August 11, 2010 .NET, C#, HowTo No comments

As you may know recently I got junior to mentor him. In order to understand his capabilities and knowledge I asked him to do couple of things, like explain me one Design Pattern he knows, explain SCRUM and write the simplest .NET Remoting. So that was yesterday and today I verified that he failed with .NET Remoting, but it doesn’t mean that he is bad. He just need learn googling art more. I asked that for next day, and gave him stored procedure to write. Hope he will be smart enough to finish it till I come tomorrow from my English classes.

.NET Remoting

To ensure that I’m not asshole that asks people to do what I cannot do, I decided to write it by my own and see how long will it take for me. It took me 23 minutes. Hm… too much, but I should complain at VS about “Add Reference” dialog.

So here we have three projects in Visual Studio: one for Server, one for Client and of course Proxy class shared between client and server.

Shared proxy class ChatSender in ChatProxy assembly:

    public class ChatSender : MarshalByRefObject
    {
        public void SendMessage(string sender, string message)
        {
            Console.WriteLine(string.Format(“{0}: {1}”, sender, message));
        }
    }

Server (ChatServer):

    class Program
    {
        static void Main(string[] args)
        {
            var channel = new TcpServerChannel(7777);
            ChannelServices.RegisterChannel(channel, true);
            RemotingConfiguration.RegisterWellKnownServiceType(typeof(ChatSender),
                “ChatSender”, WellKnownObjectMode.Singleton );
            Console.WriteLine(“Server is started… Press ENTER to exit”);
            Console.ReadLine();
        }
    }

Client (ChatClient assembly):

    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine(“Client is started…”);
            ChannelServices.RegisterChannel(new TcpClientChannel(), true);
            var chatSender =
                (ChatSender)Activator.GetObject(typeof(ChatSender), “tcp://localhost:7777/ChatSender”);
            string message;
            while ((message = Console.ReadLine()) != string.Empty)
            {
                chatSender.SendMessage(“Andriy”, message);   
            }
        }
    }

My results


No comments


Threading.Timer vs. Timers.Timer

June 9, 2010 .NET, C#, Concurrency, Performance 6 comments

I agree that title doesn’t promise a lot of interesting stuff at first glance especially for experienced .net developers. But unless you encounter some issue due to incorrect usage of timers you will never think that root is in timers.

System.Threading.Timer vs. System.Windows.Forms.Timer

In few words what are differences between Threading and Forms timers just to start with something.
System.Threading.Timer executes some method on periodic bases. But what is interesting is that execution of method is performed in separate thread taken from ThreadPool. In other words it calls QueueUserWorkItem somewhere internally for your method at specified intervals.
System.Windows.Forms.Timer ensure as that execution of our method will be in the same thread where we’ve created timer.

What if operation takes longer than period?

Let’s now think what will happen if the operation we set for execution takes longer than interval.

When I have following code:

    internal class LearningThreadingTimer
    {
        private System.Threading.Timer timer;
        public void Run()
        {
            timer = new Timer(SomeOperation, null, 0, 1000);
        }
        private void SomeOperation(object state)
        {
            Thread.Sleep(500);
            Console.WriteLine(“a”);
        }
    }

my application behaves well – prints “a” twice a second. I took a look for number of threads in Task Manager and it stays constantly (7 threads).
Let now change following line: Thread.Sleep(500) to Thread.Sleep(8000). What will happen now? Just think before continue to read.
I’m almost completely sure that you predicted printing “a” every second after 8 seconds have passed. As you already guessed each of the “a” printings are scheduled in separate threads allocated from ThreadPool. So… amount of threads is constantly increasing… (Every 1.125 seconds :) )

Issue I’ve been investigating

Some mister X also figured out that Console.WriteLine(“a”) is critical and should run in one thread, at least because he is not sure how much does it take to execute Thread.Sleep(500). To ensure it will run in one thread he decided to have lock, like in code below:

    internal class LearningThreadingTimer
    {
        private System.Threading.Timer timer;
        private object locker = new object();
        public void Run()
        {
            timer = new Timer(SomeOperation, null, 0, 1000);
        }
        private void SomeOperation(object state)
        {
            lock (locker)
            {
                Thread.Sleep(8000);
                Console.WriteLine(“a”);   
            }
        }
    }

Yes, this code ensures that section under lock is executed in one thread. And you know this code works well unless your execution takes few hours and you will be out of threads and out of memory. :) So that is an issue I’ve been investigating.

My first idea was System.Windows.Forms.Timer

My first idea was to change this timer to the System.Windows.Forms.Timer, and it worked well in application, but that application is able to run in GUI and WinService modes. But there are so many complains over interned to do not use Forms.Timer for non UI stuff. Also if you put Forms.Timer into your console application it will simply not work.

Why System.Timers.Timer is good toy?

System.Timers.Timer is just wrapper over System.Threading.Timer, but what is very interesting is that it provides us with more developer-friendly abilities like enabling and disabling it.

My final decision which fixes issue is to disable timer when we are diving into our operation and enable on exit. In my app timer executes every 30 seconds so this could not be a problem. Fix looks like:

    internal class LearningTimersTimer
    {
        private System.Timers.Timer timer;
        private object locker = new object();
        public void Run()
        {
            timer = new System.Timers.Timer();
            timer.Interval = 1000;
            timer.Elapsed += SomeOperation;
            timer.Start();
        }
        public void SomeOperation(object sender, EventArgs e)
        {
            timer.Enabled = false;
            lock (locker)
            {
                Thread.Sleep(8000);
                Console.WriteLine(“a”);
            }
            timer.Enabled = true;
        }
    }

And it looks that we don’t need lock there, but I left it there just to be sure is case if SomeOperation will be called from dozen of other threads.

MAKE DECISION ON TIMER BASING ON THIS TABLE (from msdn article)

System.Windows.Forms System.Timers System.Threading
Timer event runs on what thread? UI thread UI or worker thread Worker thread
Instances are thread safe? No Yes No
Familiar/intuitive object model? Yes Yes No
Requires Windows Forms? Yes No No
Metronome-quality beat? No Yes* Yes*
Timer event supports state object? No No Yes
Initial timer event can be scheduled? No No Yes
Class supports inheritance? Yes Yes No
* Depending on the availability of system resources (for example, worker threads

I hope my story is useful and when you will be searching like “C# Timer Threads issues” or “Allocation of threads when using timer” you will find my article and it will help you.


6 comments


AppDomain.UnhandledException and Application.ThreadException events

May 19, 2010 .NET, Concurrency, Fun No comments

Today I was playing with exception handling and threading in .NET. It was really fun.

Do you know guys that like to have everything in global try-catch block? I have two news for them.

Bad one

Exception will not be caught in try-catch block if it was thrown in another thread. Those guys also could think that Application.ThreadException could help them to catch those.

        [STAThread]
        static
void
Main()
        {
            Application.EnableVisualStyles();
            Application.SetCompatibleTextRenderingDefault(false);
            Application.ThreadException += Application_ThreadException;
         
            Application.Run(new Form1());
        }

        static
void
Application_ThreadException(object
sender, System.Threading.ThreadExceptionEventArgs
e)
        {
            MessageBox.Show(“This is
something that I would not recommend you to have in your application.”
);
        }

But indeed this event fires only if exception has been thrown as result of Windows message or any other code that runs in same thread were your WinForms application lives. I tried to use two timers to verify that.
System.Windows.Forms.Timer – which ensures that code you have in your tick method runs in the same thread as your application. In this case I got message box.
System.Threading.Timer – which runs in separate thread, so my app just crashed.

But if those guys are writing all code in Form1.cs file… then maybe it worth for them to have handling of Application.ThreadException event :)

“Good” one

There is event which will be fired when exception is thrown from any code/thread withing your application domain. It is AppDomain.UnhandledException and it occurs when exception is not caught.

Lets take a look on code snippet that shows this:

        [STAThread]
        static
void
Main()
        {
            AppDomain currentDomain =
AppDomain.CurrentDomain;
            currentDomain.UnhandledException +=
currentDomain_UnhandledException;

            Application.EnableVisualStyles();
            Application.SetCompatibleTextRenderingDefault(false);
         
            Application.Run(new Form1());
        }

        static
void
currentDomain_UnhandledException(object
sender, UnhandledExceptionEventArgs e)
        {
            MessageBox.Show(“This is
shown when ANY thread is thrown in ANY point of your Domain.”
);
        }

To test this I created Threading.Timer and fired it every 2 seconds and throwing exception on each tick, I put breakpoint into event. I got everything expected and application after that failed.

But one of our smart guys could guess to put Thread.Sleep(1000000); into handler code like below:

        static void currentDomain_UnhandledException(object sender,
UnhandledExceptionEventArgs e)
        {
            Thread.Sleep(1000000); //
Try to guess what will happen

        }

Everyone is happy. Exception is thrown each 2 seconds or less and UI continue to respond. I hope you already see what is wrong with all of this.

This screenshot talks for itself:

 (My app already ate 705 threads and still eats…)

And as you know process could have up to 1024 threads. My process is not exclusion and it also crashed after reached that number.

Ok, looks like guys should change their mind.

Hope it was a bit fun.


No comments


Few threading performance tips I’ve learnt in recent time

May 16, 2010 .NET, Concurrency, MasterDiploma, Performance 4 comments

In recent time I was interacted with multithreading in .NET.
One of the intersting aspects of it is performance. Most of books says that we should not overplay with performance, because we could introduce ugly-super-not-catching bug. But since I’m using multithreading for my educational purposes I allow myself play with this.

Here is some list of performance tips that I’ve used:

1. UnsafeQueueUserWorkItem is faster than QueueUserWorkItem

Difference is in verification of Security Privileges. Unsafe version doesn’t care about privileges of calling code and runs everything in its own privileges scope.

2. Ensure that you don’t have redundant logic for scheduling your threads.

In my algorithm I have dozen of iterations on each of them I perform calculations on long list. So in order to make this paralleled I was dividing this list like [a|b|c|…]. My problem was in recalculating bounds on each iteration, but since list is always of the same size I could have calculating bounds once. So just ensure that don’t have such crap in your code.

3. Do not pass huge objects into your workers.

If you are using delegate ParameterizedThreadStart and pass lot of information with your param object it could decrease your performance. Slightly, but could. To avoid this you could put such information into some fields of the object that contains method for threading.

4. Ensure that you main thread is also busy guy!

I had this piece of code:

    for
(int i = 0; i < GridGroups;
i++)
    {
        ThreadPool.UnsafeQueueUserWorkItem(AsynchronousFindBestMatchingNeuron,
i);
    }
    for (int i = 0; i < GridGroups;
i++) DoneEvents[i].WaitOne();

Do you see where I have performance gap? Answer is in utilizing my main thread. Currently it is only scheduling some number of threads (GridGroups) to do some work and than it waits for them to accomplish. If we divide work to approximately equivalent partitions, we could gave some work to our main thread, and in this way waiting time will be eliminated.
Following code gives us persormance increase:

    for
(int i = 1; i < GridGroups;
i++)
    {
        ThreadPool.UnsafeQueueUserWorkItem(AsynchronousFindBestMatchingNeuron,
i);
    }
    AsynchronousFindBestMatchingNeuron(0);
    for (int i = 1; i < GridGroups;
i++) DoneEvents[i].WaitOne();

5. ThreadPool and .NET Framework 4.0

Guys, from Microsoft improved performance of the ThreadPool significantly! I just changed target framework of my project to the .Net 4.0 and for worst cases in my app got 1.5x time improvement.

What’s next?

Looking forward that I also could create more sophisticated synchronization with Monitor.Pulse() and Monitor.WaitOne().

Good Threading Reference

Btw: I read this quick book Threading in C#. It is very good reference if you would like to remind threading in C# and to find some good tips on sync approaches.

P.S. If someone is interested if I woke up at 8am. (See my previous post). I need to say that I failed that attempt. I woke at 12pm.


4 comments


Get started with F#

April 4, 2010 .NET, C#, F# No comments

Renaissance of functional languages

We are at renaissance of functional languages. When I read blog posts I often see guys talking about functional programming and stuff related to it. Community wants more features that functional style provides for us. In response to that creators of languages and technologies are now introducing a lot of amazing cool features to make our life happier. They also create new languages, and so on.

What is going on with C# nowadays?

Let’s start with what we do have with C# nowadays. It has moved to functional side slightly. Introducing LINQ is big step in that direction. We are moving away from imperative programming to functional, for example this simple loop represents imperative way to work with list items.

      foreach (var element in
list)
      {
   
    Console.WriteLine(element);
      }

Using LinQ we can have so much elegant functional syntax:

      list.ForEach(Console.WriteLine);

So we pass function into function. Simply saying that is why we call this functional programming. If example above isn’t so bright take a look on next:

      Func<int, int>
doubleThat = delegate(int x) { return
x * 2; };
      var
from2To20 = from1To10.Select(doubleThat).ToList();

Another example of that C# is more close to those languages is introducing var keyword, which doesn’t mean that we can change type in further code like in dynamic languages, but that is thing which really helps us to write code without being really concerned about types, but the language itself is still strongly typed. If you would say here about C# dynamics, hm.. honestly I’m not a fan of that thing in C#, but maybe it gives some advantages for us.

Immutability

Few days ago I have heard podcast where guy from Microsoft .NET languages team spoke about different features that community had requested to language C#. One of them was immutability and it looks that they are going to introduce this in further releases (after 4.0 of course). So what is immutability? It is something that functional languages has by default and imperative hasn’t :).

For example we have

   int number = 1;
   number = number + 3;

We can change number as many times as we want. But the same in F# will produce compilation error:

(“<-” is assignment operator in F#). In order to make it work you will need another element result:

If you want variable with you 100% want to change you should declare this using mutable keyword:

Functions

So lets move to much interesting – declaring functions:

ten will be immutable variable which has value 10.

Are you familiar with parameter binding in C++ functors? It doesn’t matter but, currying of methods in F# reminds me that. Take a look how we get new method with mixing function multiply and using one frozen parameter.

Using F# library in other .Net languages 

I did all this fun stuff with creating new F# library in Visual Studio. I viewed results of code which interested me with selecting it and pressing ‘Send To Interactive’ from context menu. Next what was interested for me is how can I use that dll in my usual C# program. So I created console application, added reference to my F# lib. Now I can use it like below:

See how things differs out there. Method is seen as method, immutable variables as properties without setter and mutable as properties with setter. Forgot to mention that FirstFSharpProgram  defines with keyword module at the top of *.fs file.

Why?

When could F# be useful? Anywhere you would like. But as it creators says it has lot of multi-threading capabilities plus to that you write immutable code, which was the main root of stupid bugs in imperative programming. Plus to that you can easily use it in combination to other .NET languages.

Don’t miss chance to learn this language if you are interested in it.

Take a look on this elegant Fibbonachi solution: let rec fib n = if n < 2 then 1 else fib (n-2) + fib(n-1)


No comments


Refactoring your code to be multithreaded

April 3, 2010 .NET, Concurrency No comments

In this post I will start with some time consuming algorithm, which is very simple and will move to decrease its execution time with using advantage of my two processors.

Time consuming algorithm

For example you have some complex calculations algorithm, which runs on array of dozens of elements. For simplicity let it be list of doubles like below:

      var
inputVector = new List<double>();
      for (int i = 0; i < 10000; i++)
        inputVector.Add(random.NextDouble());

All complex calculations that you do are located in class OneThreadAlgorithm. It goes through all array and gets index of the element, which is more closer to value d that you provide inside of method FindBestMatchingIndex.

  internal class OneThreadAlgorithm
  {
    public readonly List<double>
InputVector;
   
public OneThreadAlgorithm(List<double> inputVector)
    {
      InputVector = inputVector;
    }
    public int FindBestMatchingIndex(double d)
    {
      double
minDist = double.MaxValue;
      int bestInd
= -1;
      for (int i
= 0; i < InputVector.Count; i++)
      {
        var
currentDistance = DistanceProvider.GetDistance(InputVector[i],
d);
        if
(currentDistance < minDist)
        {
          minDist = currentDistance;
          bestInd = i;
 
      }
      }
 
    return bestInd;
    }
  }

If you are interested in DistanceProvider, it just returns between two double elements. Take a look on it.

  public static class
DistanceProvider
  {
    public static
double GetDistance(double val1, double
val2)
    {
     
for (int
i = 0; i < 10000; i++)i = i + 1 – 1;
      return Math.Abs(val1 – val2);
    }
  }

As you see I have some loop before returning value. It is needed to imitate hard calculations. I’m not using Thread.Sleep(), because there are some nuances, that will appear when we will move to multithreading, so I would I like to neglect them.

First change: dividing calculations to sections

As we see algorithm just loops through collection of values and finds some element. To make it going in few threads first we need to accommodate it. The intent of our accommodation is to divide calculations to sections, so we introduce following method

    public void FindBestMatchingIndexInRange(int start, int
end, double d)
    {
      double
minDist = double.MaxValue;
      int bestInd
= -1;
      for (int i
= start; i < end; i++)
      {
        var
currentDistance = DistanceProvider.GetDistance(InputVector[i],
d);
        if
(currentDistance < minDist)
        {
          minDist = currentDistance;
          bestInd = i;
 
      }
      }
      BestMatchingElements.Add(bestInd);
    }

So, what it does is the same that we had previously. Only the difference is that now we start searching for best matching indexes in range starting with index start and finishing with end, also we put result to the BestMatchingElements list. After we went through all sections we can find best matching element only in that list. See that below:

    public int FindBestMatchingIndex(double d)
    {
      double
minDist = double.MaxValue;
      int bestInd
= -1;
     
BestMatchingElements = new List<int>();
      int sectionLength = InputVector.Count /
ParallelingNumber;
      int start = 0;
      int end = sectionLength;
      for (int i = 0; i < ParallelingNumber; i++)
      {
       
FindBestMatchingIndexInRange(start, end, d);
        start = end;
   
    end += sectionLength;
        if (i == ParallelingNumber – 1) end =
InputVector.Count;
      }
      foreach (var
elementIndex in BestMatchingElements)
      {
        var currentDistance = DistanceProvider.GetDistance(InputVector[elementIndex],
d);
        if
(currentDistance < minDist)
        {
          minDist = currentDistance;
          bestInd = elementIndex;
        }
      }
      return bestInd;
 
  }

So now it works as well as before, theoretically it is a bit slower, and it does, especially when you divide array to lot of segments (number of segments we define with variable
ParallelingNumber
property).

Second modification: storing task information into one object

When we use multithreading we schedule method which represents public delegate void WaitCallback (Object state). So to accomplish this we create new class FindBestInRangeRequest and use it as an object that passes to changed method:

    public void FindBestMatchingIndexInRange(object bestInRangeRequest)

    {
      var request = (FindBestInRangeRequest)bestInRangeRequest;
      double minDist = double.MaxValue;
 
    int bestInd = -1;
      for (int i
= request.Start; i < request.End; i++)
   
  {

That new class FindBestInRangeRequest encapsulates Start, End, D and other values needed to schedule a threading task. See that class:

    internal class FindBestInRangeRequest
    {
      public int Start { get;
private set;
}
      public
int End { get;
private set;
}
      public
double D { get;
private set;
}
      public
ManualResetEvent Done { get; private
set; }
      public
FindBestInRangeRequest(int start, int end, double
d, ManualResetEvent
done)
      {
     
  Start = start;
        End = end;
        D = d;
       
Done = done;
      }
    }

As you see we are also passing the ManualResetEvent
object, which has method Set(), with using it we can identify that task execution has finished.

Third modification: Using ThreadPool

We can allocate threads manually, but that operation is very expensive, so it is strongly recommended to use ThreadPool.
So here is how we do use ThreadPool to call FindBestMatchingIndexInRange.

        var
bestInRangeRequest = new FindBestInRangeRequest(start,
end, d, DoneEvents[i]);
        ThreadPool.QueueUserWorkItem(FindBestMatchingIndexInRange,
bestInRangeRequest);

after we have scheduled all ranges we should ensure that all threads has synchronized. We could do this using

      WaitHandle.WaitAll(DoneEvents);

 method.
Another interesting thing is that saving into BestMatchingElements is not safe, so we use to unsure safe adding.

      Monitor.Enter(BestMatchingElements);
      BestMatchingElements.Add(bestInd);
      Monitor.Exit(BestMatchingElements);

which is the same to the locking with keyword lock.

Full code base of algorithm

  internal class MultiThreadAlgorithm
  {
    public int ParallelingNumber { get; private set; }
    private List<int>
BestMatchingElements { get; set; }
    private ManualResetEvent[] DoneEvents { get;
set; }
   
public readonly
List<double> InputVector;
    public
MultiThreadAlgorithm(List<double> inputVector, int parallelingNumber)
    {
      InputVector = inputVector;
      ParallelingNumber = parallelingNumber;
      DoneEvents = new
ManualResetEvent[ParallelingNumber];
    }
    public int FindBestMatchingIndex(double d)
    {
      double
minDist = double.MaxValue;
      int bestInd
= -1;
      BestMatchingElements = new List<int>();
     
for (int
i = 0; i < ParallelingNumber; i++)
      {
        DoneEvents[i] = new
ManualResetEvent(false);
      }
      int sectionLength = InputVector.Count /
ParallelingNumber;
      int start = 0;
      int end = sectionLength;
      for (int i = 0; i < ParallelingNumber; i++)
      {
        var bestInRangeRequest = new FindBestInRangeRequest(start,
end, d, DoneEvents[i]);
        ThreadPool.QueueUserWorkItem(FindBestMatchingIndexInRange,
bestInRangeRequest);
        start = end;
        end +=
sectionLength;
        if (i == ParallelingNumber – 1) end = InputVector.Count;
      }
      WaitHandle.WaitAll(DoneEvents);
      foreach (var elementIndex in
BestMatchingElements)
      {
        var
currentDistance = DistanceProvider.GetDistance(InputVector[elementIndex],
d);
        if
(currentDistance < minDist)
        {
          minDist = currentDistance;
          bestInd = elementIndex;
        }
      }
      return bestInd;
 
  }
    public void
FindBestMatchingIndexInRange(object
bestInRangeRequest)
    {
      var request
= (FindBestInRangeRequest)bestInRangeRequest;
      double
minDist = double.MaxValue;
      int bestInd
= -1;
      for (int i
= request.Start; i < request.End; i++)
   
  {
        var
currentDistance = DistanceProvider.GetDistance(InputVector[i],
request.D);
        if (currentDistance < minDist)
        {
          minDist =
currentDistance;
          bestInd = i;
        }
      }
      Monitor.Enter(BestMatchingElements);
      BestMatchingElements.Add(bestInd);
      Monitor.Exit(BestMatchingElements);
     
request.Done.Set();
    }
    internal class
FindBestInRangeRequest
    {
      public int
Start { get; private set; }
      public int End { get;
private set;
}
      public
double D { get;
private set;
}
      public
ManualResetEvent Done { get; private
set; }
      public
FindBestInRangeRequest(int start, int end, double
d, ManualResetEvent
done)
      {
     
  Start = start;
        End = end;
        D = d;
       
Done = done;
      }
    }
  }

Do you see how our class is cumbersome. So that is why do we call multithreading to be complex and not an easy to implement.

Does this pay off?
Of course, it does. I wrote this example, because I’ve been working on another, a bit more complex thing, but generally I did the same.

Here is output of execution of three variants of algorithm: simple, division to sections, multithreading.

One thread result = 9841. TIME: 0H:0M:2S:583ms
One thread with sections result = 9841. TIME: 0H:0M:2S:917ms
Multi threads result = 9841. TIME: 0H:0M:1S:939ms
Press any key to continue . . .

As could bee seen, multithreading variant is much-much faster. It could be even faster if I had more than two processors on my machine.


No comments


Simple example to see multithreading data sharing issues

March 23, 2010 .NET, Concurrency No comments

I heard a lot of the corruption that could be made by the multiple threads when they share same data. And understanding why that happen is not so hard, but I wanted some bright example of it, which will be able to quickly show what is going on.

So here is my example:

    public class Counter
    {
        public
int Count = 0;
        public
void UpdateCount()
        {
            for
(int i = 0; i < 10000; i++)
            {
                Count = Count + 1;
            }
        }
    }
    class Program
    {
        static
void Main(string[]
args)
        {
            var
counter = new Counter();
            var
threads = new Thread[500];
            for
(int i = 0; i < threads.Length;
++i)
            {
                threads[i] = new Thread(counter.UpdateCount);
                threads[i].Start();
            }
            foreach
(var thread in threads)
                thread.Join();
            Console.WriteLine(string.Format(
                @”If you are running this code on multiple processors machine you
probably will not get expected ThreadCount*Iterations={0}*{1}={2}, but
less number, which is currently equal to {3}”
,
                    threads.Length, 10000,
threads.Length * 10000, counter.Count));
        }
    }

As you see I’m using 500 threads. Why? First it is because this ensures that lot of them will be allocated on another processor and second is because the UpdateCount runs “Count = Count + 1” that is quite very trivial and requires about 3 atomic operations, so to increase possibility to run them in concurrent threads I increased their count.

Below is the output:

If you are running this code on multiple processors machine you probably will no
t get expected ThreadCount*Iterations=500*10000=5000000, but less number, which
is currently equal to 4994961
Press any key to continue . . .
As you see the actual number of the Count field is less than expected. We lost one each time when next happens at once:
Thread 1: Reads Count (1000)
Thread 2: Reads Count (1000)
Thread 1: Increases Count (1001)
Thread 2: Increases Count (1001)
Thread 1: Writes Count (1001)
Thread 2: Writes Count (1001) – he-he so it wrote back the same number, not 1002
You probably already know how to solve this with lock or Interlocked class or other stuff.
For example just change line Count = Count + 1; with line Interlocked.Increment(ref Count); or lock (this)Count = Count + 1; But I’m getting fun writing this.


No comments