MCTS 70-536 Application Development Foundation book review

April 26, 2010 Book Reviews, Certification 2 comments

What can I say about this book? It is specialized for passing specific Microsoft Exam. If you are going to pass 536 exam you 100% need it. If you just want to learn some core things about developing applications with .net, book is not best choice. It will lead you through big portion of class names, different methods and enumerations, which could be easily seen in MSDN documentation up to date. I would even say that books like authors took everything possible from MSDN that is related to exam. Also book have mistakes, which I think should not be in such books. Plus to that book is dry to read, but you can read it fast if you know how.

How to read book:

  • Read accurately but quick if theme is familiar to you.Skip “Real Word” and pay attention on “Important” and “Exam Tip”, try to remember it!
  • If there are boring explanation and then code, switch first to code and analyze if you understand everything there, and if that is true, don’t return back to text.
  • Do not execute Labs unless you do not understand what you had read.
  • If there are difficult chapters try to read them twice – I did this with Application Security.

I would recommend this book if you are going to pass exam, if not just check-out maybe there are some interesting things that you should know, but do not read book whole end-to-end.


2 comments


It is time to get Microsoft Certification

April 26, 2010 Certification, Success 4 comments

When is it time to be Microsoft Certified?
My company is developing new certification model, named abc (I’m not sure if I can use original name), which includes few levels such as abc Certified Junior, abc Certified Intermediate, abc Certified Senior… and so on… To become Senior developer after you are Intermediate you first need to get abc Intermediate certification.

Each of the levels has some requirements like industry experience, some number company-wide presentations, passed performance appraisal with strength area marks, passed knowledge evaluation, but what is important here is that each next level also requires some external certifications. For example, to be abc certified intermediate level in .NET desktop you need to have 2 certificates: 70-505 and 70-536.

Hey, but I’m already promoted to be intermediate, but do not have certificates… At least it is one of the reasons why I cannot become Senior (another one is that I need about half a year more experience and improve my knowledge). And this makes me bit sad.
Actually I tried to get 70-505 exam immediately after I get promoted (summer 2009). First time it was free attempt, and I failed it. Shit, but that was so easy, I got 670 out of 1000 with required 700.

Why did I fail it? There are few reasons:

* Exam 70-505 includes features from the 3.0 framework, which I did not ever use. For example it includes easy question regarding the Linq. I did not know Linq at that moment. It included question regarding installing and deploying strategies. Which I did not ever try. Stop. Am I so bad? I knew design patterns; I had experience in developing desktop applications. But I was interacted only with actual developing of easy things; I was not familiar with whole lifecycle of our product. To be honest we have so complicated lifecycle that nobody knows it well.
* I was too much self-confident about the exam.
* I did not spend a lot of time on preparing, maybe half a day.

I know that there are other ways to successfully pass exam, I think you know what I mean, but I did not see in internet actual tests for my exam, and besides that is not good way to pass it. I saw some people that passed with 1000 score other exams. Why? They found exact test in internet.

Let’s suppose that I was not ready and I did not know how to pass such exams.

Ok. Do you know Roosevelt? His words: “Never, never, never, never…give up!” And I did not give up. I tried again with the same exam month later. And you know – I field it with score less than 500. I was pure sure that questions will be similar, and they were. I was too much confident that I will pass it, but I did not. So I blew 50$. Holly shit.

Was it needed to pass Microsoft exam? Tips are following:

•    Get resource to learn from it. Best choice is Microsoft Training Kits.
•    Be aware what does your exam include, if there are some unknown technologies for you, ask yourself if this is correct exam for you.
•    Ensure that you interacted with each technology meant there in real world, if you have not interact, just write simple applications when preparing to exam.
•    Do not schedule exam, if you are not self-confident in your knowledge.
•    Do not be too much self-confident in your knowledge. Most of exams are dry tests, which include some monkey questions that could be easily forgotten.
•    Train, train, train… and never give up.

Getting MCTS for me is very important step in my career. (See picture below).
I’m going to achieve this with passing 536 and 505 in short terms.


4 comments


My presentation on the Service Oriented Architecture

April 23, 2010 Opinion, SOA, Success 4 comments

Yesterday I executed my presentation on the Service Oriented Architecture. It was awesome. I got good portion of emotions.

People

I was really surprised how many people have come to listen to me. As per me it was supposed that this local Architecture Group (AG) is for couple of people, who have desire to spend their free time on talking about SOA, but for my great surprise almost all team came, I mean developers from whole enterprise project where I work. Guys, there should be something that I don’t know about intent of this group?

Rehearsal

Before presentation I as usual did my rehearsal. First it was presenting to my friend, who is also developer. We call this ‘Turbulent Developers Community‘ where we sit in the evening with beer and talk about different interesting IT things. So he brought few interesting questions for which I was now ready to respond. Also, I had my *.ppt at my phone, so when getting to my work I was trying to replay presentation in my mind. І на останок, вже безпосередньо перед презентацією я розказав її своїй дівчині в скороченому режимі – це зайняло близько 20 хвилин. Дуже тобі дякую за це!

Presentation

So, I started with plan which I have in this blog post. People were listening with attention and that is very appreciating. Also, what was enjoyable is that they had a lot of different comments on what about I was talking. They even spoke to 3 of my slides instead of my… OMG.

How was it?

… asked I, and got an answer: “It’s a great success! … can we now get money out of that?.. :)”. Those are words of my colleague with whom I work on the same piece of software. I really hope that other also enjoyed what I did.

Conclusion

For myself I have discovered that I learnt more about how to provide a good presentation. I do not say that this is my best presentation. I this the best I had is presentation on DDD, but anyway… having more experience in that area is very good achievement.

Keep in mind – you will die someday

What is the intent of my life? I spend my life at work, so that is a very big part of it. If above said it truth, why don’t then do the best at work and in your growth? For me it is mandatory. If you don’t like you work – just change it. Just do it!

Everyone have to die some day. Think about what you do and what you want to achieve. Have super large dreams. Often why I’m getting to the work at town vehicle many people talks and as I understand the intent of their life it to find a good job, to have a car, home and family. I don’t say that family is bad, BUT why to put such trivial task in front of you? You have one life and then death!


4 comments


Service Oriented Architecture – Introduction

April 22, 2010 SOA No comments

Currently I’m part of local SOA group. Purpose of it is to accumulate information about SOA and its implementation, to talk and discuss different approaches that could be applied with this paradigm of developing software. I volunteered to prepare first introduction presentation. So other more experienced guys will prepare more deep presentations. Right folks? ;)

What is SOA?

Over internet you could find lot of different definitions to SOA. One of the most simplest is:

Service Oriented Architecture is an architectural style of building software applications that promotes loose coupling between components for their reuse.

Term itself could be a bit confusing, because it combines two different things. First two words describe software methodology – how to build software. Third word, architecture, represents picture of company’s assets, which all together should build a good building. So, in other words, we could say that SOA is strategy that forces building software assets in business using service-oriented methodology.

Wikipedia gives more complex definition:
A paradigm for organizing and utilizing distributed capabilities that may be under the control of different ownership domains. It provides a uniform means to offer, discover, interact with and use capabilities to produce desired effects consistent with measurable preconditions and expectations.

SOA is not simply IT solution, it is whole paradigm shift. It is another step to the IT industry dream, when we will not need write low level code, but just combining different ready components to have product which is good for immediate selling. It is about describing, building, using and managing your IT environment with focus on services and capabilities it provides rather then on concrete technology used for that.



Advantages both for Business and for Development

Before we start with what stands behind SOA lets first take a look on advantages that it brings for Business and Development process.

Advantages of service-oriented development:

Software reuse is one of the most commonly used arguments to justify SOA. If services are developed of good size and correct scale they could be reused. Consider that you company has for sailing departments, each of them works with orders, so all of them have code that checks customers credit card, books order and so on. But checking credit card is almost 100% the same task in all departments, so developers could gather code and make it shared for all systems, in this case software reuse is accomplished. All that we need to have in our systems is client that is able to consume service. Systems are no longer concerned about the internal verification process. In future is company will create new department it will just reuse existing service, so development time will be saved and money for business.
But we could not say that reuse will be accomplished everytime. First of all service should be of right size, not to big and with correct scope. This meas that we will not be able to use ‘verify credit card and book that car’ in department where they sale phones. I also found that by only from 10 to 40 percents of services are reused. This adds additional requirement to implementers to be accurate with choosing correct scope and size of service.
Productivity increase. That is easy, once we achieved re-use system, we could not spend our time on implementing same things, testing them. We only need integrate services and do integration testing, so this becomes more cheaper.
Increased agility. Even if re-use will not be achieved we anyway get additional agility with having services. For example our selling department has divided whole process into few services, like ‘check user’, ‘check credit card’, ‘get order’, ‘ship order’… And if will need change in process we could do this in isolated island of concrete service. That is how we get more flexibility.    

Advantages of SOA strategy are:

Better alignment with business. Business people can now imagine how their business are constructed in terms of technology. Once they better understand how application is built they could promote their requirements to development teams in more efficient way. And as well development team is now more suitable with understanding of what does that business really do. It is now much easier to explain why having credit card in one service is good decision.
A better way to sell architecture to the business. Having gap in understanding how all is really implemented was also a preventing thing to sell IT product to the business.

What are services?

So we talk about services. But what exactly are those services?
Service is complete self-contained component, that provides defined tasks. List of tasks could be obtained by consumer using agreed contracts. Consumer should not know about internal implementation of service. Communication between consumer and service should be done through platform and language independent contracts.

Here is list of just mentioned properties of good Service:

  • Self-contained module that performs a predetermined task
  • Software components that have published contracts/interfaces
  • Black-box to the consumers
  • Platform-Independent
  • Language-Independent
  • Operating-System-Independent
  • Interoperable

Main Architecture Concepts

On the picture below we see how three main parts of SOA interacts with each other.

Loose coupling – Services maintain a relationship that
minimizes dependencies and only requires that they maintain an awareness
of each other.
Service contract – Services adhere to a communications
agreement, as defined collectively by one or more service description
documents.

Service abstraction – Beyond what is described in the service
contract, services hide logic from the outside world.

Service reusability – Logic is divided into services with the
intention of promoting reuse.

Service composability – Collections of services can be
coordinated and assembled to form composite services.

Service autonomy – Services have control over the logic they
encapsulate.

Service optimization – All else equal, high-quality services
are generally considered preferable to low-quality ones.

Service discoverability – Services are designed to be
outwardly descriptive so that they can be found and assessed via
available discovery mechanisms

How could it be implemented?

Since SOA is technology independent we currently have really lot of different possiblities to implement it. So this could be simple Web service, which sends data packed into XML with help of SOAP, at the same time contracts are established with WSDL. Possibility to discover available services could be UDDI.

We could also realize SOA with REST, DCOM, CORBA, DDS and other things.

Also Microsoft has its solution for the SOA which is WCF. It is very great tool to develop distributed application. I’m glad that already have at least some experience using it.

What
are SOA Patterns?

Here is very good site on disign patterns applicable with SOA.

Mistakes when implementing SOA

There are lot of mistakes that could appear when you are trying to introduce SOA in your project. They are listed in Twelve Common SOA Mistakes pdf.

I hope that I will be able to provide a good presentation, since I don’t feel myself to be professional in this area and don’t have enough required background, except of few months using WCF.


No comments


Refactor: Sequential Coupling => Template Method

April 14, 2010 Design Patterns, Refactoring 2 comments

Another colleague brought me present today – the blog post. Thank you. You were right!

We will do some refactoring which will lead us from Anti-Pattern to Pattern. From Sequential Coupling to Template Method. And as I see it could be very common way to refactor bad code that represents mentioned anti-pattern.

So, lets start with the following question. What do you see to be wrong with following code?

  public class Manager
  {
    public void DoTheProject()
    {
      IWorker worker = GetMeWorker();
      worker.PreWork();
      worker.Work();
   
  worker.AfterWork();
    }

Manager is very angry guy and he needs to get the project done. For that he would like to get some worker, which implements IWorker interface, and delegate some work to him. But also manager knows that worker could be new to project so it will require do go ahead and to some preparational work before and maybe something to do after work is done…

What is wrong? Answer is that Manager cares too much about how worker should organize his work. Why should manager know that worker needs to be prepared for work? What do we have here is the Sequential Coupling anti-pattern.

(My definition) 

Sequential Coupling Anti-Pattern – appears when consuming code is forced to call methods of used object in particular sequence to make it work correctly.

If we call Work and then PreWork, code could be broken. And we want to prevent this. For example, we can move all that stuff in one method – Work, but also sometimes it is needed to perform pre or/and after work, but sometimes not. That is why we had following design that allowed us do this. See that StandardWorker doesn’t need do something after he has finished. This was achieved with virtual and abstract methods.

  public interface IWorker
  {
    void
PreWork();
    void
Work();
    void
AfterWork();
  }
  public abstract class
WorkerBase : IWorker
  {
    public
virtual void
PreWork(){}
   
public abstract
void Work();
    public
virtual void
AfterWork(){}
  }
  public class StandardWorker : WorkerBase
  {
    public override
void PreWork()
    {
      Console.WriteLine(“… I need to prepare to work …”);
    }
    public
override void
Work()
    {
     
Console.WriteLine(“… hard work is in process …”);
    }
  }

So, what we need to do is to hide the order in which we call methods, be sure that order remains the same, and still have possibility to override each of method. What I just described is Template Method.

In our example we could leave one method in interface, then implement it in base class with calling everything else we need in correct order, each of those things we call could be overridden, we left them protected to do not show to the world.

  public interface IWorker
  {
    void Work();
  }
  public abstract class
WorkerBase : IWorker
  {
    
    //this is template method
    public
void Work()
    {
      PreWorkActivity();
      WorkActivity();
   
  AfterWorkActivity();
    }
    protected virtual
void PreWorkActivity() { }
    protected abstract void
WorkActivity();
    protected virtual void AfterWorkActivity() { }
  }
  public class StandardWorker : WorkerBase
  {
    protected override
void PreWorkActivity()
    {
      Console.WriteLine(“… I need to prepare to work …”);
    }
    protected override
void WorkActivity()
    {
      Console.WriteLine(“… hard work is in process …”);
    }
  }
  public class Manager
  {
    public
void DoTheProject()
    {
      IWorker worker = GetMeWorker();
      worker.Work();
    }
    private IWorker GetMeWorker()
    {
      return new
StandardWorker();
    }
  }

It really looks for me that it could be common to perform refactoring from Sequential Coupling to Template Method. Hope you liked this post.


2 comments


Null is not equal to null. How could that happen?

April 13, 2010 C# 2 comments

Today one of my colleagues bring me a present – a new blog post!

Please take a look on the following picture and think how could execution step into the if statement. See that oldPerson is null!

She has asked everyone if anyone knows how could this happen. Do you know?

Honestly my first thought was that it is something with Nullable<T>, or maybe method Find isn’t what we think about it. I had thought that it could be LinQ method, which is somehow wrong, whatever. But nothing of that is true. :(

She did not sent the whole picture so we did not know what is the type of oldPerson object. So far we don’t know the type and we know that “!=” behaves somehow different. That’s it! Operator is wrong and I won beer. (of course not :) )

So let’s take a look on implementation of the operator:

    1 using System.Collections.Generic;
    2 
    3 namespace ConsoleApplication1
    4 {
    5     internal class Person
    6     {
    7         public string Name { get; set; }
    8 
    9         public static bool operator ==(Person left, Person right)
   10         {
   11             if (ReferenceEquals(null, right)) return false;
   12             if (ReferenceEquals(left, right)) return true;
   13             return Equals(left.Name, right.Name);
   14         }
   15 
   16         public static bool operator !=(Person left, Person right)
   17         {
   18             return !(left == right);
   19         }
   20 
   21         public bool Equals(Person obj)
   22         {
   23             return (this == obj);
   24         }
   25 
   26         public override bool Equals(object obj)
   27         {
   28             if (ReferenceEquals(this, obj)) return true;
   29             if (obj.GetType() != typeof(Person)) return false;
   30             return (this == ((Person)obj));
   31         }
   32 
   33     }
   34 
   35     class Program
   36     {
   37         static void Main(string[] args)
   38         {
   39             var persons = new List<Person>()
   40                 {
   41                     new Person(){Name = “Ivan”},
   42                     new Person(){Name = “Andriy”}
   43                 };
   44 
   45             foreach (var item in persons)
   46             {
   47                 var oldPerson = persons
   48                     .Find(x => x.Name == item.Name+“Not Andriy”);
   49 
   50                 if (oldPerson != null)
   51                 {
   52                     item.Name = oldPerson.Name;
   53                 }
   54             }
   55         }
   56     }
   57 }

Do you already see where the problem is? It is definitely in line 11: if (ReferenceEquals(null, right)) return false;
if we will get null in right object it will say that null is not equal to null.

Reason why she needs that so much complicated is because she accesses properties of the object in return condition later:
return Equals(left.Name, right.Name);


2 comments


Get started with F#

April 4, 2010 .NET, C#, F# No comments

Renaissance of functional languages

We are at renaissance of functional languages. When I read blog posts I often see guys talking about functional programming and stuff related to it. Community wants more features that functional style provides for us. In response to that creators of languages and technologies are now introducing a lot of amazing cool features to make our life happier. They also create new languages, and so on.

What is going on with C# nowadays?

Let’s start with what we do have with C# nowadays. It has moved to functional side slightly. Introducing LINQ is big step in that direction. We are moving away from imperative programming to functional, for example this simple loop represents imperative way to work with list items.

      foreach (var element in
list)
      {
   
    Console.WriteLine(element);
      }

Using LinQ we can have so much elegant functional syntax:

      list.ForEach(Console.WriteLine);

So we pass function into function. Simply saying that is why we call this functional programming. If example above isn’t so bright take a look on next:

      Func<int, int>
doubleThat = delegate(int x) { return
x * 2; };
      var
from2To20 = from1To10.Select(doubleThat).ToList();

Another example of that C# is more close to those languages is introducing var keyword, which doesn’t mean that we can change type in further code like in dynamic languages, but that is thing which really helps us to write code without being really concerned about types, but the language itself is still strongly typed. If you would say here about C# dynamics, hm.. honestly I’m not a fan of that thing in C#, but maybe it gives some advantages for us.

Immutability

Few days ago I have heard podcast where guy from Microsoft .NET languages team spoke about different features that community had requested to language C#. One of them was immutability and it looks that they are going to introduce this in further releases (after 4.0 of course). So what is immutability? It is something that functional languages has by default and imperative hasn’t :).

For example we have

   int number = 1;
   number = number + 3;

We can change number as many times as we want. But the same in F# will produce compilation error:

(“<-” is assignment operator in F#). In order to make it work you will need another element result:

If you want variable with you 100% want to change you should declare this using mutable keyword:

Functions

So lets move to much interesting – declaring functions:

ten will be immutable variable which has value 10.

Are you familiar with parameter binding in C++ functors? It doesn’t matter but, currying of methods in F# reminds me that. Take a look how we get new method with mixing function multiply and using one frozen parameter.

Using F# library in other .Net languages 

I did all this fun stuff with creating new F# library in Visual Studio. I viewed results of code which interested me with selecting it and pressing ‘Send To Interactive’ from context menu. Next what was interested for me is how can I use that dll in my usual C# program. So I created console application, added reference to my F# lib. Now I can use it like below:

See how things differs out there. Method is seen as method, immutable variables as properties without setter and mutable as properties with setter. Forgot to mention that FirstFSharpProgram  defines with keyword module at the top of *.fs file.

Why?

When could F# be useful? Anywhere you would like. But as it creators says it has lot of multi-threading capabilities plus to that you write immutable code, which was the main root of stupid bugs in imperative programming. Plus to that you can easily use it in combination to other .NET languages.

Don’t miss chance to learn this language if you are interested in it.

Take a look on this elegant Fibbonachi solution: let rec fib n = if n < 2 then 1 else fib (n-2) + fib(n-1)


No comments


Refactoring your code to be multithreaded

April 3, 2010 .NET, Concurrency No comments

In this post I will start with some time consuming algorithm, which is very simple and will move to decrease its execution time with using advantage of my two processors.

Time consuming algorithm

For example you have some complex calculations algorithm, which runs on array of dozens of elements. For simplicity let it be list of doubles like below:

      var
inputVector = new List<double>();
      for (int i = 0; i < 10000; i++)
        inputVector.Add(random.NextDouble());

All complex calculations that you do are located in class OneThreadAlgorithm. It goes through all array and gets index of the element, which is more closer to value d that you provide inside of method FindBestMatchingIndex.

  internal class OneThreadAlgorithm
  {
    public readonly List<double>
InputVector;
   
public OneThreadAlgorithm(List<double> inputVector)
    {
      InputVector = inputVector;
    }
    public int FindBestMatchingIndex(double d)
    {
      double
minDist = double.MaxValue;
      int bestInd
= -1;
      for (int i
= 0; i < InputVector.Count; i++)
      {
        var
currentDistance = DistanceProvider.GetDistance(InputVector[i],
d);
        if
(currentDistance < minDist)
        {
          minDist = currentDistance;
          bestInd = i;
 
      }
      }
 
    return bestInd;
    }
  }

If you are interested in DistanceProvider, it just returns between two double elements. Take a look on it.

  public static class
DistanceProvider
  {
    public static
double GetDistance(double val1, double
val2)
    {
     
for (int
i = 0; i < 10000; i++)i = i + 1 – 1;
      return Math.Abs(val1 – val2);
    }
  }

As you see I have some loop before returning value. It is needed to imitate hard calculations. I’m not using Thread.Sleep(), because there are some nuances, that will appear when we will move to multithreading, so I would I like to neglect them.

First change: dividing calculations to sections

As we see algorithm just loops through collection of values and finds some element. To make it going in few threads first we need to accommodate it. The intent of our accommodation is to divide calculations to sections, so we introduce following method

    public void FindBestMatchingIndexInRange(int start, int
end, double d)
    {
      double
minDist = double.MaxValue;
      int bestInd
= -1;
      for (int i
= start; i < end; i++)
      {
        var
currentDistance = DistanceProvider.GetDistance(InputVector[i],
d);
        if
(currentDistance < minDist)
        {
          minDist = currentDistance;
          bestInd = i;
 
      }
      }
      BestMatchingElements.Add(bestInd);
    }

So, what it does is the same that we had previously. Only the difference is that now we start searching for best matching indexes in range starting with index start and finishing with end, also we put result to the BestMatchingElements list. After we went through all sections we can find best matching element only in that list. See that below:

    public int FindBestMatchingIndex(double d)
    {
      double
minDist = double.MaxValue;
      int bestInd
= -1;
     
BestMatchingElements = new List<int>();
      int sectionLength = InputVector.Count /
ParallelingNumber;
      int start = 0;
      int end = sectionLength;
      for (int i = 0; i < ParallelingNumber; i++)
      {
       
FindBestMatchingIndexInRange(start, end, d);
        start = end;
   
    end += sectionLength;
        if (i == ParallelingNumber – 1) end =
InputVector.Count;
      }
      foreach (var
elementIndex in BestMatchingElements)
      {
        var currentDistance = DistanceProvider.GetDistance(InputVector[elementIndex],
d);
        if
(currentDistance < minDist)
        {
          minDist = currentDistance;
          bestInd = elementIndex;
        }
      }
      return bestInd;
 
  }

So now it works as well as before, theoretically it is a bit slower, and it does, especially when you divide array to lot of segments (number of segments we define with variable
ParallelingNumber
property).

Second modification: storing task information into one object

When we use multithreading we schedule method which represents public delegate void WaitCallback (Object state). So to accomplish this we create new class FindBestInRangeRequest and use it as an object that passes to changed method:

    public void FindBestMatchingIndexInRange(object bestInRangeRequest)

    {
      var request = (FindBestInRangeRequest)bestInRangeRequest;
      double minDist = double.MaxValue;
 
    int bestInd = -1;
      for (int i
= request.Start; i < request.End; i++)
   
  {

That new class FindBestInRangeRequest encapsulates Start, End, D and other values needed to schedule a threading task. See that class:

    internal class FindBestInRangeRequest
    {
      public int Start { get;
private set;
}
      public
int End { get;
private set;
}
      public
double D { get;
private set;
}
      public
ManualResetEvent Done { get; private
set; }
      public
FindBestInRangeRequest(int start, int end, double
d, ManualResetEvent
done)
      {
     
  Start = start;
        End = end;
        D = d;
       
Done = done;
      }
    }

As you see we are also passing the ManualResetEvent
object, which has method Set(), with using it we can identify that task execution has finished.

Third modification: Using ThreadPool

We can allocate threads manually, but that operation is very expensive, so it is strongly recommended to use ThreadPool.
So here is how we do use ThreadPool to call FindBestMatchingIndexInRange.

        var
bestInRangeRequest = new FindBestInRangeRequest(start,
end, d, DoneEvents[i]);
        ThreadPool.QueueUserWorkItem(FindBestMatchingIndexInRange,
bestInRangeRequest);

after we have scheduled all ranges we should ensure that all threads has synchronized. We could do this using

      WaitHandle.WaitAll(DoneEvents);

 method.
Another interesting thing is that saving into BestMatchingElements is not safe, so we use to unsure safe adding.

      Monitor.Enter(BestMatchingElements);
      BestMatchingElements.Add(bestInd);
      Monitor.Exit(BestMatchingElements);

which is the same to the locking with keyword lock.

Full code base of algorithm

  internal class MultiThreadAlgorithm
  {
    public int ParallelingNumber { get; private set; }
    private List<int>
BestMatchingElements { get; set; }
    private ManualResetEvent[] DoneEvents { get;
set; }
   
public readonly
List<double> InputVector;
    public
MultiThreadAlgorithm(List<double> inputVector, int parallelingNumber)
    {
      InputVector = inputVector;
      ParallelingNumber = parallelingNumber;
      DoneEvents = new
ManualResetEvent[ParallelingNumber];
    }
    public int FindBestMatchingIndex(double d)
    {
      double
minDist = double.MaxValue;
      int bestInd
= -1;
      BestMatchingElements = new List<int>();
     
for (int
i = 0; i < ParallelingNumber; i++)
      {
        DoneEvents[i] = new
ManualResetEvent(false);
      }
      int sectionLength = InputVector.Count /
ParallelingNumber;
      int start = 0;
      int end = sectionLength;
      for (int i = 0; i < ParallelingNumber; i++)
      {
        var bestInRangeRequest = new FindBestInRangeRequest(start,
end, d, DoneEvents[i]);
        ThreadPool.QueueUserWorkItem(FindBestMatchingIndexInRange,
bestInRangeRequest);
        start = end;
        end +=
sectionLength;
        if (i == ParallelingNumber – 1) end = InputVector.Count;
      }
      WaitHandle.WaitAll(DoneEvents);
      foreach (var elementIndex in
BestMatchingElements)
      {
        var
currentDistance = DistanceProvider.GetDistance(InputVector[elementIndex],
d);
        if
(currentDistance < minDist)
        {
          minDist = currentDistance;
          bestInd = elementIndex;
        }
      }
      return bestInd;
 
  }
    public void
FindBestMatchingIndexInRange(object
bestInRangeRequest)
    {
      var request
= (FindBestInRangeRequest)bestInRangeRequest;
      double
minDist = double.MaxValue;
      int bestInd
= -1;
      for (int i
= request.Start; i < request.End; i++)
   
  {
        var
currentDistance = DistanceProvider.GetDistance(InputVector[i],
request.D);
        if (currentDistance < minDist)
        {
          minDist =
currentDistance;
          bestInd = i;
        }
      }
      Monitor.Enter(BestMatchingElements);
      BestMatchingElements.Add(bestInd);
      Monitor.Exit(BestMatchingElements);
     
request.Done.Set();
    }
    internal class
FindBestInRangeRequest
    {
      public int
Start { get; private set; }
      public int End { get;
private set;
}
      public
double D { get;
private set;
}
      public
ManualResetEvent Done { get; private
set; }
      public
FindBestInRangeRequest(int start, int end, double
d, ManualResetEvent
done)
      {
     
  Start = start;
        End = end;
        D = d;
       
Done = done;
      }
    }
  }

Do you see how our class is cumbersome. So that is why do we call multithreading to be complex and not an easy to implement.

Does this pay off?
Of course, it does. I wrote this example, because I’ve been working on another, a bit more complex thing, but generally I did the same.

Here is output of execution of three variants of algorithm: simple, division to sections, multithreading.

One thread result = 9841. TIME: 0H:0M:2S:583ms
One thread with sections result = 9841. TIME: 0H:0M:2S:917ms
Multi threads result = 9841. TIME: 0H:0M:1S:939ms
Press any key to continue . . .

As could bee seen, multithreading variant is much-much faster. It could be even faster if I had more than two processors on my machine.


No comments


Quick & cheap way to rename colum in table – sp_rename

March 31, 2010 QuickTip, SQL No comments

Quick & cheap way to rename colum in table:

EXEC sp_rename
    @objname = ‘MY_TABLE.COULMN_NAME’,
    @newname = ‘COLUMN_NAME’,
    @objtype = ‘COLUMN’

Get fun!


No comments


ID3-impl

March 26, 2010 Education, Pedagogic Practice No comments

Below is my implementation of the ID3 algorithm based on my story about it.

It builds decision tree for next training data:

 AGE | COMPETITION | TYPE | PROFIT
 =========================================
 old | yes       | swr | down (False in my impl)
 --------+-------------+---------+--------
 old | no       | swr  | down
 --------+-------------+---------+--------
 old | no       | hwr | down
 --------+-------------+---------+--------
 mid | yes       | swr | down
 --------+-------------+---------+--------
 mid | yes       | hwr | down
 --------+-------------+---------+--------
 mid | no       | hwr | up (True in my impl)
 --------+-------------+---------+--------
 mid | no       | swr | up
 --------+-------------+---------+--------
 new | yes       | swr | up
 --------+-------------+---------+--------
 new | no       | hwr | up
 --------+-------------+---------+--------
 new | no       | swr | up
 --------+-------------+---------+--------

And built tree looks like this:

           Age
         / |    
        /  |     
    new/   |mid   old
      /    |       
    True Competition False
         /      
        /        
     no/          yes
      /            
    True             False



The Implementation of algorithm ID3
using System;
using System.Collections.Generic;
using
System.Linq;
namespace ID3
{
    public static class
Program
    {
        static void Main(string[]
args)
        {
 
          var R = new Dictionary<string, List<string>>();
            R.Add(“Age”, new List<string>() { “old”,
“mid”, “new” });
            R.Add(“Competition”,
new List<string>() { “yes”, “no” });
           
R.Add(“Type”, new List<string>() { “hwr”, “swr”
});
            var C
= “Profit”;
            var
TrainingSet = GetTrainingData();
            var algorithm = new
Id3Algorithm();
            Tree
desicionTree = algorithm.ID3(R, C, “root”, TrainingSet);
        }
        private static
List<TrainingRecord>
GetTrainingData()
        {
            var
trainingRecords = new List<TrainingRecord>();
            Dictionary<string, string> attributes;
            attributes = new Dictionary<string, string>();
            attributes.Add(“Age”, “old”);
            attributes.Add(“Competition”, “yes”);
           
attributes.Add(“Type”, “swr”);
            trainingRecords.Add(new
TrainingRecord(attributes, false));
           
attributes = new Dictionary<string,
string>();
            attributes.Add(“Age”,
“old”);
            attributes.Add(“Competition”, “no”);
           
attributes.Add(“Type”, “swr”);
            trainingRecords.Add(new
TrainingRecord(attributes, false));
           
attributes = new Dictionary<string,
string>();
            attributes.Add(“Age”,
“old”);
            attributes.Add(“Competition”, “no”);
           
attributes.Add(“Type”, “hwr”);
            trainingRecords.Add(new
TrainingRecord(attributes, false));
           
attributes = new Dictionary<string,
string>();
            attributes.Add(“Age”,
“mid”);
            attributes.Add(“Competition”, “yes”);
           
attributes.Add(“Type”, “swr”);
            trainingRecords.Add(new
TrainingRecord(attributes, false));
           
attributes = new Dictionary<string,
string>();
            attributes.Add(“Age”,
“mid”);
            attributes.Add(“Competition”, “yes”);
           
attributes.Add(“Type”, “hwr”);
            trainingRecords.Add(new
TrainingRecord(attributes, false));
           
attributes = new Dictionary<string,
string>();
            attributes.Add(“Age”,
“mid”);
            attributes.Add(“Competition”, “no”);
           
attributes.Add(“Type”, “hwr”);
            trainingRecords.Add(new
TrainingRecord(attributes, true));
           
attributes = new Dictionary<string,
string>();
            attributes.Add(“Age”,
“mid”);
            attributes.Add(“Competition”, “no”);
           
attributes.Add(“Type”, “swr”);
            trainingRecords.Add(new
TrainingRecord(attributes, true));
           
attributes = new Dictionary<string,
string>();
            attributes.Add(“Age”,
“new”);
            attributes.Add(“Competition”, “yes”);
           
attributes.Add(“Type”, “swr”);
            trainingRecords.Add(new
TrainingRecord(attributes, true));
           
attributes = new Dictionary<string,
string>();
            attributes.Add(“Age”,
“new”);
            attributes.Add(“Competition”, “no”);
           
attributes.Add(“Type”, “hwr”);
            trainingRecords.Add(new
TrainingRecord(attributes, true));
           
attributes = new Dictionary<string,
string>();
            attributes.Add(“Age”,
“new”);
            attributes.Add(“Competition”, “no”);
           
attributes.Add(“Type”, “swr”);
            trainingRecords.Add(new
TrainingRecord(attributes, true));
            return
trainingRecords;
        }
    }
    internal class Tree
    {
        public string Name { get;
private set;
}
        public
string ArcName { get; private set; }
        public bool
IsLeaf{ get; private set; }
        public Dictionary<string,
Tree> Trees { get; private set;
}
        public Tree(string
name, string arcName, Dictionary<string, Tree> trees)
        {
            Name = name;
            ArcName = arcName;
            Trees = trees;
            if (Trees == null) IsLeaf = true;
            else
if (Trees.Count <= 0) IsLeaf = true;
        }
    }
    internal class TrainingRecord
    {
        public Dictionary<string, string>
Attributes { get; private set; }
        public bool Success { get;
private set;
}
        public TrainingRecord(Dictionary<string,
string> attributes, bool success)
   
    {
            Attributes = attributes;
            Success = success;
        }
    }
    /*    function ID3 (R: множина
некатегоризаційних властивостей,
       C: категоризаційна властивість,
       S: множина для
навчання) returns дерево прийняття рішень;
       begin
     Якщо S пуста,
повернути один вузол із значенням невдача;
     Якщо S складаєтсья із рядків, для
яких значення категоризаційної
        властивості одне й те ж,
        повернути
єдиний вузол із тим значенням;
     Якщо R пуста, тоді повернути єдиний вузол із
значенням, яке є
        найбільш частішим серед значень катеригоційної
властивості,
        що було знайдено серед рядків S;
     Нехай D є
властивістю із найбільшим приростом Gain(D,S)
        серед
властивостей в множині R;
     Нехай {dj| j=1,2, .., m} – значення
властивості D;
     Нехай {Sj| j=1,2, .., m} – підмножини S, що включають
        відповідні
рядки із значенням dj для властивості D;
     Повернути дерево із коренем
поміченим D і дуги позначені
        d1, d2, .., dm що продовжуються наступними
деревами
          ID3(R-{D}, C, S1),
ID3(R-{D}, C, S2), .., ID3(R-{D}, C, Sm);
       end ID3;
 */
    internal class
Id3Algorithm
    {
        public Tree ID3(Dictionary<string,
List<string>> R, string
C, string arcName, List<TrainingRecord> S)
        {
            //1
           
if(S.Count <= 0) return new
Tree(false.ToString(), arcName, null);
         
  //2
   
        var prevValue = S[0].Success;
            foreach
(var trainingRecord in S)
           
{
                if(prevValue
!= trainingRecord.Success)
                {
                    prevValue =
trainingRecord.Success;
                    break;
         
      }
            }
            if(prevValue ==
S[0].Success)
            {
                return
new Tree(prevValue.ToString(),
arcName, null);
            }
            //3
            if(R.Count <= 0)
            {
                var sCount = S.Where(x =>
x.Success).Count();
                var fCount = S.Where(x =>
!x.Success).Count();
                var resValue = (sCount < fCount) ? true : false;
                new Tree(resValue.ToString(), arcName, null);
         
  }
            //4
            double
maxGain = double.MinValue;
            string
maxAtrb = string.Empty;
            foreach
(var attribute in R)
            {
                double
currGain = Gain(attribute.Key, attribute.Value, S);
                if(currGain
> maxGain)
                {
                    maxGain = currGain;
                    maxAtrb = attribute.Key;
                }
       
    }
         
  var partitioning = new Dictionary<string, List<TrainingRecord>>();
            foreach (var posValue in R[maxAtrb])
 
          {
                var Si = S.Where(x =>
x.Attributes[maxAtrb] == posValue).ToList();
 
              partitioning.Add(posValue, Si);
            }
           
R.Remove(maxAtrb);
            var childTrees = new Dictionary<string, Tree>();
            foreach (var Si in
partitioning)
            {
                childTrees.Add(Si.Key, ID3(R, C,
Si.Key, Si.Value));
            }
            return new
Tree(maxAtrb, arcName, childTrees);
        }
        private double
Gain(string s, List<string>
posValues, List<TrainingRecord>
trainingRecords)
        {
            return
Info(trainingRecords) – Info(s, posValues, trainingRecords);
        }
        private double Info(string
attribute, List<string> posValues, List<TrainingRecord> list)
        {
            double nGeneral = list.Count;
            double
sum = 0;
            foreach (var value in posValues)
   
        {
                var sCount = list.Where(x =>
(x.Attributes[attribute] == value) && x.Success).Count();
                var
fCount = list.Where(x => (x.Attributes[attribute] == value)
&& (!x.Success)).Count();
           
    var n = (double)(sCount + fCount);
     
          var iValue = I(new List<double>() { sCount / n, fCount / n });
                sum += (n / nGeneral) * iValue;
            }
         
  return sum;
        }
 
      private double Info(List<TrainingRecord>
trainingRecords)
        {
            int n
= trainingRecords.Count;
            var sCount = trainingRecords.Where(x =>
x.Success == true).Count();
            var
fCount = trainingRecords.Where(x => x.Success == false).Count();
            var p1 = sCount / (double)n;
            var p2 = fCount / (double)n;
            double
info = I(new List<double>()
{ p1, p2 });
            return info;
        }
        private double
I(List<double> P)
   
    {
            double
sum = 0;
            foreach (var p in P)
           
{
                if
(p != 0)
                {
                    sum += p * Math.Log(p, 2);
   
            }
            }
            return
-sum;
        }
 
  }
}

and the result in Competition node from debug mode:

That is bold path in tree below:

           Age 
         / |    
        /  |     
    new/   |mid   old
      /    |       
    True Competition False
         /      
        /        
     no/          yes
      /            
    True             False

I’m going to implement all  this stuff online tomorrow for students who will listen to me.


No comments