Ye Olde Double Dispatch; The Invasive Visitor Pattern; and Pattern Matching.


As the title suggests we’re exploring The Ol’ Double Dispatch, Visitor Pattern and Pattern Matching. Given this topic can be language agnostic, I’ll choose C++ for examples. We all know it’s the language of languages. Kind of like Latin. It also gives me a good framework to work with because it’s close to the metal, which hopefully makes things less complicated, and way better than JAVA because, well… JAVA. Need I say more?

I’ll get to the back story, and an architecture choice which leads to a pervasive (invasive) design pattern, but first I’ll start with some history. Practitioners of C, would remember the paradigm of separating function from data, they remember this; because they have no other choice. In C, your tools are functions and structs, so that’s really all you got. Along the way some clever individuals invented something called virtual function dispatching. This allows you to have different function implementations based on the different type of data. This is a little something we like to call ‘polymorphism’ in the Computing world, you’ve probably heard of it. In fact, polymorphism is a fundamental pillar of Object Oriented programming. It is also, a form of what is called ‘single dispatch’.  Whereby you make a single function call and it’s dispatched to the proper implementation based on the data type.

By this point, you’re probably saying “You said you’d talk C++, but now you’re talking about C.” Well it’s important because it illustrates my point, and what would we be without C? Some type of cave person punching holes in cards? Give your head a shake. Anyways, the C++ compiler builds a virtual function table, and ensures that the correct function is called based on the type, all with the nice syntax of C++ to write it. So this example will be C++ syntax, to show virtual single dispatch.

// create a virtual base class
class base
{
public:
    virtual ~base() = default;
    virtual void do_something()
    {
        // do something
    }
};

// override the base class
class derived : public base
{
public:
   virtual void do_something() override
   {
       // do something derived
   }
}; 

int main()
{
     base b;
     derived d;
     base &a = b;
     base &c = d;
     
     // call the correct function, from the base, to the derived
     // base do_something
     a.do_something();

     // derived do_somethng
     c.do_something();
}

So Single Dispatch, virtual function table, and data / function segregation.

Now, there’s a second type of Single Dispatch, but it doesn’t exist in C (the latest C has it apparently), it’s called ‘function overloading’. In C++,  it’s done at compile time. Basically, the compiler picks the right function based on the type passed in. You know.. this.

void foo(int i)
{
    // do foo for int
}

void foo(double d)
{
    // do foo for double
}

int main()
{
   double pi{3.1415};
   int i{0};
   foo(i);
   foo(pi);
}

As I mentioned, this form of dispatching is only at compile time. The compiler has to know the type when selecting the proper overload. So something like this, won’t work.

class ibase
{
public:
    virtual ~ibase() = default;
    virtual void do_something() const = 0;
};

class derived1 : public ibase
{
public:
   virtual void do_something() const override
   {
   };
};

class derived2 : public ibase
{
public:
   virtual void do_something() const override
   {
   }
};

void foo(const derived1 &d)
{
    d.do_something();
}

void foo(const derived2 &d)
{
   d.do_something();
}


int main()
{
   // This won't compile, it's for illustration purpose
   derived1 d1;
   derived2 d2;
   ibase &b1 = d1;
   ibase &b2 = d2;
   foo(b1);
   foo(b2);
}

Now, I know what you’re thinking — ‘There’s no reason why you would ever do this. Just use polymorphism to solve this.’ You’re right! For this example.

However, the point is, that if you combine the two examples, you get ‘Double Dispatch’.  Now, if C++ supported this, that would be grand. But it doesn’t, and the problem lies with what’s illustrated above. You can’t use a runtime type to do function overloading.

From this point forward, we’re going to use a image system as an example. In our image system, the image can be of different types, and we want to be able to apply some different action to the image; adjust colour, crop, scale, black & white, etc. If we sketch our example in our fairytale C++ language that supports runtime type function overloading. It would look like this.

struct rgb
{
public:
  unsigned char red, green, blue;
};

class iimage
{
public:
    virtual ~iimage()  = default;
    virtual int height() = 0;
    virtual int width() = 0;
    virtual rgb get_pixel_color(int x, int y) = 0;
    virtual void set_pixel_color(int x, int y, rgb color) = 0;
};

class bmp_image : iimage
{
  // any none shared functions  & data
  // implement shared functions
};

class jpg_image : iimage
{
  // any none shared functions & data
  // implement shared functions
};

class iaction
{
public:
    virtual ~iaction() = default;
    virtual void apply(bmp_image &image) = 0;
    virtual void apply(jpg_image &image) = 0;
};

class adjust_color_action : iaction
{
    virtual void apply(bmp_image &image) override
    {
       // some specific algorithm for bmp images
    }
    
    virtual void apply(jpg_image &image) override
    {
       // some specific algorithm for jpg images
    }
};

void apply_action(iimage &image, iaction &action)
{
    action.apply(image);
};

int main()
{
    iimage &image = generate_image();
    iaction &action = get_user_action();
    apply_action(image, action);
}

So as you can see, if this was to compile, we’ve done two things, we’ve separated our algorithms from our data; actions & images, and given ourselves the ability to apply different algorithms based on the different types of data. We’re using both types of single dispatch at once, a runtime polymorphic virtual function dispatch, when we call ‘apply’. Then a runtime function overload, with the argument passed in. That right there — is Double Dispatch.

Unfortunately for us. That runtime function overload, doesn’t exist in C++. So we need to somehow invert our call, so that the compiler can be sure of the type when it compiles.

Enter Visitor Pattern.

By inverting the dispatch of the action, onto the image base, we can cause the dynamic dispatch to the action, while knowing at compile time the type getting passed in. It looks like this.

struct rgb
{
public:
 unsigned char red, green, blue;
};

class iaction;

class iimage
{
public:
 virtual ~iimage() = default;
 virtual int height() = 0;
 virtual int width() = 0;
 virtual rgb get_pixel_color(int x, int y) = 0;
 virtual void set_pixel_color(int x, int y, rgb color) = 0;
 virtual void apply(iaction &action) = 0;
};

class bmp_image;
class jpg_image;

class iaction 
{ 
public: 
    virtual ~iaction() = default; 
    virtual void visit(bmp_image &image) = 0; 
    virtual void visit(jpg_image &image) = 0; 
};
class bmp_image : iimage
{
    // any none shared functions & data
    // implement shared functions
    virtual void apply(iaction &action) override
    {
         // the *this, tells the compiler, that when you
         // call apply on the bmp_image, it will call the
         // bmp_image overload of the iaction
         action.visit(*this);
    }
};

class jpg_image : iimage
{
    // any none shared functions & data
    // implement shared functions
    virtual void apply(iaction &action) override
    {
        action.visit(*this);
    }
};

class adjust_color_action : iaction
{
  virtual void visit(bmp_image &image) override
  {
    // some specific algorithm for bmp images
  }
 
  virtual void visit(jpg_image &image) override
  {
   // some specific algorithm for jpg images
  }
};

void apply_action(iimage &image, iaction &action)
{
    image.apply(action);
};

int main()
{
   iimage &image = generate_image();
   iaction &action = get_user_action();
   apply_action(image, action);
}

Okay — so as you can see, in this example we’re emulating Double Dispatch, by inverting our calls. Which uses first polymorphism to make the correct call, and then a compile time overloaded function. Voila! Double Dispatch, kind of. This pattern lends itself very nicely when you need to separate your function, from your data. It allows you to easily expand data function, without having to change code in the data. However, additional data becomes a bit of a pain, because you need to change all of your function classes to support the new data class.

Now, I mentioned earlier that there is an architecture that the Visitor Pattern becomes quite pervasive, to a point of being almost invasive. It’s the dreaded ‘Anemic Domain Model’ architecture / pattern! Gasp! What is the ‘Anemic Domain Model’? Well according to Wikipedia, it is “Anemic domain model is the use of a software domain model where the domain objects contain little or no business logic (validations, calculations, business rules etc.).”

[ See Post In Defense of The Anemic Data Model ]

If you’ve architected your system in this way. You might start to see where the visitor pattern starts to become pervasive. Just because you chose to decouple function from data, doesn’t mean you throw OO out the window completely. Right? The issue you face, is that if a component in your data model wants to become a hierarchy, how do you deal with it? You can’t add function to the data. Because that’s the point of your model. You need a pattern where you can apply a dynamic algorithm, to a dynamic piece of data.

Enter Visitor Pattern.

Because RESTful APIs lend themselves nicely to this pattern, let’s reimagine our image processing example as a REST api.

Let’s say we want to PUT to a transform, in our image collection. We want to be able to 1) have multiple transforms, and accept different kinds of images.

Our API is

PUT api/images/transform
{
   "action":"adjust_color",
   "image":"bin_data"
}

Now, imagine we had some kind of mechanism, that would dispatch us the deserialization of our JSON blob to

void on_put_images_tranform(iimage &image, iaction &action);

You could design things differently, where you didn’t need the visitor pattern. Although, I think this is a nice blend of OO, patterns, and separation of concerns. So it nicely illustrates my point.

HOWEVER! The visitor pattern isn’t cheap, you need multiple classes, you need multiple interfaces and it does obfuscate data away from function.

Enter “Pattern Matching”.

What is Pattern Matching, you’re asking? How does it relate to the Visitor Pattern? Pattern matching, in regards to programming languages, is for all intents and purposes a switch statement on steroids.

Your average switch looks like this:

enum states { stopped, running, error };

std::string to_string(states s)
{
    switch(s)
    {
    case states::stopped:
         return "stopped";
    case states::running:
         return "running";
    case states::error:
         return "error";
    default:
         return "unknown";
    } 
};

Imagine this guy, if you could apply that to types.

I need to switch languages here. C++ has a library, called Mach7 which is a pattern matching library, and it allows for functionality similar to what I’m going to describe. I just find that the C# syntax is cleaner and easier to read. As of C# 7 pattern matching, and the ability switch on object type, is a thing. The first time I saw it, I thought ‘Honestly, who would ever use that??’ Apparently, me.

I’m going to reimagine our example, in a hybrid language. Because I want to use the C# pattern matching syntax, but I want to stick with the original C++ example.

Instead of needing to perform the inversion, we can use a runtime detection of type to dispatch to the right call.

Like so

void apply_action(iimage &image, iaction &action)
{
    switch(image)
    { 
    case bmp_image bmp:
         action.apply(bmp);
    case jpeg_image jpg:
         action.apply(jpg);
    default:
       throw unsupported_object_exception();
    }
}

int main() 
{ 
    iimage &image = generate_image(); 
    iaction &action = get_user_action(); 
    apply_action(image, action); 
}

As you can see, we’re very close to the initial Double Dispatch described at the start of the post. We’ve come full circle!

To recap, Double Dispatch is a form of multiple dispatch, allowing us to dispatch polymorphic function on polymorphic data. The Visitor Pattern, is a pattern that allows us to emulate Double Dispatch in languages that have no support for it. The Visitor Pattern is especially good at allowing us to separate function from our data. It becomes very pervasive in a pattern where we keep our data and logic separate (an anti-pattern), but want to have data hierarchies. Pattern Matching is a language construct that allows us use a switch statement on steroids, to switch on object type. Giving us almost Double Dispatch.

Happy Coding!

“The ultimate dream in life is to do what you love, and learn something from it” – Jennifer Love Hewitt

References

Virtual Functions in C [https://www.embedded.com/electronics-blogs/programming-pointers/4391967/Virtual-functions-in-C]

Double Dispatch with an Inverted Visitor Pattern [http://www.drdobbs.com/double-dispatch-with-an-inverted-visitor/184403497]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: