Using C# from C++.

As people might know I frequent Reddit, and StackOverflow, but mostly Reddit. I do this obviously for investing advice, but secondly to share programming thoughts and ideas. I think giving back to community is an important part of life, and of the things I have to offer, programming is my best.

There are a great deal of resources out there that outline how to call native libraries from C#. From P/Invoke, to tutorials on interoperating using C++/CLI. I’d hazard a guess this is because C# is of the top 8 (59.7%) of our beloved languages according to StackOverflows annual developer survey in 2020. Much to my dismay, C++ falls somewhere in the basement, with a measly 43.4% of developers who actually enjoy using the language. So if you’re a C# developer, and you’re interested in calling some black-box C++ library written by your ancestors (or our modern day C++ superstars!) you’ve got plenty of resources. However, if you’re a C++ dev, and you’re looking to add a bit of .NET to your life. You might have a hard time finding resources on how to go about it. Here’s my attempt to fill that void.

As a disclaimer, I’ll need you to know that this should be a last resort. You should research whether there are other options, like COM or a different native option. When we talk about C++, we think zero overhead abstraction, we think performance, we think light weight. What we’re doing isn’t any of that. When you use this technique, you’re pulling in the CLR into your application. So this is a sledgehammer solution, use it as such.

Calling C# libraries from C++ is actually easier than you’d think. C++/CLI is mostly used to go the reverse direction, and expose your native libraries into your managed. However, you can use it to expose managed code into C++. To be completely honest, I don’t know how this all works under the hood. I’d love if someone could share!

You can find the supplemental code for this here. To keep my wordiness to a minimum. I’ll just describe the setup in this post.

First create a C++/CLI project in Visual Studio. Then add your native class and export it.

// native_to_managed.h
#pragma once

#include <string>
#include <memory>

#ifdef DLL_EXPORT
#define DLL_API _declspec(dllexport)
#else
#define DLL_API _declspec(dllimport)
#endif

class DLL_API native_to_managed
{
public:
    native_to_managed();
    ~native_to_managed();
    void write_to_console(const std::string& message);

private:
    class impl;
    std::unique_ptr<impl> impl_;
};

You’ll notice here the use of the familiar pimpl pattern, or in other words a “compiler firewall”. This is necessary, because this .h will be included by our purely native project, and will need to have hidden the fact that it holds handles into the CLR. You’ll notice the minimal include dependencies. This is the purpose of the firewall. This interface will be purely C++. If you need to pass complex objects which are also managed, you’ll need to do the same for every object you want to export. You then implement the native_to_managed as per the pimpl pattern, and forward the calls into the impl.

// native_to_managed.cpp

#include "native_to_managed.h"
#include "native_to_managed.impl.h"

native_to_managed::native_to_managed()
    :impl_(std::make_unique<impl>())
{
}

// You need this in the compilation unit so it knows about the impl.
native_to_managed::~native_to_managed() = default;

void native_to_managed::write_to_console(const std::string &message)
{
    impl_->write_to_console(message);
}

You’ll notice that we’ve pulled the definition of the destructor into the .cpp file. You’ll need to do this, because inside this cpp is where the compiler gets an understanding of what the actual “impl” is, via the include of native_to_managed.impl.h. Otherwise, because we’re using std::unique_ptr we’ll get an error saying something about it not knowing how to delete. I’ve done this by doing =default which will use the compiler default generated destructor. I read somewhere that this is optimal over ~native_to_managed(){}, in what the compiler actually generates. Since then, I always opt for the =default option when I have default dtors. But most people don’t know you can do this in the .cpp, not just in the .h.

Inside the impl is where the magic happens. (do people still say that?)

//native_to_managed.impl.h
#pragma once
#include "native_to_managed.h"

#include <gcroot.h>
#include <msclr/marshal_cppstd.h>

using namespace dotnet_interop_net_lib;

class native_to_managed::impl
{
public:
    void write_to_console(const std::string &message)
    {
        managed_->WriteToConsole(msclr::interop::marshal_as<System::String ^>(message));
    }

private:
    gcroot<ManagedClass ^> managed_;
};

Technically, you can split the declaration and implementation up into a .h/.cpp. I’m just lazy, and chose to do them both in the .h. Follow your heart, or your company coding standards, or what your boss says to do, not what she does.

The most important thing to notice is the inclusion of the gcroot<T> a class that allows you to hang on to a Garbage Collector reference (handle) from your native class. You can see I’m holding on to a ManagedClass^ the hat represents in C++/CLI that this is reference to a GC object. You can use the gcroot to dereference and call into your managed class.

You’ll also notice that I’ve used msclr::interop::marshal_as<System::String ^> to translate from our unmanaged memory (std::string) to our managed memory (System.String), again note the hat. Depending on your situation, you’ll need to just understand your own marshalling requirements.

The last very important thing, is that your standard Visual Studio project referencing won’t work. If you were using C# and you referenced this project, the library would get copied to the project output. However, if you do the reverse, Visual Studio fails to setup the references. So you have to setup the additional library dependencies to point to the lib.

Edit: To clarify, this will be in the linker input of your native project.

Given this short setup — you’ll be able to call into a C#/Managed DLL. Remember, this is a sledgehammer so use it accordingly.

As always, Happy Coding!

I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail. Abraham Maslow

What I’ve learned from 2 years of Productivity Research and Application Development

Over the course of the last two years I’ve embarked on a journey developing a productivity (read to-do list) application. When I started out I dreamt of a productivity app that worked for me. Because I, like many others I can only assume, hadn’t gotten much value out of the many applications I had tried and failed to use in the past. I embarked on a journey to fulfil that dream by creating an application founded on organizational psychology principles proven to increase our ability to produce, and reduce the anxiety and stress related to our workloads. On January 1st, 2021 I went production live with a Beta version of my application Fuss.

The application is still in the Beta stage of its lifecycle, because I’ve found it challenging to find users, and get feedback from those users. So I’ve had to use myself, and my fiancée as guinea pigs of the application. I also have 2 or 3 other people who occasionally use the application and give feedback. Because /r/productivity prohibits self promotion, I wanted to take a step back from promoting my application, and focus this blog post on the productivity tips and tricks that I’ve learned in the past 2 years of research and development on my application.

Let’s first look at three definitions of “productivity”. The first, the Oxford dictionary definition of productivity.

productivity

noun
the rate at which a worker, a company or a country produces goods, and the amount produced, compared with how much time, work and money is needed to produce them — Oxford Dictionary

Wikipedia defines it as

Productivity describes various measures of the efficiency of production. — Wikipedia

And the well known James Clear, defines it

“Productivity is a measure of efficiency of a person completing a task. ” — James Clear, “The Productivity Guide”

In all of these definitions, the key takeaway is efficiency. The more efficient we can be at doing something, the more we can get done, and the less we expend getting it done. This means we get more done with less. A fantastic target to set our sights on. So how do we go about achieving efficiency in our everyday lives? And why do we even want to do that?

Well it might seem like a stupid question, but the answer for me is time. For a human being, time is a finite resource. It’s one of the only things in life money can’t buy, and you only get so much of it. I didn’t realize this until my mid-twenties that we only get so much time, so we must spend it wisely.

Our genetics and psychology directly impact our behaviour, and how we spend our time. Think about how your spend your time when you’re riding a high from some praise at work, vs. when you’ve just had a relationship end. Think about how your best friend does. This means that our individual productivity is unique to each person, and can be affected by life situation, health, and even diet. Your lifestyle directly impacts how you spend your time, and what you get done with the time you spend.

The good news is, that it’s not static. As your life evolves and changes, so does your ability to be productive. If you’re the type of person that feels like you have never ending projects, and that tasks take way longer than they should. You can change it. You just have to start. Here are three huge takeaways I learned while I was developing Fuss.

  1. Consistency is King.
  2. Get it out of your head, and prioritize it.
  3. Give yourself room to breath.

1. Consistency is King.

Regardless of what book you read, be it Good to Great, Atomic Habits, The Self-Discipline Blueprint, they all have talk about consistency. One of the biggest psychological issues that we face as humans, is how our ego impacts our ability to be consistent. The thing that truly productive people understand is that doing something is an investment, and doing it consistently has a compounding effect. Where many individuals would try something and have it be sub-standard and quit. The productive individual knows that the first time they do something is likely not going to be the best, but eventually it will be. This comes from an ability to silence the inner critic long enough to find the courage to try again. Some people genetically have a higher internal locus of control, meaning they can seemingly brew motivation from their insides. For some of us that doesn’t come naturally. Two things we can do to cultivate consistency, and muzzle that inner critic are

1. Get a reason why – whether this reason is that you want the pretty girl to notice you, or you want to prove it to your parent. The reason doesn’t really matter, other than that you need to connect with it. It can’t be because your parent or spouse told you to. You have to truly want it, and connect with that reason.

2. Celebrate the small wins. Have you ever noticed that in digital to-do lists when you complete a task, it’s gone forever? That feels great in that moment. The rush of dopamine you receive completing that task, feels great. But it’s fleeting. What happens when you haven’t completed a task in a while, or you’re stuck working on something for more than 10 minutes? How do you get that fix? You celebrate the small wins, by looking at what you’ve accomplished this far. You’ve come 80% of the way, it’s no time to sit on your laurels, but you need to at least pat yourself on the back for coming this far. It’s why paper and pencil to-do lists can be so satisfying. You can easily see what you’ve already done.

The way you cultivate discipline and consistency doesn’t matter, all that matters is that you do, and you earn that 1% compounding interest daily.

2. Get it out of your head, and prioritize it.

Human beings are fallible, it’s the beauty of being human, and it’s what makes us well human. We make mistakes, and we forget things, we forgive and we love. That’s what being human is about. This is also what allows us to be creative and structured. But being forgetful, as human as it is, impacts our time. When you forget to do something, you often have to rush to do it, or you have to accept that it didn’t get done. It’s an awful feeling realizing you’ve forgotten something. It’s often times paired with anxiety, and regret, and usually impacts future situations. Your mind knows this. So it does things to try and combat forgetfulness. Even if you have an eidetic memory you will still forget things. Let’s start by understanding memory and focus. Most humans only have the ability to truly focus on something for 10-20 minutes, and the ability to hold roughly 4 – 7 things in their mind at one time. Your short term memory only lasts for about 15-30 seconds. Knowing this we can try to understand what our minds do to combat forgetting to do something. In order for us not to forget our central processing unit (brain) needs to constantly keep revisiting those tasks. Think about it, when you haven’t written something down your mind constantly revisits it. “I have to remember to wash the sheets”, “Bernoulli’s principle of fluid dynamics states that an increase….”, “Remember to wash the sheets”. Now — this wouldn’t be too bad if we always only had 4 things to do. But that isn’t reality. In a day we might have hundreds of things to do. So your mind rapidly revisits them. David Allen of Getting Things Done refers to these as Open Loops. These loops cause us anxiety, induce insomnia, and overall are very stress inducing. I’m sure we’ve all experienced that little nagging voice telling us all the things we need to do tomorrow at 3 o’clock in the morning before an interview.

The beautiful thing about paper or computers, is that they don’t forget. But a computer, or a piece of paper will never be creative or productive on it’s own. In fact, computers and paper are nothing unless you put a human in the mix (don’t get me started about AI). We can use these tools, in a symbiotic relationship that utilizes both strengths. The human’s ability to be creative and structured, and the paper or computer’s ability to remember. This is what a to-do list is. What it does, is frees us from the chaos of these open loops, and let’s us focus on what’s important. The actual task at hand.

But we all know a piece of paper and pencil, isn’t a panacea to solve all of our productivity woes. If it was, there wouldn’t be an infinite number of to-do list applications on the internet. So, what’s missing from our good old fashioned paper and pencil to-do list?

In my opinion 2 things:

  1. Lack of an Effective Prioritization Framework – If you’re reading this post, you’ve been there. You’ve dumped everything down for today, and you start ordering the tasks based on their priority. What you prioritized today, might be different than tomorrow. Why? Because there’s often no rhyme or reason other than “this is important, my wife will be pissed if I don’t do this”. Another thing humans suck at — prioritization. Because we let our emotions override what we should be doing, for what we want to be doing.
    What you should be doing, for any given to-do list, is the closest, easiest, most valuable item on the list. What this means is that you work on the smallest most impactful task, that is due soon. What was the definition of Productivity? Exactly. The more small impactful things you get done, the more productive you are. It also helps to keep you consistent by solidifying tiny big wins as you go. Lastly, it pushes difficult meaningless tasks to the back, so that you aren’t busy being unproductive. This is called the Impact-Effort or Priority-Difficulty matrix. You can read more about it here, or in Getting Things Done by David Allen.
  2. A Tendency to Overwhelm – The other problem with the common paper and pencil, and a lot of digital to-do lists, is that they keep things too visible for us. Just like seeing small wins by seeing that tasks are complete has a positive impact on our psyche. Seeing a massive, never ending, infinite list has a negative impact. Remember, that we really can only focus on 4-7 things at once, so any more than that and we have to rapidly context switch. We get a sense of overwhelm, and discouragement that make us want to turn away from our trusty paper’s memory. Because it’s too much. So we burn that to-do list, and we turn back to our trusty fallible human brain, as ironic as that is. This is where pen and paper fall down, and where machines really shine. Because humans are emotional, and paper and pencil can’t do anything without human intervention. It becomes very difficult to build an effective prioritization method. Without priority, we must see everything all at once.
    Where machines are brilliant, is that they can remove emotion from decisions, and they do it with perfect accuracy. That’s what computers are meant to do, and what makes them a perfect prioritization companion. A deterministic algorithm means that given the same input, the computer will give the exact same output, every time. That’s exactly what we need an algorithm for prioritizing.
    When we have an effectively prioritized list, it allows us to do something marvelous. It allows us to only see what we need to see right now, and be certain that it’s the most important items. When we only see 10 things, we’re now no longer intimidated, we’re invigorated, because we’re up to the challenge.

3. Give yourself room to breathe.

Congratulations! You’ve made it this far. You now are consistently tracking and nailing things you need to do. You’re working off a list that doesn’t overwhelm you, and you’re confident that you’re doing the right thing. You know in the back of your head that the project has taken some interesting turns, and occasionally tasks keep popping up that just aren’t in alignment with the project anymore. Or worse yet, things that used to be unimportant, all of a sudden have become important.
The final puzzle piece to being more productive, and efficient is review. If you look at the popular software development frameworks like agile or Xtreme. They all contain an aspect of review. Typically in the form of a retrospective. The idea is that you work a 2 week sprint, and then you take a moment to review what you did, and how you did it. The same thing goes for your to-do list. Because the doing side of your life, is just a collection of projects that need, well, doing. So we have to take some time to review our lists. The popular Bullet Journal technique has this built in with pen and paper. As you carry forward through pages, you bring items from old pages forward. You can use this to clarify, trim, or remove completely. It’s the equivalent of tidying your workspace, so you have room to breathe, and think.
What does this look like with a digital to-do list. Well, it’s the time to recharacterize your tasks. You’ll notice I didn’t say reprioritize your tasks. Because the way a task is prioritized never changes, but the characteristics of the tasks may, and where they sit in the line might. Maybe that low impact thing you had on the list, is now high impact. Maybe the deadline got moved up. Maybe the task is no longer important to the project. These are things we have to use our human brains to do, and this is how we declutter our minds. This review time is a perfect time to reflect on what you did well, what didn’t work so well, and what tasks really matter. If you’re not doing this as part of your to-do list management, I urge you to try. Every two weeks, give yourself permission to take 15-20 minutes to review your lists of things to do. Recharacterize the tasks, sweep up old tasks that no longer make sense, and clarify the others. I would be willing to bet you find this time both satisfying and refreshing while it opens and clarifies your thoughts and mind.

I want to thank you for reading, and hope that these 3 techniques can help you on your journey of becoming the most productive you yet. It has been wonderful sharing my experiences with you, and I hope that it sparks some conversations, or thoughts in you all. I hope that if you apply these techniques you’ll see marked improvement in both productivity and mental clarity.

If this blog post struck a chord, I’ll invite you to try out Fuss, and see how these techniques are built directly into the tool. If you’re interested in helping make Fuss better, I would love that. Right now, I have a freemium model to help cover some of the hosting costs. However, if you’d like to use the tool for free in exchange for feedback, I would love that too. You can email me here.

As always, stay well, and Happy Coding!

PL

Improved productivity means less human sweat, not more. — Henry Ford

References

GoF – Abstract Factory Pattern

What a wild summer it has been. With everything going on in the world, it’s important to take a moment to reflect on what’s going well, right now. With non-essential travel on pause, I haven’t needed to travel for work. Which means I’ve had more time to enjoy the summer with family, and more time to code and dream up blog posts. I think COVID has also had some positive impacts on society, I’ve definitely noticed a focus on hygiene in public these days, and that is wonderful.

In my reflection, I was recalling earlier this year when I read the famous Gang of Four book “Design Patterns : Elements of Reusable Object Oriented Software” – Gamma, Johnson, Vlissides, Helm. If you’ve not heard of this book, it’s basically considered the bible when it comes to Design Patterns in Software. This was the first time I had read the book. Yes — I developed software professionally for this long, and had never read this book. It was about time. I could basically feel the snickers and sneers as I walked the isles of my local grocery store. “You’ve not read Design Patterns by GoF?” “What kind of software developer are you?” Now, in reality this was just my chronic case of impostor syndrome creeping up my spine. Even though I had read countless articles on design patterns, and dove Head First into a myriad of them when I read “Head First Design Patterns“. It still didn’t measure up, I hadn’t read the real thing. So in January, I did it. I bought the e-book, and I read it. I wouldn’t say it was a cover to cover read. Like I said, it wasn’t my first pattern rodeo. A lot of the patterns, I had learned about, seen, or even implemented before. So it was more of a refresher than anything else. The book is well written, though a bit dated. What I found nice was that the examples were in C++. Anyways, this isn’t a review on the book. It was just something that came and went, back in January. Then during the COVID pandemic, I’ve been scratching my head for blog post ideas. I wish I possessed the type_traits that some of these other bloggers possess to seemingly pull insightful ideas out of their brains, and write delicate and succinct posts packed with code and useful information. I felt that I should go back to basics, and do a post or two on the cult classic GoF Design Pattern book, update it with modern C++, and put my own Peter spin on the patterns.

“You’re losing our attention; we don’t want to hear about your life. Get on with the post.” — Reddit

Call it what you want to call it. Kismet, fate, alphabetical sorting, the first design pattern in the book. I landed on the Abstract Factory pattern. If you’re unfamiliar with the pattern the definition is to “provide an interface for creating families of related objects without specifying their concrete classes”. What this means, is that the Abstract Factory, is an interface to an object that will create a bunch of objects, and hand them off as interfaces. I’ve taken the liberty to copy the UML from the book.

As you can see from the UML, the Abstract Factory is used by clients to put an opaque layer, or interface between the actual implementation, and the caller. As we all know, this is done through a programming paradigm called polymorphism. The factory is abstract to our client, as well as the objects that it is creating. So, the abstract factory, creates a family of abstract objects that the client can work with. The canonical example of this would be a windowing system, whereby you can have different display systems, that all expose the same family of objects. Consider Linux and Windows windowing systems, they all have windows, button controls, scrollbars, etc. So when working with a windowing system, it could make sense to apply the Abstract Factory Pattern.

I’ll use the original GoF example as my starting point. Though, I’m going to apply modern C++ to it, so it won’t be line for line the same as the example in the book. As well, I won’t detail the map components Maze, Wall, Room, Door, as I don’t feel their necessary to illustrate the point.

class AbstractMazeFactory
{
public
    virtual ~MazeFactory() = default;
    virtual std::unique_ptr<Maze> MakeMaze() =0;
    virtual std::unique_ptr<Wall> MakeWall() =0;
    virtual std::unique_ptr<Room> MakeRoom(int room) =0;
    virtual std::unique_ptr<Door> MakeDoor(Room &r1, Room &r2) = 0;
};

std::unique_ptr<Maze> CreateMaze(AbstractMazeFactory &factory)
{
    auto maze = factory.MakeMaze();
    auto room1 = factory.MakeRoom(1);
    auto room2 = factory.MakeRoom(2);
    auto door = factory.MakeDoor(*room1, *room2);

    room1->SetSide(North, factory->MakeWall());
    room1->SetSide(East, door);
    room1->SetSide(South, factory->MakeWall());
    room1->SetSide(West, factory->MakeWall());

    maze->AddRoom(room1);

    room2->SetSide(North, factory->MakeWall());
    room2->SetSide(East, factory->MakeWall());
    room2->SetSide(South, factory->MakeWall());
    room2->SetSide(West, door);

    maze->AddRoom(room2);

    return maze;
}

We’ve separated the logic of creating the maze from the construction of the maze components, as well as their actual implementation. To take advantage, you just need to implement concrete components, and derive from the AbstractFactory which will return these components.

class EnchantedMaze : public Maze { ... };
class EnchantedWall : public Wall { ... };
class EnchantedDoor : public Door { ... };
class EnchantedRoom : public Room { ... };

class MazeFactory : public AbstractMazeFactory { ... };

class EnchantedMazeFactory : public MazeFactory
{
public:

    virtual std::unique_ptr<Room> MakeRoom(int room) override
    {
         return std::make_unique<EnchantedRoom>(room, CastSpell());
    }
    
   virtual std::unique_ptr<Door> MakeDoor(Room &r1, Room &r2) override
   {
         return std::make_unique<DoorNeedingSpell>(r1, r2);
   }

private:
    Spell* CastSpell();
};

This decoupling enables you to change the implementation of your components, without having to disrupt the creation algorithm. If you’re worried about compiling, when you modify your components it’s not necessary to recompile CreateMaze.

The book outlines the following benefits and liabilities

  1. It isolates concrete classes. The pattern funnels creation through a single interface. Which ensures a single point in which your application must create objects. Because the application must interact with both the factory, and its components via their interfaces details of their implementation remain opaque to the outside algorithms.
  2. It makes exchanging product families easy. Because a client interacts through a single interface, it makes changing this interface as easy as an assignment to a new factory. However, each factory must supply entire product family as a whole.
  3. It promotes consistency among products. Because the factory dictates creation of a group of related items, which will likely be codependent. Since it forces a single source for creation, it makes enforcing this easy.
  4. Supporting new products is difficult. Although changing the factory is easy, creating a new product family is often not. It means defining and implementing a new factory, as well as the entire product set.

You can start to smell when this pattern is helpful, if you’re starting to invent creative ways to change implementation of sets of components within your system. Or, if you’re starting to mix the structure of your application, with component implementation details.

The problem with this GoF patterns is that it results in a lot of boilerplate code. If we look at the AFP (Abstract Factory Pattern), you have to create the factory interface. Then for each family of products you have to implement the concrete classes. This type of repetition tends to get worked around in clever fashions, which can often become unmaintainable nightmares.

My intention with this post (series of posts?) was to make a library that makes using the GoF patterns accessible, and reduces the amount of boilerplate needed. As well as modernizing the solutions, and being creative where I could. It’s hard to top something that’s lasted the test of time like this. I also wanted to make the library easy to use, hard to use wrong, and to illustrate the use of the intended pattern.

When we boil down the Abstract Factory, we have two major components. We have the factory itself, and the family of components. The factory is a set of methods that create, and return each component objects. Which means that for N components in a family, you’ll have N factory functions in the factory. This was my first thought was to remove that boilerplate cost. But how? Well, if we ask ourselves the question of what each factory function is, it’s specifying the type to create at compile time. In C++ we have a neat tool for generalizing functions based on type at compile time. It’s called template programming. So we can start there, let’s look at a simple widget factory example. How would we want to use it.

void create_screen(abstract_widget_factory &factory)
{
    auto w = factory.create<window>(); 
    auto b = factory.create<Button>();
    
    b->set_text("Hello World");
    w->add_widget(b);

   w->show();
};

My initial thought was to do something with template specialization, but the wall that I kept running into is that the Abstract Factory pattern makes heavy use of runtime polymorphism. The template system is compile-time. What does this mean? In the example CreateScreen above, the actual concrete type of the factory, doesn’t need to be known past that it is an AbstractWidgetFactory, and through the miracle of virtual function dispatching, it will land in the appropriate function for the concrete class when the call is dispatched. Template functions are similar, in that they allow you to apply a function to an interface (read concept), but the actual type must be specified at compile time. The reason for this, to vastly simplify it, implements the template function for the given type, instead of you having to write it yourself. Because of this, when you’re trying to replace boilerplate, using the template system typically comes in handy. In order to implement the function above, you would need to define the AbstractWidgetFactory along the lines of

class abstract_widget_factory
{
public:
   virtual ~abstract_widget_factory() = default;

   template <typename T>
   virtual std::unique_ptr<T> create() = 0;
};

This isn’t valid C++, and it makes sense as to why. Because in order to have the pure virtual function (=0), the inherited class must implement the function. But the function is a template, it can be any number of different instantiations of the function. Therefore, what does the concrete class actually implement? Thus, this is no-bueno.

Given that we know the set of objects we’re creating, I thought I may be able to solve this with a concept. However, I came to the conclusion that this would just push the boilerplate code I was trying to avoid with the runtime interface, I would have to write with the concept. I’ll admit my level of comfortably with concepts is quite low, and I only conceptualized the idea. I didn’t actually try and implement it. If you have an idea of how to tackle this with concepts, I am all ears!

So where does this leave us? How can we combine runtime decisions, with compile-time decisions? What does that even mean, or look like? First, what are the runtime decisions we’re trying to make, and what are the compile time decisions? The compile time decision, is the choice of the abstract component we want to use, in this case Button, or Window. In the GoF example, Maze, Wall, Room, etc… The runtime choice is the actual factory class, which by nature of the pattern, should dictate the implementations of the components.

In my research, I found a Dr. Dobb’s (RIP) article from 2001, by Herb Sutter and Jim Hyslop. the article is written in story form, which makes it a pretty fun read. The best part, the story was actually exactly the problem I was trying to solve. I was still trying to solve the problem on my own, so I avoided reading the article in-depth until after I had a working model. Though, I did use it to gut check that my idea that a registry for creation was a track I wanted to go on. So I landed on this.

class abstract_factory
{
public:
     template <typename BaseT>
     std::unique_ptr<BaseT> create()
     {
          // semi-pseudo code
          auto creator = creators_[BaseT::type_id()];
          // something like creator->create<...>();
          // return it.
     }

protected:
    template <typename BaseT, typename DerivedT>
    void map()
    {
        // creators_[BaseT::type_id()] = I need something here;
    }

private:
    std::map<const char*, creator?> creators_;

};

Then, a client could use the factory something like this.

class linux_widget_factory : public abstract_factory
{
public:
     linux_widget_factory()
     {
          map<button, linux_button>();
          map<window, linux_window>();
     }
};

void create_screen(abstract_factory &factory)
{
    auto w = factory.create<window>(); 
    auto b = factory.create<button>();
    
    b->set_text("Hello World");
    w->add_widget(b);

   w->show();
}

int main()
{
    linux_widget_factory f;
    create_screen(f);
}

Future Peter here: One of the problems that the Dr.Dobb’s article suffered from, was that the singleton factory was on a per-hierarchy basis. In that, you could only create one base type, with multiple derived types. As well, being a singleton doesn’t allow for different factories for different families of components.

This implementation looks pretty good, and it works like the GoF Abstract Factory. You’re able to hide the implementation of a family of objects, behind one class which forces the client to go through its interface. It’s just how do we make this work? What we’re really trying to do, is to package up the creation into a function that can be called. When we package it up, we want to hide away any type specific information because this will allow for type generalization, and then when we call the creation we want to as if by a miracle re-hydrate the type for the object we’re creating. There’s a technique for what we’re trying to do. You might know what I’m talking about from some of my previous articles, or you’re just familiar with it. It’s called type erasure. Like the Sutter / Hyslop factory, we’ll make our factory creation function work on a base type. The most base type, void. Then through the miracle of templates, we’ll reassign the type when the client calls create.

If you’re unfamiliar with type erasure, I’ll give a quick overview, and leave a couple of references to articles I’ve read in the past at the bottom.

Basically, any kind of runtime polymorphism is a form of type erasure, you’re hiding away the concrete type to work with a more generic type. Typically when we refer to type erasure in C++, we mean hiding all the type information away. Since we want to work with all types, we’ll have to remove any type information.

Disclaimer: Don’t try what you’re about to see at work. Using type erasure in this way removes the type safety provided by the compiler.

// We want to hide the creation behind some interface.
class creator_base
{
public:
    virtual ~creator_base() = default;
    virtual void *create() = 0;
};

Now, we’ve turned creation of our objects upside down, and hid it behind a very generic interface. When we call create, we’ll get back a void pointer, that’s all a void pointer. Unbeknownst to the compiler it’s actually our concrete object, we’ve just effectively turned off type safety for this call.

The next part of type erasure is to embed the actual type behind the interface. In our case, it looks something like this.

template <typename T>
class creator : public creator_base
{
public:
   virtual void* create() override final
   {
        auto t = std::make_unique<T>();
        return t.release();
   }
};

Easy — so now creator<linux_button> will create a linux_button, and return the opaque void handle to it. Now all that’s left when we’re creating is to flip the object over to the base type we want.

class abstract_factory
{
public:
     template <typename BaseT>
     std::unique_ptr<BaseT> create()
     {
          // semi-pseudo code
          auto creator = creators_[BaseT::type_id()];
          void *obj = creator->create();
          return std::unique_ptr<BaseT>(obj);
     }

protected:
    template <typename BaseT, typename DerivedT>
    void map()
    {
        creators_[BaseT::type_id()] = std::make_unique<creator<DerivedT>>();
    }

private:
    std::map<const char*, std::unique_ptr<creator_base>> creators_;

};

The last piece of this puzzle is the BaseT::type_id(). What I did in my first iteration, was to just stamp this on to my base product classes like this. This is changed to a different pattern in the actual implementation.

class button
{
public:
   static constexpr const char* type_id() { return "button"; }
public:
   virtual ~button() = default;
   virtual void click() =0; 
   virtual void set_text(const std::string &text) =0;
};

My original goal was to reduce the boilerplate of the original GoF pattern. Which, I did. The major boilerplate that occurs is in the factory. Each product create call would be at minimum 4 lines. Which has been reduced to a single line registration function. I also think this has one up on the Dr. Dobb’s implementation in that we’re using the compiler a bit more heavily, and we don’t limit the product classes to a single hierarchy. You can find the eeyore library here.

All in all this was a fun exercise, and I look forward to doing it again. Until then stay healthy, and happy coding!

PL

I know quite certainly that I myself have no special talent; curiosity, obsession and dogged endurance, combined with self-criticism, have brought me to my ideas.
– Albert Einstein

YEG C++ Lightning Talks

Tuesday, Sep 15, 2020, 6:30 PM

Online event
,

2 Members Attending

Have you ever spent 15 minutes diving into the semantics of const correctness at a round table, only to have your Grandmother and Aunt look at you like you’re out of your mind? Maybe you dream of abstract syntax trees, while your eyes glaze over as your friends dive into their COVID wedding plans. If any of this sounds like you, read on! Express yo…

Check out this Meetup →

Join us virtually for our YEG C++ lightning talks!

References:

What is Type Erasure – Arthur O’Dwyer
Conversations: Abstract Factory, Template Style – Herb Sutter & Jim Hyslop
Understanding Abstract Factory Pattern – Harshita Sahai

Leak Free, by Design

C++ is a brilliant language, one that has had my heart for many years. Though with the rise in web applications, more developers are finding themselves working in full stack web applications. Which almost never utilize C++. This can be seen in the 2020 StackOverflow developer survey, where 69.7% of professional developers are utilizing JavaScript and 62.4% HTML / CSS, compared to C++ measly 20.5%. This saddens me, because even though I too dabble in those languages, I would hazard a guess that a large majority of those professional developers have never even seen C++, let alone worked with it. It’s the unfortunate side of demand side economics. Unfortunately, I suspect that the closest a vast majority of JavaScript developers have ever gotten to a memory leak is a post on /r/ProgrammerHumor. And the closest thing to a crash is a big red blob in the console. However, there’s good reason why it’s such an accessible language, and why people can throw together applications so rapidly. JavaScript is a scripting language, and it was thrown together in ’95 to be exactly that, and it is still roughly that. Even with 2.5 decades to evolve, it’s still a web scripting language. You don’t have to understand much past basic logic to put together an application, and with the wealth of libraries you don’t need much to really make something cool. Now, contrast this with a language like C++, where without a bit of grit and determination, it’s hard to get off the ground. Even once you do, you’re working with a black console fronted green text. Gone are the days where you could whip together a cool console app that asked your friends questions, and spit out some silly answers, or even let them play a game. If it’s not Fortnite, it’s not cool, and if it’s not on my phone I don’t want to see it. It’s no wonder why those StackOverflow numbers are the way they are. The sad thing is that many people don’t understand that languages like JavaScript are built on the shoulders of giants, and as flashy as JavaScript web apps are, you don’t build operating systems, real time mission critical systems, and real console or PC games, in languages like JavaScript. As much as they want it to, it just can’t happen. The problem is, that people are afraid of languages like C++. It’s scary to offer your 19 year old intern the keys to the Ferrari. If he doesn’t understand what he’s getting himself into, it’s relatively easy to drive that thing into a wall. It’s much easier to let him ride the JavaScript moped around the block. This is the problem that modern C++ developers need to address. In my opinion, with the recency of the modernization of C++, and the Microsoft open-source movement, I feel that there has been a resurgence in C++. But we need to fight for this, we need to educate, and we need to push the features of the language that make it safe, by design.

About eight weeks ago, I was contacted by a gentlemen by the name of Artem Razin. He was contacting me, because in the relatively small C++ community, he was looking to get word out about his product DeLeaker. He wanted me to write a review about the product. Truth be told. I’m not writing this blog to give product reviews, so I originally declined. I just want to work on my writing skills, and share some of my knowledge. However, I also want to be an active member of the community, and here someone was reaching out to test their product, and help them share it with the world. That being said, I downloaded and tested out the application. Needless to say, the application does what it says. It integrates into Visual Studio, and it detects leaks of all kinds, and it does this with little to no impact on your application. The problem is, I don’t have a real C++ application to try it on. Most of my hobby C++, is with libraries, and other fun toys, so I can’t say how it runs in the real world. But my curiosity struck me more than anything for how this thing worked. So, Artem and I went back and forth, as I questioned him about how the application did what it did, and how it got its start. I started putting together how I could craft a blog post about it. Since my blog is about understanding, education, and uncovering mysteries. What could I write? Ahhh ha. I had it, my post would be about lifting the covers on an application like DeLeaker. How would we go about detecting leaks in applications? When I started my research, it was pretty straightforward, I just had to figure out how to hook an application, and detect memory allocations. On Windows, it’s relatively straight-forward. Though, as I worked on it, I thought to myself about JavaScript, and other languages where this wasn’t necessary. I imagined a world, where we didn’t have to worry about this in C++. A world where leaks in C++, were a thing of the past. Then I realized that, that is the theme of this post. Creating leak free applications, by design.

What’s wrong with this code?

void foo()
{
     faucet *moen = new faucet();
     moen->run();
}

I’ll give you a hint, it’s a memory leak, a rather obvious one of course. The problem is that leaks don’t always look like this. They can hide in plain sight.

void foo()
{
    faucet *moen = new faucet();
    moen->run();
    delete moen;
}

No leak, right? Wrong. How can that be, we’ve got a paired delete for our new.? The problem is that since exceptions are the default mechanism for handling errors in C++, we have to be wary of them. This means that the run() method, could potentially throw an exception. If it does, and it’s caught somewhere higher up, we’ll never call the delete, and we’ll have the same outcome as the method above. A little leak.

If you’re a veteran, you already know this, and you know the answer for this. Use a smart pointer. For those of you who aren’t veterans, and you don’t know. A smart pointer is the encapsulation of a very powerful pattern in C++, called RAII. Resource Allocation Is Initialization. What this means, is that all allocated resources should be attained in initialization, and subsequently let go in destruction. In this case, heap-allocated in construction, and delete in destruction. This way, the code becomes memory safe, just by using the “smart pointer”. Smart pointers, are used to represent ownership, and ownership is a very powerful concept in C++.

#include <memory>
void foo()
{
    auto moen = std::make_unique<faucet>();
    moen->run();
}

You’ll notice that the code is much similar to our first try, except now it won’t leak. I’ve included the <memory> header, that’s where you’ll find these tools. You’ll also notice the removal of the * and type, for the keyword auto. Well, for our JavaScript and C# readers, that’s similar to the var keyword. The reason I’ve used it, is that the type is no longer a faucet, nor is it a pointer. When we use make_unique, we’re actually creating a wrapper object called a unique_ptr which is handling the allocation and deallocation of our faucet type, through the miracle of templates, and it acts just like a pointer to a faucet. The type of moen however, is now std::unique_ptr<faucet>.

Here we are at 1200 words, and the morale of the story is this use smart pointers, and you’ll never ever ever have a leak again in your life. Sorry Artem. 🙂 If only life was that simple. The problem with real code, and not contrived simplified blog code, is that it’s never that cut and dry. The reason why JavaScript is easier, is because you don’t have to think about design. You can just code a tangled rats-nest of JavaScript, and it can still be a functional working application. Needless to say you can shoehorn your C++ application into the category of “functional”, and “working”, with enough trial and error as well. Obviously the answer to Leak Free code, isn’t just “use smart pointers”.

faucet * make_faucet()
{
    faucet *moen = new faucet();
    moen->run();
    return moen;
}

In this example, we want to start our faucet, and pass it back to the caller. Unfortunately, unless the caller knows they need to call delete, we could have a leak. As well, we’ve also opened ourselves up for the potential of an exception causing the leak. Use smart pointers.

#include <memory>
std::unique_ptr<faucet> make_faucet()
{
    auto moen = std::make_unique<faucet>();
    moen->run();
}

Great. This works the same. Smart pointers are heating up!

#include <memory>
std::unique_ptr<faucet> make_faucet()
{
    auto moen = std::make_unique<faucet>();
    moen->run();
}

void turn_on_hot(std::unique_ptr<faucet> f)
{
    f->turn(dir::right, 0.75);
}

int main()
{
    auto f = make_faucet();
    turn_on_hot(f);
}

Oops! error C2280: 'std::unique_ptr<faucet,std::default_delete<_Ty>>::unique_ptr(const std::unique_ptr<_Ty,std::default_delete<_Ty>> &)': attempting to reference a deleted function. If you’ve programmed with modern C++ for any length of time using smart pointers, you know what this is. This is definitely scary to a beginner. Have you ever programmed in JavaScript, and gotten a compilation error that told you, you were referencing a deleted function? Oh wait. Nevermind. Regardless, for a beginner this error is difficult to reason about. The simple answer, is that you can’t copy a unique_ptr, the author has deleted this function. For good reason, if you understand it, but if you’re coming from a world where you just pass everything to everything, and you don’t even know what this is. It’s scary af, as the kids say.

The problem is that smart pointers alone, are not the key to writing safe, leak free applications. Just because I have a garage full of tools, doesn’t make me a mechanic. So where does this leave us? One of the most powerful features of C++, that many other languages don’t have, is well defined object lifetime. In C++, we can know exactly when objects come to life, and exactly when they’re destroyed. In my opinion, this is one of the most powerful language features, especially when you’re trying to write safe, leak-free code. In order to do this it means we must both understand and design with this in mind.

What is Object Lifetime when it comes to C++? According to cppreference.com lifetime definition, “A runtime property: for any object or references, there is a point of execution of a program when it’s lifetime begins, and there is a moment when it ends.” For an object, it’s life begins when the storage for it is obtained, and any initialization has been completed. The objects life ends, when the destructor call starts, or the storage has been released. Now, as an exercise to the reader to understand the other caveats, and lifetime of references. Understanding this, we can start to plan our programs so that the lifetime of our objects is a well-known principle of our application. This becomes important, because it will impact the type of smart pointers you use, and how you go about passing those pointers around. The choices you make for lifetime, will also impact how you model ownership in your application, but we’ll get to that.

Let’s start with our lifetime model. How long does an object live in our application? The simplest approach is often the best. The simplest approach here, is to break an object lifetime up into either short-lived or long-lived. In “Writing High Performance .NET Code”, author Ben Watson suggests to beat the garbage collector, you should make the objects either really long lived i.e. the lifetime of the application, or really short lived i.e. function scope. This evades the most costly part of garbage collection where the object is relocated. I feel that the same mentality holds true in C++, not for the cost of the garbage collection, but for the mental cost it incurs to reason about whether an object should still be there or not. If we know an object has a long lived lifetime, then we know it should be safe to pass this object around by reference without worry of it dangling. However, if we know it’s a short-lived object, then we have to think twice before we pass references to it. This model also applies relatively to objects that contain children. So long as a parent object is the sole owner of the child, then it is safe to assume the parent will outlive the child. It makes code like this, valid.

class parent;
class child
{
public:
    child(parent &p) : parent_(p), life_(10) {}
    void feed();
private:
    int life_;
    parent &parent_;
};

class parent
{
public:
    parent(int life) : life_(life) {}
    void give_birth()
    {
          children_.emplace_back(*this);
    }
    void feed_children()
    {
         for(auto &child : children_)
             child.feed();
    }
    void subtract_life(int val)
    {
        life_-=val;
    }
private:
    int life_;
    std::vector<child> children_;
};

void child::feed() 
{
   life_ += 10;
   parent_.subtract_life(10);
}

int main()
{
    parent p(100);
    p.give_birth();
    p.feed_children();
}

In this example, we know that the parent owns all the little vampiric children, so they can be sure that it’s safe to suck the life from their parent.

It’s important to remember the distinction between stack and heap allocated objects. A stack allocated object one like p above, will live the lifetime from the line after it’s defined on, to the closing brace it resides within. A heap allocated object will live from the line after new, to the line which delete is called. You can have long and short lived stack objects, and long and short lived heap objects. Though typically, you’ll want short lived objects to live on the stack, and sometimes you’ll want long lived objects to live on the heap. However, it can also make sense to have long lived objects that are on the stack, allocated at the beginning of your program and live until it completes.

If we understand object lifetime and ownership model, it can start to paint a picture in our mind about how we can use RAII and object passing semantics to ensure that our applications are leak free. In fact, we can design our applications in a way that we know there isn’t a leak. We can do this with 3 basic steps.

  1. Subscribe to RAII, ensure any sandwich code is properly encapsulated in an RAII wrapper, which acquires the resource in the constructor, and releases the resource in the destructor. Use smart pointers for heap allocation of objects.
  2. Define a well known ownership and lifetime model. Define long and short lived objects, and have the longer lived objects own the shorter lived objects, where possible.
  3. Based on ownership and lifetime, choose the appropriate argument passing semantics.

The first two are relatively straightforward, we want to prefer stack allocation where possible, but use smart pointers when we need to allocate, and use an RAII wrapper when we need to acquire / release a resource. Then we want a well-defined ownership and lifetime model. Meaning that we know which objects are long-lived, and which are short, and who owns what. Then, we let these two models dictate how we handle object interaction within our program.

void do_something(std::string s);
void do_something_else(const std::string &s);
void do_something_to(std::string &s);
void do_something_again(std::string *s);
void do_something_pass_ownership(std::string &&s);

If you’re coming from a language like C#, or JavaScript these function declarations might look strange. Let’s be honest, to a rookie C++ developer the difference is subtle, and more often than not, the first pick is the default. Just because in other languages you don’t have to decorate your type when you’re passing a variable. In C++ though, those signatures all behave differently, and all have different meanings, and when you’re authoring an application they serve as self-documentation to allow clients of your class, to understand the functions affect on the ownership and lifetime model of your application.

void do_something(std::string s);

The first signature, says to the caller “you’ll have to pass me a copy of a valid object”, this implicitly means that the object is a) valid, and b) the callee can muck with it because it’s copy, and it has no effect on the original. Unfortunately, this is the go-to object passing method for beginner C++ developers, because it looks just like every other programming language’s method to pass arguments. What’s often misunderstood, is the cost associated with this operation. For small, plain-old-data type objects, it’s often faster to pass a copy rather than a reference, so it makes sense. But when we’re talking about full objects, larger than a couple of bytes, things start to change. With something like a string or vector, you’re paying to copy all the objects contained within. This means a heap allocation, and a copy. So you want to save passing by value for small objects, and when it’s intentional. Such as the case where you want to guarantee you’re receiving an object, you want to muck with it, but you don’t want to mess up the original.

void do_something_else(const std::string &s);

Next up, pass by const-ref. This should be the default argument passing semantic you reach for. This tells your caller “you have to pass me a valid object, and I won’t modify it”. If you start here as the base, you can start to change the signature based on your functions needs. If your need is that you just want to read the object being passed in, then this is great. The only ask from the callee, is that the reference is to a valid object. This is where defining your ownership & lifetime model will allow you to ensure what you’re passing will in-fact be referencing a valid object.

void do_something_to(std::string &s);

You want a valid object, and you want to mutate it. This is the signature for you. That’s exactly what you’re telling the caller, “pass me a valid object, and I’m going to mutate it”. Just like with const-ref, passing by reference says you want a guarantee of an object, but removing the const lets you muck with the object they’ve passed in. Historically, this has been used for out-params, where you basically fill the object being passed in. However, this is modern C++, if you need to get an object out, just return it.

void do_something_again(std::string *s);

Now, the nice thing about passing by reference is that you’ve got a guarantee you’re getting an object, where in some other languages you don’t have a method to guarantee that. So you’ve got to do null checks for defensive programming. Well, in C++ we can explicitly ask for that type of parameter passing. Pass by pointer (or const pointer). This tells your caller something a little different than the “by reference” options of above. This tells them “pass me a pointer to an object, or nothing (nullptr)”. The difference between pass by reference, and pass by pointer, is the ability to pass nullptr to the function. This comes in extremely handy when you want to indicate the beginning or end, or a special case with the nullptr. However, this comes with the expectation that we are able to handle both cases. So always always always do a nullptr check when passing by pointer. Though, it should be an indicator to change your argument passing semantic if you’re not handling this case in the first place.

void do_something_pass_ownership(std::string &&s);

This && is a bit weird, and to other languages that are reference counted such as C# and JavaScript, it really doesn’t have a parallel. But this is the r-value passing semantic, and it’s used to denote the passing of ownership. Arguments should be passed from the caller using std::move, this makes the transfer of ownership explicit, allowing you to maintain your ownership model.

Now, armed with RAII, a solid ownership and lifetime model, and proper passing semantics you can start to craft code that is Leak Free, by Design.

#include <memory>
#include <vector>

int make_handle();
void free_handle(int);

class widget_manager;

struct window_event {};

class widget
{

public:
    virtual ~widget()=default;
    virtual void dispatch(const window_event &e) = 0;

};

class widget_manager
{
public:
    widget_manager()=default;
    widget_manager(widget_manager &)=delete;

    void dispatch(const window_event &e)
    {
        for(auto &wptr : widgets_)
        {
            wptr->dispatch(e);
        }
    }
    void add_widget(std::unique_ptr<widget> ptr)
    {
        widgets_.emplace_back(std::move(ptr));
    }

private:
    std::vector<std::unique_ptr<widget>> widgets_;
};

class main_widget : public widget
{


public:

    main_widget(widget_manager &manager)
    :manager_(manager), handle_(make_handle())
    {
        
    }    
    ~main_widget() { free_handle(handle_); }

    virtual void dispatch(const window_event &e)
    {
        // do something main here

    }

private:
    widget_manager &manager_;
    int handle_; 
};


class widget_factory
{

public:
   std::unique_ptr<widget> make_main_widget(widget_manager &m)
   {
       return std::make_unique<main_widget>(m);
   }

};

class application
{

public:
    application(widget_manager &manager, widget_factory &factory)
    :manager_(manager), factory_(factory)
    {
            auto widget = factory_.make_main_widget(manager_);
            manager_.add_widget(std::move(widget));

    }  

    window_event get_window_event() 
    {
        // get a window event, if end return nullptr
        return window_event{};
    }
    void run()
    {
        while(1)
        {
            auto event = get_window_event();
            manager_.dispatch(event);   
        }
    }

private:
    widget_manager &manager_;
    widget_factory &factory_;

};

int main()
{
    widget_manager manager;
    widget_factory factory;
    application the_app(manager, factory);
}

The problem with contrived examples for complex problems, is that they never quite illustrate the point you’re trying to make. For me, at least. The point this code illustrates, is that you can come up with a model for long, and short lived objects. You can have long lived objects own shorter lived objects. Knowing this set of information can dictate how you want to pass these objects around. You want to use the appropriate smart pointers, i.e. unique_ptr for unique ownership, and shared_ptr for shared ownership, and always use RAII wrapper classes when you have sandwich code, in this example make_handle() and free_handle(int). If you make these rules front of mind when you’re programming in C++, you’ll be able to sleep a little easier knowing you’re not going to leak resources.

Now, if your application isn’t modern, or you’re dealing with legacy code. Well, then you should look to a tool like DeLeaker to help you find those leaks. Once you spot them, you can hopefully refactor your code pulling from some of the above guidelines to guide you into the Leak Free direction.

You can find DeLeaker here. It’s important to note that I’m not endorsing this product, nor was there any financial incentive, only a copy of DeLeaker itself. I just want to help out others in the community.

The other interesting topic that had come to mind when I was writing this post, would be how something like a leak detection software would work. If you’re interested in this, let me know. If you enjoyed reading this, and would like to read more, make sure you subscribe and share!

Until then, stay healthy and Happy Coding!

PL

“Design and programming are human activities; forget that and all is lost.”
― Bjarne Stroustrup

Please rate this post to let me know what I should write more of.

References:
Guideline Standards Library
GSL Passing Parameters
Leak Freedom…. By Default – Herb Sutter
Resource Model – Bjarne Stroustrup, et al.

Don’t try this at work – Vector Pt. 3

Welcome back everyone, for at long last, the last in this vector series. It’s been a long haul, and I appreciate everyone who has come along for the journey.

In light of the happenings recently in the United States of America, I’d just like everyone to take a moment of silence; for those victims of prejudice, hate, and discrimination. Humans are social creatures, and we all need to realize we are here not in-spite of each other, we are here because of each other. It is our duty, no matter your role in society, your job, your socioeconomic standing, to treat all other individuals with the same standard of decency, respect, and humility. We all face challenges, some more than others, and as humans we must empathize with our fellow human, regardless of ethnicity, gender, or sexual preference. It is our unique ability as humans to be self-aware, we should use this power for good, and not evil. I challenge you to pause for a moment and imagine what it would be like to live in someone else’s shoes, someone who has to deal daily with prejudice and segregation; if only for a moment. I have been blessed with many opportunities in my life, and am thankful everyday for that. No matter our position in life, it is never our right to deny others a chance at the opportunities we have had. In fact, it is our responsibility, to help enable those opportunities for others. We mustn’t allow this to continue. After the protests have dispersed, and the smoke has settled. We must remember this, and as a society we must change. For if nothing changes, nothing changes. #blacklivesmatter

Thank you.

The last time we left of, we had a pretty much working vector implementation. Then we wanted to enable it to be used with standard algorithms. it was a bit of a trick though, because it already worked with the algorithms, using a direct pointer into the memory. Instead we wrote an iterator class that was a bit safer, and allowed us a bit more control over the raw pointer. Then we learned a bit of funny business with SFINAE, to enable specific functions only when the arguments are iterators. It was a pleasant stride that we took, and we learned a few things along the way. Taking a look at insert allowed us to meander down the valley of memory allocation, and reorganization. Then we were able to stop and smell the SFINAE roses, though they often have an off scent. At the start of this journey, I’ll be honest — I wasn’t expecting it would be this much work. It is crazy to think about the amount of brain power that went into crafting our humble little standard vector, that we use daily. It puts in perspective the amount of work that goes into the STL alone. Although, it was the premise of this series to illustrate why it’s in our best interest to use the fantastic set of tools that ship with our compiler.

This adventure however, is coming to a conclusion. When I started, I had almost jokingly said I wanted to pit the vector we would create, against the MSVC standard vector. At that time of writing, I didn’t really know what I was going to do, (similarly with “use it with the algorithms”). It isn’t to say I didn’t have a direction, I just didn’t know what I was going to do. So I had to decide what exactly “pitting my vector” against Microsoft’s meant. Luckily, since it’s my blog, it’s my arena. Though if I didn’t do it justice, I knew that I would get down-voted into oblivion. LIKE THAT TIME I THOUGHT VIM COULDN’T NAVIGATE TO HEADER FILES. So, I had to actually put some brain power into this, other than just making some asinine competition up, and declaring that I win!

We really want to measure the amount of work the vector is doing. Often in computing the amount of work is measured by the amount of processing time. This is why we say an operation, or algorithm is fast or slow, it’s the amount of time it takes to complete. It follows, that the less a particular operation does, the faster it will be. The fastest program, is no program at all. However, it’s also the most useless program ever. The art of optimization, is to do less work while still having the same outcome. There are a few things that affect the actual timing of operations. These things range from processor speed, program load (i.e. how many programs are running simultaneously), memory consumption, etc. These though, are all environmental. Meaning that if you tested program A, in the same environment as B, they should have the same effect. The biggest affect on your operations overall speed, will be it’s algorithmic complexity. You might’ve heard of this, sometimes it’s called Big-O notation. That is a constant time operation. You’ll hear people say things like a binary search is Log(N), or that a linear search runs in linear time. Worst of all, an algorithm is exponential. Now, these “time” measurements, aren’t actual wall clock time. Though, they will affect the amount of ticks your algorithm takes to complete. They’re a relative time scale, based on the amount of work the algorithm has to do. If an algorithm is linear, it means that as the amount of work the algorithm must do grows, the time will grow linearly in relation to size of it’s input. When something is a constant time operation, it means that it has to do the same work, regardless of the size of input.

Let’s explain this in-terms of our vector; Let’s say we want to find an item in our vector. In the first case we want to find the 10th item. Okay, so regardless of ordering of our vector, if we want the 10th item, we have to do the same constant work. We need to get the start of the vector, increment it by 10, and we have the item. Okay, Okay. What about a vector of size < 10. So we have to check the size first. Regardless though of our vectors size, it’s the same work for 1, 10, 100, 1 000, 10 000, 1 000 000 000 items. This means it’s a constant time operation. You would see this in the wall clock time. If the operation took 10 ms for a vector of 10 items, it would take roughly the same 10 ms regardless of the size.

What if we wanted to find an item that matched some predicate? Assuming our vector is a vector<int> of random numbers, and we wanted to find say the value 10. What do we have to do? Well, we have to start at the beginning and iterate the vector, until we find the value 10. Now, all things being completely random, you may find the element right at the beginning, some where in the middle, or maybe it’s the very last element. Given one full iteration of the vector, you’ll be able to either a) find the element b) know it doesn’t exist. This is a linear, algorithm. The wall clock time, will grow roughly linearly as the size increases. Meaning that it should take roughly 10 times longer to find an element in a vector of 10 vs. a vector of 100, with a linear search. In practice however, if you measured this once, you might see something strange. Your vector of 10, might take more time than the vector of 100. Why is that? Well, it could simple be due to a roll of the dice. All things being random, you found the item at the end of the 10 item vector, and it was the first element in the 100 element vector. That’s why these time complexities are often times called the amortized complexity. With a linear search, the best case is the case where you found the element at the very start, and the worst case is at the end. The average case, will land somewhere in the middle. so if you were to measure searching a 10 element vector, vs. a 100 element vector, say 1000 times. You would see that on average the larger vector took longer, and if you were to graph it you would see it grow linearly.

Next up, the dreaded exponential growth. These are algorithms we want to avoid as developers, because they’re the type of algorithm that “works on my machine”, but grinds to a halt at scale. Because we test with 100 elements on our machine, and scale is 100 000 elements. These algorithms are insidious, and often come in the form nested loops. To continue our example, if you for instance wanted to search for a range of items, within your vector. If we call the elements we search for the needle, and the elements we’re searching the haystack. We could approach this problem as follows. Start by iterating the haystack, until we find an element that matches the first element in the needle. Then continue iterating both collections until we match the entire needle collection OR we do not match, and we start the process over again. Now, the reason these types of algorithms are insidious, is because they tend to be adequately fast enough, until you start getting into larger collections. So it is helpful to be able to spot these types of algorithms, before it’s too late. Each time you’ve got a nested loop where you have an unbounded end on each loop, you’ve got something that will grow exponentially.

Linear operations are often times good enough, meaning that on average, at scale, they’ll complete in an adequate amount of time. As long as this meets our timing requirements, and the user doesn’t go gray waiting, a linear algorithm will suffice. What about the times when they won’t? How can we do better? Well — when you’re dealing with completely random data, you can’t do better than linear or exponential, because the data is random. For instance, to search for an item in a random collection, you’re basically required to iterate the entire collection, without any shortcuts. How else can you be sure of your result? This means that in order to do better than linear, we have to add some structure to our data, so that we can make reasonable guarantees. This allows us to take short-cuts, and do less work.

Let’s go back to our search we were doing earlier. We want to find an element with the value of 10, in a vector<int>. Now, when we did this before, we operated on a random set of data. There was no order or structure to our data, so we were forced to inspect each element. Let’s instead assume the vector is sorted. Given this order, we can make reasonable assumptions that allow us to take shortcuts and do less work when we’re searching. Of course the linear search still works, but why would we do that when the data is sorted? Assuming an ascending sort, we could search in the following manner. Start at the middle element, is it our value? If not, is the value less than, or greater than the value we’re searching for? If it’s less than the value we’re searching for, and the list is in ascending order, than we know with certainty that the item resides in the top half of the list. This means in one operation we can discard half the list of items. We just discarded half the work. Now, if you take the same approach to the top half of the list, you’ve just discarded half-of-half. Keep doing this and you’ll find your item, or you’ll run out of items to search. As you can see, having the data in sorted order allows us to rapidly find the item, by means of divide and conquer. Each iteration of our search, we effectively remove half of the work. This is what is called a Logarithmic time complexity, because they divide the entire set in 2 for each step. These types of operations are often referred to as binary operations, because at any given point you either go left or right, depending on the value. Now, a sorted order allows this type of approach to a list of items, but you can structure your data in such a way i.e. a binary tree, that makes these types of binary operations very easy.

Now, you’re probably asking “well what about the time it takes to sort the vector”? Well, that does have a cost, and that’s where the balance of computing comes in. Maybe, adding items to your vector happens rarely, but you’re constantly searching it. So it makes sense to sort the collection and pay for that at initialization, and then use a faster algorithm for searching. Given your data set you might just want to manage a binary structure, where you pay a Log(N) cost for insertion, as well as for searching. It’s important to illustrate the point that structure and order, is what allows us to craft smarter algorithms, that allow us to do less work.

I’ve digressed. What I really wanted to cover was measuring our vector, against MSVC vector.

I thought a while (maybe 14 minutes) about what I wanted to measure. I felt that the only real operation that a vector did, was grow. Meaning that pretty much anything else, was effectively a constant time operation. If you want to get an element, or assign to a pre-existing slot, this is a constant time operation. So where are the operations that will take some time? Well, I know that there is work to be done when the vector grows. Due to the fact we have to allocate more space, and copy into that new space, then free the old space. So the operations we really want to time are adding an item (emplace) and inserting items (insert). When we insert, we want to test inserting at the beginning, middle, and end. Because we know when we implemented it, those operations all have different paths. You can read about this in the previous post.

I’ll be honest — I had planned to dedicate a significant portion of this post to benchmarking, and methods behind that. Though at this point with my algorithmic complexity digression, it will make this post a novel to get into that. If that’s of interest to you, let me know and I’ll try to write a follow up post about it.

Originally, my plan was to look into using a third party benchmarking tool. Something like Google Benchmark, or nonius. However, I couldn’t help my own curiosity. To be completely honest, I think one should not re-invent the wheel, at work. But at home, for fun, for posterity (yes — if I ever have children, I want to show them my library of useless recreations.), or for education, I’m a firm believer we should make attempts at it. So, for better or for worse, instead of using an off-the-shelf solution. I opted to write my own lightweight benchmarking tool. You can find it here.

What I chose to measures is the following:

  1. Emplace 10, 100, 1 000, 10 000, 100 000, 1 000 000 int values.
  2. Emplace 10, 100, 1 000, 10 000, 100 000, 1 000 000 struct w/ move ctor.
  3. Insert 1 000 into 1 000, 10 000, 100 000 int.
  4. Insert 1 000 into 1 000, 10 000, 100 000 struct w/ move ctor.

Once we do that, we can then analyze the results. This would be the very first step on the road of optimization. I won’t digress down that rabbit hole.

Here’s the graphed results of the different emplace measurements. On the x-axis, is the number of iterations the test was run, on the y-axis is the time in nanoseconds. The blue line represents the MSVC standard vector, and the orange line is the htk vector. Well — no surprise here, htk wins. No need to look any closer at those results.

Just kidding. Put your pitch-forks away. I don’t actually think that the htk vector is more performant than the MSVC vector. There’s perfectly logical explanations to why it is coming in faster than the STL vector. One plausible answer would be that the numbers are just way off, I’ve never claimed to be a mathematician or statistician. The code could be wrong, there’s a possibility that I’m timing more than I’d like to be timing. However, given that the code is identical for both cases, it should be consistent. I will say however, that there is large variance in the data.

nameiterationstotalmeanmedianminmaxs-dev
vector<int> emplace 101061200516025001700273007515.344
vector<int> emplace 101002682002162210015006000530.2405
vector<int> emplace 10100029931002391.522001600209001051.978
vector<int> emplace 1010000248184002022.0519001000190070019016.71
vector<int> emplace 10100000121941400997.82188006003753002574.748

With such a large standard deviation, it means are data is pretty widely spread, and using Bryce’s check to determine if the values are normally distributed. Hardly any of the tests were seemingly normal. That being said, I ran the measurements multiple times, and saw similar results so, let’s go with it being somewhat correct.

We can start to reason about why the htk vector might be faster than the standard vector? Well — truth be told, there was one test that I knew htk would have the upper-hand. The emplace 10 items test. Well — if you recall when we implemented our vector, the initialization of the vector allocated 10 slots. This means, that we started with enough room for 10 elements. The standard vector had to allocate this initial memory, so it was pretty safe to say we’d win there.

In two of the cases emplace 1000 and 10000, we didn’t actually win. If you look at the 100000 iteration trials for both tests, the timing was almost identical. So likely, at best for two of the cases we’re equal, which makes sense. To illustrate this further we’ll take the 100000 iteration trials, for each test 10 items, 100, 1000, 10000, 100000 items and graph that.

As we can see, the two are almost identical. Since our trial size is growing by a factor of 10 per test, it makes sense to plot the time on a logarithmic scale.

x- items emplaced; y-nanoseconds

This graph, shows pretty clearly, almost exactly what we expected. In the test with 10 elements, the htk vector is superior because it isn’t doing any allocations. The 100 element makes sense it would be slightly faster, because we saved the time from the first 10 allocation. After that, it stops mattering. The two are pretty much neck and neck. Another reason I suspect as to why the htk vector might have a slight upper hand, for the move_only object, is that it’s much less safe. The htk vector probably doesn’t even offer a basic exception guarantee, whereas the stl vector offers a strong guarantee, whereby if an exception is thrown while the collection is growing, it will effectively roll back to the state before trying to grow. With the htk vector, well, it doesn’t do that. It at least tries to clean up for a basic guarantee, but I didn’t really pay attention to ensuring it, so there could be cases where it does not offer that.

x- items emplaced; y-nanoseconds

Okay — so now for the embarrassment I think we were all expecting.

Overall — I’d say this is a blow out, the std vector is about 5x faster in all cases than our htk vector. So then we ask ourselves, why? Well, the answer lies back with what we did for emplace, and what we didn’t do for insert. Well, kind of. With the insert, when we need to grow if the original items, and they’re trivial items, we can just use memmove to very quickly move the contiguous data from one area of memory to another. Keeping in mind, this is trivial data which is what allows us to just copy the bytes. This definitely optimized our growth code for trivial types. However, when we implemented insert we used this code:

move_items(data_.first, data_.first + offset, next.first, copy_type{});
auto dest = first + offset;
while (start != fin)
    allocator_.construct(*(dest++), *(start++));
move_items(data_.first + offset, data_.last, dest, copy_type{});

You can see that we use our tag dispatch optimized function to move the items within our vector around, but when we’re inserting our new elements, we just construct them one by one. Instead of just moving them (read: memmove) into the newly allocated space. Now, you might be asking “how much slower could that actually be?” Well, in this case, it turns out to be roughly 5x slower, in fact. So we can change the code to something along the lines of…

move_items(data_.first, data_.first + offset, next.first, copy_type{});
auto dest = first + offset;
move_items(start.ptr(), fin.ptr(), dest, copy_type{});
move_items(data_.first + offset, data_.last, dest, copy_type{});

If you recall, when we wrote move_items, we took advantage of tag dispatching, to determine which function to use. The specific function that will get chosen will be the function that uses memmove.

void move_items(pointer first, pointer last, pointer &dest, memcpy_items_tag)
{
     memmove(dest, first, (last - first) * sizeof(T));
}

After changing the code, I re-ran the insert tests, and the results (unsurprisingly) looked like this.

You can see now, they’re both really fast, given that they’s within 200ns of each other. I’d say that they’re pretty much identical. The only justification I can make for the slight difference, is again the safety factor. We really don’t care about any exception guarantees, which for fun is okay, but don’t try it at work.

Okay — I’ll address it. I know it’s burning for some of you. That code I wrote above, doesn’t actually work. I mean — it works, for the test I had written, but not much else.

move_items(start.ptr(), fin.ptr(), dest, copy_type{});

This line here, has this weird .ptr() call on both the iterators. I can explain. Basically, for move_items, which is akin to the MSVC _Umove function. However, the MSVC functions work on iterators, and ours just so happens to work on pointers, from the legacy code we wrote in the first article. What does that mean? Well, it means that we directly pass the input to our move_items call into memmove. This was perfect at that time, because we had full access to the pointers. Meaning our iterator was a pointer. But now, now we’re working with iterators that aren’t pointers. Just so we’re on the same page, a pointer is an iterator, but an iterator is not a pointer. There is a level of indirection there, that allows iterators to work the way they do over pointers, but it means we can’t feed the iterator directly into the memmove function without the compiler yelling. But you just said that MSVC uses iterators in their function…. They do, but first they go through the process of undressing their iterators. No — wait, that’s not right. Unwrapping, their iterators. It’s basically the processes of removing the iterator facade so that we can in this case see the underlying pointer. But why didn’t you do that, Peter? Well — if I’m honest, as much as I love rabbit holes, and going down them to come out the other side. I’m probably almost as tired of working on this vector, as you readers are reading about it. It’s also another exercise in traits, and meta programming that I didn’t want to spend the time on. My bad. 🙂

I had a better use of that time. I’m going to try and keep my promise that I made in the beginning, and in the middle, and now as we approach the end. I still have not kept. I have not really talked at all about implementing OR timing an algorithm.

So — the other day, I was listening, or watching, I think listening to a CppCast Podcast, and Sean Parent was the guest. He was talking about how at Adobe one of the technical questions is that you have to implement a binary search, and then prove it. I thought, challenge accepted. Not that I’ll be applying at Adobe any time soon, but it’s a worth while exercise. Especially when it’s not for all the marbles, like a job at Adobe.

template <typename IteratorT, typename ValueT>
bool binary_search(IteratorT begin, IteratorT end, const ValueT &v)
{
    const int mid = (end - begin) / 2;
    if (mid == 0)
        return *begin == v ? true : ( *(end-1) == v ? true  : false);
    const auto midval = *(begin + mid);
    if (midval == v)
        return true;
    if (v < midval)
        return htk::binary_search(begin, begin+mid, v);
    else
       return htk::binary_search(begin + mid, end, v);
}

Okay — I did it. It’s a recursive binary search function. I know what you’re thinking, eww recursion. That’s just how I think about it. So the the next part, is to prove it by induction. Dun. Dun. Dun. Now, Sean mentioned that he had an individual provide 30 pages of documentation for the binary search. I’ll attempt to do one better…. 31 pages! Just kidding. I can barely muster 5000 words into a coherent blog post, let alone a 31 page proof.

Here it goes. Given that we have a ascending sorted collection. Our base case is a collection of 2 items, where the midpoint(end-begin)/2 will be equal to 0, when this is the case the value we’re looking for must be one of those two values, or does not exist. In the case where the collection is larger than 2, our midpoint is not equal to 0. There are three cases, if the value of our midpoint is equal to our value, it exists in our collection. If it is not equal, and the search value is less than the value of our midpoint then it can only exist in the bottom half of the collection. If the value is greater than the midpoint value, then it can only exist in the top half of the collection.

Well, it’s not much, and it’s hardly a mathematical proof, but it’s my best try. I think it illustrates the point of binary search as divide and conquer.

To close off this post, my plan was to illustrate the magic of the binary_search over the plain old linear search. However, like always, color me surprised. I went off an used the chronograph library to time the different search functions. I wanted to test std::find, std::binary_search, and htk::binary_search (recursive) to see how they fair across a vector of integers. So the test was as follows.

  1. Generate a random vector of size 10, 100, 1000, 10000, with values between 1-100
  2. Generate a random search value between 1 – 100
  3. Test 10, 100, 1000, 10000, 100000 times, each search function

As simple as this seems, it wasn’t as straightforward as I would have thought. My first hurdle was that my “random” generator function, wasn’t really random. The initial code looked like this.

template <typename T, typename VectorT>
VectorT random_numeric_vector(size_t size, T min = std::numeric_limits<T>::min(), T max = std::numeric_limits<T>::max())
{
    VectorT v;
    std::default_random_engine generator;
    std::uniform_int_distribution<T> distribution(min, max);

    for (size_t i = 0; i < size; ++i)
    {
        v.emplace_back(distribution(generator));
    }
    return v;
}

size_t next_random(size_t min, size_t max)
{
    std::default_random_engine generator;
    std::uniform_int_distribution<size_t> distribution(min, max);
    return distribution(generator);
}

A simple test case looked like this.

benchmark(s, "std::binary_search vector<int>(10)", [](session_run &r) {
    auto v = random_numeric_vector<int, std::vector<int>>(10, 1, 100);
    std::sort(v.begin(), v.end());
    auto value = next_random(1, 100);
    measure(r, [&v, &value]() {
        std::binary_search(v.begin(), v.end(), value);
    });
});

The result I was seeing, across the two searches were very, very strange. The linear search sat somewhere in the realm of 100ns, for every collection size. The binary search, was slower, somewhere in the range of ~150ns, and slightly higher for the larger collection, not much though. So I thought, what the heck? I chalked it up to something to do with optimization. So, I chased down some code to tell the optimizer to eff off, and I ran it again. Same result. Over an over. In hindsight, I probably should’ve checked the documentation, or been smarter. It took me putting a debugger on to look, and see. Every time I would generate a new “random” vector, it was the same random values. As well, each time I would generate a “new” random search value, it just so happened to be equal to the first element in the collection. Thus, my pseudo random number generator, using a new default_random_engine per call, was giving me the exact same set of random values. Which just so happened to make each random search value, not so random. In fact, it was the best case for a linear search, making it return in every iteration, and every vector size, immediately.

After all that. I fixed the issue by hoisting and reusing the default_random_engine, so I was getting a set of random values. So then back to timing it. After I ensured the values were random, and the test was definitely executing the code I wanted. I had hoped to see the results we talked about earlier. You know, where the linear search time grew exponentially as the collection grew exponentially. Then where the binary search time grew, and then as the collection got larger and larger, the time to search leveled off. A logarithmic graph.

Then this happened. What. The. Fox. Said? The one I least expected, std::find is fastest, almost in all cases. They’re basically equivalent up until a million items, where I would’ve expected the binary_search to prevail. The linear search, performing like that is very strange. Not only does it out perform both the linear searches (loop style & recursive), it also doesn’t require the input to be sorted, which on average took almost 33 ms to sort the 1 000 000 000 int vector. That is quite a bit of overhead to pay, just to be outperformed. What gives? This isn’t what my CompSci professor told me, they couldn’t have been lying could they? Obviously, it makes logical sense that a linear search, whereby you could potentially have to iterate all elements, would take more time than one where continually break the search in half. So, my professor wasn’t lying, and this result is just very counter intuitive. We can make a bet that std::find isn’t doing what we think. It just can’t be. Let’s prove that.

We are assuming it’s iterating the collection and checking, something like this.

template <typename IteratorT, typename ValueT>
bool linear_search(IteratorT begin, IteratorT end, const ValueT &v)
{
    for (; begin != end; ++begin)
    {
        if (*begin == v)
            return true;
    }
    return false;
}

Update 06.16.2020: After posting this story to reddit, it was pointed out by /u/staletic that std::find is not optimized by memchr. In fact, after I stepped through with the debugger, it was in fact, not doing this. I’m wondering if it’s just the sheer act of unwrapping the pointers to raw pointers that is making it that much faster. I’m going to run a test and see.

I ran the results of that test, and voila!

Size101001000100001000001000000
std::find249.341195.602205.812171.916266.472513.4
std::binary_search96.202129.227163.582206.146409.6851321.187
htk::binary_search108.538143.235161.051196.494504.581114.834
htk::linear_search85.808126.395421.9893577.37634773.36391704.3

So, the first graph looks a lot more like we would’ve assumed it to look, the linear search grows exponentially as our collection grows exponentially. It also gets trounced by even the recursive binary_search. Which even makes the cost of sorting the collection totally worth it. That said, if you’re able to not pay the sort cost, by maybe, maintaining sort order on insert. You have a very quick search on your hands.

So what gives with std::find, well the truth is in the details. It looks like for types that aren’t trivial types, it does a typical linear search, in the style we implemented. But, when it comes to trivial types, it uses a different mechanism. It uses a call to memchr, a C-api call that will find the pointer to a matching value in a set of data. Though I’m not sure how it’s implemented, I can tell you, it’s very quick. Meaning that if you have a collection of trivial types, it looks to be in your best interest to favor std::find for searching the collection, and not worry about having to sort the collection.

Alright — I think I kept my promise. That was long winded, but a surprisingly interesting journey. We scratched the surface of micro-benchmarking, by understanding how to apply it to the work a vector does. As well, we took a stab at implementing (and proving) a binary_search algorithm. Then, we got to see it in action, and how it performed (or should I say got out performed) against std::find and std::binary_search, then we saw how a true linear search really compares to a binary search, when they’re playing the same game. Overall, this was a very interesting walk down vector trail. I hope you enjoyed the read, and look forward to seeing you next time!

Until then, stay healthy, and Happy Coding!

PL

A sudden bold and unexpected question doth many times surprise a man and lay him open.

Francis Bacon

References
https://www.youtube.com/watch?v=zWxSZcpeS8Q&feature=youtu.be
https://www.youtube.com/watch?v=nXaxk27zwlk&feature=youtu.be&t=2441
https://www.bfilipek.com/2016/01/micro-benchmarking-libraries-for-c.html
https://github.com/google/benchmark/blob/c078337494086f9372a46b4ed31a3ae7b3f1a6a2/include/benchmark/benchmark.h
https://github.com/rmartinho/nonius