Mirror reflection library: Factories and factory generators in-depth

Mirror reflection library 0.5.13

Factory generator utility

Introduction

This part of the documentation deals more in-depth with the topic of factories both generally and more specifically with those generated by Mirror's factory generator utility. Brief overview of the generator can be found in Factory generator utility.

Factory patterns

The concrete factory, abstract factory, factory method and other related patterns are well established solutions to recurring object instantiation problems in software design. These design patterns separate the user of the created, possibly polymorphic object from the method and details of its construction.

The product type can have multiple constructors, taking different sets of arguments. In some cases it is desirable, that the decision which constructor is to be used is made only at run-time just immediately before the construction, not at the time of compilation. Nor should the factory require the product to have a constructor with a specific signature, for example the default or copy constructor. The sources supplying arguments for the construction may vary; argument values may be converted from a dataset which is a result of an SQL database query or from an XML stream or may be entered by the user from some kind of user interface, etc. Existing instances, from an object pool, may also be used as arguments for the construction of new instances.

The sources supplying arguments for the construction may vary; argument values may be converted from a dataset which is a result of an SQL database query or from an XML stream or may be entered by the user from some kind of user interface, etc.

Manually implemented factories

Let us consider the following simple class modeling a vector or point in three-dimensional space. For the sake of simplicity the class has three public floating point data member variables storing the values of the x, y and z coordinate. It also has three explicitly defined constructors and the implicit copy constructor. Thus there are four ways how to construct a new instance of vector; default construction initializing all coordinates to zero, initialization of all coordinates with a single value, initialization of all coordinates separately and copy construction making the new vector a copy of another existing instance of vector:

  struct vector
  {
    double x,y,z;
    vector(void): x(0.0), y(0.0), z(0.0) { }
    vector(double _w): x(_w), y(_w), z(_w) { }
    vector(double _x, double _y, double _z): x(_x), y(_y), z(_z){}
  };

A hand-coded factory for the vector type allowing to select the constructors at run-time and to specify the source of constructor arguments might look as shown in the following pseudo-code. For the sake of simplicity this implementation does not allow existing instances to be used as arguments for (copy) construction. This feature will be addressed later.

  class vector_factory
  {
  private:
    double extract_param(Data data, string param_id)
    {
      // somehow extract and convert a parameter from the data
    }
    int get_index(Data data)
    {
      // somehow pick the constructor to be used
    }
  public:
    vector* create(Data data)
    {
      switch(get_index(data))
      {
        case 0: return new vector();
        case 1: return new vector(
          extract_param(data, "w")
        );
        case 2: return new vector(
          extract_param(data, "x"),
          extract_param(data, "y"),
          extract_param(data, "z")
        );
        default: throw exception(...);
      }
      return 0;
    }
  };

This factory could then be used to create a vector in client code in the following way:

  Data get_data(void){ ... }
  vector_factory fact;
  vector* pv1 = fact.create(get_data());
  vector* pv2 = fact.create(get_data());
  Data get_data(void){ ... }
  vector_factory fact;
  vector* pv1 = fact.create(get_data());
  vector* pv2 = fact.create(get_data());

The vector_factory provides a public member function called create which uses some kind of branching to dispatch the selected constructor and supplies the required arguments to it. It also implements a member function get_index to select the constructor to be used and the extract_param member function that is responsible for the extraction and conversion of the argument values from an external representation. The notable feature is that regardless of the real external representation of the input Data type (XML, JSON, XDR, ASN1, RDBS dataset, etc.), many parts of the factory class remain the same. The only things which change are the Data type, and the implementation of the get_index and extract_param functions. This makes the implementation above unsatisfactory because of code duplication between factories creating the same product from different external input data representations.

It would be possible to re-factor the implementation and put the parts which change into an external polymorphic abstract class, say, factory_helper that the vector_factory would use. The get_index and extract_param would become virtual functions of factory_helper and Data would become an opaque type, for example void* or boost::any. Concrete implementations of factory_helper would handle the constructor picking and parameter extraction.

This approach has however two important drawbacks. First, is the performance penalty induced by the run-time polymorphism, because virtual functions create a function in-lining barrier for the compiler. This can cause slowdowns up to an order of magnitude, depending on the data representation used, compared to an implementation without virtual function calls. The second issue is the loss of type safety introduced by the opaque data type. This may cause fatal application errors when incompatible data type is used with a factory. To address these issues one could employ C++'s templates and make vector_factory a parameterized class. A concrete implementation of factory_helper could be passed as a policy template parameter to vector_factory which would replace run-time polymorphism with compile-time polymorphism and remove the need for virtual functions. Each concrete implementation of factory_helper could also define its own concrete Data type that would not have to be opaque to the vector_factory anymore:

  // some class allowing to store constructor description
  class constructor_description { ... };

  // a factoru helper for input data in XML format
  class factory_helper_xml
  {
  public:
    typedef xml_element Data;

    double extract_param(
      Data data,
      string param_id,
      constructor_description ctr_desc
    )
    { ... }

    int get_index(Data data, constructor_description ctr_desc)
    { ... }
  };

  class factory_helper_xml { ... };
  class factory_helper_odbc { ... };

  template <class FactoryHelper>
  class vector_factory
  {
  private:
    constructor_description ctr_desc(void){ ... }
    FactoryHelper helper;
  public:
    vector* create(typename FactoryHelper::Data data)
    {
      switch(helper.get_index(data, ctr_desc()))
      {
        case 0: return new vector();
        case 1: return new vector(
          helper.extract_param(data, "w", ctr_desc())
        );
        case 2: return new vector(
          helper.extract_param(data, "x", ctr_desc()),
          helper.extract_param(data, "y", ctr_desc()),
          helper.extract_param(data, "z", ctr_desc())
        );
        default: throw exception(...);
      }
      return 0;
    }
  };

This factory would have the following usage in client code:

  vector_factory<factory_helper_xml> fact_xml;
  vector_factory<factory_helper_odbc> fact_dbs;
  vector* pv1 = fact_xml.create(get_xml_node());
  vector* pv2 = fact_dbs.create(query_database());

Now the method of constructor picking and parameter extraction is separate from the factories and can be easily interchanged, reusing the common code. However, this solution requires that vector_factory passes some kind of constructor description to the get_index and extract_param functions, since they are now decoupled from the factory and require this information. In addition every time the constructors of vector change, the factory needs to be updated accordingly and if a new class, say triangle is added, a new triangle_factory class would have to be implemented manually. For applications that need to create instances of many different types from external data, this would obviously tedious and error-prone.

Factory dissection

From the examples above it should be clear that factories are composed of distinct parts, responsible for various aspects of the construction process, which are interweaved together. For clarification the aspects are reiterated here:

Constructor description (meta-data describing the constructors)
Constructor selection
- Based on user preference (in case of user interface-based factories)
- Based on available data (matching the input data to constructor signatures)
Constructor dispatching (i.e. using the selected constructor)
Selecting the sources of the individual argument values
- Existing instances
- New instances (possibly converted from an external representation)
Getting the argument values
- Conversion from the external representation
- Recursive construction by the means of a factory
- Providing existing instances (for example from an object pool)
Input data validation

These tasks fit into two orthogonal categories:

Product type-related (constructor description, constructor dispatching and product instantiation)
Input data representation-related (constructor selection, argument source selection, input data validation and argument value getting)

Parts from each category can be combined with the parts from the other category to create a new factory, which promotes code reusability. Factories constructing a single product type from various input data representations share the product type-related parts and factories creating various products from a single input data representation reuse the components related to this representation. This makes maintenance and bug-fixes much more manageable because a code update in one of the parts propagates to all factories using it. Splitting the parts into the two categories has an additional advantage that they can be implemented separately and independently, provided that a common framework and interfaces for their composition and cooperation are defined.

Initialization and cleanup

Another important aspect of factory design and implementation is that instances of a factory may need to be properly initialized before and cleaned up after its use. Consider a GUI-based factory, where during the product construction a dialog is presented to the user containing the necessary visual components for the selection of the constructor to be used and for the input of all required argument values. It would be a waste of resources to recreate the dialog from scratch for every constructed instance. This dialog should be created once, can be re-used to construct multiple instances of Product and then needs to be freed to prevent resource leaks. Factories using other means of gathering input data from other data sources may require similar initialization and cleanup procedures.

Mirror's factory generator

One of the methods how to partially or fully automate the process of factory implementation is by employing compile-time reflection and template meta-programming . If a compile-time reflection facility can provide meta-data describing the constructors of a class, these meta-data can be transformed by the means of a template meta-program into the product-related parts usable in factory implementation as described above. The Factory generator utility implemented by Mirror provides a framework for automatic generation of factories constructing objects of a particular type from a particular external data representation. It defines three abstract concepts called Manufacturer, Manager and Suppliers and provides several reusable implementations of these concepts. The concepts define the interface for real classes handling the input data representation-related aspects of the construction. The Mirror library also provides the necessary meta-data describing the constructors and the factory generator composes these meta-data with the application-specific Manufacturer, Manager and Suppliers into a concrete factory class. Instances of the generated factory can then be used on their own or as building blocks of other classes.

As briefly explained elsewhere, to generate a new factory, one needs to supply the factory generator with two templates having the (same) following signature:

  template <class Product, class Traits>
  struct Manufacturer { ... };

  template <class Product, class Traits>
  struct Suppliers { ... };

Both these templates are responsible for creating instances that will be used as arguments which will be passed to a constructor during the construction. The fundamental difference between a Manufacturer and the Suppliers is that a manufacturer should return a new instance either created from scratch or converted from an external representation. the Suppliers on the other hand should return existing instances (for example from various object pools) if available. The Suppliers are then used in copy constructors and the Manufacturer in all other constructors having parameters.

Both the Manufacturer and the Suppliers must have the function call operator returning the Product type and thus instances of the instantiations of these templates are Product returning functors, which are used by the generated factory in the following way. Suppose for example a class Person has this two constructors:

  class Person
  {
  public:
    Person(string name, string surname, int age, double height_in_m);
    Person(const Person& p);
  };

If the first constructor is used then the factory uses three instantiations of the Manufacturer with the Product template parameter being string, int and double.

  // .. in an imaginary code deep in the factory generator sources
  Manufacturer<string, Traits> _name;
  Manufacturer<string, Traits> _surname;
  Manufacturer<int, Traits> _age;
  Manufacturer<double, Traits> _height_in_m;

  Product do_create(integral_constant<0>)
  {
     return Product(
       _name(),
       _surname(),
       _age(),
       _height_in_m()
     );
  }

To support the second (copy) constructor the factory also does this:

  // not so far from the previous code
  Suppliers<Person> _Person_pool;

  Product do_create(integral_constant<1>)
  {
    return Product(_Person_pool());
  }

Then somehow the factory needs to decide, based on the available external data, which of these two constructors to use. This is a job for the Manager, which is by convention a Manufacturer with void as Product. Thus again somewhere in the inside of the generated factory (a variation of) this code is also present:

  Manufacturer<void, Traits> manager;

  Product operator()(void)
  {
    return pick_and_create(manager.index());
  }

TODO:

Copyright © 2006-2011 Matus Chochlik, University of Zilina, Zilina, Slovakia.
<matus.chochlik -at- fri.uniza.sk>
<chochlik -at -gmail.com>
Documentation generated on Fri Dec 16 2011 by Doxygen (version 1.7.3).
Important note: Although the 'boostified' version of Mirror uses the Boost C++ libraries Coding Guidelines and is implemented inside of the boost namespace, it IS NOT an officially reviewed and accepted Boost library. Mirror is being developed with the intention to be submitted for review for inclusion to the Boost C++ libraries.