James Walker


| 4 min read

hands with puzzle pieces
Shutterstock/oatawa

A programming impedance mismatch occurs when data needs to be transformed into a different architectural paradigm. The most prominent example involves object-oriented codebases and relational databases.

An impedance mismatch arises when data is fetched from or inserted into a database. The properties of objects or classes within the codebase need to be mapped to their corresponding database table fields.

Mapping and Relationships

Your classes won’t necessarily map directly to individual database tables. The construction of an object might require data from several tables to be used in aggregate.

You’ll also need to handle relationships between your data. Relational databases make this straightforward by allowing you to reference other records. You can use a JOIN to access all the data encapsulated by the relationship.

CREATE TABLE productTypes(
    ProductTypeUuid VARCHAR(32) PRIMARY KEY,
    ProductTypeName VARCHAR(255) NOT NULL UNIQUE
);
 
CREATE TABLE products(
    ProductUuid VARCHAR(32) PRIMARY KEY,
    ProductName VARCHAR(255) NOT NULL UNIQUE,
    ProductType VARCHAR(32) NOT NULL,
    FOREIGN KEY (ProductType) REFERENCES productTypes(ProductTypeUuid) ON DELETE CASCADE
);

Using plain SQL, you could get the combined properties of a Product and its ProductType with this simple query:

SELECT * FROM products
INNER JOIN productTypes ON ProductTypeUuid = ProductType;

Then properties of the Product and its ProductType are accessed from the same flat structure:

echo $record["ProductName"];
echo $record["ProductTypeName"];

This flat array quickly becomes limiting in complex applications. Developers naturally model the Product and ProductType entities as separate classes. The Product class could then hold an instance of a ProductType. Here’s how that looks:

final class ProductType {
 
    public function __construct(
        public string $Uuid,
        public string $Name) {}
 
}
 
final class Product {
 
    public function __construct(
        public string $Uuid,
        public string $Name,
        public ProductType $ProductType) {}
 
}

There’s now a significant impedance mismatch in the code. Some form of specialised mapping is required before records from the database query can be represented as Product instances.

Further complications arise when you want to access all the products of a particular type. Here’s how you might do that in code:

final class ProductType {
 
    public function __construct(
        public string $Uuid,
        public string $Name,
        ProductCollection $Products) {}
 
}

Now the ProductType holds a ProductCollection, which would ultimately contain an array of Product instances. This creates a bi-directional relational reference – ProductType contains all its products and each Product contains its product type.

This form of modelling doesn’t exist within the relational paradigm. Each form of connection is represented with a single relational link record. The use of bi-directional references simplifies developer access to object properties. However, it requires more complex mapping when transferred to and from the database. This is because SQL doesn’t natively understand the semantics of the model.

Considering Hierarchy

The model explained above creates a hierarchy in the codebase: Product sits below ProductType. This sounds logical and matches our expectations of the real world.

Relational databases don’t respect hierarchies. As all relationships are equivalent, relational databases have an inherently “flat” structure. We saw this earlier when fetching data with a JOIN.

The lack of hierarchy in SQL means all tables possess an equivalent priority to each other. An effect of this is that you can easily access the properties of records nested deep in your logical object hierarcy. Furthermore, there’s a lower risk of cyclical dependencies.

The example above shows that Product and ProductType can end up referring to each other in code; the flat nature of relational databases would prevent this specific example occurring. Cycles can still crop up in plain SQL but you’re less likely to encounter them than when modelling with object-oriented code.

Object-oriented programming relies on the composition of simple objects into more complex ones. Relational models have no such notion of composition or the “simple” and “complex” – any record can reference any other.

Inheritance

Another OOP-exclusive function is inheritance. It is common for a class to extend another, adding additional behaviours. Relational databases are unable to replicate this. It’s impossible for a table to “extend” another table.

A codebase that uses inheritance will encounter difficulties when child objects are persisted or hydrated via a relational database. Within the database, you’ll usually need two tables. One stores the properties of the base object (that you’re extending), with another handling the properties of the child.

The mapping code then iterates all the properties of the object. Properties deriving from the extended class will be inserted into the first code. Those directly defined on the child will end up in the second table.

A similar system is required when mapping back to the codebase from the database. A SQL JOIN could be used to get all records corresponding to the base (extended) class, with the child properties included. Those records that contained the child properties would then be mapped to child class instances.

CREATE TABLE parent(Id INTEGER PRIMARY KEY, A INTEGER);
CREATE TABLE child(Id INTEGER PRIMARY KEY, B INTEGER, ParentId INTEGER);
class Parent {
    public function __construct(int $A) {}
}
 
final class Child extends Parent {
    public function __construct(int $A, int $B) {}
}
 
// Get records
// SELECT * FROM parent INNER JOIN child ON child.ParentId=parent.Id;
$objs = [];
foreach ($records as $record) {
    if (isset($record["B"])) {
        $objs[] = new Child($record["A"], $record["B"]);
    }
    else $objs[] = new Parent($record["A"]);
}

The introduction of inheritance necessitates the use of more involved mapping logic. The inability of relational databases to model the inheritance capabilities of object-oriented languages introduces this impedance mismatch.

Visibility and Encapsulation

A fundamental tenet of object-oriented programming is visibility and encapsulation. Properties are declared as public or private. An object’s internal representation can be concealed behind an interface that pre-formats data for the consumer.

Relational databases lack these controls. There’s no way to declare a field as private, nor is there an obvious need to do so. Each piece of data stored in a database has its own purpose; there should be no redundancy.

This isn’t necessarily true when transposed to an object-oriented paradigm. An object might use two database fields in aggregate to expose a new piece of computed information. The values of the two individual fields may be irrelevant to the application and consequently hidden from view.

CREATE TABLE products(Price INTEGER, TaxRate INTEGER);
final class Product {
 
    public function __construct(
        protected int $Price,
        protected int $TaxRate) {}
 
    public function getTotalPrice() : int {
        return ($this -> Price * ($this -> TaxRate / 100));
    }
}

The application only cares about the total product price, including taxes. The unit price and tax rate are encapsulated by the Product class. Visibility controls (protected) conceal the values. The public interface consists of a single method that uses the two fields to compute a new value.

More generally, object-oriented programming advocates programming to interfaces. Direct access to properties is discouraged. Using class methods that implement an interface enables alternative implementations to be constructed in the future.

There is no direct counterpart to this in the relational world. Database views do provide a way to combine tables and abstract fields into new forms. You’re still working directly with field values though.

Summary

Object-relational impedance mismatches occur when an object-oriented codebase exchanges data with a relational database. There are fundamental differences in the way in which data is modelled. This neccessitates the introduction of a mapping layer that transposes data between the formats.

The impedance mismatch problem is one of the key motivating factors in the adoption of ORMs within object-oriented languages. These enable the automatic hydration of complex codebase objects from relational data sources. Without a dedicated data mapping layer, only the simplest of applications will have a straightforward pathway to and from a relational database.

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *