Models That Match Reality

Data modeling is a big part of software design and good models should accurately
approximate reality. Ideas from modeling in other disciplines, sets, and an
aphorism from the Elm community can help us design better models.

Inspired by a discussion
on narrowing types
from the Elm discourse.

When we think of the word “model” in software, we often think of the M in
model-view-controller (MVC). If you’re from the Elm community, you might think
of the Model type that stores an application’s state in the Elm Architecture

More broadly, models are simplified constructions that try to approximate and
describe reality. A model might be physical, such as a model train or a scale
model of a city. It can also be logical, such as a weather model which is a bunch
of mathematical equations that try to describe weather systems.

Photo of a scale model of an 18th century city. In the foreground are
    some docks along a river and the lower town containing many row-houses.
    Towering above the lower town on top of a large bluff is the upper town,
    surrounded by stone walls. Two large churches feature prominently. In the
    background, farmland can be seen stretching away to the horizon.
Duberger scale
of Québec City, created in 1806 for the military authorities.
Jeangagnon, CC BY-SA 3.0 via Wikimedia Commons.

What about our models in the software world? They also try to approximate
reality. For example a Registration might try to describe some aspects of a
business process in a vacation booking application.

Real-life customers and processes can be infinitely complex and we can’t capture
all of that in our modeling. Instead, we approximate and focus on the
characteristics that matter in the context of this particular application.

Just because all models are approximations doesn’t mean they can’t be

I find it helpful to think of reality and our model as sets. If we had a perfect
model, the set of values our model can describe and the set of values in reality
would be identical. If we drew this as a Venn diagram, the two circles would
perfectly overlay each other.

Venn diagram with a green circle labeled 'reality' perfectly overlapped by a blue octagon labeled 'our model'
A model that accurately describes reality

In practice, our models often exclude important details, or more commonly,
include values that are not part of the domain we are trying to describe. As
programmers, we often label these as “invalid values”.

Venn diagram with a green circle labeled 'reality', partly overlapped by a blue octagon labeled 'our model'. The part of the octogon that does not overlap the circle is labeled 'invalid values'.
An inaccurate model that describes a lot of values not part of reality

For example, as part of modeling a registration process we might ask a customer
if they want a one-way or round-trip flight. We might describe this process with
two booleans. I’m going to use Elm types to show concrete modeling examples
below but the concept applies regardless of your your modeling language.

type alias Flight =
  { roundTrip : Bool
  , oneWay : Bool

This model is too broad. Reality says there are only 2 kinds of flights:
one-way or round-trip and that these two choices are mutually exclusive.
However, our model describes 4 different kinds of flights including 2 that are
nonsensical (both round-trip and one-way, and neither round-trip nor one-way).

In set terms, we would say that the “reality” set has a cardinality of 2,
while the “model” set has a cardinality of 4. Remember, we’re trying to get both
sets to be identical so this tells us that our model is too permissive. If our
program gets into one of these states, we will likely get garbage output.

We can try and create a different model that better describes the reality of
flight choices. The following can be read as “a Flight is OneWay OR

type Flight = OneWay | RoundTrip

Now, both our “reality” and “model” sets have a cardinality of 2 (and both
contain the same 2 values). Our model is much more accurate.

While the examples in the section above used Elm types, these modeling ideas are
not restricted to typed languages. We could do something similar with a Ruby on
Rails model as well:

class Flight < ApplicationRecord
  enum direction: [:one_way, :round_trip]

Elm programmers often refer to this kind of modeling improvement as “making
impossible states impossible
”. You’ll often hear this brought up in discussions
about what types to use, however the big idea behind it is much broader: align
your modeling with the reality you are trying to describe.

Want to learn more about data modeling? Here are some helpful resources from our
blog and around the internet.

Source link

Latest articles

Related articles

Leave a reply

Please enter your comment!
Please enter your name here