The more I dig into the logic of Perl, the more I find that it uses constructive (intuitionist) instead of classical logic. But Perl developers think in terms of the latter, not the former. This causes us to often reason incorrectly about our software.

For example, if we have this:

if ( $x < 3 ) {
    ...
} 
else {
    ...
}

We tend to think that $x is a number and $x >= 3 holds in the else block. That might be true in a strongly typed (uses classical logic) language, but that’s not necessarily true in either Perl or constructive logic.

Here’s a counter-example in Perl:

my $x = {};  # a reference to a hash
if ($x < 3) {
    say "if";
}
else {
    say "else";
}

That will print else because my $x = {}; say $x + 0; will often print a number like 4966099583. That’s the address of $x.

You won’t even get a warning.

Not useful.

So the else branch in Perl simply says the previous case didn’t hold, but it says nothing about the actual value of the number.

We can assert that $x is an integer and die if it’s not, but many don’t bother with that.

Classical versus Constructive

In classical logic, there’s the law of the excluded middle. Something is either true or it’s not. In constructive logic, things are true, not true, or other (unknown).

If $x is an integer, we know that it’s either 3 or it’s not, but we might not know it yet.

Note: Constructive (intuitionist) logic isn’t a mere mathematical curiosity. The well-known physicist Nicolas Gisin has written a short, easy-to-read paper showing how intuitionist logic might prove that there really is an “arrow of time” in physics .

I find the idea of constructive logic useful enough that I wrote a Perl module named Unknown::Values .

I proposed adding this new, three-value logic, or 3VL (true, false, unknown), to Perl and while there was some interest, some practical problems reared their ugly heads and the idea went nowhere.

Interestingly, one of the counterpoints was based on the confusion between classical and constructive logic. Complementing undef with a new unknown keyword (that salary might return):

# using undef values:
my @overpaid = grep {
    defined $_->salary
    &&
    $_->salary > $cutoff
} @employees;

# same thing with unknown values:
my @overpaid = grep { $_->salary > $cutoff } @employees;

As you can see, using unknown values instead of undef values makes the code shorter and easier to read. unknown salaries would not pass the grep. You don’t have to remember the defined check. It just works. Of course, you can run it in the other direction:

my @underpaid = grep { $_->salary <= $cutoff } @employees;

And again, unknown values wouldn’t pass the grep. If you need those employees:

my @unknown_salary = grep { is_unknown $_->salary } @employees;

But this was presented as a counter-argument:

if ( $x < 3 ) {
    ...;
}
else {
    ...;
}

With 3VL, surely the else would also have to be skipped because we don’t know if the condition is false? Thus, we could have an if/else block where the entire construct is skipped. That would be confusing madness!

Except that in 3VL, if the if condition doesn’t hold, the else can still fire because it represents an unknown value, as shown in the $x = {} example above.

The counter-argument might have worked for Java:

if ( someVar < 3 ) {
    System.out.println("Less Than");
}
else {
    System.out.println("Not Less Than");
}

In the above, if you hit the else block, you know both that:

someVar is a number.
someVar is not less than three.

Thus, Java follows classical logic. A statement is either true or false. There is no middle ground. Perl follows constructive logic, even though we tend to think (and program) in terms of classical logic.

Ignoring the type, for Perl this is more correct:

if ( $x < 3 ) {
    ...;
}
elsif ( $x >= 3 ) {
    ...;
}
else {
    ...; # something went wrong
}

That’s because until we explicitly make a positive assertion about a value, we cannot know if a statement is true or false. Thus, in Perl, an else block is frequently not a negation of the if condition, but a catch block for conditions which haven’t held.

Again, Perl developers (and most developers using languages with dynamic types) tend to think in terms classical logic when, in fact, we’re using constructive logic. It usually works until it doesn’t.

How Can We Fix This?

brian d foy found my Unknown::Values module interesting enough that he felt it should be in the Perl core:

Unknown::Value from @OvidPerl looks very interesting. These objects can't compare, do math, or most of the other default behavior that undef allows. This would be awesome in core.https://t.co/xmD8yoEjIK
— brian d foy (@briandfoy_perl) December 17, 2021

His observations match my concerns with using undef in Perl. However, when I proposed this to the Perl 5 Porters , while some found it interesting, there were some serious concerns. In particular, Nicholas Clark wrote (edited for brevity):

For this:

$bar = $hash{$foo};

If $foo happens to be unknown, is $bar always unknown? Or is it undef if and only if %hash is empty?

That behaviour is arguably more consistent with what unknown means than the “always return unknown”.

...

Does trying to use an unknown value as a file handle trigger an exception? Or an infinite stream of unknowns on read attempts?

What is the range [$foo .. 42] where $foo is unknown?

I think that most logically it’s an empty list, but that does seem to end up eliminating unknown-ness. Hence if we have first class unknowns, should we be able to have arrays of unknown length?

Ovid; It has a high-value win in eliminating common types of errors we currently deal with

And massive risk in introducing a lot of surprises in code not written to expect unknowns, that is passed one within a data structure.

Basically all of CPAN.

...

There are about 400 opcodes in perl. I suspect that >90% are easy to figure out for “unknown” (for example as “what would a NaN do here?“) but a few really aren’t going to be obvious, or end up being trade offs between conceptual correctness and what it’s actually possible to implement.

Needless to say, this pretty much shot down the idea and these were annoyingly fair points.

In other words, leaking this abstraction could break a lot of code and people will be confused. For example, if DBIx::Class were to suddenly return unknown instead of undef for NULL values, while the semantics would adhere much more closely to SQL, the behavior would be very suprising to Perl developers receiving a chunk of data from the ORM and discovering that unknown doesn’t behave the same way as undef.

So how would we deal with this? I can think of a couple of ways. Each would use the feature pragma to lexically scope the changes.

The first way would be to change undef to use 3VL:

use feature 'unknown';

my @numbers = ( 1, 2, undef,5, undef,6 );
my @result = grep { $_ < 5 } @numbers;
# result now holds 1 and 2

In this approach, any undefined value would follow 3VL. However, if you returned that undef outside of the current lexical scope, if falls back to the current 2VL (two-value logic: true or false).

However, we might find it useful (and easier) to have distinct unknown and undef behavior:

use feature 'unknown';

my @numbers = ( 1, 2, unknown,5, undef,6 );
my @result = grep { $_ < 5 } @numbers;
# result now holds 1 and 2, and `undef`

This would require unknown to behave like undef if it left the current lexical scope.

In other words, ensure that the developer who chooses to use 3VL has a tightly controlled scope.

But what do we do with the case of using unknown values as a filehandle or the keys to a hash or the index of an array?

$colors[unknown] = 'octarine';

Currently, for the Unknown::Values module, we have the “standard” version which mostly returns false for everything So sort and boolean operations ($x < 3) are well-defined, but almost anything else is fatal. (There is a “fatal” version which is fatal for just about anything, but I don’t think it’s very useful).

Why Not use Null Objects?

When Piers Cawley first suggested that I use Null Object pattern instead of unknown values, I balked. After all, you could have this:

my $employee = Employee->new($id); # not found, returns a null object

if ( $employee->salary < $limit ) {
    ...
}

That puts us right back where we started because when salary returns undef, it gets coerced to zero (probably with a warning) and the < $limit succeeds. That’s not the behavior we want.

Or maybe salary returns the invocant to allow for method chaining. Well, that fails because $employee->salary < $limit silent coerces salary to the address of the object (thanks, Perl!) and that’s probably not what we want, either.

That’s when I realized an old, bad habit was biting me. When presented with a solution, I tend to look for all of the ways it can go wrong. I need to look for all of the ways it can go right. The more Piers made his case, the more I realized this could make sense if I can use the operator overloading of Unknown::Values. I could write something like this (a bit oversimplified):

package Unknown::Values::Instance::Object {
    use Moose;
    extend 'Unknown::Values::Instance';
    sub AUTOLOAD { return $_[0] }
}

Now, any time you call a method on that object, it merely returns the object, When you eventually get down to comparing that object, the overloading magic in the base class always causes comparisons to return false and we get the desired 3VL behavior.

We could go further and assert the type of the Unknown::Values::Object and do a ->can($method) check and blow up if the method isn’t there, but that’s probably not needed for a first pass.

But there’s more! In the Wikipedia article about null objects , they use an example of a binary tree to show how Null objects sometimes need overloaded methods. Using syntax from the Corinna OOP project for Perl :

class Node {
    has $left  :reader :param { undef };
    has $right :reader :param { undef };
}

We can recursively calculate the size of the tree:

method tree_size() {

    # this recursive method adds 1 to $size for every node
    # found
    return 1
      + $self->left->tree_size
      + $self->right->tree_size;
}

But the child nodes might not exist, so we need to manually account for that:

method tree_size() {
    my $size = 1;
    if ( $self->left ) {
        $size += $self->left->tree_size;
    }
    if ( $self->right ) {
        $size += $self->right->tree_size;
    }
    return $size;
}

That’s not too bad, but if we need to remember to check if the right and left are there every time we use them, sooner or later we’re going to have a bug.

So let’s fix that (note: this example is simplified since I’ll need to create a new version of Null objects just for Corinna).

class Null::Node :isa(Null::Object) {

    # break the recursion
    method tree_size () { 0 }
}

class Node {
    use Null::Node;
    has $left  :reader :param { Null::Node->new };
    has $right :reader :param { Null::Node->new };
}

Because we’ve overridden the corresponding methods, our tree_size() doesn’t need to check if the nodes are defined:

method tree_size() {
    return 1
      + $self->left->tree_size
      + $self->right->tree_size;
}

In this simple case, this may overkill, but if we start to add a lot of methods to our Node class, we will have a Null object as default, using 3VL, and we don’t have to worry about building in extra methods that we don’t need. (But see the criticism section about null objects for more information ).

Unknown::Values with Null object support is on the CPAN and github .

Caveat

Be aware that the latest version (0.100)of Unknown::Values has backwards-incompatible changes . In particular, any attempt to stringify an unknown value or object is fatal. That is a huge win because it protects us from this:

$hash{$unknown_color}  = 'octarine';
$array[$unknown_index] = 42;

Both of those are clearly bugs in the context of unknown values, so they’re now fatal.

This doesn’t address all of Nicolas' concerns, but it definitely goes a long way to handling many of them.

If you'd like top-notch consulting or training, email me and let's discuss how I can help you. Read my hire me page to learn more about my background.

Constructive Versus Classical Logic in Perl

Classical versus Constructive

How Can We Fix This?

Why Not use Null Objects?

Caveat