Corinna’s First (mis)Steps

Note: Corinna is the name of Ovid’s love interest in the Amores .

I had been working on the design for Corinna, a proposed object system for the Perl core , for several months, inspired by the work of Stevan Little , pulling ideas from many other sources, including far too much reading about effective object-oriented design. Sawyer X , then the Perl pumpking (a now defunct role, but it was the person overseeing Perl language development), urged me to stop working on the implementation and just focus on the design. Design something great and he’d find someone to implement it.

When I first announced Corinna (then known as “Cor”), it was met with enthusiasm, indifference, and hostility, depending on whom you asked. In fact, I was convinced that what I had proposed was solid enough that I was sure the community would buy into it easily. They did not.

Truth be told, they were right to not get excited. Ignoring the fact that people have wanted to get an object system in the Perl core for years and no one had succeeded, the fact is that Corinna wasn’t really that interesting in terms of an OO system for the Perl community. I had spent so much time researching good OO practices that I hadn’t take into consideration the impact of decades of ingrained behaviors in the Perl community. Here’s what an early version looked like.

class Cache::LRU v0.01 {
    use Hash::Ordered;

    has cache    => ( default => method { Hash::Ordered->new } );
    has max_size => ( default => method { 20 } );

    method set ( $key, $value ) {
        if ( self->cache->exists($key) ) {
            self->cache->delete($key);
        }
        elsif ( self->cache->keys > self->max_size ) {
            self->cache->shift;
        }
        self->cache->set( $key, $value );
    }

    method get ($key) {
        if ( $self->exists($key) ) {
            my $value = $self->cache->get($key);
            $self->set( $key, $value );  # put it at the front
            return $value;
        }
        return;
    }
}

I won’t bore you with the details of the bad decisions I made, but you can read the initial Corinna specification here . Let’s look at one mistake I made in that specification:

Note: “slots” are internal data for the object. They provide no public API. By not defining standard is => 'ro', is => 'rw', etc., we avoid the trap of making it natural to expose everything. Instead, just a little extra work is needed by the developer to wrap slots with methods, thereby providing an affordance to keep the public interface smaller (which is generally accepted as good OO practice).

So you couldn’t call $self->cache or $self->max_size outside of the class.

Good OO Design

So I said encapsulation is “generally accepted as good OO practice.”

Yes, that’s true. And if you’re very familiar with OO design best practices, you might even agree. Java developers learned a long time ago not to make their object attributes public, but Perl, frankly, didn’t really have simple ways of enforcing encapsulation. In fact, in the early days of OO prorgamming in Perl, you’d often see an object constructed like this:

package SomeClass;

sub new {
    my ( $class, $name ) = @_;
    bless {
        name   => $name,
        _items => [],
    }, $class;
}

1;

With that, even outside the class you could inspect data with $object->{name} and $object->{_items}. But why the leading underscore on _items? Because that was a signal to other developers that _items was meant to be private. It was asking them to not touch that because Perl didn’t offer an easy way to ensure encapsulation.

Something similar for Moo/se could be written like this:

package SomeClass;
use Moose;

has 'name' => (
    is       => 'ro',
    isa      => 'Str,
    required => 1,
);

has '_items' => (
    is       => 'ro',
    isa      => 'ArrayRef',
    default  => sub {[]},
    init_arg => undef,
);

1;

That almost the same thing, but now you have a reader for $object->_items and you still don’t get encapsulation. Bad, right?

Well, yeah. That’s bad. All Around the World has repeatedly gone into clients and fixed broken code which is broken for no other reason than not respecting encapsulation (drop me a line if you’d like to hire us).

There have been attempts to introduce “inside-out” objects which properly encapsulate their data, but these have never caught on. They are awkward to write and, most importantly, you can’t just call Dumper($object) to see its internal state while debugging. That alone may have killed them.

My Mistake

So if encapsulation is good, and violating encapsulation is bad, why were people upset with my proposal?

Amongst other things, if I wanted to expose the max_size slot directly, I had to write a helper method:

class Cache::LRU v0.01 {
    use Hash::Ordered;

    has cache    => ( default => method { Hash::Ordered->new } );
    has max_size => ( default => method { 20 } );

    method max_size () { return self->max_size }

    ...
}

If I wanted people to be able to change the max size value:

method max_size ($new_size) {
    if (@_ > 1) { # guaranteed to confuse new developers
        self->max_size($new_size);
    }
    else {
        return self->max_size;
    }
}

And there are many, many ways to write the above and get it wrong. Corinna is supposed to make it easier to focus on writing good OO code. Here, it was letting developers fall back to writing bad code that Corinna could easily write. Why should people have to write it? If I offered some convenience here, how should it look? Enough people argued against the design that I realized I made a mistake. Maybe encapsulation is good, but my strictness was over the top. Which leads me to the entire point of this blog entry:

If you’re not effective, it doesn’t matter if you’re right.

Was I right to strive for such encapsulation? I honestly don’t know, but if it meant that no one would use my OO system, it didn’t matter. I had to change something.

Politics

It was pointed out to me that since I had been working in isolation, I hadn’t had the chance to hear, much less incorporate, the views of others. It was time for politics.

Out of all of the online definitions about politics, I really like the opening of Wikipedia the best:

Politics is the set of activities that are associated with making decisions in groups ...

That’s it. People often hate the word “politics” and say things like “politics has no place in the workplace” (or community, or mailing list, or whatever). But politics is nothing more than helping groups form a consensus. When you have a group as large as the Perl community, if you’re not interested in forming a consensus for a large project, you’ve already put yourself behind the eight ball and that’s what I had done.

So an IRC channel was created (irc.perl.org #cor) and a github project was started and anyone with an interest in the project was welcome to share their thoughts.

To be honest, this was challenging for me. First, I don’t respond well to profanity or strong language and there were certainly some emotions flying in the early days. Second, nobody likes being told their baby is ugly, and people were telling me that. For example, when I was trying to figure out how to handle class data, here’s part of the conversation (if you leave a comment, I will delete it if it’s identifying this person):

developer: Ovid: this is not an attack on you or at all, I’m just about to rant about class data, ok
developer: class data is bullSHIT
developer: there is no such thing
developer: it’s a stupid-ass excuse in java land for them not having globals
developer: just make it a fucking global
developer: or a fucking constant method
developer: sub foo { ‘value' }
developer: there, it’s a constant method, it’s fine, fuck off

When I see things like that, I tend to tune out. That’s not helpful when I need to understand where people are coming from. And I’ve read that multiple times and I still don’t see the technical argument there (to be fair, this went on for a bit, but I saw nothing explaining why global variables are better than class data. It could just be that I’m a bear of little brain).

I strongly disagree that globals are better than class data because I’ve worked on codebases with plenty of globals and, even if they’re immutable, there’s the lifecycle question of when the come into existence. Plus, class data is clearly associated with a class, so at the very least, when I call my $foo = SomeClass->data (assuming it’s needed outside the class), the class can maintain responsibility for what that data is. Or if the class data is private to the class (as it often is), a global breaks that entirely. But nonetheless, I have to get buy-in for Corinna and that means listening to people and trying to figure out what the real complaints were and how (and whether) to address them.

This worked well for slot encapsulation because we eventually came up with this:

slot $name :reader :writer;

The $name variable can be read via $object->name and can be set via $object->set_name($new_name). Yes, this tremendously violates encapsulation because Corinna does not yet have a way of enforcing type constraints, but mutability at least isn’t the default. And when we later incorporate type constraints, the external interface of the classes won’t have to change. I got buy-in for the design, at the cost of slightly compromising some of my design goals. I think it’s a good trade off and I think Corinna is better off for this.

But what about class data?

Class Data and Methods

Should we include class data and methods in Corinna? I discovered that for many people, the answer was obvious, but their obvious answers disagreed with other people’s obvious answers. What makes it worse is that this isn’t just a “yes or no” question. We have multiple questions.

Should we include class data and methods?
What should the semantics be?
What should the syntax be?

This isn’t easy. We got past one and two with difficulty, but number three has been a bit of a trainwreck. And I’m the conductor.

Inclusion

I think it was pretty much universally agreed that we had to have class methods. These wouldn’t have $self injected into them, nor would they have any access to instance data. In fact, done properly, we could make violations of that a compile-time failure. That’s a huge win for Perl. They might look like this:

slot $some_data; # instance data

common method foo() {
    # $class is available here
    # $self and $some_data are not available here
}

One use of those is for alternate constructors. In Moo/se, you have BUILDARGS for fiddling with arguments to new. In Corinna, at least for the MVP, you write an alternate constructor (we don’t have method overloading in Perl).

class Box {
    slot ( $height, $width, $depth ) :param;
    slot $volume :reader = $height * $width * $depth;

    common method new_cube ($length) {
        return $class->new(
            height => $length,
            width  => $length,
            depth  => $length,
        );
    }
}

With the above, you can create a cube with Box->new_cube(3). No more messing around with BUILDARGS and trying to remember the syntax.

So if we should have class methods, should we have class data? Well, it’s been pointed out that it’s literally impossible to avoid:

class Box {
    my $num_instances = 0;      # class data!!

    slot ( $height, $width, $depth ) :param;
    slot $volume :reader = $height * $width * $depth;

    ADJUST   { $num_instances++ }
    DESTRUCT { $num_instances-- }

    common method new_cube ($length) {
        return $class->new(
            height => $length,
            width  => $length,
            depth  => $length,
        );
    }

    common method inventory_count () { $num_instances }
}

So at a very practical level, whether or not we want class data, we have it. In fact, even if we omitted class methods, we’d still have class data. So let’s work with what we have.

Semantics

We have a lot of the semantics described here , but it could use some more work. However, the general idea of behavior isn’t controversial enough that I want to spend too much time here.

Syntax

Now we have a problem. In Moo/se, we BUILD, DEMOLISH, and has, which have been renamed to ADJUST, DESTRUCT, and slot in Corinna in part because they’re different beasts. We don’t want to have BUILD in Corinna and have Moo/se developers think it’s the same thing. If we get an analog to BUILDARGS, it will probably be called CONSTRUCT for the same reason.

So one of our design goals is that different things should look different.

Another design goal is that we do not wish to overload the meaning of things. Thus, we agreed that reusing the class keyword (class method foo() {...} or class slot $foo) was probably a bad idea (it turns out to be spectacularly bad if we get to inner classes, but let’s not go there yet).

By the same reasoning that “different things should look different,” similar things should look similar. In Java, class data and methods are declared with the static keyword.

public class MyClass {
    private String name;

    // class data
    public static int numberOfItems;

    public MyClass(String name) {
        this.name = name;
    }

    // class method
    public static void setSomeClassData(int value) {
        MyClass.numberOfItems = value;
    }
}

A developer can easily understand how the two are analogous. But do we need this for Corinna? Here’s the “accidental” class data we could not avoid.

class Box {
    my $num_instances = 0;      # class data!!

    slot ( $height, $width, $depth ) :param;
    slot $volume :reader = $height * $width * $depth;

    ADJUST   { $num_instances++ }
    DESTRUCT { $num_instances-- }

    common method new_cube ($length) {
        return $class->new(
            height => $length,
            width  => $length,
            depth  => $length,
        );
    }

    common method inventory_count () { $num_instances }
}

But we could get rid of inventory_count method by supplying a reader (and even renaming it).

class Box {
    my $num_instances :reader(inventory_count) = 0;      # class data!!

    slot ( $height, $width, $depth ) :param;
    slot $volume :reader = $height * $width * $depth;

    ADJUST   { $num_instances++ }
    DESTRUCT { $num_instances-- }

    common method new_cube ($length) {
        return $class->new(
            height => $length,
            width  => $length,
            depth  => $length,
        );
    }
}

So right off the bat, for new developers, we need to teach them when they can and cannot use slot attributes with my variables.

Also, Perl has the conflation of package and class, along with sub and method. Do want to add my for class data and for lexical variables? And as Damian Conway has pointed out , static analysis tools are already hard enough to write for Perl, given the overloaded meaning of many keywords.

And if we do accept the notion that similar things should look similar, why would class methods and class data have different declarators? We can’t just say my method foo() {...} because that clearly implies it’s a private method.

Or we can adopt the approach other OO languages such as Java, C++, C#, and Swift have done and use a single keyword to explain the same concept: these things are bound to the class and not an instance of the class. For the aforementioned languages, that keyword was static, but it was strongly shot down as “not good for us” due to possible confusion with the state keyword and the fact that different languages sometimes use static to mean different things. Different things should look different.

shared seems good, but that implies threads to many people , so that was also shot down.

I’m not sure who came up with the word common (it may have been me back in February of 2021, according to IRC logs) and so far it seems like the least-bad alternative. (Another suggestion I proposed at that time was mutual)

However, there are those who are strongly in favor of my, including adding attributes to it—if it’s in Corinna and not inside a method—and strongly object to common on the grounds that all methods defined in a class are common to every instance of that class. They have a point about common being a poor choice, but I don’t have a good one and I suspect that, over time, it won’t even be noticed (I may live to regret typing that).

So while I’m trying to figure all of this out, Damian Conway posted an extensive endorsement of Corinna . To illustrate one of his points, he shared a class written in Dios, which an OO system for Perl which he introduced in his “Three Little Words” presentation.

He wrote the following class.

use Dios;

class Account {
    state $next_ID = 'AAA0001';

    has $.name     is rw  is required;
    has $.balance  = 0;
    has $.ID       = $next_ID++;

    method deposit ($amount) {
        $balance += $amount;
    }

    method report ($fh = *STDOUT) {
        $fh->say( "$ID: $balance" );
    }
}

That’s almost how you would write it in Corinna, but that’s not what I really noticed.

I kept staring at that state variable he used to declare class data.

Everyone arguing for reusing an existing declarator to declare class data in Corinna was arguing for my. Here’s Damian, using state.

I couldn’t get that out of my mind. And then I started thinking about inner classes, but let’s not go there yet. Let’s talk about state and why this is important.

State Variables

perldoc -f state says:

state declares a lexically scoped variable, just like my. However, those variables will never be reinitialized, contrary to lexical variables that are reinitialized each time their enclosing block is entered. See “Persistent Private Variables” in perlsub for details.

What does that mean? Well, first, let’s run the following code.

sub printit {
    state $this = 1;
    my $that = 1;
    $this++;
    $that++;
    say "state $this and my $that";
}
printit() for 1 .. 3;

That prints out:

state 2 and my 2
state 3 and my 2
state 4 and my 2

As you can see, state variables are like static variables in C. They are declared once and retain their value between calls. They kinda look like static members in Java.

Let’s look at state some more, this time returning an anonymous subroutine with the variables.

sub printit ($name) {
    return sub {
        state $this = 1;
        my $that = 1;
        $this++;
        $that++;
        say "$name: state $this and my $that";
    }
}

my $first  = printit('first');
my $second = printit('second');

$first->()  for 1 .. 3;
$second->() for 1 .. 3;

And that prints out:

first: state 2 and my 2
first: state 3 and my 2
first: state 4 and my 2
second: state 2 and my 2
second: state 3 and my 2
second: state 4 and my 2

Hmm, perldoc -f state says that state variables are only initialized once, but in the case of returning an anonymous sub, we’ve created a new lexical scope and we have a different state variable.

Just for completeness, let’s define those variables inside the outer sub, but outside the inner sub.

sub printit ($name) {
    state $this = 1;
    my $that = 1;
    return sub {
        $this++;
        $that++;
        say "$name: state $this and my $that";
    }
}

my $first  = printit('first');
my $second = printit('second');

$first->()  for 1 .. 3;
$second->() for 1 .. 3;

And that prints out:

first: state 2 and my 2
first: state 3 and my 3
first: state 4 and my 4
second: state 5 and my 2
second: state 6 and my 3
second: state 7 and my 4

So, depending on how we declare those variables and what the enclosing scope should be, we get different results. This is more or less as expected, though creating a new lexical scope and having the state variables re-initialized might surprise some because I don’t think it’s clearly documented.

But what does that mean for class data?

Part of my job is to ensure that Corinna doesn’t break existing Perl. However, I need to ensure that Corinna doesn’t hobble future Perl, either. That’s much harder because we can’t predict the future.

The Future

There are two things we would love to see in the future for Perl. One is inner classes and the second is anonymous classes. “Anonymous classes” already feels “Perlish” because we have anonymous subroutines and most Perl developers are familiar with the concept of closures. But let’s discuss inner classes first since many people are not familiar with them. Let’s look at some examples from the Java documentation .

class OuterClass {
    ...
    class InnerClass {
        ...
    }
    static class StaticNestedClass {
        ...
    }
}

The InnerClass has access to all static (class) and instance variables in OuterClass, while the StaticNestedClass class only has access to the static variables.

What this means is that you can put together a collection of classes and encapsulate the “helper” classes. When people talk about allowing classes to “trust” one another with their data but not share it with the outside world, this is a way to do that while still maintaining encapsulation.

For Corinna, it might look like this:

class OuterClass {
    ...
    class InnerClass {
        ...
    }
    common class StaticNestedClass {
        ...
    }
}

So we’d immediately have something we can reason about, with well-defined, battle-tested semantics from the Java language (if we’re allowed to steal from other languages, Java should be on that list. No language bigotry, please).

(As an aside, this is why we can’t reuse the class keyword for class data and methods. How would we describe a static inner class? class class StaticNestedClass {...}?)

Next, let’s consider an anonymous class. Here’s one way to think about it.

my $thing = class  {
    slot $foo;
    slot $name :param;
    ...
};
my $o1 = $thing->new(name => $name1);
my $o2 = $thing->new(name => $name2);

We could go the Java route and allow declaration and instantiation at the same time , but I don’t think that gains us anything:

my $object = (class  {
    slot $foo;
    slot $name :param;
    ...
})->new(name => $name1);

But consider this:

class OuterClass {
    ...
    private class InnerClass {
        ...
        method some_method (%args) {
            return class {
                state $class_data = 17; # or my $class_data = 17
                slot $name :param;
            };
        }
    }
}

So we have a private inner class which returns anonymous metaclass instances with state or my variables being used for class data. Are they shared across all metaclass instances or not? I would think “no”, but someone else might make a reasonable argument otherwise. And should it be state or my? Do either really connote “global to this class regardless of how the class is generated”?

And what we’re talking about is something speculative, years in the future, where the existing semantics of my or state might be perfectly appropriate. Or after we get there, we might discover that they’re not appropriate and we’ve backed ourselves into a corner because we decided to reuse a handy thing in the language.

Conclusion

There are no good answers here, but I had to make a call. And I decided to err on the side of established conventions in popular languages, and potential safety for the future of Perl. I also didn’t want to potentially overload the meaning of existing syntax.

There are already people who have let me know that they’re very upset with this decision. There are others who are fine with this decision; they just want to get Corinna in core. In this case, I don’t think there’s a “safe” political solution. So I decided to play it safe technically.

People might come back to me later and make a strong argument that I screwed up. That’s fine. I welcome those arguments and I might change my mind, but the arguments have raged since February with no sign of consensus. I had to make a call. I might not have made the right call, but it needed to be done.

Update

I’ve been getting some feedback on this. I’ve now been going back through tickets, emails, and IRC logs and I see that there some agreement on class method foo () {...}, but a lot of disagreement over class $foo. There was some discussion of class slot $foo being OK. There has been so much discussion of this issue for months on end, on IRC, email, and github, that I’ve not remembered all of the fine-grained detail well.

The use of a class as a both a class declarator and a non-access modifier overloads the meaning of class and violates the “different things should look different” principle (to be fair, it’s not the only place we’ve violated this). And this still doesn’t address the awkward case of declaring a static inner class: class class InnerClass {...}.

Looks like this debate is not done.

If you'd like top-notch consulting or training, email me and let's discuss how I can help you. Read my hire me page to learn more about my background.

Politics in Programming