Corinna’s First (mis)Steps
Note: Corinna is the name of Ovid’s love interest in the Amores .
I had been working on the design for Corinna, a proposed object system for the Perl core , for several months, inspired by the work of Stevan Little , pulling ideas from many other sources, including far too much reading about effective object-oriented design. Sawyer X , then the Perl pumpking (a now defunct role, but it was the person overseeing Perl language development), urged me to stop working on the implementation and just focus on the design. Design something great and he’d find someone to implement it.
When I first announced Corinna (then known as “Cor”), it was met with enthusiasm, indifference, and hostility, depending on whom you asked. In fact, I was convinced that what I had proposed was solid enough that I was sure the community would buy into it easily. They did not.
Truth be told, they were right to not get excited. Ignoring the fact that people have wanted to get an object system in the Perl core for years and no one had succeeded, the fact is that Corinna wasn’t really that interesting in terms of an OO system for the Perl community. I had spent so much time researching good OO practices that I hadn’t take into consideration the impact of decades of ingrained behaviors in the Perl community. Here’s what an early version looked like.
class Cache::LRU v0.01 {
use Hash::Ordered;
has cache => ( default => method { Hash::Ordered->new } );
has max_size => ( default => method { 20 } );
method set ( $key, $value ) {
if ( self->cache->exists($key) ) {
self->cache->delete($key);
}
elsif ( self->cache->keys > self->max_size ) {
self->cache->shift;
}
self->cache->set( $key, $value );
}
method get ($key) {
if ( $self->exists($key) ) {
my $value = $self->cache->get($key);
$self->set( $key, $value ); # put it at the front
return $value;
}
return;
}
}
I won’t bore you with the details of the bad decisions I made, but you can read the initial Corinna specification here . Let’s look at one mistake I made in that specification:
Note: “slots” are internal data for the object. They provide no public API. By not defining standard
is => 'ro'
,is => 'rw'
, etc., we avoid the trap of making it natural to expose everything. Instead, just a little extra work is needed by the developer to wrap slots with methods, thereby providing an affordance to keep the public interface smaller (which is generally accepted as good OO practice).
So you couldn’t call $self->cache
or $self->max_size
outside of the class.
Good OO Design
So I said encapsulation is “generally accepted as good OO practice.”
Yes, that’s true. And if you’re very familiar with OO design best practices, you might even agree. Java developers learned a long time ago not to make their object attributes public, but Perl, frankly, didn’t really have simple ways of enforcing encapsulation. In fact, in the early days of OO prorgamming in Perl, you’d often see an object constructed like this:
package SomeClass;
sub new {
my ( $class, $name ) = @_;
bless {
name => $name,
_items => [],
}, $class;
}
1;
With that, even outside the class you could inspect data with
$object->{name}
and $object->{_items}
. But why the leading underscore on
_items
? Because that was a signal to other developers that _items
was
meant to be private. It was asking them to not touch that because Perl didn’t
offer an easy way to ensure encapsulation.
Something similar for Moo/se could be written like this:
package SomeClass;
use Moose;
has 'name' => (
is => 'ro',
isa => 'Str,
required => 1,
);
has '_items' => (
is => 'ro',
isa => 'ArrayRef',
default => sub {[]},
init_arg => undef,
);
1;
That almost the same thing, but now you have a reader for $object->_items
and you still don’t get encapsulation. Bad, right?
Well, yeah. That’s bad. All Around the World has repeatedly gone into clients and fixed broken code which is broken for no other reason than not respecting encapsulation (drop me a line if you’d like to hire us).
There have been attempts to introduce “inside-out”
objects which properly encapsulate their
data, but these have never caught on. They are awkward to write and, most
importantly, you can’t just call Dumper($object)
to see its internal state
while debugging. That alone may have killed them.
My Mistake
So if encapsulation is good, and violating encapsulation is bad, why were people upset with my proposal?
Amongst other things, if I wanted to expose the max_size
slot directly, I had
to write a helper method:
class Cache::LRU v0.01 {
use Hash::Ordered;
has cache => ( default => method { Hash::Ordered->new } );
has max_size => ( default => method { 20 } );
method max_size () { return self->max_size }
...
}
If I wanted people to be able to change the max size value:
method max_size ($new_size) {
if (@_ > 1) { # guaranteed to confuse new developers
self->max_size($new_size);
}
else {
return self->max_size;
}
}
And there are many, many ways to write the above and get it wrong. Corinna is supposed to make it easier to focus on writing good OO code. Here, it was letting developers fall back to writing bad code that Corinna could easily write. Why should people have to write it? If I offered some convenience here, how should it look? Enough people argued against the design that I realized I made a mistake. Maybe encapsulation is good, but my strictness was over the top. Which leads me to the entire point of this blog entry:
If you’re not effective, it doesn’t matter if you’re right.
Was I right to strive for such encapsulation? I honestly don’t know, but if it meant that no one would use my OO system, it didn’t matter. I had to change something.
Politics
It was pointed out to me that since I had been working in isolation, I hadn’t had the chance to hear, much less incorporate, the views of others. It was time for politics.
Out of all of the online definitions about politics, I really like the opening of Wikipedia the best:
Politics is the set of activities that are associated with making decisions in groups ...
That’s it. People often hate the word “politics” and say things like “politics has no place in the workplace” (or community, or mailing list, or whatever). But politics is nothing more than helping groups form a consensus. When you have a group as large as the Perl community, if you’re not interested in forming a consensus for a large project, you’ve already put yourself behind the eight ball and that’s what I had done.
So an IRC channel was created (irc.perl.org #cor) and a github project was started and anyone with an interest in the project was welcome to share their thoughts.
To be honest, this was challenging for me. First, I don’t respond well to profanity or strong language and there were certainly some emotions flying in the early days. Second, nobody likes being told their baby is ugly, and people were telling me that. For example, when I was trying to figure out how to handle class data, here’s part of the conversation (if you leave a comment, I will delete it if it’s identifying this person):
- developer: Ovid: this is not an attack on you or at all, I’m just about to rant about class data, ok
- developer: class data is bullSHIT
- developer: there is no such thing
- developer: it’s a stupid-ass excuse in java land for them not having globals
- developer: just make it a fucking global
- developer: or a fucking constant method
- developer: sub foo { ‘value' }
- developer: there, it’s a constant method, it’s fine, fuck off
When I see things like that, I tend to tune out. That’s not helpful when I need to understand where people are coming from. And I’ve read that multiple times and I still don’t see the technical argument there (to be fair, this went on for a bit, but I saw nothing explaining why global variables are better than class data. It could just be that I’m a bear of little brain).
I strongly disagree that globals are better than class data because I’ve
worked on codebases with plenty of globals and, even if they’re immutable,
there’s the lifecycle question of when the come into existence. Plus, class
data is clearly associated with a class, so at the very least, when I call my
$foo = SomeClass->data
(assuming it’s needed outside the class), the class
can maintain responsibility for what that data is. Or if the class data is
private to the class (as it often is), a global breaks that entirely. But
nonetheless, I have to get buy-in for Corinna and that means listening to
people and trying to figure out what the real complaints were and how (and
whether) to address them.
This worked well for slot encapsulation because we eventually came up with this:
slot $name :reader :writer;
The $name
variable can be read via $object->name
and can be set via
$object->set_name($new_name)
. Yes, this tremendously violates encapsulation
because Corinna does not yet have a way of enforcing type constraints, but
mutability at least isn’t the default. And when we later incorporate type
constraints, the external interface of the classes won’t have to change. I got
buy-in for the design, at the cost of slightly compromising some of my design
goals. I think it’s a good trade off and I think Corinna is better off for
this.
But what about class data?
Class Data and Methods
Should we include class data and methods in Corinna? I discovered that for many people, the answer was obvious, but their obvious answers disagreed with other people’s obvious answers. What makes it worse is that this isn’t just a “yes or no” question. We have multiple questions.
- Should we include class data and methods?
- What should the semantics be?
- What should the syntax be?
This isn’t easy. We got past one and two with difficulty, but number three has been a bit of a trainwreck. And I’m the conductor.
Inclusion
I think it was pretty much universally agreed that we had to have class
methods. These wouldn’t have $self
injected into them, nor would they have
any access to instance data. In fact, done properly, we could make violations
of that a compile-time failure. That’s a huge win for Perl. They might look
like this:
slot $some_data; # instance data
common method foo() {
# $class is available here
# $self and $some_data are not available here
}
One use of those is for alternate constructors. In Moo/se, you have
BUILDARGS
for fiddling with arguments to new
. In Corinna, at least for the
MVP, you write an alternate constructor (we don’t have method overloading in
Perl).
class Box {
slot ( $height, $width, $depth ) :param;
slot $volume :reader = $height * $width * $depth;
common method new_cube ($length) {
return $class->new(
height => $length,
width => $length,
depth => $length,
);
}
}
With the above, you can create a cube with Box->new_cube(3)
. No more
messing around with BUILDARGS
and trying to remember the syntax.
So if we should have class methods, should we have class data? Well, it’s been pointed out that it’s literally impossible to avoid:
class Box {
my $num_instances = 0; # class data!!
slot ( $height, $width, $depth ) :param;
slot $volume :reader = $height * $width * $depth;
ADJUST { $num_instances++ }
DESTRUCT { $num_instances-- }
common method new_cube ($length) {
return $class->new(
height => $length,
width => $length,
depth => $length,
);
}
common method inventory_count () { $num_instances }
}
So at a very practical level, whether or not we want class data, we have it. In fact, even if we omitted class methods, we’d still have class data. So let’s work with what we have.
Semantics
We have a lot of the semantics described here , but it could use some more work. However, the general idea of behavior isn’t controversial enough that I want to spend too much time here.
Syntax
Now we have a problem. In Moo/se, we BUILD
, DEMOLISH
, and has
, which
have been renamed to ADJUST
, DESTRUCT
, and slot
in Corinna in part
because they’re different beasts. We don’t want to have BUILD
in Corinna
and have Moo/se developers think it’s the same thing. If we get an analog to
BUILDARGS
, it will probably be called CONSTRUCT
for the same reason.
So one of our design goals is that different things should look different.
Another design goal is that we do not wish to overload the meaning of things.
Thus, we agreed that reusing the class
keyword (class method foo() {...}
or class slot $foo
) was probably a bad idea (it turns out to be
spectacularly bad if we get to inner classes, but let’s not go there yet).
By the same reasoning that “different things should look different,” similar
things should look similar. In Java, class data and methods are declared with
the static
keyword.
public class MyClass {
private String name;
// class data
public static int numberOfItems;
public MyClass(String name) {
this.name = name;
}
// class method
public static void setSomeClassData(int value) {
MyClass.numberOfItems = value;
}
}
A developer can easily understand how the two are analogous. But do we need this for Corinna? Here’s the “accidental” class data we could not avoid.
class Box {
my $num_instances = 0; # class data!!
slot ( $height, $width, $depth ) :param;
slot $volume :reader = $height * $width * $depth;
ADJUST { $num_instances++ }
DESTRUCT { $num_instances-- }
common method new_cube ($length) {
return $class->new(
height => $length,
width => $length,
depth => $length,
);
}
common method inventory_count () { $num_instances }
}
But we could get rid of inventory_count
method by supplying a reader (and
even renaming it).
class Box {
my $num_instances :reader(inventory_count) = 0; # class data!!
slot ( $height, $width, $depth ) :param;
slot $volume :reader = $height * $width * $depth;
ADJUST { $num_instances++ }
DESTRUCT { $num_instances-- }
common method new_cube ($length) {
return $class->new(
height => $length,
width => $length,
depth => $length,
);
}
}
So right off the bat, for new developers, we need to teach them when they can
and cannot use slot attributes with my
variables.
Also, Perl has the conflation of package
and class
, along with sub
and
method
. Do want to add my
for class data and for lexical variables? And
as Damian Conway has pointed
out , static
analysis tools are already hard enough to write for Perl, given the overloaded
meaning of many keywords.
And if we do accept the notion that similar things should look similar, why
would class methods and class data have different declarators? We can’t just
say my method foo() {...}
because that clearly implies it’s a private
method.
Or we can adopt the approach other OO languages such as Java, C++, C#, and
Swift have done and use a single keyword to explain the same concept: these
things are bound to the class and not an instance of the class. For the
aforementioned languages, that keyword was static
, but it was strongly shot
down as “not good for us” due to possible confusion with the state
keyword
and the fact that different languages sometimes use static
to mean different
things. Different things should look different.
shared
seems good, but that implies threads to many
people , so that was also shot down.
I’m not sure who came up with the word common
(it may have been me back in
February of 2021, according to IRC logs) and so far it seems like the
least-bad alternative. (Another suggestion I proposed at that time was
mutual
)
However, there are those who are strongly in favor of my
, including adding
attributes to it—if it’s in Corinna and not inside a method—and strongly
object to common
on the grounds that all methods defined in a class are
common to every instance of that class. They have a point about common
being
a poor choice, but I don’t have a good one and I suspect that, over time, it
won’t even be noticed (I may live to regret typing that).
So while I’m trying to figure all of this out, Damian Conway posted an extensive endorsement of Corinna . To illustrate one of his points, he shared a class written in Dios, which an OO system for Perl which he introduced in his “Three Little Words” presentation.
He wrote the following class.
use Dios;
class Account {
state $next_ID = 'AAA0001';
has $.name is rw is required;
has $.balance = 0;
has $.ID = $next_ID++;
method deposit ($amount) {
$balance += $amount;
}
method report ($fh = *STDOUT) {
$fh->say( "$ID: $balance" );
}
}
That’s almost how you would write it in Corinna, but that’s not what I really noticed.
I kept staring at that state
variable he used to declare class data.
Everyone arguing for reusing an existing declarator to declare class data in
Corinna was arguing for my
. Here’s Damian, using state
.
I couldn’t get that out of my mind. And then I started thinking about inner
classes, but let’s not go there yet. Let’s talk about state
and why this is
important.
State Variables
perldoc -f state
says:
state
declares a lexically scoped variable, just like my. However, those variables will never be reinitialized, contrary to lexical variables that are reinitialized each time their enclosing block is entered. See “Persistent Private Variables” in perlsub for details.
What does that mean? Well, first, let’s run the following code.
sub printit {
state $this = 1;
my $that = 1;
$this++;
$that++;
say "state $this and my $that";
}
printit() for 1 .. 3;
That prints out:
state 2 and my 2
state 3 and my 2
state 4 and my 2
As you can see, state
variables are like static
variables in C. They are
declared once and retain their value between calls. They kinda look like
static members in Java.
Let’s look at state
some more, this time returning an anonymous subroutine
with the variables.
sub printit ($name) {
return sub {
state $this = 1;
my $that = 1;
$this++;
$that++;
say "$name: state $this and my $that";
}
}
my $first = printit('first');
my $second = printit('second');
$first->() for 1 .. 3;
$second->() for 1 .. 3;
And that prints out:
first: state 2 and my 2
first: state 3 and my 2
first: state 4 and my 2
second: state 2 and my 2
second: state 3 and my 2
second: state 4 and my 2
Hmm, perldoc -f state
says that state
variables are only initialized once,
but in the case of returning an anonymous sub, we’ve created a new lexical
scope and we have a different state variable.
Just for completeness, let’s define those variables inside the outer sub, but outside the inner sub.
sub printit ($name) {
state $this = 1;
my $that = 1;
return sub {
$this++;
$that++;
say "$name: state $this and my $that";
}
}
my $first = printit('first');
my $second = printit('second');
$first->() for 1 .. 3;
$second->() for 1 .. 3;
And that prints out:
first: state 2 and my 2
first: state 3 and my 3
first: state 4 and my 4
second: state 5 and my 2
second: state 6 and my 3
second: state 7 and my 4
So, depending on how we declare those variables and what the enclosing scope should be, we get different results. This is more or less as expected, though creating a new lexical scope and having the state variables re-initialized might surprise some because I don’t think it’s clearly documented.
But what does that mean for class data?
Part of my job is to ensure that Corinna doesn’t break existing Perl. However, I need to ensure that Corinna doesn’t hobble future Perl, either. That’s much harder because we can’t predict the future.
The Future
There are two things we would love to see in the future for Perl. One is inner classes and the second is anonymous classes. “Anonymous classes” already feels “Perlish” because we have anonymous subroutines and most Perl developers are familiar with the concept of closures. But let’s discuss inner classes first since many people are not familiar with them. Let’s look at some examples from the Java documentation .
class OuterClass {
...
class InnerClass {
...
}
static class StaticNestedClass {
...
}
}
The InnerClass
has access to all static (class) and instance variables in
OuterClass
, while the StaticNestedClass
class only has access to the
static variables.
What this means is that you can put together a collection of classes and encapsulate the “helper” classes. When people talk about allowing classes to “trust” one another with their data but not share it with the outside world, this is a way to do that while still maintaining encapsulation.
For Corinna, it might look like this:
class OuterClass {
...
class InnerClass {
...
}
common class StaticNestedClass {
...
}
}
So we’d immediately have something we can reason about, with well-defined, battle-tested semantics from the Java language (if we’re allowed to steal from other languages, Java should be on that list. No language bigotry, please).
(As an aside, this is why we can’t reuse the class
keyword for class data and
methods. How would we describe a static inner class? class class
StaticNestedClass {...}
?)
Next, let’s consider an anonymous class. Here’s one way to think about it.
my $thing = class {
slot $foo;
slot $name :param;
...
};
my $o1 = $thing->new(name => $name1);
my $o2 = $thing->new(name => $name2);
We could go the Java route and allow declaration and instantiation at the same time , but I don’t think that gains us anything:
my $object = (class {
slot $foo;
slot $name :param;
...
})->new(name => $name1);
But consider this:
class OuterClass {
...
private class InnerClass {
...
method some_method (%args) {
return class {
state $class_data = 17; # or my $class_data = 17
slot $name :param;
};
}
}
}
So we have a private inner class which returns anonymous metaclass instances
with state
or my
variables being used for class data. Are they shared
across all metaclass instances or not? I would think “no”, but someone else
might make a reasonable argument otherwise. And should it be state
or my
?
Do either really connote “global to this class regardless of how the class is
generated”?
And what we’re talking about is something speculative, years in the future,
where the existing semantics of my
or state
might be perfectly
appropriate. Or after we get there, we might discover that they’re not
appropriate and we’ve backed ourselves into a corner because we decided to
reuse a handy thing in the language.
Conclusion
There are no good answers here, but I had to make a call. And I decided to err on the side of established conventions in popular languages, and potential safety for the future of Perl. I also didn’t want to potentially overload the meaning of existing syntax.
There are already people who have let me know that they’re very upset with this decision. There are others who are fine with this decision; they just want to get Corinna in core. In this case, I don’t think there’s a “safe” political solution. So I decided to play it safe technically.
People might come back to me later and make a strong argument that I screwed up. That’s fine. I welcome those arguments and I might change my mind, but the arguments have raged since February with no sign of consensus. I had to make a call. I might not have made the right call, but it needed to be done.
Update
I’ve been getting some feedback on this. I’ve now been going back through
tickets, emails, and IRC logs and I see that there some agreement on class
method foo () {...}
, but a lot of disagreement over class $foo
. There was
some discussion of class slot $foo
being OK. There has been so much
discussion of this issue for months on end, on IRC, email, and github, that
I’ve not remembered all of the fine-grained detail well.
The use of a class
as a both a class declarator and a non-access modifier
overloads the meaning of class
and violates the “different things should
look different” principle (to be fair, it’s not the only place we’ve violated
this). And this still doesn’t address the awkward case of declaring a static
inner class: class class InnerClass {...}
.
Looks like this debate is not done.