Friday, September 4, 2009

Re: Re: Re: Moose Or No Moose

This has been a crazy week here in Moose-land as you might have read. I would like to attempt to sum this all up so that we can all move on and get back to programming (which is sooooooo much more fun then blogging).

Adam Kennedy responded to all this in his recent use.perl post, and then later came into the #moose IRC channel to discuss. In that post I think that he did an excellent job of answering the "Should Padre use Moose" question. His conclusion seemed to be "Not Yet" which I can respect given his well thought out reasons. I was also happy to see that Adam felt Moose had the "air of inevitability" about it, and he felt it was only a matter of time till the Moose community solved our startup time and memory efficiency issues.

For the record, I was never actually suggesting that Padre should immediately be re-written in Moose, but more that Padre and Moose might benefit from one another, and would those benefits perhaps outweigh the costs. For Adam (and I assume the rest of the Padre team) the answer was "Not Yet", that the costs were too high and that benefits not compelling enough for them.

Then this morning I arrived at my desk to see Daisuke Maki had posted his 2 cents about this debate. Two lines in Daisuke's post jumped out at me. The first was this one:
I'd really like them [the Moose develoers] to make a more thought-out response, and act like you actually care about the parties involved that have a slightly different view than yours
And the second was:
That will make my life easier. I can point people to the Moose roadmap or whatever and say, "look, they care. they /will/ address your concerns. let's just install Moose in your system for now, yeah?" and go on hacking with Moose.
So while I don't have any official roapmap yet, I will say that "we are working on it" and "we do care". If nothing more, this recent debate has served to light a fire under the Moose development team and forced us to take a closer look at ourselves. It is a challenge and one that we are ready to take on.

The first step is to admit that you have a problem.

So last night (after a challenge from Adam on IRC) Jesse (doy) Luehrs added memory usage stats to Shawn (Sartak) Moore's original script which produced this data and then Cory (gphat) Watson used his Chart::Clicker module, which is written in Moose ;) to produce this graph:

The purple line is startup time and the yellow line is memory usage. (As I pointed out before, that nice downward drop at the very end there is a direct result of the Japan Perl Associations sponsorship of Goro Fuji).

The second step is to do something about it.

A big part of the memory and CPU hit is from the meta-layer. These are all the objects that Moose has to create for your class, and then all the code Moose compiles and evals to make sure things are fast at runtime. This is largely a one-time (compile time) cost and if you never call the "meta" method on your Moose class then you never use it again. We currently have two projects going on to help reduce or even eliminate some of these costs.

A recent post by fREW Schmidt announced that he is currently working to refactor the Moose test suite in such a way that the meta-level tests are optional. This is to help make way for Matt Trout's proposed "Antlers" project, which aims to "compile" Moose classes such that the meta-layer (the part that consumes most of the memory and takes so much time at startup) is not loaded or used unless you actually specifically request it by calling the "meta" method itself.

This idea has been kicking around the Moose team's heads for a while, in fact it is an evolution/mutation of Yuval Kogman's MooseX::Compile module. When Yuval wrote his version over a year and a half ago, Moose did not have the proper features internally and so the project was shelved. Now we feel that Moose is ready, and with fREW's test suite refactoring, it should only be a matter of tuits before this can become a reality.

The second is a combination of Goro Fuji's work and some XS-ification that Yuval Kogman, Florian (rafl) Ragwitz and a few other Moose hackers have been doing. The aim of this is to eliminate much of the need to do the code evaling for accessor and constructor generation by moving them to XS. Since we don't want you to pay for any more features then your using, Moose currently will compile only as much accessor and constructor code as you need and eval it into existence (which gets expensive). It is the hope that by moving that code to XS we can eliminate the speed penalty of un-used features and therefore not need to custom compile anymore.

The third step is to recruit new cult members.

Actually I am not sure if that is the 3rd step or not, but it sounds good to me. So if you love Moose and have some spare tuits lying around, please come and help pick some low hanging fruits and make this a reality.

Wednesday, September 2, 2009

Moose Startup Time over Time

I actually owe marcus an apology.

You were correct in that Moose startup time has not gotten significantly faster since 0.01. After our exchange Sartak decided to actually take a look and see. So he wrote this script, which produced this data, which jhannah then promptly turned into this graph (reproduced below).

It seems that since the 0.15 release we have mostly stayed within the 0.2 to 0.3 second range. It is interesting to look at the Changelogs for both Moose and Class::MOP you can actually see the feature additions or refactorings that correspond with the peaks and valleys.

Some of the most recent speedup is the direct result of Goro (gfx) Fuji's work on Class::MOP and Moose as sponsored by the Japan Perl Association. Much thanks to all involved in that.

Re: Re: Moose Or No Moose

Chris Prather recently asked "who is generally complaining about the slings and arrows of outrageous dependencies", which got me thinking (and I will get back to that thought in a moment).

Dave Cross also left a comment on my last blog post clarifying the relationship between his Array::Compare module and Test::Warn. Seems that the author of Test::Warn actually removed the usage of Array::Compare in a dev release several months ago, but simply forgot to remove it from the META.yml dependency listing. While I still think it is important to look downstream before you port to Moose, this clarification got me thinking as well.

Next comes Jonathan Rockway and his emacs fetish. Jonathan left a comment on Dave's blog which if you didn't know Jonathan, might be mistaken for an attempt to ignite the eternal Emacs vs. VI flamewar. But if you look beyond the emacs fanboi-ness of the comment you will see that Jonathan is (as usual) really making an excellent point about Padre, its extensibility as an editor, its dependencies and the choices its authors are making in relation to those things. This got me thinking about what Moose might be able to bring to the Parde party, good or bad.

So, now for my thoughts ...

To answer Chris's question, there are at least the two people who commented in the RT bug. The first is Adam Kennedy, who loves all things fast and tiny and has long complained about Moose's appetite for CPU and memory. And the second is Mark Stosberg who is a big advocate of vanilla CGI on which he and I have long disagreed. It is not surprising that neither of them like Moose. Are they representative of some kind of majority? Or just religious extremists worshipping at the altar of slim computing? It is hard to say, they both have valid concerns, but history and trends are pretty clear in how bloated software and ever improving hardware are constantly pushing one another forward (Moore's Law FTW).

So, as I said before, I agree-to-disagree with Adam and on his points, but the more I think about it, I think perhaps he is just knee-jerking here and possibly even using RT in anger. In the RT bug he says
This adds a huge amount of additional dependencies
Now sure this is true in relation to Array::Compare, but we all know that Adam is really talking about Padre. This is a little ridiculous considering Moose (on 5.10.0) has only 21 depenencies, 6 of which are core modules and has a 92% chance of installing (it would be 100%, but List::MoreUtils is dragging us down). While Padre (also on 5.10.0) has 104 dependencies, 32 of which are core modules and only has a 24% chance of installing (and requires a threaded Perl as well). So really how many dependencies would Moose add to Padre? Of the non-core dependencies they share 5 dependencies, so 21 - 6 core - 5 shared = 10, which when added to Padre's 62 non-core deps brings the grand total to 72.

Now, you might say "Wait a minute, thats a non-trivial amount of additional dependencies", and in fact you would be correct. However, the likelyhood that Moose is already installed is fairly high and growing by the day as it's popularity increases and already popular modules like Catalyst start using it. You see, Moose is not like other CPAN modules which simply provide a specific feature for a specific need, but instead is a tool to extend the language itself and which you use to create other CPAN modules. Because of this, the chance of having Moose already installed, either directly or because you install one of the approximately 600+ modules that use it, is pretty high.

Adam's next point about startup time is valid, this has long been an issue with Moose. We have made great strides on this issue over the years, but there is still a ways to go. How much this truly affects Padre though is debatable, as Jonathan Rockway pointed out, how many times a day/week/month are you really going to be starting up your text editor? Is startup time for a featureful IDE with possibly many plugins really that big of a deal? It certainly hasn't seemed to slow down the adoption for things like Eclipse.

And finally, to Adam's last comment ...
If this can't be fixed, we're going to need to remove Array::Compare (and everything that uses it) from Padre
this was actually the part that concerned me most because of what Adam is saying to all the authors of Padre's 62 non-core dependencies. While I do believe CPAN developers should look downstream before adding dependencies in the spirit of being a good neighbor, I think it is a little unreasonable for those upstream to dictate what is and is not an acceptable dependency.

Sure Dave's Array::Compare is a straightforward module and the benefits Moose brought to it were fairly minimal. But there are some pretty complex modules in Padre's dependency list, many of which could probably benefit greatly from Moose. And what about Padre plugins? Are they allowed to use Moose? What if they become popular plugins and the Padre folks want to merge them into the core? Do they have to shed the Moose before being allowed in? Where does this insanity end!?!?!

See, Moose is growing in popularity a lot lately. The number of CPAN modules that use Moose have been growing at a very steady pace. The #moose@irc.perl.org channel regularly has over 200 people in it. If you read the Perl Ironman blogs, you have no doubt seen a lot of Moose there lately. The number of Moose related questions on Perlmonks has been increasing lately (I should know I answer many of them). There were 16 talks at YAPC::NA this year tagged with Moose (including a 6 hour Moose course) which made Moose one of the largest tags in the cloud. This was followed by 7 talks at YAPC::EU this year.There will be approx 5 or so talks at the upcoming YAPC::Asia (including the Moose course again given by Sartak). Dave Rolsky will be giving a Moose course at the Italian Perl Workshop. Moose is also central to the EPO's extended core effort. We even get a fair amount of twitter traffic (for whatever that is worth).

Because of this, I really think Moose has proved itself not to be just a fad, especially considering this momentum has been steadily growing over 3 and a half years now. While it might be reasonable of Adam to request upstream deps not use Moose right now, how much longer before this becomes a real problem for Padre? Will the culture of NIH set in? Should Padre embrace the Moose future now? Has anyone even benchmarked/profiled how much of an impact Moose would have on Padre?

So Internets, what do you think?


Tuesday, September 1, 2009

Re: Moose Or No Moose

Moose love controversy, in fact you could say that they are natures attention whores (second only to those nasty hairless apes that seem to be everywhere!). If they aren't endorsing controversial politicians, having kinky sex or crashing expensive cars then you can likely find them partying all night at the hottest clubs. And of course all this attention leads not only to a lot of imitation but even some haters.

The same can be said of the Moose perl module (well, the controversy part anyway, modules can't have sex or drive cars, DUH!). The latest controversy is surrounding a blog post by David Cross in which he tells the story of a RT "bug" report he got after porting his Array::Compare module to Moose.

The bug report was submitted by Adam Kennedy who has a well known fetish for all things tiny. While I do not share Adam's love for CPU/Memory conservation, I do respect his efforts in trying to keep some of the critical tools of our beloved Perl infrastructure slim and easily installable. And in fact, in this case I do have to agree with Adam, before you port an existing module to Moose it helps to look downstream a little.

The benefits Moose provides for a module as straightforward as Array::Compare are actually pretty minimal, it is only using a few simple attributes and not much more. But Moose does come with a well known startup cost (don't let the haters tell you we are slow, once we get up and running things are plenty fast enough and getting faster), which Array::Compare was passing onto all its dependencies. One of those (now former) dependencies was Test::Warn which itself has 101 modules that in turn depend on it in their test suites. The result is that this simple change with minimal overall benefits has imposed a cost on a sizable chunk of the CPAN.

Now you could argue that Padre - The Perl IDE (the project that initially sparked all this) is an IDE and like other IDEs (Eclipse, Visual Studio, etc) it should take forever to start up so as to allow developers a leisurely morning coffee break. Or you could argue that it is Dave's module and he can do whatever he wants with it, downstream dependencies be damned. You could even say that Moose will eventually be fast enough so people should stop whining and just be patient. But in the end, we (CPAN) are a large and heavily interdependent community and I believe we should be respectful to our neighbors as much as possible.

Anyway, Nuff Said,... Peace out to all mah homeys attending YAPC::Asia, wish I could have come this year.

Thursday, August 27, 2009

Rollin' like a Moose-flavored steamroller!

So Chris Prather pointed out Adam Kennedy's latest CPAN Top 100 data to me and noted that Moose now has more downstream dependencies (see the "Volatile 100" tab) then DBI does (we have 1032 compared to DBI's 977). Now of course, DBI is a much more widely used module out in the DarkPAN and I harbor no delusions about what this comparison really means. However, Moose is growing like crazy the last few months and while we still don't have as many downstream dependencies as Class::Accessor (they have 1694, so just 662 left to go) we did recently surpass them on direct dependencies (Moose has 591 and C::A has 567). And looking over the list of authors of these Class::Accessor modules I do see a lot of known Moose conspirators.

It seems I have created a monster :)

Tuesday, July 28, 2009

Plumbing 101 - Fixing my leaky Intertubes

So this will be a short one, but I promised Yuval I would write it up.

My recent big refactoring project at work has been largely about switching a mad-cap collection of YAML config files, database tables, poorly inferred relationships and other sorts of random insanity that accumulates over 5 years of maintaining and extending an application, and inverting this unmaintainable mess into a nice clean happy KiokuDB based set of objects.  

So all is going well until the other day when I upped the number of FCGI backends and the concurrency exposed  a leak in my objects which was not visible when using only the 1 FCGI backend for development. So the first thing I did was to turn to my unit tests and see if I could detect the leak outside of my web-app using Test::Memory::Cycle. Of course I didn't find one there, that would have been too easy, it was buried deeper inside the web-app. So I then enlisted Yuval for help since his knowledge of the Perl guts far exceeds what mine will ever be. 

Unfortunately for me, this application pre-dates Catalyst so I could not use Catalyst::Model::KiokuDB, but I was able to cargo-cult the core of that module and stuff it into the homegrown web framework that this did run on. So while this didn't solve the problem it did allow me to watch the problem happen and gather nice statistics.  I can't stress how important it is to get the your monitoring tools set up first so your not digging through false positives and/or useless information. Yuval and I spent a fair amount of time tweaking this until we finally found the right settings that gave us the perfect balance of information. 

After this it was a lot of playing around with Devel::FindRef and Scalar::Util::weaken until I found the right settings and all my leaks were gone. One particularly evil leak was closures passed into Template::Toolkit params that referenced themselves inside the closure. This resulted in the entire template object leaking, including all associated parameters (in one case this meant hundred of objects). I realized as I was fixing this, that this leak (and a few others) had existed for probably the entire 5 years this application has been running. Only now that it was leaking KiokuDB objects and causing visible issues did I actually notice. 

So while I don't feel that I am now some kind of Master Plumber or anything, I do feel confident enough to fix a leaky intertube here or there. And as always I am amazed by the flexibility of Perl and the wonders that CPAN provides.  It is an odd combination of pride and shame to have finally cleaned up 5 year old leaks. 

Anyway, back to work ...

Thursday, July 16, 2009

More Thoughts on Parameterized Roles

So my last post on parameterized roles has really got me thinking. One of the first use cases I ran into for parameterized roles was MooseX::Storage. Myself and Chris Prather wrote it on the train into NYC one day and since there was no such thing as MooseX::Role::Parameterized yet we hacked it with an exported Storage subroutine which composing in multiple roles based on the parameters that were passed to it. This is actually still how MooseX::Storage works because, well, it Just Works tm so there is no need to change it. But I decided as a thought experiment and a test of the Role Functor idea from my last post to see if I could re-write MooseX::Storage in terms of it. Much to my delight it not only worked, but came out very cleanly. 

Now, this is still written in MooseX::Declare inspired pseudo-code, so it is not yet a reality, but I am getting more and more convinced that this is something I really need to write. So anyway, here goes.

role COLLAPSER { requires 'pack', 'unpack' }
role FORMATTER { requires 'thaw', 'freeze' }
role IO        { requires 'load', 'store'  }

role DefaultCollapser with COLLAPSER {

    method pack {
        Collapser::Engine->new( object => $self )
                         ->collapse_object
    }

    method unpack ($class:, $data) {
        Collapser::Engine->new( class => $class )
                         ->expand_object( $data )
    }
}

role JSONFormatter [ 
        Collapser => (does => COLLAPSER) 
    ] with FORMATTER {

    method thaw ($class:, $json) {
        $class->unpack( JSON::Any->encode( $json ) )
    }

    method freeze {
        JSON::Any->decode( $self->pack )
    }
}

role SimpleFile [ 
        Formatter => (does => FORMATTER) 
    ] with IO {

    method load ($class:, $filename){
        my $fh   = IO::File->new( $filename, 'r' );
        my $data = do { local $/; <$fh>; };
        $class->thaw( $data );
    }

    method store ($filename) {
        my $fh = IO::File->new( $filename, 'w' );
        $fh->print( $self->freeze );
    }
}

I am obviously punting on a couple of details here to keep things simple for the example, but I think it gets the point across.  The nice part, in my opinion, is that the parameterization nicely captures the "levels" of serialization. For instance, here is what a class that does all the options would look like:

class Point 
 with SimpleFile( 
          Formatter => JSONFormatter( 
              Collapser => DefaultCollapser 
         ) 
    ) {
    has x => (is => rw, isa => Int, default => 0);
    has y => (is => rw, isa => Int, default => 0);

    method clear {
        $self->x(0);
        $self->y(0);
    }
}

And here is a class which does not do the load/store but just does the JSON freeze/thaw:

class Point
 with JSONFormatter( 
          Collapser => DefaultCollapser 
    ) {
    has x => (is => rw, isa => Int, default => 0);
    has y => (is => rw, isa => Int, default => 0);

    method clear {
        $self->x(0);
        $self->y(0);
    }
}

And here is a class which does only the simple pack/unpack:

class Point with DefaultCollapser {
    has x => (is => rw, isa => Int, default => 0);
    has y => (is => rw, isa => Int, default => 0);

    method clear {
        $self->x(0);
        $self->y(0);
    }
}

Overall I am quite happy with this, so now it is just a matter of finding the tuits to actually implement it.

Sunday, July 12, 2009

Thoughts on Parameterized Roles

I was discussing parameterized roles with Sartak and doy at YAPC::NA this year. Sartak is the author of the very cool MooseX::Role::Parameterized module, which implements pretty much unlimited parameterization abilities for roles. The shear, unbridled flexibility embodied in that module is insane, which, of course, is both really cool and really scary at the same time. One of our discussion points was about how so much flexibility, if misused, pretty much destroys the benefits of allomorphism you get from roles. With enough parameterization the statement $object->does(SomeRole) has very little meaning anymore since SomeRole could easily be parameterized so that two instances of it do wildly different things. One of the thoughts discussed for solving this problem was to create a stricter set of different kinds of parameters that are allowed. Essentially restricting the functionality to a sane subset through which we can provide some level of guaranteed allomorphism.  While we pretty much rejected that idea for MooseX::Role::Parameterized, the idea stuck in my head.

So the other day on #moose, I was discussing parameterized roles again with Sartak and doy and I mentioned how I have always seen parameterized roles as being very close to ML Functors. The ML family of languages (Standard ML , OCaml, etc.) has an extremely powerful module system which not only has modules (structure in SML) and module signatures (the "type" of the module) but also functors. Functors are best described as modules which take another module as an argument and produce a third module as a result. The book "ML for the Working Programmer" (highly recommended, it is a great book) shows the following conceptual mapping to try and help describe the ML module system. 

  structure ~ value
  signature ~ type
    functor ~ function

But as the book says, this is a helpful starting point, but it fails to convey the full possibilities of the ML module system. 

So at one point in this discussion I decided to try and sketch out how Functor-esque parameterized roles might look and I came up with this (using MooseX::Declare inspired pseudo-code).

role ORDERING { requires 'compare' }

role Sortable [Ordering => (does => 'ORDERING') ] {
    sub sort {
        my ($self, @elements)
        sort { $self->compare($a, $b) } @elements
    }
}

role StringOrder with ORDERING {
    sub compare {
        my (undef, $x, $y) = @_;
        $x cmp $y;
    }
}

role NumericOrder with ORDERING {
    sub compare {
        my (undef, $x, $y) = @_;
        $x <=> $y;
    }
}

role AlphabeticalOrder with ORDERING {
    sub compare {
        my (undef, $x, $y) = @_;
        lc($x) cmp lc($y);
    }
}

class BunchOfStrings with Sortable(StringOrder) {
    # ...
}

class BunchOfNumbers with Sortable(NumericOrder) {
    # ...
}

The first role ORDERING is just an role that requires the compare method and nothing more (an interface), which maps to the ML idea of a signature. 

The second is the parameterized role Sortable which implements a sort method and expects a single role parameter Ordering which must be a role that does the ORDERING interface. This role maps to the ML idea of a Functor. If you notice the Sortable::sort method calls a compare method, which is a method of the ORDERING interface role. The idea here is that the role provided in the parameter Ordering will get composed into the Sortable role and provide the expected compare method. 

The next three roles are just examples of roles that do the ORDERING interface role. Basically one for each of the most common Perl sorting behaviors (at least the most common in my experience). These are pretty simple and straightforward, nothing special here.

After this is a few classes that show how this mechanism might get used. The Sortable(StringOrder) syntax shows the passing of the role parameter (in this case StringOrder) to the parameterized role Sortable. The result of this will produce a third role which is then composed into the BunchOfStrings class.

So, while this is much more restrictive then MooseX::Role::Parameterized, it is much more flexible then simply creating a restricted subset of parameterizable bits. It also (perhaps) solves the allomorphism issue since the "name" of the Sortable(StringOrder) role is simply Sortable(StringOrder) and this clearly provides a predictable and repeatable set of functionality.

So anyway, I do not currently have the tuits to implement this and honestly I kind of want to let this stew for a little longer. It would not replace MooseX::Role::Parameterized but perhaps be called MooseX::Role::Functors or something and can be just another way to do it.

Why I don't like Autobox

So I was looking over Michael Schwern's perl5i module recently (after hearing about it at this years YAPC::NA) and I noticed that it enables the autobox module. This reminded me of all the debates I have had with Matt Trout over the years about the various pros and cons of autobox. So I figured this would probably make a decent blog post, so here goes.

My core objection to autobox is that it is an illusion. It works by hijacking the normal perl method resolution process and right before perl says "Cannot call method 'foo' on unblessed reference" it checks specific packages to see if there are available methods. This gives the illusion that these core perl types are in fact objects, when in reality they are very much not. If they were proper objects, they would always be objects instead of just objects within the lexical scope of the autobox pragma. Here is some code that illustrates what (for me) is the big abstraction leak of autobox.

my $test;
{
    use autobox;
    my $foo = [ 1, 2, 3, 4, 5 ];
    warn $foo->length;
    $test = sub {
        warn $foo->length; # succeeds ...
        $foo;
    };
}

my $x = $test->();
warn $x->length; # fails

This example shows how the lexical scoping of the autobox pragma allows the $test closure to still work correctly, but once outside of the lexical scope the value is no longer autoboxed. This just seems really backwards to me because it requires the users of your code to also enable autobox in their code to use elements from your code. The result of them (for whatever reason) not doing this is that your internal usage of a data element can greatly differ from external usage of the same element. This is an API disconnect that does not sit well with me.

In short, autoboxing is a feature of the lexical environment and not something intrinsic to the element itself. 

My second issue with autobox is that it is very shallow. In languages where the core types are proper objects (Smalltalk, Ruby, Javascript, etc.) it is possible to subclass/extend these core types using normal OO practices. Autobox provides the illusion of normal OO, but as soon as you look any deeper the the surface the illusion starts to crumble at an alarming rate. 

While it is possible to do something close to subclassing/extending with autobox code by using the following technique, it has some severe drawbacks and serious inconsistencies. 

{
    package ARRAY;
    sub length { scalar @{ $_[0] } }
    
    package MyArray;
    use base 'ARRAY';
    # do something silly here for illustration
    sub length { (shift)->SUPER::length + 1 }
}

{
    use autobox;
    my $foo = [ 1, 2, 3, 4, 5 ];
    warn $foo->length; # 5

    my $bar = bless [ 1, 2, 3, 4, 5 ] => 'MyArray';
    warn $bar->length; # 6
}

The most obvious issue is that this only works for reference types (ARRAY, HASH and CODE) since Perl only allows blessing of references. So you cannot use this with SCALAR, INTEGER, FLOAT, NUMBER, STRING and UNDEF which leaves out more then half of the functionality of autobox.

Also, the manual blessing of the subclassed array ref seems a little odd since it differs from how the regular autoboxed array ref works. Of course you could create a MyArray::new method to hide this if you want. If you did this then perhaps for consistencies sake you would want to ARRAY::new as well. But unless you blessed the array ref into the ARRAY package then a user of your code would need to have autoboxing enabled for ARRAY->new to return anything useful, because (as I said above) the autoboxing is not intrinsic functionality, but instead functionality of a given lexical environment. 

Now, my last issue with autobox is that if used with the wrong kind of laziness is can expose the internals of an object and defeat encapsulation and make bad APIs. This was the original motivation behind my writing MooseX::AttributeHelpers after having written Moose::Autobox.  Take this example for instance.

{
    package MyThings;
    use Moose;
    use Moose::Autobox;
    
    has 'things' => (
        is      => 'ro',
        isa     => 'ArrayRef',   
        default => sub { [] },
    );
    
    my $me = MyThings->new;
    
    $me->things->push( 1 );
}

It is very tempting to just let the autoboxing provide the API to add things to your object, but this exposes a lot of internal details to your objects consumer. If at some point you want to change how things are stored you will have a lot of work  to do. Of course this is better then if users had been doing push(@{ $me->things }, 1) because you still have the encapsulation of the autoboxed APIs. But having to write an interface to match ARRAY for whatever you change things to use is just going to get nasty after a while.

Perceptive readers will also note that $me->things->push( 1 ) will not work  unless autoboxing is enabled in that particular lexical environment. Again placing a lot of responsibility on the users of your code just to use the API your providing.

In contrast the MooseX::AttributeHelpers (soon to be core Moose) version is much more encapsulation friendly and is much more amenable to future changes to the storage type of things.

{
    package MyThings;
    use Moose;
    use MooseX::AttributeHelpers;
    
    has 'things' => (
        traits   => [ 'Collection::Array' ],
        is       => 'ro',
        isa      => 'ArrayRef',   
        default  => sub { [] },
        provides => {
            push => 'add_thing'
        }
    );
    
    my $me = MyThings->new;
    
    $me->add_thing( 1 );
}

If you change how things are stored, you simply need to re-write the add_things method. Everything is properly encapsulated within your object as it should be. 

So anyway, thats enough of my autobox ranting. I think that autobox is an extremely interesting piece of software and by no means do I think people should not use it if they are so inclined to. But I think it should be used carefully and with full knowledge of it's limitations and issues. 


Saturday, June 13, 2009

Why make_immutable is recommended for Moose classes

Someone on perlmonks asked
Can you point me to a good explanation of why make_immutable is recommended?
And I realized in the documentation we really only say (in Moose::Manual::BestPractices)
making classes immutable speeds up a lot of things, most notably object construction.
So instead of burying my explanation deep inside Perlmonks I thought I would explain it here (and add to my Iron Man creds).

So, Moose metaclasses are built specifically so that they can be altered at any time from anywhere and still remain a valid and correct class. This is why there is no __PACKAGE__->finalize_class or similar type of method call required at the end of your Moose class definition. But doing things this way does come at a price in that some of the meta-level calls can be very expensive.

For instance, if you wanted to know all the attributes supported by a class, you would need to collect all the local attributes, then visit each superclass (recursively) and collect all those attributes while being sure to skip all overridden attributes. This can get quite expensive and since we allow for you to, at any time, alter the inheritance structure or add/delete attributes via the MOP, this means we can not cache the results of that query (well we could cache it, but then we would have to have all sorts of extra code to check the cache and invalidate it, etc. etc.).

So what you are doing when you make a Moose class immutable, is actually saying "it is okay to cache things, I am not going to mess with the metaclass". At that point Moose takes the opportunity to memoize many of the MOP calls and install methods that throw exceptions when you try and alter the metaclass, effectively making the class read-only. However, this really only helps speed up calls to ->meta methods, so we also then take it one step further.

The example I gave above, of checking all attributes in a class, may seem kind of esoteric and not something one usually needs to care about, but this is exactly what Moose needs to do every time it creates an instance of an object. It needs to do this in order to properly initialize all the slots in an instance, fire any triggers, check any type constraints, perform any type coercions and call all BUILD methods in the inheritance graph in the correct order. By memoizing the computed list of all inherited attributes we are actually saving quite a lot of computation, but honestly that is not enough. So we actually take the opportunity to inline and compile our own optimized constructor method that does the exact same thing, but in much less time. The result is that object construction is significantly faster during the runtime of the program (which is when it really counts) and we instead take the compile-time hit of the code construction and evaluation. And since we are in there already we also inline a DESTROY method which correctly calls all the DEMOLISH methods in the correct order (Moose already, by default, will inline your attribute accessors, but if it didn't then it would do that as well).

So the short answer is that making your class immutable is good because it memoizes several metaclass methods and installs an optimized constructor and destructor for your class and therefore helps reduce a fair amount of the cost (during runtime) of all the abstraction that the MOP provides.

Monday, June 8, 2009

YAPC::NA Moose Hackathon

We are planning a Moose hackathon after YAPC::NA this year. It should be a nice compliment to Dave Rolsky's Moose course as well as the several Moose related talks on the schedule, so if you can stay an extra day or two in Pittsburgh come and hang out and talk/hack some Moose.

Saturday, June 6, 2009

Moose Bus Factor

One of the most common problems in many software projects is the Bus Factor. This can be especially true of Open Source projects even if they have a lot of contributors, because it is not often that a project will get another developer who is as steeped in the code as the original author.

I am happy to say that the Bus Factor for Moose is now a solid 2 and able to expand easily up to 5 and Class::MOP is the same if not better. This is not to count out the other 50+ contributors at all. Several of them have been steadily climbing in these lists over the past few months.

Here is the output of git-blame for Moose.

Total lines: 45996
       18954  41.21%  Stevan Little
       15270  33.20%  Dave Rolsky
        3556   7.73%  Yuval Kogman
        2414   5.25%  Shawn M Moore
        1038   2.26%  Chris Prather
         414   0.90%  John Napiorkowski
         387   0.84%  Tomas Doran
         333   0.72%  Guillermo Roditi
         298   0.65%  Jesse Luehrs
         274   0.60%  Todd Hepler
         259   0.56%  Lars Dieckow
         259   0.56%  Hans Dieter Pearcey
         195   0.42%  Anders Nor Berle
         187   0.41%  Matt S Trout
         186   0.40%  Nathan Gray
         186   0.40%  Aankhen
         182   0.40%  Jonathan Rockway
         177   0.38%  Jesse Vincent
         172   0.37%  Aran Clary Deltac
         150   0.33%  *initial checkin
         110   0.24%  John Goulah
         103   0.22%  Tomas Doran (t0m)
          89   0.19%  Dann
          88   0.19%  Marcel Grunauer
          88   0.19%  Ricardo SIGNES
          73   0.16%  Jess Robinson
          63   0.14%  Evan Carroll
          56   0.12%  Justin DeVuyst
          55   0.12%  Ash Berlin
          50   0.11%  Daisuke Maki (lestrrat)
          46   0.10%  Wallace Reis
          41   0.09%  Florian Ragwitz
          40   0.09%  Tokuhiro Matsuno
          33   0.07%  Adam J. Foxson
          32   0.07%  Marc Mims
          23   0.05%  Robert 'phaylon' Sedlacek
          18   0.04%  Cory G Watson
          18   0.04%  Scott McWhirter
          15   0.03%  Paul Jamieson Fenwick
          13   0.03%  michaelr
          11   0.02%  Tomas Doran (t0m
           7   0.02%  Adam Kennedy
           5   0.01%  Robert Boone
           5   0.01%  Nelo Onyiah
           4   0.01%  t0m
           4   0.01%  Mike Whitaker
           2   0.00%  Piotr Roszatycki
           2   0.00%  Eric Wilhelm
           2   0.00%  Paul Driver
           2   0.00%  Christian Hansen
           2   0.00%  hakim
           1   0.00%  Marcus Ramberg
           1   0.00%  John Douglas Porter
           1   0.00%  Jay Hannah
           1   0.00%  Shlomi Fish
           1   0.00%  Cory Watson

And here are the same for Class::MOP.

Total lines: 26107
       11109  42.55%  Stevan Little
        5107  19.56%  Dave Rolsky
        4525  17.33%  Chris Prather
        1919   7.35%  Yuval Kogman
        1606   6.15%  Florian Ragwitz
         702   2.69%  Guillermo Roditi
         647   2.48%  Shawn M Moore
          98   0.38%  nperez
          89   0.34%  Matt S Trout
          70   0.27%  Ricardo SIGNES
          50   0.19%  *initial checkin
          42   0.16%  Tomas Doran
          36   0.14%  Todd Hepler
          29   0.11%  Hans Dieter Pearcey
          26   0.10%  Jesse Luehrs
          24   0.09%  Marc Mims
          11   0.04%  Brandon L Black
           5   0.02%  Robert Boone
           4   0.02%  Scott McWhirter
           3   0.01%  Jonathan Rockway
           2   0.01%  Flavio Poletti
           2   0.01%  Shlomi Fish
           1   0.00%  Rob Kinyon

I am quite happy with this, as it now means that I can cross the street without fear.


Wednesday, June 3, 2009

Moose and DWIMery

So Ovid recently discovered that Moose does not create any accessors by default. Which was surprising to him and truthfully has surprised many people over the years. We on #moose have discussed this many times and the general consensus has always been to leave it as it is. I explain in the comments to Ovid's post why this is so, but I figured that for my inaugural blog post I should expand on this topic.

DWIMery and the Slippery Slope

DWIMery ("Do What I Mean"-ery) can be a very valuable thing when designing APIs but it does come at a cost. The more specific your API, the easier it is to DWIM since the option set is likely pretty small and defaults are usually obvious. But the more general your API, the harder it is to strike a balance. The problem gets even more so when you are designing something like an object system or a language. A system like Moose needs to not only support doing what I mean, but also doing what everyone else means as well. 

Opinionated Software vs. TimToady

Recently there has been a trend towards more "opinionated" software (Ruby On Rails) and even "opinionated" languages (Python).  The popularity of both these pieces of software shows that many people like this trend. However, Moose is Perl, and in Perl we subscribe to TIMTOWDI (There Is More Than One Way To Do It). On some level, you could say that opinionated software is actually the antithesis of TIMTOWTDI.

Now this is not to say that Moose is not opinionated or is somehow the pinnacle of  TIMTOWTDI. In fact Moose is actually pretty opinionated and I strongly believe that too much TIMTOWTDI is one of the reasons that Perl has the negative reputation it has for maintainability and code clarity. But what Moose does differently is to be humble about its opinions and make it easy (for some value of "easy") to override those opinions and inject your own.  

Chris Prather actually suggested just such a solution in one of his comments to Ovid's post. The syntax looks something like this:

package Foo;
use Moose -traits => ['ReadOnly'];

has 'bar';
has 'baz';

This could be accomplished by making a "trait" (the Moose term for a role that is applied to a meta-level object) which would affect the metaclass such that any time an attribute was created it would force a default read-only accessor to be created. While this sounds complicated it would actually be fairly simple, the trickiest part being dealing with merging your default read-only-ness with any user specified options.

Why Moose doesn't create accessors by default

Moose has always aimed to be as Perl-ish as possible, which means trying to embody the spirit of TIMTOWTDI. As I mentioned in one of my responses to Ovids post, the choice of which type of accessors Moose should create is not so simple. My personal inclination is towards generating simple read-only accessors, others might expect read/write accessors to be the default (which is what other common Perl OO modules like Class::Accessor provide). But this ignores the suggestions that Damian made in Perl Best Practices or the people who like semi-affordance accessors (->foo for reading and ->set_foo for writing) or the people who prefer public readers/private writers. The list can go on and on, and each and every one of these is an equally valid choice. 

In my mind the only solution when faced with all these differing and equally valid viewpoints is to actually favor none of them, but allow all of them. And of course, this is exactly what Moose does. I believe that this is most in keeping with the spirit of TIMTOWTDI and therefore the most Perl-ish.