Ryan's code wonderings: September 2012

Sunday, September 30, 2012

Closures or software bags

I'm not sure if closures are something you can really understand without actually working with them, I don't think I did (chances are however you're smarter than me, so I may be the outlier). I read about them a variety of times after hearing about them, I kind of felt like I might understand them, but it really didn't click until I actually wrote some code and experienced them myself. That said, I'm going to try and explain them, partially because I might actually understand them incorrectly, but moreover I view them in a way I haven't found explained anywhere else. So here goes my explanation of closures after I didn't understand other peoples explanations.

Oh yeah last note before I start: closures are simple. Really, really simple, everyone who tells you they aren't are cheating you like when you were in school and everyone told you math is hard, true or not they were undermining you right from the jump. I'm not saying this to give you a confidence boost though, if you don't think they're simple you're more likely to think you don't understand them when you actually do; because they're simple. Ok, on with it already.

So I'm a visual thinker to start with and most things tend to conjure up an image for me, for a closure, it's a bag. A nice leather bag with a leather draw string you'd expect some hippy at a lord of the rings conference to have on his belt specifically. I think this is just because I associate closures with the concept of Greybeards due to LISP. Either way the image fits for me because if you found a leather bag of this nature that was closed, you could literally imagine anything being inside. Maybe it has some canadian coins, maybe it has a miniature time machine, a crossbow, and/or a beady eyed albino stretch-limo rat. Also it could be empty, or have another bag inside of it, more on that later.

Ok, so we have a funny looking bag with undetermined body of contents, what of it? Well somebody at one point made this bag, and somebody put stuff in this bag. Why? Well, if I had a bag like this that I was going to put stuff into, I would put things that work together. Let's imagine a different bag for a moment, a student's backpack. Inside you'll find topical books, notebooks, pens, and a calculator. These things all definitely work together. You can't operate the math book without the calculator (I too think this should be wrong but you've seen these as well as I have), or the paper without one of the pens etc so these have a purpose together as a whole. One could say you couldn't operate the school without this bag, we'll get there in a moment.

So now we have some extremely abstract concept, why? Well, it basically holds true to what a closure is. Now that's out of the way and you think I'm mental for wasting your time I'll actually show a closure. I'm going to use JavaScript because it's commonly understood and closures are fairly first class in it.

baggy = function() { alert('woo'); };

Ok, are you underwhelmed? Good. Now, don't think function is synonymous with closure. Sorry if that's what you just got from that. Here's what I hoped to show there: there is a bag(gy) with a little computation inside it which spits out a message box when evaluated. A message box machine, sounds like something you'd find in a Greybeard's little leather bag..

Now, I want to address a constant misconception about closures. I haven't talked about it yet and some people who think they understand closures or read other stuff about them think my explanation so far is pretty far off. Why is that? Scope. I haven't said anything about it at all. A lot of explanations and articles on closures are based completely on the fact that they seem to have strange behaviors in relation to scope. Defining a closure by the rules it follows regarding scope however is like defining an Apple by the count and organization of it's seeds. It's useful if all you care about is the seeds(scope) but won't really do if you want the whole apple.

Now that is out of the way, and I've made it over a page without giving any explanations you came for, I'll try to give you something useful.

(function() {
  // Here is my current scope, look nothing interesting, so forget about scopes

  someX = 40;

  aClosure = function(x){
  y = 2;
  alert(x + y);
  };

  aClosure(someX);
}).call((function(){/* Here is my parent scope, also nothing so forget about scopes! */})());

Here we go, there's more code so it must be showing something more complex and detailed right? Eh, not much. aClosure is slightly more interesting than the last closure I showed you though, so back to the bag analogy what's in that bag? Well there's an opening in that bag that allows us to put things in it(the parameter), the value 2 held in a variable, and a computation that generates message boxes. Still pretty boring though, looks like a function, doesn't show anything we didn't already know. Come on, we wanted to see what this closure thing is we keep hearing all this hype about!

(function() {
  // Here is my current scope, look nothing interesting, so forget about scopes

  someX = 40;

  aClosure = function(x){
  y = 2;
  alert(x + y);
  };

  bClosure = function(x) {
  thisVarExistsInThisClosure = x;
  aClosure(thisVarExistsInThisClosure);
  }

  bClosure(someX);
}).call((function(){/* Here is my parent scope, also nothing so forget about scopes! */})());

Now some of you read that and think "Well now you've got two closures and the code is getting contrived, will you just explain the bloody thing already??". Remember how I said they're a lot simpler than you think? That's what makes them difficult for people a lot of times, you see them often enough that you don't really recognize that it's anything special. Above you can see I have 2 bag's now. The same one from before, and a new one, bClosure. The new one has an opening for putting a variable into, and then it has a local variable to hold whatever you put in through that opening, also it has a pointer to our first closure. Wait, what was that last thing? If aClosure doesn't exist inside bClosure, why does bClosure have access to it? The answer to this is why people always think closures have something to do with scope.

Closures are bags, they have inside them three things:

Variables, sometimes by value sometimes by reference but even a reference is a value where that value is a memory address.
References to functions that may do something useful, the notebook in a students backpack is useless without a pen/pencil remember?
Other bags which have another cluster of useful things in them that interact with each other to cause a result useful to the bag they're in. Think a student's backpack inside a school, it's one of many things in a school that are mutually beneficial when combined.

Now that you know this, I'm going to give you an example and ask you to think how you imagine the underlying JavaScript creates this closure.

(function() {
  // Here is my current scope, look nothing interesting, so forget about scopes

  someX = 40;

  aClosure = function(x){
  y = 2;
  alert(x + y);
  };

  bClosure = function() {
  thisVarExistsInThisClosure = someX;
  aClosure(thisVarExistsInThisClosure);
  }

  bClosure();
}).call((function(){/* Here is my parent scope, also nothing so forget about scopes! */})());

If you didn't notice the difference between this example and the previous one, note the number of parameters to bClosure. Why was I able to do this? If you had to write something that interpreted that and generated a bag based on it what would it generate? Semantically it is identical to the last example I gave, isn't it? Almost, there's a slight difference in that previously you had to put something into the opening of your bag at execution time before it would have a value for the local variable. So what is it doing under the covers now to fill in the local variable for bClosure since we aren't putting the value in through an opening?

Simple, when the bag is created, some things are put in it right away, namely anything it needs that's available at definition time, in our example someX. So whatever the Greybeard put in his leather bag that I found, it's all still in there, and though the bag has changed hands, I can still combine the things in it to get some pretty cool results. That last part is why everyone's obsessed with scope related to closures, if someone else gets your bag they can use your stuff because you put your stuff in the bag, so they're handy for sharing stuff.

If that clicked for you, you're wondering why I wrote this in such a contrived fashion? Closure's are an abstraction, and abstraction's compress things. To show you the same behavior without that abstraction meant less density, sorry about that, I like density too which is why I like closures. One last example now, though if you understood everything to this point you understand a closure where at definition time whatever it's owner has can be put into it for use later. Told you it was simple.

(function() {
  // Here is my current scope, look nothing interesting, so forget about scopes
  someX = 40;

  aClosure = function(){
  y = 2;
  alert(this.someX + y);
  };

  aClosure();
}).call((function(){/* Here is my parent scope, also nothing so forget about scopes! */})());

Now here's the meat of when a lot of people run into closures, they're confused by the this keyword in JavaScript. This is simply a reference to the bag your current bag was created in. Remember how when your bag was created, you were able to put whatever you had around at the time in it? Well if your bag has "this" references, it just means you put some of the things lying around you in it before closing it up, and you want to use them. if someX were defined in aClosure in this example, this.someX would still reference the parents version. That's all their is to 'this', it just disambiguates between variables inside your closure and ones in the scope where your closure was defined (which were also placed inside your closure at definition time).

Now then, s/bag/closure/g and as always I would love to hear feedback. As I said at the start, I could be way off base with this, that's the problem in software, you could have a complete conceptual inaccuracy and be oblivious to it. We've all been there, or at least I have. Lots.

Cheers and I hope this helps someone.

Monday, September 24, 2012

What about inheritance

Someone asked for my thoughts on a bit from a book the other day, and this is what I derived from it. I'm curious what others think of my derivation and explanation. Here's the context I was asked about:

From a book I am reading (http://shop.oreilly.com/product/0636920024231.do)

Opinions? (this relates to things besides javascript in my opinion)

Classical Inheritance is Obsolete

"Those who are unaware they are walking in darkness will never seek the light."

—Bruce Lee

In "Design Patterns", the Gang of Four recommend two important principles of object

oriented design:

1. Program to an interface, not an implementation.

2. Favor object composition over class inheritance.

In a sense, the second principle could follow from the first, because inheritance exposes

the parent class to all child classes. The child classes are all programming to an implementation,

not an interface. Classical inheritance breaks the principle of encapsulation,

and tightly couples the child classes to its ancestors.

Why is the seminal work on Object Oriented design so distinctly anti-inheritance?

Because inheritance causes several problems:

1. Tight coupling. Inheritance is the tightest coupling available in OO design. Descendant

classes have an intimate knowledge of their ancestor classes.

2. Inflexible hierarchies. Single parent hierarchies are rarely capable of describing all

possible use cases. Eventually, all hierarchies are "wrong" for new uses -- a problem

that necessitates code duplication.

3. Multiple inheritance is complicated. It's often desirable to inherit from more than

one parent. That process is inordinately complex and its implementation is inconsistent

with the process for single inheritance, which makes it harder to read and

understand.

4. Brittle architecture. Because of tight coupling, it's often difficult to refactor a class

with the "wrong" design, because much existing functionality depends on the existing

design.

5. The Gorilla / Banana problem. Often there are parts of the parent that you don't

want to inherit. Subclassing allows you to override properties from the parent, but

it doesn't allow you to select which properties you want to inherit.

These problems are summed up nicely by Joe Armstrong in "Coders at Work", by Peter

Siebel:

“The problem with object-oriented languages is they've got all this implicit environment

that they carry around with them. You wanted a banana but what you got was a gorilla

holding the banana and the entire jungle.”

—Joe Armstrong

Inheritance works beautifully for a short time, but eventually the app architecture becomes

arthritic. When you've built up your entire app on a foundation of classical

inheritance, the dependencies on ancestors run so deep that even reusing or changing

trivial amounts of code can turn into a gigantic refactor. Deep inheritance trees are

brittle, inflexible, and difficult to extend.

More often than not, what you wind up with in a mature classical OO application is a

range of possible ancestors to inherit from, all with slightly different but often similar

configurations. Figuring out which to use is not straightforward, and you soon have a

haphazard collection of similar objects with unexpectedly divergent properties. Around

this time, people start throwing around the word "rewrite" as if it's an easier undertaking

than refactoring the current mess.

Many of the patterns in the GoF book were designed specifically to address these wellknown

problems. In many ways, the book itself can be read as a critique of the shortcomings

of most classical OO languages, along with the accompanying lengthy workarounds.

In short, patterns point out deficiencies in the language. You can reproduce

all of the GoF patterns in JavaScript, but before you start using them as blueprints for

your JavaScript code, you'll want to get a good handle on JavaScript's prototypal and

functional capabilities.

For a long time, many people were confused about whether JavaScript is truly object

oriented, because they felt that it lacked features from other OO languages. Setting

aside the fact that JavaScript handles classical OO with less code than most class-based

languages, coming to JavaScript and asking how to do classical inheritance is like picking

up a touch screen mobile phone and asking where the rotary dial is. Of course

people will be amused when the next thing out of your mouth is, "if it doesn't have a

rotary dial, it's not a telephone!"

JavaScript can do most of the OO things you're accustomed to in other languages, such

as inheritance, data privacy, polymorphism, and so on. However, JavaScript has many

native capabilities that make some classical OO features and patterns obsolete. It's

better to stop asking "how do I do inheritance in JavaScript?", and start asking, "what

cool new things does JavaScript enable me to do?"

I would lately agree with this. I think what is really being complained about here is something that should be complained about as modern OO has gone bananas with this concept: statefulness.

Think about it for a minute, what do you do in a base class?

· Construction/initialization code which sets up the state of the object where multiple objects are similar and therefore have common forms of state encapsulated in them. i.e:

public class BaseClass {
    private Guid _thisInstancesToken;

    public BaseClass() { _thisInstancesToken = Guid.NewGuid(); }
}

· Utility methods that are usable by your child classes.

protected void NewToken()
{
    _thisInstancesToken = Guid.NewGuid();
}

The first obviously is state based, what initialization occurs in a base class that isn’t about the object’s state (or it’s environment; state in other objects)? There’s no reason to have base class initialization occur without it being side effectful (side effect n. Effect the system in a way other than returning something).

The second one, why would you ever call a utility method in your parent class rather than in a completely separate class? The only reason the parent class should have that method for you to call instead of another class having it, is because by being the objects parent it has access to the state, and can therefore alter that state.

Now that you see how inheritance really exists only to serve statefulness, let’s talk about why we should worry about statefulness.

Statefulness leads to side effects: The relationships inside of state. These are pieces of tight coupling which causes the ‘arthritic’ behavior he is referring to.

For example:

public class Car { public bool Running; }
public class Driver
{
    Void StartCar(Car carToStart) { carToStart.Running = true; }
}

The above method StartCar causes a side effect in your current program’s running state, by simply executing someDriver.StartCar(aCar); every single part of the current running process which has any relationship to aCar is affected. This side effect creates a tightly coupled relationship between Driver and all parts of the system which have a reference to aCar . Though when you’re changing those other parts of the system which depend on Car, you wouldn’t immediately think you have to pay attention to Driver when you’re changing them, after all they have no reference to Driver. If Driver has this tight relationship to every part of the system that relies on a Car without it being obvious, what other parts of the system are also maintaining similarly tight relationships by affecting the Car’s state?

Now look at the stateless way of doing the above and notice how it does not affect other parts of the system:

class Car { bool Running; }
class Driver
{
    Car StartCar(Car carToStart)
   {
        Car mutableCar = carToStart.Clone();
        mutableCar.Running = true;
        return mutableCar;
    }
}

Now you see, the Driver's only relationship is to those who call StartCar, because the method has no side effects. There could be 400 threads, or objects, or what have you in the system referencing carToStart, but calling StartCar has 0 effect on the behavior of those parts of the system, therefore, Driver has no relationship to those parts of the system. If the method that calls StartCar takes the new car and replaces the old car with it however.. Then it would be said the method calling StartCar has a side effect and that method’s class is then tightly relating itself to all who hold references to that instance of Car.

So I believe the rant mentioned is really griping because inheritance only exists to create state and cause side effects in it, and through side effects disparate parts of a system become tightly coupled in unexpected ways.

“The problem with object-oriented languages is they've got all this implicit environment

that they carry around with them. You wanted a banana but what you got was a gorilla

holding the banana and the entire jungle.”

—Joe Armstrong

Joe sums up basically the problem above, you end up with a base class messing in your entire jungle, and when your base class causes the flapping of a butterflies wings in that jungle, you might get a hurricane in Jacksonville. J

Now someone out there is saying "Hey, wait a minute! You didn't say anything at all about the real purpose of inheritance! It abstracts your implementation from your contract! Polymorphism allows type substitution, what about that??". Well you're half right, yes polymorphism allows generalized coding methods against contracts rather than concrete implementations. However contract implementation really isn't the same thing as subclassing. This is why you can in C# and Java implement multiple contracts but not multiple base classes. So I derive from this that, though the syntax for implementing a contract and a base class are often the same, that is merely for syntactic consistency. What is actually happening when implementing a contract is quite different from what is happening underneath when your class implements a base class. For a broader explanation and detailing of subclassing/subtyping dangers refer to http://okmij.org/ftp/Computation/Subtyping/Trouble.html and http://okmij.org/ftp/Computation/Subtyping/Preventing-Trouble.html

This fellow explains a far more in depth understanding of the topic.

Please share your thoughts and point out where mine are lacking.