Author: imccoy

How To Program

Have an idea: I’ll make the computer do…. that!
Sit down at the keyboard. Stare blankly. Start typing. Stop typing. Walk away.
Have an idea: I’ll make the computer do that…. like this!
Sit down at the keyboard. Stare blankly. Start typing. Stop typing. Walk away.
Have an idea: I’ll make the computer do that like this… producing this output!
Sit down at the keyboard. Start typing. Stop typing. See that you’ve got the output you wanted.
Have an idea: Now I’ll just generalise from that output!
Repeat.

Grand Unifying Theory

Consider a popular web page. It doesn’t matter which, as long as it’s dynamically constructed. When the server wants to send you that page, there are – loosely – two approaches. The easy one is to open the template and start sending stuff down to the client; when you get to a hole, fetch that thing from the backend and proceed. Each hole is filled sequentially and the latency is fairly high. The harder way is to fire off requests for all the backend stuff you’re gonna need, and build it all into a page when it all arrives, which gets you lower latency and potentially decreased server load. Of course, if your template is written as a pure function over state defined by the backend, then a compiler can take care of generating code using the latter technique. And then, everything is beautiful.

Consider a UI that does swish in-place updates whenever you edit the data. Again, there are two approaches; either you write Javascript to keep everything up to date within the system, or you use a framework that allows you to define models within the client and bind said models to UI elements. In the former case, it’s incredibly difficult to keep the code organized (and yourself sane), and in the other way you wind up duplicating model definitions between client and server. Of course, if your template is written as a pure function over state that changes in known ways for given requests, then a compiler can produce a template that keeps itself consistent as the user changes the state. And then, everything is beautiful.

Consider a site that has to keep some aggregates, its inputs combined, sliced and diced in various ways. You can write a batch process that grabs all the new stuff and rebuilds the aggregates, if you can parallelize hard enough and live with the delays. Or you can take an event based approach, so upon the arrival of any update the event processor takes care of twiddling all the things that depend on the event, if you’re willing to accept something rather tricky to debug and almost impossible to recover from if it goes wrong and breaks your precious production data. Of course, if you define your aggregates, your application state in general, as a pure function over user requests, then you can have a compiler that keeps those aggregates current by incremental updates when user requests come in. And then, everything is beautiful.

At the intersection of those three vectors is where I think the shining future of web development is to be found. Templates written as pure functions over the state, state written as pure functions over the input, and incremental computing techniques used to keep the state current and the UI current as inputs are provided.

Drawing

I want to draw this scene, it’s a pretty simple idea. We are in a graveyard, and there is a bicycle, sitting above the grave. But we see instead of the chain connecting the pedals to the rear wheel, it delves deep down into the earth, where it connects to a coffin mounted on an axle.

Except, you know, I can’t draw. And if I attempted to draw this particular scene, I think my attempt would set most of the great artists of history to spinning in their graves.

Web Development: We Are All Doing It Wrong

Well, almost all of us. Two projects are shining beacons of hope: Luna and Opa.

I’ve spent the last couple of weeks trying to write code that does it in my own particular style of less-wrong. When I say less-wrong, what am I actually talking about? At the moment, we write a HTML template on the server, then we write javascript to twiddle bits of the dom generated by that template; thus we at best create tightly interdependent templates and code, and at worst duplicate the two. We struggle to make a good experience with javascript available that also works when javascript is disabled. We write controllers and views that are so tightly coupled that they are really one piece written in two different languages.

My vision is to get away from these issues by writing templates at a slightly different level of declarativeness (in the Luna presentation, he describes this as data binding on steroids), and instead of working in terms of routes and requests, write code in terms of what the user intends with a click (which doesn’t mean you can’t have cool URIs).

This, then, is where I’m at. With caveats: most of these names, even these concepts, are more than a little bit provisional. And this is just the barest of beginnings, it doesn’t even get close to delivering on the vision. This is a teenytiny application that presents a whiteboard for running retrospectives. Anyone can add entries in any of the three sections. That’s about it.

data Section = Good | Bad | Confusing
  deriving (Show, Read, Eq)
data Entry = Entry String
  deriving (Show, Eq)

data Entity = EntitySection Section | EntityEntry Entry
  deriving (Show, Eq)

data Hap = NewEntry Section String | ShowRetro

defaultHap = ShowRetro

actionsForHap ShowRetro = []
actionsForHap (NewEntry section text) = [AddPropertyOn (EntitySection section) (EntityEntry $ Entry text)]

entityLand =
        haveEntity "Section" $
        haveEntity "Entry" $
        haveEntity "Text" $
        "Section" `relates_to_multiple` "Entry" $
        "Entry" `relates_to_one` "Text" $
        emptyEntityLand

possibleSectionStrings = ["Good", "Bad", "Confusing"]

newEntryHapperSection = HapperField { fieldName = "section", validators = [EnsureIsOneOf possibleSectionStrings] }
newEntryHapperText = HapperField { fieldName = "text", validators = [EnsureIsNotEmpty] }
newEntryHapper = Happer "newEntry" [newEntryHapperSection, newEntryHapperText] (\getter -> NewEntry (read $ getter (newEntryHapperSection)) (
getter newEntryHapperText))

allHappers = [newEntryHapper]

retro_entry entry = Tag "span" [] $ ValueFuncCall showEntryText [entry]

retro_entries section = Tag "div" [] $ Values [
                                      (Tag "h2" [] $ ValueFuncCall showValueAsString [section]),
                                      (ValueFuncCall (FuncMap retro_entry) [ValueFuncCall ("Section" `follow_to` "Entry") [section]])]

retro = Tag "div" [] $ Values [retro_entries(ValueEntity $ EntitySection Good),
                            retro_entries(ValueEntity $ EntitySection Bad),
                            retro_entries(ValueEntity $ EntitySection Confusing)]

sample_data = build_data entityLand
                     [
                      add_entity (EntitySection Good) (EntityEntry $ Entry "It's Okay"),
                      add_entity (EntitySection Bad) (EntityEntry $ Entry "It's Ugly")
                      ]

retro_page = Values [(Tag "h1" [] $ Text "Retro"),
                     retro,
                     HapperReceiver newEntryHapper [
                       HapperInput newEntryHapperSection (Dropdown possibleSectionStrings),
                       HapperInput newEntryHapperText Textfield
                     ]]

wrap_in_html body = Tag "html" [] $ Values [
                                    (Tag "head" [] $ Tag "title" [] $ Text "Waltz App"),
                                    (Tag "body" [] body)]

What’s interesting about this?

The definition of the markup for the page (retro_entries is a particularly fine example) makes it clear when a portion of the markup is a determined by some changing data in the system. This means that (a future version of) the framework can perform ajax-y incremental updates on the client side, without explicit dom-twiddling code being written for the application. When the set of entries for a section grows, the framework can take care of shoving that new node in there, so you get swish updates without having to maintain the synchronisation of JS and the markup.

Future work: make a more succinct language, probably based on haml or hamlet, that has the same properties.

A “Hap” (is in, what’s the haps?) is a user’s interest in doing something. A Happer is how a web request makes a Hap. They may ultimately turn out to be the same thing, but for now they’re distinct. Happers have a strong but lightweight link from their definition to their use, are related to URLs, and can be nested.

When you are requesting user input, you’re doing so to create a hap. So, you specify what Happer it is that you want. This is a strong link, so you don’t change one character and find that the application still runs but it’s not hooked together anymore (rails, ahem). But it’s a lightweight link, too: you’re just specifying what it is that you want to make, then laying out fields to do so.

A Happer is related to URLs. If you make a newEntry happer, than if that’s the only happer to be submitted by a form, the framework could indicate which happer is in use by POSTing to /entries. If there’s two happers to be submitted by a form, but both of them are related to updating an entry, then it might use a PUT to /entry/$id with hidden fields to indicate which inputs go with which happer.

Future work: Make the happers use URLs. Allow happers to be arbitrarily complicated (in order to allow the inevitable tables of inputs). Make it possible to use multiple happers in a single page. Work out if happers and haps are actually any different.

Once you’ve gone from a happer to a hap, you can get the actions for the hap. These actions are where any side-effects (writes to the database, etc) happen, and there’s two motivations for this; one practical and one philosophical. The practical motivation is that these side-effects are the things that are going to cause DOM modifications; this structure allows you to run happers -> haps -> actions both on the client, to update the dom, and on the server, to update the database and other users. The philosophical angle is that all existing Haskell web frameworks that I know of throw you into the IO monad in every request, so that any request can do any thing. This structure of haps creating actions which cause side effects should increase the amount of your application that lives outside the IO monad and therefore has pure code’s ease of reasoning.

Future work for actions are to make more sophisticated actions, to make actions that hit persistent databases, action processing to update the client ajaxily, and actions that cause other actions.

Feedback? There’s actual working code at https://github.com/imccoy/waltz-incremental. I haven’t done any of the really ambitious stuff yet, but it’s still seeming possible.

(Other ways of thinking about this:

how would we be building web apps if OOP and Relational Databases had never happened.
the twisted love-child of event sourcing, MVC and FRP
what would a web framework that really optimizes for pure functions look like?

or, perhaps, the delusions of a madman)

It’s feedback time!

I linked to http://asana.com/2011/06/peer-feedback-at-asana/ on le tweets today, and there was a bit of conversation. Which is great, I like conversation, but my responses are now too big to fit down the twube, so this here is one of those.

What is the ideal outcome of a corporate feedback session? Obviously, it’s for people to do more of the good stuff that they do, and to be able to improve on their weaker points. Some corporate processes might depend on it for compensation purposes, but that falls outside my idealized purview; and I, honestly, can’t think of anything important beyond those two obvious points.

We want people to improve, so we tell them about their bad points. This, this is fair enough. The problem is that there is no way of delivering this feedback that doesn’t risk making the recipient feel like some unspeakable trodden-on mess. I mean, I’m pretty sure it’s not just me that feels like that. And the worst thing is that in a great many cases, the things that make me feel that way are things I already knew. Now there’s something I know I need to change, and, I feel like garbage. But the thing I know, I already knew. The only new bit is that I feel worse than I did before.

Awesome. Feedback is such a great part of contemporary corporate culture, isn’t it?

How often do you change aspects of yourself? What makes that easier or harder? When are you your best self? For me, I change very slowly. It is easier when I know that I am making mistakes, but I feel that it’s okay and I’m being forgiven. And I am my best self when I am generally cheerful and feel myself to be loved.

That is why the asana post spoke to me so. It describes a beautiful world, where the focus is squarely on the truly important and useful aspects of feedback.

Rachel suggested that the whole feedback process could be replaced by each person presenting their self-improvement plan to the group. And I, being a creature of unspeakable hubris, responded by looking into my crystal ball. In it, I saw myself tearing my hair out as I attempted to produce a reasonable plan. This torture proceeds for some time, only to be followed by periods of guilt and self-recrimination throughout the year about the ineffectiveness of my plan, and the ineffectiveness of my self. Is this a straw man? I don’t know. All I can say is that I find the process much easier to imagine as harmful than helpful.

I haven’t spoken at all about the challenge of informing people about their problems. But I, being young, being foolish, say: lets worry about that after we’ve got workplace environments where people actually have a real chance at improvement; where they know that they will make mistakes, and yet be still beloved; where it is easy to be your best self. Where the backdrop to ‘you smell a bit today’ will be one of knowing that you’re awesome, and the person who told you is awesome, and that greater heights of awesomeness are within your reach.

Taking a REST

Most of my work is in rails, which promotes a particular vision (some would say particularly wrong) of REST. When we speak of REST, it is mostly because we want to have a set of endpoints with well-defined operations, where each operation acts on a single resource. These days, we have three goals:

Make a set of web pages that give a reasonable experience in the absence of javascript, which all have nice URLs
Have the same set of pages work really bloomin’ nicely with javascript turned on
Present a set of urls that represent resources and define actions on those resources.

We had a very spirited discussion about this at work yesterday, in which I declared that you can’t meet all 3 goals. Here’s why.

When using javascript to make pages really slick, there’s a set of url endpoints that you want, a la goal 3. Some of those endpoints will be identical with the urls you want your nice user-facing web app to have for goal 1. The first goal requires some properties, and the second demands others. Eventually, conflicts between them will force compromise, and then what it means to be a resource will be ambiguous, and then? Bugs! Bugs, glorious bugs.

What’s the solution? It could be to add a layer; your pure domain sits underneath the shiny js pages, which interact with it directly, and also underneath the non-js site, which aggregate it into noice pages. But might it be possible to twist it around, instead? When you access a UI endpoint, it issues you with a handful of resources, and each resource has it’s own url but is also accessible via the UI endpoint. The framework talks of what it means to update a resource in a particular context, and so can offer both endpoints without developer intervention.

Does that exist yet? Anyone tried to build something like it and able to tell me why it can’t be done?

On testing

Testing – particularly automated unit and acceptance testing – drives me a bit mad.

When we want to assert something about data flow, we make a bunch of assertions about particular, peculiar data values moving back and forth. When we want to assert that you can make a post from either the admin or the public-facing side and have it appear on both sides, we test those four facts independently, that a post can be both written and seen on both sides; or we test two pairs of operations.

What we mean (or at least what I mean), and try to get at with all this, is that conceptually there is a single entity with two read operations and two write operations available.

What would it take to have an application and a test framework that lets us say that?

While I am here, please allow me to complain about mocks. I know exactly how often I care about a particular method being called, and it is precisely never. What I care about is that whenever a particular thing happens, a bit of data crossing an abstraction boundary or a computation being performed, some method acts in a particular way. If you isolate the action of the method from the boundary or the computation, then you isolate it from its meaning and demolish the value of the test.

And I hate maintaining tests that aren’t valuable.

The story of “Semantic”

In the beginning was HTML, and the version of the HTML was 4, and the quality of the HTML was madness. And some people, some wise people, said “This is madness”.

They were right.

So they sought a path out of madness. They said, “this thing we have marked up as red and bold, we did that because we considered that to be important. And this thing we gave a background color because it’s the heading.” And so the things that were headings, they labeled as headings; and the things that were important, they labeled as important, and these labelings they called semantic.

And they were right to do so.

We said, then, that since we liked to avoid madness, we would call all things by their semantic names. And then we had a project with three headers and two sidebars, and we gave them all increasingly strained names, and then the designers gave us a box which was to hold the names of fribbles and the addresses of frobbles. And we asked ourselves: what, semantically, do fribble’s names and frobble’s addresses have in common? And the answer to that was, sadly, sweet fanny adams.

Yeah.

Sweet Fanny Adams.

It turns out that sometimes, a box on the side below the other box on the side, is just a box on the side below the other box on the side.

But we had trained ourselves to ask not just what things were, but what they were semantically. And we had forgotten that the semantics of a thing are nothing more or less than the meaning of the thing. And then when things were nothing more or less than a box on the side. we were paralyzed by trying to find a deeper meaning for the box on the side; we were too busy looking for the semantics of the thing, to let the thing simply be what it was.

And that is why the word “Semantic” makes me sad.

Statements about software development

Some of these, I may write more about. I often feel like these statements between them point in a direction, and some days I think that I could build something that they’re pointing at. Maybe I’ll just keep complaining, though!

A database is just a big ol’ global variable.
Mutable state is where bugs come from.
Uncontrolled side effects mean you may as well have mutable state everywhere.
Does your application’s database contain a stream of data, or a pile of facts? We think of it as a pile of facts, even when it’s a stream of data. We should stop doing that.
Your tests should only assert things that you actually care about.
Mocks in a dynamic language are almost unforgivable.
A relationship between two objects, doesn’t belong to either of them.
A picture does not describe a user experience.
Building pages in terms of html and then dealing with them in terms of DOM is stupid.
Why wouldn’t you want your language to stop you from putting round things in square holes?

ICFP Programming Contest, 2007

So, that was a while ago… http://save-endo.cs.uu.nl/ is the website.

I participated in it with a bunch of friends, but we never managed a particularly good implementation of the basic spec. We were able to render the starting image – in something like 4 hours. Anyway, the thing opens up into a scavenger hunt, of sorts, and I got tantalizing glimpses of that hunt but our implementation never let us get anywhere. I’ve recently picked the thing up again, resulting in something that renders the starting image in about 3 minutes (so far). It’s on github: https://github.com/imccoy/icfp07

It’s the most non-trivial thing I’ve ever written in haskell. As it turned out, getting it working and correct was not particularly hard (incidentally, finger trees rock), but I have had a bugger of a time getting rid of the space leaks so that the thing could actually run to completion. That makes haskell an interesting contradiction, for me: I love the way it’s type system and purity conspire to prevent surprises, but I hate the way it’s laziness creates more surprises.

I imagine I’ll keep poking at it, both in terms of optimizing the implementation further and trying to find more of the stuff they’ve hidden away in there.