Object Relational Mappers have a dream. Their dream is that the mechanics of creating, reading, updating and deleting can be abstracted away and need not be thought about.
I have a dream, too. My dream is that the fact that creating, inserting, updating and deleting is going on can be abstracted away and need not be thought about.
The ORM dream has a big ol’ flaw, to the extent that I am sceptical about the use of ORMs in general. The flaw is that the right way to do CRUD operations depends rather heavily on the domain and the application in question.
My dream should not be vulnerable to that particular flaw, though, because I’m trying to abstract away those operations entirely. Or so I thought right up until yesterday, when I realised that since my approach is to first convert the programmer’s code into a form that does have CRUD operations, I’m going to have the ORM problem from then onwards.
And it’s worse, because when you call the CRUD operations yourself you can exercise a fair degree of control over what goes on. When the CRUD invocations are automatically generated, that control would have to be exercised by hints and indirect suggestions.
In today’s fantabulous outing, marvel at three approaches to efficient implementation! Ride the wave of perplexity that is the function from a list of inputs to the current state! Gaze in wonder at the persistent storage of that current state!
Option 1 – incrementalisation
In this approach, we take a state function f, that takes a list of inputs and produces current state, and produce an incrementalised state function, that takes a change in the list of inputs to the resulting change in the current state. Liu gives a systematic approach for deriving the incrementalised versions of functions.
- It enables a neat way of making a highly responsive UI. As well as incrementalising the state function, you also incrementalise the view function (a function from state to HTML). On the client, you can run the incrementalised state function, pass its output to the incrementalised view function, and apply the resulting change to the DOM. (this works)
- It lends itself to a really cute way of integrating with a data store: write a function that takes the system state and returns a relational representation of that state. Then incrementalise that function. Any time the system state changes, you can pass that change into the incrementalised relational representation function and hit the database with its output. Shiny! (I haven’t done this yet)
- The wording of Liu’s paper is very careful. What they have is a systematic approach for deriving incrementalised versions of functions. What they don’t have is a fully-automated approach. There needs to be some user interaction or some hinting (in the general case) in order to produce an incrementalised version of a function.
- When you have an inner function that uses free variables, and the values of those free variables can be affected by a change, the incrementalised versions start to get pretty hairy and potentially rather inefficient. The current version of my code assumes that any free variables used will not change, which is obviously not true in general.
- At the moment I have an informally specified bug ridden implementation of less than half of Liu’s technique.
Option 2 – Partial Evaluation
Partial evaluation is where you take a program of n inputs, then provide some subset of those inputs and evaluate the program with respect to those inputs. This isn’t immediately applicable to this state-updating shenanigans, because there’s no provision there for changing inputs.
Instead of running the partial evaluator directly on a program that takes a list of inputs and returns a state, we can augment the program with a new top level function:
return state(cons(new_input, inputs))
Assuming the partial evaluator unfolds that cons, we’re in with a chance of getting a program that incorporates a new input without doing too much stuffing around. It’s not at all clear how you might get the state into a persistent data store with this approach, but worse then that, the resulting program won’t know how to capitalize on existing bits of state. This is probably surmountable by memoizing computation of bits of the state, but if you’re going to go there…
Option 3 – Adaptive Computation
Adaptive computation (after Acar) is where you think of a program as building up a dependency graph of values. The final output depends on a bunch of intermediate values, which depend on other intermediate values, which ultimately depend on the program’s inputs.
There’s no obvious barrier to storing that whole dependency graph in some sort of persistent store. Obviously, this is going to be a much weirder database than Option 1 gives you; but on the other hand, this approach naturally gives you much more fine-grained change propagation behaviour. That means you don’t have to worry about free variables making stuff confusing. More importantly, it is fully automatable. You just feed your program into the madhouse, and off it goes.
Similar AJAXyness to that of Option 1 is imaginable, but vastly more intricate: you must ship the portions of the dependency graph relevant to the client’s computation down to it. Determining in advance which portions will be relevant is quite likely to be vulnerable to the halting problem, so you’ll have to either send too little (and have a lazy loading strategy to get other data that you need on the client) or too much (and clog the intertubes, OMG).
This option is closest to the approach taken by Asana’s Luna and by Meteor.
Parenthetically: I was amused to discover that with Option 1, the efficient thing to do is to cons a new input onto the head of the list. With this option, the efficient thing to do is to replace the empty list at the end of the input list with a singleton list containing the new input.
I’m sticking with Option 1 for now. I’m about to attempt integrating with a database; if that goes badly, Option 3 is going to start looking much better. Again.
They’re not in detention, honest! It’s right there in the law (clause 11):
An offshore entry person who is being dealt with under subsection (3) is taken not to be in immigration detention (as defined in subsection 5(1)).
So that’s nice. They’re not in detention! What are they? Well, that’s the question, innit?
They are simply enjoying the hospitality of the Australian government on some island a long way from anywhere, perhaps with no way to leave unless they tell the aforementioned Australian government to send them home.
I say “perhaps”, because there’s no information about what the conditions in these processing places will be. I was looking at the law because I wanted to know what we were doing to protect these people who come to us and ask: please, protect us? I couldn’t find anything, and was hesitant to write about it because I’m not really qualified to read these things. But then I found this. Take us to the bridge, Sarah Hansen-Young:
The Houston report included many important recommendations, none of which this government or the opposition have said any word of since Monday. Where is the commitment to increase our refugee intake immediately? Where is the plan to resettle those who have been waiting for years in Indonesia? Where in this legislation is our commitment to upholding our commitment under the convention to treat people properly, to ensure that they are housed appropriately, to ensure that they have access to legal assistance and to ensure that we do not detain children? None of that stuff, none of those important safeguards which are listed in the Houston report, are in this legislation. The government say that they are implementing the Houston report and they have not even bothered to read the detail of the recommendations.
Bear in mind that I’m not just believing her, I have read the documents. She speaks truth. The legislation enables the first and worst recommendation, offshore deten^h^h^h^h^hprocessing, and government and parliament have done nothing to ensure conditions in those places are tolerable. Nor have they done anything with the report’s suggestions to improve the queue that asylum seekers are supposedly jumping.
Nothing about Houston Report 3.46′s appropriate accommodation, or its physical and mental health services. Nothing about its educational and vocational training, its monitoring arrangements and case management assistance. And most worryingly, nothing about its application assistance or its appeal mechanism.
Nothing about the Report’s 3.22′s increase in places for folks currently in Indonesia, or its “practical agenda of initiatives”, or 3.26′s capacity-building of UNHCR processes in the region.
Please, please, prove me wrong.
But enough of the heavy stuff! Let’s have some comic relief. From Migration Legislation Amendment (Regional Processing and Other Measures) Bill 2012, aka the Stop The Boats bill:
198AC Documents to be laid before Parliament
(1) This section applies if the Minister designates a country to be a regional processing country under subsection 198AB(1).
(2) The Minister must cause to be laid before each House of the Parliament:
(a) a copy of the designation; and
(b) a statement of the Minister’s reasons for thinking it is in the national interest to designate the country to be a regional processing country, referring in particular to any assurances of a kind referred to in paragraph 198AB(3)(a) that have been given by the country; and
(c) a copy of any written agreement between Australia and the country relating to the taking of persons to the country; and
(d) a statement about the Minister’s consultations with the Office of the United Nations High Commissioner for Refugees in relation to the designation, including the nature of those consultations; and
(e) a summary of any advice received from that Office in relation to the designation; and
(f) a statement about any arrangements that are in place, or are to be put in place, in the country for the treatment of persons taken to the country.
In other words, there’s a whole lot of stuff that will help the parliament to make the judgement about if it’s a good idea to send refugees to this country. The minister must provide it. Take particular note of that word “Must”, the third word in (2). Onwards:
(3) The Minister must comply with subsection (2) within 2 sitting days of each House of the Parliament after the day on which the designation is made.
What? What are you talking about? 198AB said that if the document wasn’t explicitly rejected by either house within five days, it would be accepted. I’m not clear on how “within 2 sitting days” is counted by these folks, but if there’s scope for the minister to provide the documents at the end of day 2 out of 5, that leaves 3 days for the houses to accept or reject. They’ve built a stalling tactic right in to the legislation! But that’s okay, surely it can’t get any worse…
(4) The sole purpose of laying the documents referred to in subsection (2) before the Parliament is to inform the Parliament of the matters referred to in the documents and nothing in the documents affects the validity of the designation. Similarly, the fact that some or all of those documents do not exist does not affect the validity of the designation.
The first sentence says it’s to help you decide about the designation, it doesn’t define or otherwise affect the designation (in network protocol standards, we’d say it’s not normative). Fair enough. But the second sentence seems to come awfully close to giving the Minister a pass on justifying the designation of a country. I thought it couldn’t get any worse? Surely now…
(5) A failure to comply with this section does not affect the validity of the designation.
Ah, this one is simple. The word ‘must’ in clause (2) was for cosmetic purposes only. The Minister can not only stall (as per (3)) or partially comply (as explicitly provided for by (4)), he can in fact skip the whole bit.
See? Comic relief, just like I promised. It must be funny, because the primal law of “if you don’t laugh, you’ll cry” is in effect. It’s in effect so hard. Still, at least it’s better than the Morrison amendment.
P.S. Big ups to the folks maintaining the Australian Parliament House website, by the way, you guys rock my world. Austlii’s consolidated legislation listings rate a mention here too, it’s good stuff. If you think the law is good stuff, that is.
P.P.S There is a more charitable reading of 198AC. The pre-amendment rules are that the minister gets to designate places of off-shore processing; in the worst case, the new rules degrade back to that situation.
I wrote earlier that the trade-off between deaths at sea and possible trauma in detention at Nauru could potentially be justified. That’s predicated on a pretty big assumption, the assumption that there are only two options to choose between. Rather than a policy that may encourage people to come by borderline unseaworthy boat, or a policy of deterrence, how about a policy where safe transport from Indonesia to Australia is provided by the Australian government? This policy is often mentioned, but usually with scorn: “what are we supposed to do, just charter a plane to bring them here?” I am interested in actual reasons not to do this.
The Houston Report doesn’t really answer my question, although it provides a hint. Buried in the document, deep within the sweaty groin of Attachment 6: “Australia’s International and Regional Engagement on Irregular Movement and International Protection”, is this note about practical arrangements developed under the Regional Cooperation Framework of the Bali Process on People Smuggling (etc etc): “Any arrangements should avoid creating pull factors to, or within, the region.” Creating a safe and effective mechanism for people who would otherwise be Irregular Migrant Arrivals would create a big-ol’ pull factor to Indonesia, and Indonesia doesn’t want that.
That’s all reading between the lines. The report doesn’t really talk about the possibility. But – should my reading be accurate – Indonesia’s distaste for the notion would probably not be absolute. Could they be persuaded?
The Houston Report advises re-opening Nauru and Manus Island as off-shore processing centres for asylum seekers. From my point of view, that sucks. I try to be all love-your-neighbour, yo, and shipping your neighbour off to a tiny island – where they are to stay for an unknown period of time – is not real loving. How did such a sucky thing come to pass?
There were three people on the panel that produced the report: Angus Houston, a retired air force man; Professor Michael L’Estrange, an academic and public servant with a rightish bent; and Paris Aristotle, a man with the most awesome name ever but also the Director of the Victorian Foundation for Survivors of Torture. Mr Aristotle’s schtick is supporting and advocating for refugees.
When asked about the possibility of trauma in the new Nauru processing centre, Mr Aristotle compares the potential trauma of prolonged presence on Nauru to that of losing relatives at sea. Given that the rate of boats being lost at sea en route to Christmas Island has increased recently, this issue of asylum seekers dying at sea is a real concern, and the trade-off is potentially justified.
Alas, alack, we can explain all this without considering the trade-off. The panel was assembled to answer a question, and that question takes the form of their Terms of Reference for the report. The question they were asked, Numero Uno, their raison d’être: “how best to prevent asylum seekers risking their lives by travelling to Australia by boat”. Or, as Tony Abbott would put it: “Stop the Boats”. There were other things they were asked to write about:
- source, transit and destination country aspects of irregular migration
- relevant international obligations
- the development of an inter-related set of proposals in support of asylum seeker issues, given Australia’s right to maintain it’s borders
- short, medium and long term approaches to assist in the development of an effective and sustainable approach to asylum seekers
- the legislative requirements for implementation
- the order of magnitude of costs of such policy options
None of these things are the physical or mental well-being of asylum seekers. None of these things are about respecting and caring for vulnerable people looking to us in a time of need. These questions were not asked.
They are kind of between the lines. Our “relevant international obligations” include some treaties that bind us to a certain minimal level of not being a dick. Imposing irreparable psychological damage probably isn’t a sustainable approach to asylum seekers. The well-being of asylum seekers might be considered an “asylum seeker issue” for proposals to be in support of. But this is all implicit stuff, it doesn’t rate a mention in the terms of inquiry.
Compare and contrast this with the lead entry, the Stop the Boats priority. This requirement can be read out of the “inter-related set of proposals” requirement, the “effective … approach to asylum seekers” terms, and maybe even the “destination country aspects”. But to leave it hidden away in there would be to allow the report to come up with any answer at all. The answer that might perhaps buy Tony Abbott’s silence on the issue is off-shore processing, and so the question must focus on deterrence. And thus we asked the panel not to advise us on how to humanely and effectively treat asylum seekers in Australia, but on how to stop them coming.
Paris Aristotle never had a choice. He can be as hard-headed as he wants; he is bound by the terms of reference to answer the wrong question. He is bound to give an answer focused on stopping the boats. He may also think it a justified trade-off in the present circumstances, I don’t know. But it doesn’t matter. His job was to answer the question he was asked. He could have affected the details, but the overall thrust of the report? He never had a choice.
It appears that if
- you’re using Android’s HttpURLConnection (2.3 is the specific source of the grievance), and you’re talking to certain web servers (ahem, in this case nginx with varnish in front of it, if the headers are to be believed).
- you call setDoOutput(true)
- you call setChunkedStreamingMode()
then calling getOutputStream() will cause your later call to getResponseCode() to return a 503 HTTP_UNAVAILABLE.
Isn’t that nice? My fix was to precompute the POST data so I could call setFixedLengthStreamingMode() with its length instead.
PS. You know how I said it was fun? I lied.
My grand purely-functional web application architecture project continues, and today I’d like to talk about a dictionary webapp. It’s very simple, having only two types of inputs, one of which creates words and one of which creates definitions.
type Word = String type Definition = String data Input = NewWord Word | NewDefinition Word Definition
The system state is a list of words, where each word is accompanied by a list of definitions (using some unsurprising helper functions, defined at the bottom of this post):
state :: [Input] -> [(Word, [Definition])] state inputs = map word (filter isNewWord inputs) where word w = (w, definitions w) definitions w = definitionsFor w inputs definitionsFor w = (map definitonFrom) . (filter $ (==w) . wordFrom) . (filter isNewDefinition)
That is, for each word-creating input, make a word entry. Gather all of the relevant definition entries into that word entry.
And from this I want to produce an implementation that does the minimum necessary to update the state when the inputs change. For an implementation to be practical, a NewDefinition input for a word should only impact on values that are directly related to that word. From that perspective, our state function is trouble. In particular, that list of (Word, [Definition]) is trouble.
Here’s why: The simplest thing that I can think of to produce a state-updating version of the function is to use a nice simple top-down algorithm. So from a regular map function,
map f  =  map f (x:xs) = (f x):(map f xs)
We produce something like this (where f’ is a state-updating version of f):
map' f'  =  map' f' (x:xs) = (f' x):(map' f' xs)
This implementation has an unpleasant implication. For any (f x) to change, every value in the initial x:xs has to be looked at. This is bad news from a practical-implementation point of view, because it means that adding a definition for word w actually has to touch the list elements for all of the words created prior to (or after, which one it is doesn’t matter all that much) w. It is quite possible to make an implementation like this that actually works , if you are not worried about the inevitable failure to scale.
At the same time, this bad news is no news at all, since immutable-everything value semantics demand that changing a value within a list also changes all of the lists that contain that value. That is, if we have a list (1:(2:(3:))), an update to 3 can only be made by making an update to the cons cell with value 2, and that update in turn can only be made by making an update to the cons cell with value 1. Those are the rules of the game.
Changing the Game
Why do implementations of the map-reduce framework not have this problem? After all, they are all about having a huge pile of input and generating some number – possibly a similarly huge pile – of outputs. How do the inputs find their way into the relevant outputs? When we say map-reduce, we are omitting the middle step, that of shuffling. The result of the computation is not `reduce (map input)’ but rather `reduce (shuffle (map input))’. The Map stage produces a pile of (key, value) pairs, the shuffle stage arranges those by key so you have a pile of (key, [value]) pairs, then each of those (key, [value]) pairs is reduced. The shuffle stage is all about sending values off to where they will have an effect. So inspired, we can introduce a function shuffle:
shuffle :: (Ord k) -> [a] -> (a -> k) -> ([a] -> v) -> Map k v -- the type signature kind of says it all for me, but -- here's an implementation anyway shuffle is fk fv = foldr (\i m -> addAtKey i (fk i) m) Map.empty is where addAtKey i = Map.alter (Just . maybe [i] (i:))
That is, for some collection of things `[a]‘ we have a function that says which bucket they should go in, and a function that says how to produce a value from the members of a bucket, and from all those things we produce a map of buckets to values. This idea is so simple and so obvious that it took me a month to think of it.
The point of all that is of course that we can implement shuffle’:
shuffle' (\is -> i:is) fk fv fv' = Map.alter f (fk i) v where f Nothing = Just $ fv [i] f (Just v) = Just $ fv' (\is -> i:is) v
So when we have an as-yet empty bucket (the `f Nothing’) case we just compute fv on the single element. When we have a bucket with something in it, we compute the effect of the bucket input’s change on the bucket’s output, apply that change to the bucket’s output, and just like that your uncle’s name is Robert. Probably. This is implementable too, and we can rework our dictionary app to use it quite readily:
definitions inputs = Map.mapWithKey definitionsFromInputsFor (shuffle wordFrom inputs) where definitionsFromInputsFor w inputs = map definitionFrom (definitionInputsFor w inputs)
We have lost something on the way, though. At the start of this post, we could write any function we liked to express our program’s state. Now, the set of functions we can write to do that is severely curtailed. On the other hand, they are efficiently implementable. Will that severe curtailment turn out to be a problem?
It raises another question. Is `shuffle’ actually giving us much, or is the win coming from some special privileges being given to the Map data structure? Maybe representing system state as a list is just a stupid thing to do (exhibit A, the vastly simpler shuffleized implementation).
On both of these questions, further work (as the academics like to say) is required.
Oh, right. Here’s those unsurprising helpers I was talking about. They’re, like, totes boredom:
isNewWord (NewWord _) = True isNewWord _ = False isNewDefinition (NewDefinition _ _) = True isNewDefinition _ = False wordFrom (NewWord w) = w wordFrom (NewDefinition w _) = w definitionFrom (NewDefinition _ d) = d definitionFrom _ = undefined