Usergenic

Thoughts on humane software development

Abstract Away Parallelism

| Comments

An evented programming model (as seen Ruby’s EventMachine and in Node.js) makes the management of asynchronous continuations and event chaining a visibile and central idea for the application developer. While these concepts are important to the understanding of the low-level implementation of an evented system, they are often not important in the context of the problem domain an application is addressing.

How might we get much or all of the performance and parallelism benefit of an evented system without letting callback-chain management and explicit deferment invade our application code?

Lets take a really simple example of a logical expression: x and y.

We know that this expression resolves to true only when both x and y evaluate to true. If the evaluation of x and the evaluation of y are mutually exclusive and could be executed in parallel, then we’ll pretend our language does that. (This assumes that the logical operators in our language are not short-circuiting in this form, meaning we don’t intend to use them to guard right operand expressions from being evaluated prior to evaluation of left operand expressions.)

What’s neat then about the parallel evaluation of x and y is that at any point when either evaluation results in a false we can resolve the entire expression to false since the value of the remaining operand expression is functionally irrelevant.

If we combine eager-parallelism with lazy-evaluation, we could say that z = (x and y) and not even need to wait for the result of the evaluation of (x and y) before continuing on to the next line. We know that z is promised to be the result of the evaluation and that is enough.

Lets look at the implications of this in a simple web-server request. The following code lets just assume is Ruby but with the features described above, where expressions and assignments are promises and logical operators like and are’t short-circuiting:

Serve a webpage
1
2
3
4
5
6
7
8
9
10
11
12
13
14
def serve_page(request)

  session = DB.lookup_session(request.session_id)
  user    = DB.lookup_user(session.user_id)
  page    = DB.lookup_page(request.params['page_id'])

  return deny_access! if
    page.restricted? and
    not page.readable_by?(user) and
    not user.administrator?

  render page

end

In the above example, we can initiate the DB.lookup_session call right away and continue on to the next line. In order to get the user we’ll need to wait for the result of the DB.lookup_session call to return, because the argument of DB.lookup_user needs the user_id from session, however we can assign user to be a promise for that expression and move on to the next line as well.

We can actually fire off DB.lookup_page right away because we already have the request object and its params, so in our VM we can imagine that DB.lookup_session and DB.lookup_page are going on in parallel right now.

We get to the return deny_access! if ... expression and we will essentially have to yield while we wait for something to come back. If we get page back first and we find out it is not restricted, then we can skip the rest of the expression as the and chain is going to resolve to false. Parallelism win!

Now lets say session comes back and the user promise evaluation moves to getting DB.lookup_user going based on session.user_id. If we happen to find out that user.administrator? is true we can also short-circuit the and chain.

While the short-circuit of the and chain doesn’t save us having to wait for page to come back before the render page calls That short-circuit could be valuable if it turns out that figuring out page.readable_by? requires further database calls to an access-control-list for example.

This special handling is a form of parallelism in that we are able to kick off the operation of any promised database query result while continuing on to the next expression and the next expression until we have a demand that has to be satisfied before continuing.

Want.

Comments