2014/10/24

back to the beginning ... async 101

Even the most humble of modern laptops today has multiple cores at its disposal. When you work Resource Oriented you benefit from the fact that resource requests are automatically spread over the available cores. However within one (root) request you typically make subrequests sequentially. In most cases this is exactly what you want as one subrequest provides the input for the next ... and so on.

There are cases however where you can benefit from parallel processing. A webpage, for example, can be composed from several snippets which can be requested in parallel. In a previous post I discussed the XRL language :

<html xmlns:xrl="http://netkernel.org/xrl">
    <xrl:include identifier="res:/elbeesee/demo/xrl/header" async="true"/>
    <xrl:include identifier="res:/elbeesee/demo/xrl/body" async="true"/>
    <xrl:include identifier="res:/elbeesee/demo/xrl/footer" async="true"/>
</html>


Another use case for parallel processing is batch processing. In my last post I developed an active:csvfreemarker component. It applies a freemarker template to every csv row in an input file and writes the result to an output file. It works. However, the files I want processed contain millions of rows and applying a freemarker template does take a bit of time. Can parallel processing help ? Yes it can ! Here's the revelant bit of code :

while(vCsvMap != null) {
    int i = 0;
    List<INKFAsyncRequestHandle> vHandles = new ArrayList<INKFAsyncRequestHandle>();

    while( (vCsvMap != null) && (i < 8) ) {
        INKFRequest freemarkerrequest = aContext.createRequest("active:freemarker");
        freemarkerrequest.addArgument("operator", "res:/resources/freemarker/" + aTemplate + ".freemarker");
        for (Map.Entry<String,String> vCsvEntry : vCsvMap.entrySet()) {
            freemarkerrequest.addArgumentByValue(vCsvEntry.getKey().toUpperCase(), vCsvEntry.getValue());
        }
        freemarkerrequest.setRepresentationClass(String.class);
        INKFAsyncRequestHandle vHandle = aContext.issueAsyncRequest(freemarkerrequest);
        vHandles.add(vHandle);

        vCsvMap = vInReader.read(vHeader);
        i = i + 1;
    }
    for (int j=0; j<i; j++) {
        INKFAsyncRequestHandle vHandle = vHandles.get(j);
        String vOut = (String)vHandle.join();
        vOutWriter.append(vOut).append("\n");
    }

}

The freemarker requests are issued as async requests in groups of eight. Their results are then processed in order in the for-loop.

Why eight ? That number depends on several things. The number of cores available, the duration of each async request, ... You'll need to experiment a bit to see what fits your environment/requirements. So actually the number should not be hard-coded. Bad me.