2012/02/24

Wayback Machine

Have you ever taken a look at the Wayback Machine? In these days of the fast and the furious internet too few of us take time to look back at how things were in the not so distant past.

It may therefore be of interest to you to know that the more serious of the reverse engineers (I refuse to call them hackers, but that is a story for another day) do know about the Wayback Machine.

The way it works is that it constantly crawls the web and keeps a copy (by datestamp) of pages that have changed since the last crawl.

In ROC terms (you knew this was coming, right ?) the Wayback Machine holds on to expired (= no longer valid) representations of a resource. When I say no longer valid, I mean no longer valid now, for within the correct context - a given date - they are of course valid. There, the explanation of a time machine by a none-scientist!

NetKernel has it's very own time machine. It is called the Visualizer. It holds on to representations of resources. How ? By tagging them in the cache. For all  other purposes these representations may be expired, but as long has the Visualizer has them tagged, they are not removed from the cache and can be consulted.

To see it at work, start it, then consult a couple of pages in the Backend Fulcrum. Go back to the Visualizer, stop it and take a look (in detail) at what you just did. It is that simple.

There is no debugging tool on the market that can stand even close to it. Let me repeat that. There is no debugging tool on the market that can stand even close to it. Most of them are postmortem tools, you have to reconstruct what happened. The (more expensive) ones that allow real time debugging suffer from the quantum problem : "Once you stick in your instruments to measure, you've changed the thing you want to measure".

The Visualizer suffers from neither problem. It is a true time machine. Go check it out. Today!

For those of you that are not convinced ... the Wayback Machine and its counterparts get more hits from reverse engineers (and I'm not talking script kiddies here) then by any other group of users. These are people that know how to use things like SoftIce (now Syser), Java Decompilers, javascript injection, ... Why would they bother with a rather boring website ? Out of historic interest ? I think not ...

2012/02/17

Time of the Tree

Before I discovered NetKernel, my physical library consisted of a copy (= immutable representation) of the LOTR books as well as everything Stephen King had written up to that point (I actually learned English from reading "The Tommyknockers" over and over again, so blame him if my English is far from perfect).

Since then countless books on mathematics have been added as well as a (treasured) copy of The Codebreakers (the 1967 edition), books on evolutionary biology, ... the list goes on and on ...

So I was rather pleased last week after reading the NetKernel newsletter that I could go back to the roots and read up on the Ents. For after all, trees need guardians don't they ?

Do I agree with what Peter says in the article ? Well, as a database administrator I've seen the demise of the hierarchical (tree) database, the coming and going of the network oriented (codasyl) database and the rise of the relational (codd) database. And just now the latter is under fire by the nosql/document database. And the funny thing is that a document is a tree, instead of there being something new we've actually come full circle.

So yes, he might have a point. And the Hierachical Data Structure brings this datastructure neatly to your code, because lets be frank, it is a pain if the datastructure in your code differs from the one on the database. Let us - for example - take a look at the active:fls accessor. It takes an IHDSNode as input :

<root>
  <fls>
    <root>file:/var/tmp</root>
     <recursive/>
     <filter>*.tmp</filter>
  </fls>
</root>


Yes, I do know that is XML. It is just a representation, a way of viewing at HDS. Every IHDSNode has a root element which may be ignored for practical purposes. Incoming XML (for example) is transrepted into a tree underneath this element. On the way out the element is removed.

This will make it more clear :


root : null
  fls : null
    root : file:/var/tmp
    recursive : null
    filter : *.tmp
  fls : null
    root : file:/tmp
    filter : *.bak


The above IHDSNode contains a forest, not just a single tree. Normally speaking you can not transrept this to an XML representation (an XML can not contain a forest). But the transreptor has foreseen this (border) case and turns it into :

<hds>
  <fls>
    <root>file:/var/tmp</root>
     <recursive/>
     <filter>*.tmp</filter>
  </fls>
  <fls>
    <root>file:/tmp</root>
     <recursive/>
     <filter>*.bak</filter>
  </fls>
</hds>


Whereas the original example transrepts to :
<fls>
  <root>file:/var/tmp</root>
  <recursive/>
  <filter>*.tmp</filter>
</fls>


Since one has to be careful these days ... for the purpose of this blog entry I define a forest as a group of two or more trees.


Did you notice the similarity with JSON ? Here is another possible representation :
{ "fls" : null {
    "root" : "file:/var/tmp",
    "recursive": null,
    "filter": "*.tmp"
  },
  "fls" : null {
    "root" : "file:/tmp",
    "filter" : "*.bak"
  }
}


Do you see my point ? I could not care less when I read that XML is dead and that JSON Rules. For I guarantee it, before the year is out, somebody with clout will get tired of typing curly brackets and invent something new (and if nobody does, I will). Bullshit (forgive me my French) ! Debating representations is moot. We care about resources. And the Time of the Tree has arrived (once more) ! Check it (HDS) out !

For your information, here's the Groovy program I did my experiments with :

import org.netkernel.layer0.nkf.INKFResponse;
import org.netkernel.layer0.representation.IHDSNode;
import org.netkernel.layer0.representation.impl.HDSBuilder;
        
builder = new HDSBuilder()

builder.pushNode("fls",null);
builder.addNode("root","file:/var/tmp");
builder.addNode("recursive",null);
builder.addNode("filter","*.tmp");
builder.popNode();


builder.pushNode("fls",null);
builder.addNode("root","file:/tmp");
builder.addNode("filter","*.bak");
builder.popNode();


response = context.createResponseFrom(builder.getRoot());
response.setExpiry(INKFResponse.EXPIRY_ALWAYS);


That's it for this week ... O yes, for obvious reasons, Tom Bombadil is my favorite character in LOTR, but Treebeard does come in a close second.

2012/02/09

TRL

I've had some feedback that I worry too much about even-toed ungulates, so this week I'll play it safe and give a short example of the newest TRL - Text Recursion Language feature.

Now, TRL is itself a recent addition to the NetKernel toolkit and the little sister of XRL - XML Recursion Language. In retrospect it was a missing child, for with Freemarker + TRL for text and XSLT + XRL for xml those two formats (which will both be around for a long time to come) are now pretty well covered.

No idea what a future (?) JSLT and JRL (with the J for JSON) will look like, but the format has great momentum and at some point the need for both tools will arise.

Little sisters often take the lead (mine provided my schoolbooks with neat covers long before I could cut straight with scissors ... and she reminds me of that until this very day) and TRL is no exception. Last week asynchronity was added to it. XRL will no doubt follow soon.

What does that mean ? Well, take this simple example :

<sequence>
  <request assignment="response">
    <identifier>active:trl</identifier>
    <argument name="template">
      <literal type="string">
        <![CDATA[
Hello ${
          <request>
            <identifier>active:groovy</identifier>
            <argument name="operator">
              <literal type="string">
                import org.netkernel.layer0.nkf.INKFResponse;
                sleep(1000);
                response = context.createResponseFrom("Tom");
                response.setExpiry(INKFResponse.EXPIRY_ALWAYS);
              </literal>
            </argument>
          </request>
} ${
          <request>
            <identifier>active:groovy</identifier>
            <argument name="operator">
              <literal type="string">
                import org.netkernel.layer0.nkf.INKFResponse;
                sleep(1000);
                response = context.createResponseFrom("Geudens");
                response.setExpiry(INKFResponse.EXPIRY_ALWAYS);
              </literal>
            </argument>
          </request>
}
        ]]>
      </literal>
    </argument>
  </request>
</sequence>

You can execute this in the Scripting Playpen (as DPML obviously). By the way, the <![CDATA[ ]]> is needed in order to put XML inside a string literal without having to escape it (thanks to Peter Rodgers for pointing that out).


The example itself doesn't do a lot, but time it with the Visualizer. You'll get something like 2010 ms. Which is to be expected with the two sleeps that are executed synchronously.

Now, try this :


<sequence>
  <request assignment="response">
    <identifier>active:trl</identifier>
    <argument name="template">
      <literal type="string">
        <![CDATA[
Hello $a{
          <request>
            <identifier>active:groovy</identifier>
            <argument name="operator">
              <literal type="string">
                import org.netkernel.layer0.nkf.INKFResponse;
                sleep(1000);
                response = context.createResponseFrom("Tom");
                response.setExpiry(INKFResponse.EXPIRY_ALWAYS);
              </literal>
            </argument>
          </request>
} $a{
          <request>
            <identifier>active:groovy</identifier>
            <argument name="operator">
              <literal type="string">
                import org.netkernel.layer0.nkf.INKFResponse;
                sleep(1000);
                response = context.createResponseFrom("Geudens");
                response.setExpiry(INKFResponse.EXPIRY_ALWAYS);
              </literal>
            </argument>
          </request>
}
        ]]>
      </literal>
    </argument>
  </request>
</sequence>

Time it again. You should get something like 1010 ms. We just halved execution time! The tradeoff - always be aware of those - is an extra thread. Still impressive. Go play with it! Try some recursion (a recursive example would have made the above completely unreadable).

To finish this entry ... it has been suggested that if we - humans - would have had an even number of digits on each hand/foot, we might have invented binary (and thus started the digital age) a lot sooner.

2012/02/03

Scaffolding

From riding a dromedary near the pyramids in Giza, Egypt, we move across the Mediterranean Sea and to the time when this sea was called Mare Nostrum (our sea). You guessed it, we are going to Rome, Italy!

The ancient Romans were very keen on the following architectural structure.

That is called a free standing arch. The Colloseum is full of them. So are the aqueducts that provided the Roman cities with water. In fact, every self respecting emperor or general of any worth had one of these erected in his name. We know those today as triumph arches.

It is impossible to build one. Really. Without the middle stone (called keystone), the structure is not stable and will collapse, but obviously you can not put the keystone in before everything else is in place. Try it yourself, it is impossible.

What the Romans figured out (so did the Egyptians when building the pyramids in fact) is that the impossible can be made possible if you use scaffolding. Today you can't see the pile of sand (or beams of wood) that supported the structure while it was being build. But it was there.

Right, before everybody thinks I've gone completely nuts and have started calling myself Julius Caesar ... what does this have to do with NetKernel?

Everything is a resource. You've heard that so often that by now you probably ignore it. But there are resources and resources. In the book I use the example of the data encrypted steganographically in an image of the National Art Gallery in Kuala Lumpur.

Do I know how to get that image? No. Do I know how to decrypt it? No. What I do know is how my request should look and what representation I expect. And NetKernel allows me to build a system based on that. As a scaffold I can use a file containing fixed data while I study the api of the gallery in Kuala Lumpur. Or maybe somebody else can do the studying for me. Maybe the whole api has yet to be developed. Whatever, NetKernel doesn't care! When the time comes, you put the keystone in (= let the requests resolve to the real application), check it, remove the scaffold and lo and behold ... a free standing arch!

Same procedure goes for databases. Every new day brings a new database engine. While that has obvious benefits (keeps Oracle on its toes for one) it also makes choosing hard and customers may change their minds overnight (and a couple of times). Doesn't matter. I once build a very successful application where the customer didn't even realize I wasn't using the database at all. Once I realized it - using a database - was going to be a performance nightmare, I swapped out the database and swapped in a filebased system. The requests didn't change, nor did the results, NetKernel doesn't care. The performance went to subsecond responses. Everybody happy.

So, the lesson for today is that you should learn from history. Scaffolding is an age old trick and NetKernel allows you to use it. Do so!

All that remains for me to say is ... Ave!