24 Apr 2014

The early part of the CosmicOS message describes a programming language that consists only of integers and nesting markers (parentheses). Symbols are allowed for convenience when developing the message, but get stripped away. For example:

(= 20 (+ 5 15))

is encoded exactly the same way as:

(2 20 (10 5 15))

This is done to cut down on ambiguity and complexity once we get to meta-programming. Instead of a special syntax, we agree that the first integer in any list is a look-up to some kind of memory. Currently, slot 2 holds a function for testing equality and slot 10 holds a function for adding two integers.

The memory store is important to the message, but the specific locations used are not. So far, they get assigned on need, and then are pinned down so that the message doesn’t change unnecessarily from build to build. This means that commonly used stuff is held in lower-index locations, leading to a shorter message overall.

On balance, I like not having a special syntax for identifiers versus integers (although I’m not religious about it) since it avoids a lot of complexity once we are describing how to evaluate code within the message itself. I do have a qualm though. It would perhaps be better to use higher-index locations (like say 345391) and avoid lower locations (like 10 or 2), as a courtesy to the eventual reader, so that in the early message there are no collisions between uses of integers as simple numbers and uses of integers as memory indices. When (+ 5 15) is encoded as (10 5 15) there isn’t much contrast between the lookup and the arguments. If instead we encoded as (345391 5 15), there would be a bigger contrast. And context-free searching for a sequence like 345391 in the message would yield relevant hits, rather than the chaff that a search for 10 or 2 would give.

The message would become longer, but could be simpler to understand. Expect this change in a future revision to CosmicOS.

All posts