May 27, 2021

The quirkiest programming language I ever worked in

I have worked with a number of programming languages over the years, producing non-trivial pieces of software. There are things to be said about programming languages, as they are the means by which we instruct the machine to do our bidding. As such, they are the glue between what we construct in our minds, and what the machine in the end actually does.

This leads to a tower of abstractions, where our world rests on four elephants, standing on top of a turtle, sailing through the empty space of our minds.

discworld
Figure 1. The world, as envisioned by Discworld

Programming languages comes in all flavours, from low end, to low end, but fixed a number of hard things, to flawed, to high level (with guard rails), to practical, to designed for fun, to a giant experiment that turned out to work rather well, to a magically enchanged +2 platinum crossbow, to things you cannot avoid. There is a lot to choose from, and depending on your constraints, it might be you end up working in a language that are missing things you know about from other programming languages.

OOiS

Of the languages I have worked with over the years, the quirkiest by far is called OOiS-script, a now dead language. Back in 2007, I joined a company called 24HR (they are awesome by the way), and ended up working with this language because of that.

OOiS-script was part of a CMS called OOiS (Object Oriented information System), which was also its own web server, which also hosted its own scripting language. The web server was written in C, with a parser for OOiS-script to boot. The first version of this was written back in 1999. By the time I joined, there was still active use, but no active development of the system.

The idea behind OOiS-script was that ordinary webmasters, who were not trained programmers, would be able to write simple snippets of code that spat out HTML. In practice, it turned out that only one customer actually used this feature, Lund University. Since it was only meant to be used by webmasters, the picking of features were…​ sparse.

Note
To be fair to OOiS, it had some really interesting concepts for CMSes, which I ruthlessly copied, for when I decided to write my own.

OOiS-script 101

When I joined, I was handed a neat little printed book with all the syntax that was supported. Needless to say, there were no modern code editors that supported this syntax. However, someone had put together an emacs mode that gave some highlighting, and so I ended up using emacs. 14 years later I still use emacs out of sheer laziness.

Defining a variable
^set(var[myvar]=3)

Defining variables was the first quirk. You see…​ that code translates into a giant hashmap with myvar holding the value 3. Seems innocent enough at first. A quick, easy implementation. The result however…​ was the interesting effect that all variables in your program were global, without exception.

The second interesting effect was that you could not resassigned an assigned value to something other than the primitive that was used in the first place. I.e, I could not assign a var[myvar] to say an integer 3, and then assigned the same var[myvar] to a string "foobar".

Looping
^loop([counter,0,100])
  ^print(var[counter])<br>
^endloop()

Yep…​ OOiS-script was a template language as well. So you could write in any HTML you wanted, and it would happily spit it out. counter is put into the giant, global hashmap, and is available via var[].

Printing
^print("my stuff")
^print("<strong>I can print HTML as well, and forget the closing strong")
^print("<strong>I can print HTML as well, and forget the closing strong").escape()
^print(var[myvar])

Printing was interesting in that it didn’t escape by default.

Comment
<!!-- my comment --!!>

Comments were written in a such a way, that they almost looked like HTML comments, except they had the extra exclamation mark. Better make sure they are correct, or your comment will show up in your HTML.

Shell
^set(var[output]=cmd("rm -rf /"))

You also had direct access to the shell. Since so much was not possible to write in OOiS-script, any advanced stuff was written in something else, and executed from within OOiS-script. This little beauty, and the fact that it was part of the default implementation, caused an entire web server to be deleted by someone who found an injection path in one of the web sites being served. Three days later we could finally restore all the websites, and two months later we had rewritten the website that was the offender.

Include
^include("path/to/script.html")

Heavily inspired by PHP, include could be used in code files anywhere and was best used to split up large pieces of code into something that resembled modules. Note that vars are still global using include.

Primitives

For primitives you had ints, floats, strings and arrays. Lists and hashmaps were not present.

Other stuff

You had syntax for working with XML, but not JSON (too new). You used Latin1 for character encoding and you even had support for UTF-8, but only for all the code points that translated to Latin1.

And it was not possible to delete something. Once something was allocated, it stayed for the entire request. Upon returning a response, the memory was freed.

OOiS-script 102

So…​ how would you dynamically construct something then? Everything is global. One tiny little quirk to the rescue. See…​ var[myvar] can also be accessed via a string such as var["myvar"]. That means the following is possible.

Dynamic lookup
^set(var[x]="myvar")
^print(var[var[x]])

This will print 3, which was set before in myvar. The thing about this though…​ is that we accessed "myvar" (e.g. the string) via a second var called x.

Hashmap
^loop([i, 0, 10])
  ^set(var[var["mything"].concat[var[i]]]=i)
^endloop()

<!!-- Later in the code --!!>

^loop([i, 0, 10])
  ^print(var[var["mything"].concat[var[i]]]=i)
^endloop()

If you needed a hashmap, you constructed one in the global hashmap that var used. All you needed to know was how you constructed the keys.

List

List was not present, so the global var was abused as a store, and then constructing an array with the keys. Given that arrays needed to have a known size beforehand, it was often more practical to just use the hashmap trick and know what the keys would be.

Counters would then be used, together with the chosen keyword, to construct the keys for the list.

Woes

Building advanced functionality was possible, but always a pain to reason about and since reading the code required scrutiny like a hawk when reading the code. Manual testing was the only way to be sure that what you had written was working. This turned out to be a good habit, and something I have cherished in more sane programming langauges afterwards. Your compiler can tell you everything is all right, and you still screwed up your logic. Test it manually, observe the result, verify.

XML woes

What was real pain was navigating large XML files. The techniques I had to use there meant very large chunks of data were allocated to the global vars hashmap internally, and I had no way of deleted unused data. Nor could i allocate an already allocated var used in the XML navigation to something small, like an empty string, since that would clash with the types in the C hashmap that was used under the hood.

This became a big problem for Lund University, as the server had 4 gigs of RAM, and a cron-job that ran every morning at 6AM, which parsed a number of XML files and then promptly crashed as it ran out of memory. I believe the number of files were less than a thousand. The problem was solved by splitting up the cron-job into two jobs, with the XML files split in half. This made it possible to just barely squeeze by, with the slightly bigger job consuming about 3.9 gigs of RAM.

UTF-8 woes

In 2009 OOiS finally had to contend with the fact that UTF-8 is actually one honking great idea and it hit OOiS-script full force when Lund University wanted a searchable web frontend for all of their publications. Since this was written in a language that could actually handle UTF-8, and Lund University publish articles with character sets other than Latin1, this turned out to be a bit of a problem. OOiS (the server), was coupled with MySQL which used the Latin1 encoding. Every piece of data that was gotten from MySQL was Latin1 encoded, while all the data that was gotten from the web service that was powering their publication database was in UTF-8.

It was eventually solved by limiting the things that was allowed on the template, and convert anything from MySQL used in the template to UTF-8. Python powered the integration with the publication database, and handled all the edge cases of Latin1 <→ UTF-8 conversitions, since you also had to be able to search for publications. This got correctly sent in to the server as UTF-8, manhandled by OOiS which thought it was Latin1, sent to the python integration as Latin1, converted back to proper UTF-8, sent to the publication server and then the results were served back as UTF-8. The last part was thankfully not a problem, as there were no need for manipulations of the data inside of OOiS.

A house of cards was built, that worked like a charm from the end users perspective.

Moral of the story

I built some really nice stuff. It was harder than it should have been. Tools matter, but only so far as they propel you forward. If possible, choose nice tools.

Tags: war-stories