A Predator of Information

Our songs will all be silenced, but what of it? Go on singing.

Higher and Higher!

15 December 2018 1:37 AM (software)

I grew up on an Apple IIGS. It had two languages preinstalled: Applesoft Basic and Assembly. It didn't take long for me to discover Assembler, and I later asked for and got a macro Assembler for Christmas. It was Orca/M, a really good macro assembler made by someone with an IBM background.

i wrote a lot of assembly language, read books on systems programming, and wrote· bits of operating systems and small applications of my own. (As this was a microcomputer pre-memory-protection, ‘bits of operating system’ meant ‘libraries to read from and write to the disk drive.’)

Eventually I got a PC and installed Linux. I went away to college and engaged with Other Programmers for the first time. They insisted I should like High Level Languages like C and C++ and Java! Especially Java! Well. Portability I kinda', sorta' got; assembly language isn't the most portable thing in the world. Other than that, I wasn't impressed.

C didn't seem like an improvement. With a good macro-assembler and a standard calling convention you could whip up macros to simplify the tedium of making function calls or calls into the OS. C++ didn't seem like an improvement, either. Encapsulation just seemed like syntactic sugar. obj.foo() wasn't obviously better than foo(obj). Inheritance didn't seem like anything special, either; having a table of the addresses of subroutines is an old trick. Java seemed like C++ but slower. I wasn't interested.

So I spent a bit of my youth thinking that assembly was cool and awesome and high level languages were for the weak. You can imagine that I came to a different conclusion, given that this website runs on Scheme. There was something I encountered that changed everything…

And that was Forth. Forth was low-level. Heck, the Forth I was using let you drop into the assembly if you wanted to pretty easily. I could respect that. And it gave me something that all those high-falutin'-level languages didn't: A REPL. How cool is that? The closest things to a REPL I'd seen up to this point were the Apple II ROM Monitor (and the Apple II ROM Monitor was not actually fun to use) and the Applesoft Basic prompt.

Forth was very regular. You strung words together. Each word took things off the stack and put other things on the stack. The words that came with the language weren't special. If you wanted different control structures, you could make them, and they looked and worked just like the ones that came with the system. Even things like defining variables or defining a word were handled by words, often written in Forth itself. There was a global vocabulary (or in more complicated versions, a linked list of vocabularies searched in some order), and adding a new word to the vocabulary is a easily done with create. Each word can potentially have some code and a range of memory associated with it. This allows you to create whole new kinds of data easily. Want structures? Add them. Want objects? Add them. Want closures? Ridiculously easy. Add them.

Words could read text out of the input buffer and do whatever they wanted with it. So if 'create' wasn't to your liking, you could write your own word that parsed text however you wanted, rather than just waiting for the next blank space. Want rational numbers you can input as rat m/n? Knock yourself out. In traditional Forth systems the compiler was written in Forth and you could (and were encouraged to) use its guts to make a compiler of your own that did something slightly different, if you wanted.

Forth was less a language for writing programs and more a language for writing the language you actually wanted to write your program in. This, for the first time, gave me something I couldn't get from assembly language. Sure, I could write a compiler or interpreter in assembly, but that wasn't the same. That was a lot of work, and you couldn't use the language you were writing to write itself while you were writing it.

So, I'd finally found a high level language, not very high, but higher than assembly that intrigued me and that I really liked using. I got seriously interested in programming languages and looked for any other languages that had the same qualities: A language with a REPL. One that's very regular without much special syntax. One that lets you effectively write a new language in it.

Knock, knock.
Get the door.
It's Lisp!

Scheme, to be precise, was my Lisp of choice. In a sense, Scheme was more regular and uniform than Forth. Forth I had to worry precisely about where the bits and bytes were. Traditional Forth doesn't have a heap. You never deallocate memory. You just start out at one end of RAM and increment the pointer separating used from free memory with the allot word. This is not as bad as it sounds. A lot of embedded systems have to allocate all the memory they're going to use right at the start, and often Forth programs on a disk system would just allot themselves several block-sized buffers and read pages from disk (or write them out) as needed to do the work. (A traditional Forth system didn't use files, it just had blocks of a fixed size, often one kibibyte, and words to move them to or from disk.) As you can imagine, this was somewhat involved. You could create a changeable data structure, sure, but you had to decide on the capacity of each part right up front. There was no changing it later on. With Scheme, you just created stuff. When you were done with it, you stopped referring to it. That was liberating. I would never have thought so before, but when your focus is on experimenting in real-time to build the language you want for your domain, being freed from low-level details is great.

Forth wasn't shy about being close to the machine. There were grungy things in assembly and weird magical words that left markers and jumped around. It was extensible because it was this weird hodgepodge and nothing was hidden. Scheme took a different tack. There were a few primitive special forms, like define and if, that were magical, but you could still write things that looked and worked like what came with the compiler. Macros let you make arbitrary changes to the parse tree. (foo x y z) will normally invoke the function foo with the parameters x, y, and z. If foo is a macro, it will take the three parameters and use it to generate some arbitrary list which is then evaluated.

Scheme has a loop construct. It's called do. I don't care for it. It also has recursion. Functions can call themselves. It is also guaranteed to perform tail-call optimization. What this means is that if a function call is in tail position, that is the return value of that function would also be the return value of the caller, the function is jumped to without allocating a stack frame. This means that if a function calls itself repeatedly in tail-position, it won't use any more stack space and can continue indefinitely. If you have tail-recursion, you don't need a loop construct. If you want one anyway, you can make it with a macro.

Scheme has closures; more properly, all scheme functions close over their environment. If you define one function inside another, then return that inner function, any variables in the function that defined it are still visible. A variable scope depends entirely on what was defined inside what, not on which stack frame happens to be available at the time of execution. You can use this to get an encapsulation like effect where one function returns a list of functions all of which operate on the same data. It's used most often to write up quick functions and either return them or pass them into other things as arguments without having to worry about whether the function will outlive the data on which it depends.

Consider the expression (+ (* 3 2) 5). The continuation of (* 3 2) could be thought of as (lambda (x) (+ x 5)). That is, the continuation of a function application is whatever expression the value was going to be used in. If you make this explicit, it's called continuation passing style. The above example would look like ((lambda (x) (x (* 3 2))) (lambda (x) (+ x 5))). Scheme has first-class continuations: You can get the continuation at any point with the call-with-current-continuation (aka call/cc) function and stash it somewhere. Take our example of (+ (* 3 2) 5). If I capture the continuation, then I can effectively re-evaluate the form except with a different value for (* 3 2). This can be very interesting for nondeterministic algorithms or for exploring some search space where you want to satisfy a set of conditions. Whenever there's a failure, some different candidate can be tried. If you have a continuation from some expression way down the subtree that calls you, you can invoke that continuation to escape, throw away the remaining computation, and return early. This can be used for exceptions as well as breaking out of loops. Continuations can also be used for coroutines and all sorts of other stuff.

Perl also has closures. Nowadays even C++ has closures. They've become a common feature in new languages or ones being updated. I was writing Perl at the time, and mastering Scheme made me much, much better at programming in Perl, because it forced me to think in some ways that Perl allowed, but didn't particularly push me toward. This was a discovery even more interesting than the specific capabilities of Scheme and Forth.

This is the reason why, when someone asks me what programming language they should learn, I will almost always tell them something other than the popular run of Java, C#, C++, JavaScript, Python, etc. Chances are, they already have something pushing them to learn one of those, whether it's a job or some project they want to work on. Also, the most popular languages are very similar to each other. If you know one, the others will feel familiar. You do not gain anything by learning more than one.

That is why I recommend people pick up things like Forth, Haskell, Prolog (or Mercury), APL (or J, which is more functional and less special-charactery), Plaid, or Idris. They are different from each other. They focus on different things and will force you to think differently. I even thought learning SNOBOL was interesting.

So. Now you know how I went from an assembly language hacker who sneered at Java to a functional hacker who sneers at Java.

One response

  1. says:

    Interesting.

Leave a Reply