Slux development status

Slux is a "high-level" runtime-typed virtual machine, designed to be both general and an appropriate target for Interactive Fiction languages.

It offers garbage collection; dynamic typing; support for currying and closures; primitive types (31-bit integers, strings, names, objects, fixed-layout structs, and dynamically-sized heterogenous arrays) for speed; function-scoped, object-scoped, global-scoped, and dynamically-scoped variables; opcode-based for performance, but with opcodes cascading to message sends to allow "operator overloading"; pseudoclasses to define methods for the primitive types; virtual primitive types implemented in interpreted code; opcode variations allow targeting by multiple languages with different semantics; compiler-configurable multiple-inheritence lookup rules; efficiently encoded program files; save/restore and save/undo for IF.

Specification

As of Sep 15, 2002:

Please report bugs in the specification to buzzard at nothings dot org! I'm fairly certain some of the opcode links are broken, which means the opcodes are undefined, but I'm not really up to clicking on them all. If you see an opcode that you don't know what it is and it's not defined anywhere, tell me.

Compiling to Slux

A guide to compilation strategies for dealing with Slux:

Rationale

FastSlux, an implentation

FastSlux is a high-performance implementation of the Slux specification. (There is no 'simple' reference specification, because I don't have the time to do both.)

FastSlux features:

Status, as of Jan 20, 2003

Noticed that 'testprop' wasn't working; fixed it. Now *really* finished testing the object/property/plist opcodes.

About 90 opcodes tested.

I've started writing a test codebase to test all the opcodes. A lot of infrastructure (an interpreted routine to print out entities) and I've tested all the string operators so far.

I did some performance testing at Iain Merrick's suggestion, and got around a 8:1 ratio of FastSlux to C code, doing all integer operations which FS was constantly typechecking, and with the inefficiencies in the interpreter dispatch loop described below.

I was testing this function:

int fib(int x)
{
   if (x < 2) return x;
   return fib(x-1) + fib(x-2);
}
which I manually compiled to (destinations on the left, # means a constant):
   jumpge  r0,#2,skip
   return  r0
 skip:
   sub     r0,r0,#1
   func    r1,#$fib,[r0]  ; parameter list is 'r0'
   sub     r0,r0,#1
   func    r0,#$fib,[r0]
   add     r0,r0,r1
   return  r0
A compiler would probably use additional registers to store the temporaries, but wouldn't need any additional instructions.

Looked at disassembly of inner loop under MSVC. Mostly good; pays an unnecessary rangecheck on every switch, as expected; all three of the crucial variables are register allocated as I requested. Here is what gets executed between the time 'noop' gets dispatched and the computed jump of the next instruction.

004024C5   inc         esi
004024C6   jmp         _callEntryPoint+1A80h (00403df0)

00403DF0   xor         eax,eax
00403DF2   mov         al,byte ptr [esi]
00403DF4   cmp         eax,0BBh
00403DF9   jbe         _callEntryPoint+146h (004024b6)

004024B6   mov         edx,0FFFFFFFCh
004024BB   or          ecx,0FFh
004024BE   jmp         dword ptr [eax*4+403E40h]
esi is the instruction pointer; the first instruction is the entire implementation of noop. I have no idea what the edx and ecx stuff at the end there are. ecx and edx elsewhere are temporaries. I was thinking maybe this was a subexpression being pulled up before the switch, but I couldn't find edx or ecx being used anywhere without being overwritten first. And 'or ecx,0ffh' is a bizarre instruction anyway.

Here's the implementation of 'return 0'; rapid call and return is crucial.

004029F2   mov         edi,dword ptr [edi-4]
004029F5   mov         esi,dword ptr [edi-0Ch]
004029F8   mov         ebp,dword ptr [edi-8]
004029FB   mov         dword ptr [esp+10h],ebp
004029FF   movsx       edx,byte ptr [esi-1]
00402A03   mov         dword ptr [edi+edx*4],0
00402A0A   jmp         _callEntryPoint+1A80h (00403df0)

First it restores the VM frame pointer, edi. Then it loads the saved VM ip into esi, and the saved VM literal-pool pointer into ebp, and then saves tha onto the hardware stack even though it's register allocated; I don't know why. Next it fetches, out of the old instruction stream, the destination register that the function call return value is supposed to write to, and puts that in edx. Then it writes a 0 there. Then off to the next instruction.

Building a new stack frame is a bit slower, sadly.

Done features:

Still TODO:

The current codesize breakdown, in lines of code: