The Skorpio Programming Language

**The development of this language is in progress!

<-- back

Stack-Oriented
Milestones
Examples
Quick Start
- Help
- Simulation
- Compilation
- Testing
- Usage
Language Reference
- Data Types
- Built-in Words
- Functions
- Include
FAQ

Stack-Oriented

A stack-oriented language is one which primarily uses a stack, instead of (or in addition to) named variables, to manage data flow. This concept is closely related to that of concatenative languages, most of which are stack-based.

Milestones

● Compiled (Compiled language)

● Native (Native)

● Turing-complete (Turing completeness)

● Statically typed (Static type checking)

● Self-hosted (Written in itself, no more Python. Self-hosting)

Examples

"Hello, World!":

use "std.sko"
    
"Hello, World!\n" stdout fmt

Two simple programs:

the first one prints numbers from 10 to 0 in descending order (multi-line example);
the second one prints numbers from 0 to 10 in ascending order (one line example);

descending order

use "std.sko"
10 while cp -1 > do
    cp =>
    1 -
end

ascending order

use "std.sko"
0 while cp 11 < do cp => 1 + end

Quick Start

> Help

$ ./skorpio.py -h
Usage: ./skorpio.py [OPTIONS] <SUBCOMMAND> [ARGS]
OPTIONS
     -dbg                     Enable debug mode 
     -I           <path>      Add the path to the include search list
     -E   <expansion-limit>   Function and use expansion limit. (Default 1000)
    
SUBCOMMANDS
     -s           <file>      Simulate the program
     -c [OPTIONS] <file>      Compile the program
     -h                       Print help to STDOUT and exit 0
    
OPTIONS
     -r                       Run the program after successful compilation
     -o         <file|dir>    Customize the output path
     --silent                 Silent mode. Hide infos about compilation phases

> Simulation

The simulation is an interpretation of the program

$ cat ./tests/arithmetics.sko
-- arithmetics.sko
    
-- add
1 2 + =>
    
-- substract
3 2 - =>
$ ./skorpio.py -s ./tests/arithmetics.sko
3
1

> Compilation

The compilation generates assembly code, compiles it with nasm, and then links it with GNU ld. Both should be available in your $PATH.

$ cat ./tests/arithmetics.sko
-- arithmetics.sko
-- add
1 2 + =>
-- substract
3 2 - =>
$ ./skorpio.py -c ./tests/arithmetics.sko
[INFO] Generating arithmetics.asm
[CMD] nasm -felf64 tests/arithmetics.asm
[CMD] ld -o tests/arithmetics tests/arithmetics.o
$ ./tests/arithmetics
3
1

The -r subcommand allows you to run the program after successful compilation:

$ ./skorpio.py -c -r ./tests/arithmetics.sko
[INFO] Generating arithmetics.asm
[CMD] nasm -felf64 tests/arithmetics.asm
[CMD] ld -o tests/arithmetics tests/arithmetics.o
[CMD] tests/arithmetics
3
1

The -o subcommand allows you to customize the output path:

$ mkdir output && ./skorpio.py -c -o output/ ./tests/arithmetics.sko
[INFO] Generating arithmetics.asm
[CMD] nasm -felf64 output/arithmetics.asm
[CMD] ld -o output/arithmetics output/arithmetics.o
$ ls output/
arithmetics*  arithmetics.asm  arithmetics.o

Or as a file:

$ ./skorpio.py -c -o ./output ./tests/arithmetics.sko
[INFO] Generating output.asm
[CMD] nasm -felf64 ./output.asm
[CMD] ld -o ./output ./output.o
[CMD] ./output
$ ls
output*  output.asm  output.o  assets/  LICENCE  skorpio.py*  README.md  test.py*  tests/

You can chain the -r and -o subcommands:

$ mkdir output && ./skorpio.py -c -r -o output/ ./tests/arithmetics.sko
[INFO] Generating arithmetics.asm
[CMD] nasm -felf64 output/arithmetics.asm
[CMD] ld -o output/arithmetics output/arithmetics.o
[CMD] output/arithmetics
3
1

> Testing

Test cases are located in ./tests/ folder. The *.txt files contain inputs (command line arguments, stdin) and expected outputs (exit code, stdout, stderr) of the corresponding programs.

Run ./test.py script to execute the programs and assert their outputs:

$ ./test.py run

To update expected outputs of the programs run the update subcommand:

$ ./test.py update

To update expected command line arguments and stdin of a specific program run the update input <path/to/program.sko> subcommand:

$ ./test.py update input ./tests/argv.sko <new> <cmd> <args>
[INFO] Provide the stdin for the test case. Press ^D when you are done...
Hello, World
^D
[INFO] Saving input to ./tests/argv.txt

The ./examples/ folder contains programs that are meant for showcasing the language rather then testing it:

$ ./test.py run ./examples/
$ ./test.py update input ./examples/name.sko
$ ./test.py update output ./examples/

> Usage

If you wanna use the Skorpio compiler separately from its code base you only need two things:

./skorpio.py - the compiler itself,
./std/ - the standard library.

By default the compiler searches files to include in ./ and ./std/. You can add more search paths via the -I flag before the subcommand: ./skorpio.py -I <custom-path> -r .... See ./skorpio.py help for more info.

Language Reference

This is what the language supports so far.

> Data Types

>> Integer

Currently an integer is anything that is parsable by int function of Python. When the compiler encounters an integer it pushes it onto the data stack for processing by the relevant operations.

Example:

1 2 +

The code above pushes 1 and 2 onto the data stack and sums them up with + operation.

>> String

Currently a string is any sequence of bytes sandwiched between two ". No newlines inside of the strings are allowed. Escaping is done by unicode_escape codec of Python. No way to escape " themselves for now. No special support for Unicode is provided right now too, it's just a sequence of bytes.

When the compiler encounters a string:

1. the size of the string in bytes is pushed onto the data stack,
2. the bytes of the string are copied somewhere into the memory (the exact location is implementation specific),
3. the pointer to the beginning of the string is pushed onto the data stack.

Thus, a single string pushes two values onto the data stack: the size and the pointer.

Example:

use "std.sko"
"Hello, World!\n" stdout fmt

The fmt macro from std.sko module expects two values on the data stack: the size of the buffer it needs to print to stdout and the pointer to the beginning of the buffer. Both of the values are provided by the string "Hello, World!\n".

>> Character

Currently a character is a single byte sandwiched between two '. Escaping is done by unicode_escape codec of Python. No way to escape ' themselves for now. No special support for Unicode is provided right now too.

When compiler encounters a character it pushes its value as an integer onto the stack.

Example:

'X' =>

This program pushes integer 88 onto the stack (since the ASCII code of letter X is 88) and prints it with the => operation.

> Built-in Words

>> Stack Manipulation

cp - duplicate an element on top of the stack.
```
a -- a a
        
```
=> - print the element on top of the stack to stdout and remove it from the stack.
```
a b -- a
        
```
~ - swap two elements on the top of the stack.
```
a b -- b a
        
```
# - drops the top element of the stack.
```
a b -- a
        
```
over - copy the element below the top of the stack.
```
a b -- a b a
        
```

>> Comparison

= - checks if two elements on top of the stack are equal. Removes the elements from the stack and pushes 1 if they are equal and 0 if they are not.
```
[a: int] [b: int] -- [a == b : bool]
        
```
!= - checks if two elements on top of the stack are not equal.
```
[a: int] [b: int] -- [a != b : bool]
        
```

> - applies the greater comparison on top two elements.

[a: int] [b: int] -- [a > b  : bool]

< - applies the less comparison on top two elements.

[a: int] [b: int] -- [a < b  : bool]

>= - applies the greater or equal comparison on top two elements.
```
[a: int] [b: int] -- [a >= b : bool]
        
```
<= - applies the less or equal comparison on top two elements.
```
[a: int] [b: int] -- [a <= b : bool]
        
```

>> Arithmetic

+ - sums up two elements on the top of the stack.

[a: int] [b: int] -- [a + b: int]

- - subtracts the top of the stack from the element below.
```
[a: int] [b: int] -- [a - b: int]
        
```

* - multiples two elements on top of the stack.

[a: int] [b: int] -- [a * b: int]

divmod - perform Euclidean division between two elements on top of the stack.
```
[a: int] [b: int] -- [a / b: int] [a % b: int]
        
```

>> Bitwise

>> - right bit shift.

[a: int] [b: int] -- [a >> b: int]

<< - left bit shift.

[a: int] [b: int] -- [a << b: int]

or - bit or.

[a: int] [b: int] -- [a | b: int]

& - bit and.

[a: int] [b: int] -- [a & b: int]

> Control Flow

>> if-else condition

<condition> if
    <body>
else <condition> if
    <body>
else
    <body>
end

>> while loop

while <condition> do
    <body>
end

>> Memory

mem - pushes the address of the beginning of the memory where the stack can be read and written.

*s - store a given byte at the given address.

[byte: int] [place: ptr] --

&l - load a byte from the given address.

[place: ptr] -- [byte: int]

*64 - store an 8-byte word at the address on the stack.
```
[byte: int] [place: ptr] --
            
```
&64 - store an 8-byte word at the address on the stack.
```
[place: ptr] -- [byte: int]
            
```

>> System

sys<n> - perform a syscall with n arguments where n is in range [0..6]. (sys0, sys1, ..., sys6)

> Functions

Define a new <keyword> that expands into a sequence of <tokens> during the compilation.

An example with the keyword fmt and as tokens 1 1 sys3:

fn fmt
    1 1 sys3
end
"Hello, World!\n" fmt
-- returns "Hello, World!"

> Include

Include tokens of file file.sko

use "file.sko"

FAQ

Why would you use a stack-oriented language, and is there any practical advantages of such a paradigm ?

(source for full answers)