Stargaze user manual

1. Lexical syntax
2. Built-in primitives
3. Extended primitives

(very early draft)

1. Lexical syntax

1.1. Comment

Comment in Stargaze starts with ;.

1.2. Booleans

Only the strings #t and #f are boolean values; they evaluate to true and false respectively.

1.3. Strings

There's only one kind of syntax for string literals: "{string-piece}*", where string-piece is one of:

Normal characters.
Escape sequences, which starts with \ and is of the following format:
- \r, \n, \b, \t, \\, \", \v, \f, for carriage return, linefeed, backspace, horizontal tab, slash, double quote, vertical tab and form feed respectively.
- \xHEX, where HEX is the codepoint of character in hexadecimal, e.g. \x61 is the same as a.

1.4. Characters

Two kinds of syntax for character literals:

#\NAME where NAME is the name for the character. NAME is often the character itself (e.g. #\a refers to the character a), but exceptions exist; see below.
#ch{HEX} where HEX is the codepoint in hexadecimal, e.g. #ch{61} is the same as #\a.

As for the first kind, Scheme R4RS only explicitly declares a #\space and a #\newline. Stargaze (partly) follows the convention in MIT Scheme, except that #\newline always corresponds to the ASCII LF regardless of the operating system. For this reason the following "character name"s are also defined in Stargaze:

All of the standard ASCII names for non-printing characters are supported. They can be used by prefixing with #\, e.g. #\NUL:

NUL     SOH     STX     ETX     EOT     ENQ     ACK     BEL
BS      HT      LF      VT      FF      CR      SO      SI
DLE     DC1     DC2     DC3     DC4     NAK     SYN     ETB
CAN     EM      SUB     ESC     FS      GS      RS      US
DEL

A few other names are also defined:
- #\esc, for #\ESC. MIT Scheme used the string #\altmode, which we
- #\backspace for #\BS
- #\linefeed for #\LF (also, as stated above, #\newline)
- #\tab for #\HT
- #\return for #\CR

1.5. Vector

Vector literals can be constructed by surrounding values with #{ and }.

1.6. Quotes

Four kinds of quote exists in Stargaze:

quote / ' , which turns whatever it prefixed into "the thing itself"; e.g. (if #t 3 4) evaluates to 3 but '(if #t 3 4) evaluates to (if #t 3 4) (that is, a list containing if, #t, 3 and 4.)
qquote / ` , which works like normal quote, but any of the sub-expression within can be "unquoted" and thus get evaluated like normal.
unquote / , , which evaluates whatever it prefixed within a quasiquote-ed context. e.g. `(if (leq 3 4) 3 4) evaluates to (if (leq 3 4) 3 4) but `(if ,(leq 3 4) 3 4) evaluates to (if #t 3 4).

These quotes are equivalent to their s-expr counterparts at all time, e.g. ''a would be equivalent to (quote (quote a)) which would evaluates to (quote a) (which also means that (list? ''a) would evaluate to #t.)

2. Built-in primitives

(def NAME BODY): Bind the value BODY to the name NAME in the current environment.
- (def (FUNCNAME ARGS ...) BODY): Short hand for (def FUNCNAME (fn (ARGS ...) BODY)).
(fn ARGLIST BODY): Anonymous function.
(if COND THEN_CLAUSE ELSE_CLAUSE): Simple branching.
(cond (COND1 CLAUSE1) ...): Multiway branching.
(let ((NAME1 EXP1) ...) BODY ...): let-binding.
(letrec ((NAME1 EXP1) ...) BODY ...): letrec-binding.
(quote ...): Quote its argument as symbolic values
(set! NAME VALUE): Assign VALUE to the name NAME.
(begin EXP1 ...): Executes EXP1 and the rest in the order they appear in the argument list. Returns the evaluated value of the last argument..
(equal EXP1 EXP2): Check if EXP1 and EXP2 has the same value.

2.1. Module-related

(include STR): Include a file.
(import MOD RENAME?):
- MOD can be one of two options:
  - A string, which refers to the module.
  - (STR PREFIX), which prefix all exported name in STR with PREFIX.
    - e.g. assumes that mymodule exports func1, func2 and func3; (import ("mymodule" xyz)) would import func1 as xyzfunc1, func2 as xyzfunc2, func3 as xyzfunc3.
- RENAME is one of two followings:
  - Nothing, in which case the form would be (import MOD); this would import all exported name in MOD.
  - A list of (NAME NEWNAME), which would only import the exported names described by NAME, which would bind to NEWNAME when importing. This renaming takes precedence over prefixes, so (import ("abc" abc) (myfunc1 defmyfunc1)) would import myfunc1 from module "abc" into defmyfunc1 instead of abcmyfunc1 or abcdefmyfunc1.
- Note that (import MOD) means import all names from MOD, while (import MOD ()) means import all the exported names in (), which means none of the exported names.

in the imported module; NEW_NAME would be the effective name in the importing module for NAME in the imported module.

(export NAME ...): Export

2.2. Miscellaneous

(atom? EXP): Check if EXP is an atom, i.e. a symbol, an integer, a boolean, a string, a character, or the empty list.
(closure? EXP): Check if EXP is a closure.
(primitive? EXP): Check if EXP is a primitive.
(procedure? EXP): Check if EXP is a closure or primitive.
- Note that in Stargaze lingo a "primitive" refers to primitives that can be applied like a function (unlike things like fn and quote, which are called "special forms" and are called "syntax forms" in other LISP-like languages) and a "closure" refers to any anonymous function created during the course of execution.

2.3. Integer

(int? EXP): Check if EXP is an integer.
(add EXP ...): Return the sum of the arguments.
(sub EXP1 ...): Return EXP1 - EXP2 - ....
(mul EXP ...): Return the product of the arguments.
(div EXP1 ...): Return EXP1 div EXP2 div ...
(mod EXP1 EXP2): Return EXP1 % EXP2.
(leq EXP1 EXP2), (lt EXP1 EXP2), (geq EXP1 EXP2), (gt EXP1 EXP2): Check if EXP1 is smaller or equal than, strictly smaller than, greater or equal than, and strictly greater than EXP2 respectively.
(intstr EXP): convert EXP into a string.

2.4. Floating-point

(float? EXP): Check if EXP is an floating point number.
(addf EXP ...): Return the sum of the arguments.
(subf EXP1 ...): Return EXP1 - EXP2 - ....
(mulf EXP ...): Return the product of the arguments.
(divf EXP1 ...): Return EXP1 div EXP2 div ...
(float INT): Convert INT to a floating point number.
- Returns the argument itself if it is already a floating-point number.
(floor FLOAT), (ceil FLOAT), (round FLOAT), (trunc FLOAT):
- floor returns the first integer that's smaller than or equal to its arugment.
- ceil returns the first integer that's bigger than or equal to its argument.
- round rounds its argument to the nearest integer.
- trunc directly removes the decimal part.
- Returns the argument itself if it is already an integer.
(leqf EXP1 EXP2), (ltf EXP1 EXP2), (geqf EXP1 EXP2), (gtf EXP1 EXP2): Check if EXP1 is smaller or equal than, strictly smaller than, greater or equal than, and strictly greater than EXP2 respectively.
(floatstr EXP): convert EXP into a string.
(eqnum EXP1 EXP2): Return if EXP1 and EXP2 has the same numerical value.

2.5. Pair

(cons EXP1 EXP2): Return the pair of EXP1 and EXP2
(car EXP1): Return the first component of the pair EXP1.
(cdr EXP1): Return the second component of the pair EXP2.
(set-car! EXP NEWCAR): Set the first component of EXP to NEWCAR.
(set-cdr! EXP NEWCDR): Set the second component of EXP to NEWCAR.
(w/car EXP NEWCAR): Equivalent to (cons NEWCAR (cdr EXP)).
(w/cdr EXP NEWCDR): Equivalent to (cons (car EXP) NEWCDR).

2.6. Character

(chr INT): Convert INT into the corresponding character.
(ord CHAR): Convert CHAR into the corresponding integer.
(char? EXP): Check if EXP is a character.

2.7. String

(strref STR I): Retrieve the I-th (starting from 0) character of the string STR.
(substr STR START END?): Retrieve the substring of the string STR starting from index START to index END. END is optional; if it's not provided, this function takes the substring starting from index START to the end of the string.
(strsym STR): Convert a string to a symbol.
(str? EXP): Check if EXP is a string.

2.8. Boolean

(and EXP1 ...): Returns #f if one of the EXP evalueates to #f, or else returns the last argument.
(or EXP1 ...): Returns the first value that is not #f (if any); or else, returns #f.
(not EXP): If EXP evaluates to #f then return #t; returns #f otherwise.

2.9. Symbol

(symstr SYM): Convert a symbol to a string.
(sym? EXP): Check if EXP is a symbol.

2.10. List

(list EXP1 ...): Combine its arguments into a list.
(nil? EXP): Check if EXP is an empty list.

2.11. Vector

(vec? EXP): Check if EXP is a vector.
(vector EXP1 ...): Combine its arguments into a vector.
(listvec LIST): Convert LIST into a vector.
(veclist VEC): Conver VEC into a list
(vecref VEC INT): Return the INT-th element from VEC.
(mkvec INT): Create a vector of size INT.
(vecset! VEC INT VALUE): Set the INT-th element of VEC to value VALUE.
(vec++ EXP1 ...):
(veclen VECTOR):

2.12. File input/output

stdin, stdout, stderr: Standard input, standard output and standard error.
(eof? EXP): Check if EXP is the EOF object.
(openinput STR): Open file STR as input.
(openoutput STR MODE?): Open file STR as output.
- When MODE is not used (i.e. the form is (openoutput STR)), the file is opened for writing.
- When MODE is "a", the file is opened for appending.
(close FILE): Close a file.
(readch FILE): Read a character from an input file.
(writech FILE): Write a character to an output file.

2.13. Iteration

Iteration is important (at least for now) since we don't have tail-call optimization.

(while COND BODY): Repeatedly execute BODY until COND evaluates to false. Returns nil.

2.14. Structure operations

(defstruct NAME DESCRIPTOR...): Define a "struct".
- (defstruct NAME FIELD1 ...): Define a simple record type.
- (defstruct NAME (TAGSYMBOL1 FIELD11 ...) ...): Define a struct that has multiple branch, similar to the algebraic datatypes / disjoint union in languages like Haskell and Standard ML.

When executed, a few new functions would be defined in the call site's environment:

(NAME? EXP): Check of EXP evaluates to a value that is a struct of type NAME.
(NAME EXP1 ...): Constructor for the struct type NAME. Defined only when NAME is defined as a simple record type.
(NAME/FIELD1 EXP), …: Field getters for the struct type NAME. Defined only when NAME is defined as a simple record type.
(TAGSYMBOL1 EXP1 ...), …: Constructors for the struct type NAME. Defined only when NAME is defined as a disjoint union.
(NAME/TAGSYMBOL1/FIELD1 EXP), …: Field getters for the struct type NAME. Defined only when NAME is defined as a disjoint union.

For example, this:

(defstruct MyStruct fieldName1 fieldName2)

would have these functions defined:

(MyStruct? EXP)
(MyStruct EXP1 EXP2)
(MyStruct/fieldName1 EXP)
(MyStruct/fieldName2 EXP)

and it would be like this in Haskell:

-- the name "MyStructConstructor1" itself is not important here because
-- for Stargaze when the definition has only 1 branch this part is considered
-- to be not exist.
data MyStruct = MyStructConstructor1 {
    fieldName1 :: Field1Type,
    fieldName2 :: Field2Type
}

or this in Nim:

type
  MyStruct = ref object
    fieldName1: Field1Type
    fieldName2: Field2Type

while this:

(defstruct Term
  (TInt iVal)
  (TVar vName)
  (TApp appHead appArgs)
  (TLambda lArgs lBody))

would have the following functions defined:

(Term? EXP)
(TInt EXP)
(TVar EXP)
(TApp EXP EXP)
(TLambda EXP EXP)
(Term/TInt/iVal EXP)
(Term/TVar/vName EXP)
(Term/TApp/appHead EXP)
(Term/TApp/appArgs EXP)
(Term/TLambda/lArgs EXP)
(Term/TLambda/lBody EXP)

and it would be like this in Haskell:

data Term
  = TInt Int
  | TVar String
  | TApp Term [Term]
  | TLambda [String] Term

and this in Nim:

type
  TermType* = enum
    TInt
    TVar
    TApp
    TLambea
  Term* = ref object
    case tType*: TermType
    of TInt:
      iVal*: int
    of TVar:
      vName*: string
    of TApp:
      appHead*: Term
      appArgs*: seq[Term]
    of TLambda:
      lArgs*: seq[string]
      lBody*: Term

Note that these two following examples are different: one is a simple record type, one is a disjoint union that has only 1 branch:

;; create a struct called "MyStruct" with 2 fields
;; this is similar to this one in Haskell:
;;     type MyStruct = (Field1Type, Field2Type)
;;     myStruct_field1 :: MyStruct -> Field1Type
;;     myStruct_field2 :: MyStruct -> Field2Type
;;     myStruct_field1 (r, _) = r
;;     myStruct_field2 (_, r) = r
(defstruct MyStruct field1 field2)

;; create a struct called "MyStruct" with 2 fields
;; this is like the following in Haskell:
;;     data MyStruct = MSConstr1 { field1 :: Field1Type, field2 ::  Field2Type }
(defstruct MyStruct (MSConstr1 field1 field2))

;; create a struct called "MyStruct" with 1 fields since the first symbol
;; is considered to be a tag. This is like the following in Haskell:
;;     data MyStruct = Field1 Field2Type
(defstruct MyStruct (Field1 Field2)

(struct? EXP): Check if EXP evaluates to a struct value.
(struct-major-label EXP): If EXP evaluates to a struct, return its "major label", which would be the NAME part of the struct.
(struct-secondary-label EXP): If EXP evaluates to a struct, return its "secondary label", which woould be the TAG-SYMBOL1 part of the struct.

For example, when we have the following definition:

(defstruct mystruct1 field1 field2)
(defstruct mystruct2 (MSConstr1 field1 field2))
(defstruct mystruct3 (Field1 Field2))

The following expressions all evaluates to a struct value:

(mystruct1 3 4)
(mystruct2 'MSConstr1 3 4)
(mystruct3 'Field1 4)
(mystruct3 (strsym "Field1") 4)
(mystruct3 (strsym (str++ "Fie" "ld" "1")) 4)

while these raises error:

(mystruct1 3)  ;; missing one field
(mystruct1 4 5 6)   ;; one field too many
(mystruct2 'MSConstr2 3 4)  ;; tag not allowed; must be MSConstr1, but MSConstr2 found.
(mystruct3 3 4)  ;; tag not allowed; must be 'Field1, but 3 found.

The "major label" of the examples above would be:

(struct-major-label (mystruct1 3 4))
   ;; 'mystruct1
(struct-major-label (mystruct2 'MSConstr1 3 4))
   ;; 'mystruct2
(struct-major-label (mystruct3 'Field1 4))
   ;; 'mystruct3
(struct-major-label (mystruct3 (strsym "Field1") 4))
   ;; 'mystruct3
(struct-major-label (mystruct3 (strsym (str++ "Fie" "ld" "1")) 4))
   ;; 'mystruct3

and the "secondary label" of the examples above would be:

(struct-secondary-label (mystruct1 3 4))
   ;; #f   (not applicable for simple struct)
(struct-major-label (mystruct2 'MSConstr1 3 4))
   ;; 'MSConstr1
(struct-major-label (mystruct3 'Field1 4))
   ;; 'Field1
(struct-major-label (mystruct3 (strsym "Field1") 4))
   ;; 'Field1
(struct-major-label (mystruct3 (strsym (str++ "Fie" "ld" "1")) 4))
   ;; 'Field1

(struct-arity EXP): If EXP evaluates to a struct, return its arity (i.e. the number of fields, major label & secondary label not included); or else, return #f.

Tags are not counted. Using the example above, the following to expressions both evaluate to the number 2:

(struct-arity (mystruct1 3 4))   ;; 2
(struct-arity (mystruct2 'MSConstr1 3 4))   ;; 2
    ;; not 3, because 'MSConstr1 is not counted

(structvec EXP): Convert EXP into a vector, which would be its field list (in order) with its major label & secondary label at the front. For example:

(structvec (mystruct1 3 4))
    ;; #{'mystruct1 #f 3 4}
(structvec (mystruct2 'MSConstr1 3 4))
    ;; #{'mystruct2 'MSConstr1 3 4}
(structvec (mystruct3 'Field1 4))
    ;; #{'mystruct3 'Field1 4}
(structvec (mystruct3 (strsym "Field1") 4))
    ;; #{'mystruct3 'Field1 4}
(structvec (mystruct3 (strsym (str++ "Fie" "ld" "1")) 4))
    ;; #{'mystruct3 'Field1 4}

(vecstruct EXP): Convert EXP (which should be a vector) into a struct. The vector should be in the form of #{majorLabel secondaryLabel field ...}. Basically structvec's inverse action; see examples above.

3. Extended primitives

A few functions that should be able to be defined using the language itself is defined as primitives for performance, even if there aren't much performance to begin with…

3.1. List libraries

(length LIST): Return the length of LIST.
(append EXP ...) / (list++ EXP ...): Combines EXP and the rest into one list.
(reverse LIST): Return a reversed version of LIST.
(map F LIST1 ...): Returns a new list whose members are the results of applying F on LIST1 and the rest, e.g. (map add (list 3 4) (list 5 6) (list 7 8)) is equivalent to (list (add 3 5 7) (add 4 6 8)).
(filter F LIST): Returns a new list consisting of all the members of LIST that satisfies F. F should be a function that takes 1 argument and returns a boolean.
(member EXP LIST): Check if EXP is in the list LIST. If it is, returns the part of LIST starting from EXP; if not, return the false value.
(assoc EXP LIST): Check if EXP is in the assoc list LIST. An assoc list in LISP-like languages is a kind of list that consists of key-value pairs. If there is a key-value pair that uses EXP as the key, this primitive will return that key-value pair; or else, it will return the false value.

3.2. Bitwise operations

(bit~ INT): Bitwise not.
(bit& INT ...): Bitwise and.
(bit^ INT ...): Bitwise xor.
(bit| INT ...): Bitwise or.
(bit<< INT INT): Left shift.
(bit>> INT INT): Right shift (left-side padded with 0)

3.3. String operations

(mkstr N CHAR): Makes a string of length N consisting of only character CHAR.
(strlen STR): Returns the length of the string STR.
(strlist STR): Convert a string to a list of characters.
(strvec STR): Convert a string to a vector of characters.
(str++ STR1 ...): Concatenates STR1 and the rest into one single string.