186 lines
5.4 KiB
Markdown
186 lines
5.4 KiB
Markdown
# saQut
|
||
|
||
> **A compiler built as a toolbox, not a black box —**
|
||
> every internal phase is a first-class, inspectable output.
|
||
|
||
```
|
||
saqut tokens file:fib.sqt → token stream, JSON
|
||
saqut ast file:fib.sqt → full AST, JSON
|
||
saqut ast file:fib.sqt --optimized → constant-folded + DCE'd AST
|
||
saqut run file:fib.sqt → execute via IR + bytecode VM
|
||
```
|
||
|
||
Most compilers are black boxes. saQut is a **glass box.**
|
||
|
||
---
|
||
|
||
## What is it?
|
||
|
||
saQut is a **procedural language compiler** written in C++.
|
||
The language is small and C-flavoured on purpose — it is a vehicle, not the product.
|
||
The product is **a compilation pipeline where every stage is named, queryable, and machine-readable.**
|
||
|
||
You can pipe `saqut ast` into your own tool.
|
||
You can hand the optimized AST diff to a review script.
|
||
A stranger with no access to source could write an LSP from `saqut symbols` output alone.
|
||
That is the test saQut is designed to pass.
|
||
|
||
---
|
||
|
||
## The language looks like this
|
||
|
||
```c
|
||
int fibonacci(int n) {
|
||
if (n <= 1) {
|
||
return n;
|
||
}
|
||
return fibonacci(n - 1) + fibonacci(n - 2);
|
||
}
|
||
|
||
int fibonacciIterative(int n) {
|
||
int first = 0;
|
||
int second = 1;
|
||
for (int i = 0; i < n; i = i + 1) {
|
||
int next = first + second;
|
||
first = second;
|
||
second = next;
|
||
}
|
||
return first;
|
||
}
|
||
|
||
int main() {
|
||
int n = 10;
|
||
print(fibonacci(n));
|
||
print(fibonacciIterative(n));
|
||
return 0;
|
||
}
|
||
```
|
||
|
||
- No mandatory `class` / `main` boilerplate
|
||
- Typed functions, `struct`, `int[]` arrays
|
||
- `int`, `float`, `bool`, `string` literal types
|
||
- Value semantics — no user-visible pointers
|
||
- Single FFI seam (`callhost`) — the only door to the outside world
|
||
|
||
**Deliberately absent:** OOP, closures, generics, implicit int↔float coercion, `auto`.
|
||
|
||
---
|
||
|
||
## Build
|
||
|
||
**Requirements:** C++17, CMake ≥ 3.16, Ninja
|
||
|
||
```bash
|
||
git clone https://github.com/abdussamedulutas/saqut
|
||
cd saqut
|
||
cmake -B build -G Ninja
|
||
cmake --build build
|
||
```
|
||
|
||
Binary lands at `build/saqut`.
|
||
|
||
**Tested on:** Linux (x86-64, Manjaro). macOS and Windows untested but no platform-specific code.
|
||
|
||
---
|
||
|
||
## CLI
|
||
|
||
| Command | What you get |
|
||
|---|---|
|
||
| `saqut tokens file:src.sqt` | Token stream with positions |
|
||
| `saqut ast file:src.sqt` | Full AST as JSON |
|
||
| `saqut ast file:src.sqt --optimized` | AST after constant folding + dead-code elimination |
|
||
| `saqut symbols file:src.sqt` | Symbol table dump |
|
||
| `saqut ir file:src.sqt` | IR instruction dump |
|
||
| `saqut run file:src.sqt` | Compile and run via bytecode VM |
|
||
|
||
Every output is designed to be piped, diffed, or consumed by other tools.
|
||
|
||
---
|
||
|
||
## Pipeline
|
||
|
||
```
|
||
Source
|
||
│ Lexer + Tokenizer
|
||
▼
|
||
Tokens ──────────────────── saqut tokens
|
||
│ Pratt parser + recursive descent
|
||
▼
|
||
AST ─────────────────────── saqut ast
|
||
│ Symbol collector (two-pass)
|
||
▼
|
||
Symbol Table ────────────── saqut symbols
|
||
│ Type checker + structural validator
|
||
▼
|
||
Annotated AST
|
||
│ Optimization Manager (clone — original untouched)
|
||
│ ├─ Constant Folding pass
|
||
│ └─ Dead Code Elimination pass
|
||
▼
|
||
Optimized AST ───────────── saqut ast --optimized
|
||
│ IR Generator
|
||
▼
|
||
IR ──────────────────────── saqut ir
|
||
│ Bytecode VM (interpreter loop)
|
||
▼
|
||
Output ──────────────────── saqut run
|
||
```
|
||
|
||
The optimizer works on a **clone** of the AST — the original is preserved.
|
||
Constant folding and DCE run in a fixpoint loop until nothing changes.
|
||
|
||
---
|
||
|
||
## What works right now
|
||
|
||
| Stage | Status |
|
||
|---|---|
|
||
| Lexer / Tokenizer | ✅ |
|
||
| Pratt parser | ✅ |
|
||
| AST + JSON serialization | ✅ |
|
||
| Symbol table (two-pass collector) | ✅ |
|
||
| Type checker | ✅ |
|
||
| Structural validator | ✅ |
|
||
| Constant folding (int, bool, logical, unary) | ✅ |
|
||
| Dead code elimination | ✅ |
|
||
| IR generator + bytecode VM | ✅ |
|
||
| `saqut run` executes fibonacci | ✅ |
|
||
| `string` type | ✅ |
|
||
| `struct` | 🚧 |
|
||
| `int[]` arrays | 🚧 |
|
||
| Standard library / FFI beyond `print` | 🚧 |
|
||
|
||
---
|
||
|
||
## Philosophy in two sentences
|
||
|
||
**Glass:** every compilation stage is a stable, queryable output — tokens, AST, symbols, IR — all separately inspectable and pipeable.
|
||
**Cage:** no user pointers, value semantics, single FFI door — the VM is deterministic, which makes record-replay and time-travel debugging a natural extension, not an afterthought.
|
||
|
||
The long version is in [`docs/architecture.md`](docs/architecture.md).
|
||
|
||
---
|
||
|
||
## Design records
|
||
|
||
Architectural decisions live in `docs/`:
|
||
|
||
| File | Coverage |
|
||
|---|---|
|
||
| [`docs/fikirler.md`](docs/fikirler.md) | ADR-001–005: backend strategy, parser, header-only, token, IR |
|
||
| [`docs/adr-frontend-analiz.md`](docs/adr-frontend-analiz.md) | ADR-006–019: analysis, optimization, execution model, FFI, memory |
|
||
| [`docs/roadmap-frontend.md`](docs/roadmap-frontend.md) | Phase-by-phase implementation plan |
|
||
| [`docs/architecture.md`](docs/architecture.md) | Full architecture reference (Turkish) |
|
||
|
||
---
|
||
|
||
## License
|
||
|
||
Source-available, commercial use restricted.
|
||
Free for: personal use, learning, writing and running saQut programs, internal tooling.
|
||
Requires permission for: hosting as a service, embedding sub-components commercially, redistributing as a product.
|
||
|
||
See [`LICENSE.md`](LICENSE.md) for the full terms.
|
||
Commercial licensing: saqutsoftware+gitea@gmail.com
|