--- /dev/null
+# Memory growth in sula: findings
+
+Investigation into why the sula process's resident memory (RSS) kept
+increasing on every served request. Summary: there were **two distinct
+causes** — a large, fixable heap leak, and a small, residual arena leak that
+is an architectural limitation of Scryer Prolog.
+
+## Symptom
+
+Each served file increased RSS and it never came back down. Heavy use drove
+the process past 15 GB. Re-requesting the same page repeatedly kept growing
+RSS monotonically.
+
+## Cause 1 (major): Scryer never GCs the heap; it reclaims it only on backtracking
+
+Scryer Prolog has **no automatic global-stack (heap) garbage collector**.
+There is no `gc` Prolog flag, and the marking code in
+`scryer-prolog/src/machine/gc.rs` is term-traversal machinery that is *not*
+wired into the run loop. The only mechanism that reclaims heap is
+**backtracking**: returning to a choicepoint restores the heap top `H` to the
+value saved in that choicepoint.
+
+The original accept loop `with_connection_loop/3` was deterministic and
+forward-recursive. Every request, `serve_text/3` does
+`phrase_from_file(seq(Body), File)` (`response.pl`), which reads the **entire
+file onto the heap**, plus request-parsing terms. Because the loop never
+backtracked, none of that heap was ever reclaimed — it grew by ~filesize per
+request.
+
+Note: **TCO / last-call optimization does not help.** TCO reclaims the
+*environment (local) stack*, never the heap. Adding a `!` to make the loop
+tail-recursive did nothing for this.
+
+### Reproduction
+
+A properly tail-recursive loop reading a 270 KB file 2000× climbed to ~20 GB
+RSS. The identical work in a failure-driven loop stayed flat:
+
+```prolog
+% Grows without bound (deterministic, never backtracks):
+loop(N) :- N =< 0 -> true ; phrase_from_file(seq(_), "f.dat"), N1 is N-1, loop(N1).
+
+% Flat (backtracks to between/3 every iteration -> heap truncated):
+loop(N) :- between(1, N, _), ( phrase_from_file(seq(_), "f.dat") -> true ; true ), fail.
+loop(_).
+```
+
+### Fix (in this repo)
+
+`with_connection_loop/3` was rewritten as a **failure-driven loop**:
+
+```prolog
+with_connection_loop(Context, Socket, Kont) :-
+ repeat,
+ ( catch(setup_call_cleanup(<accept>, <handle>, <close>), E, handle_conn_error(E))
+ -> true ; true ),
+ fail.
+```
+
+Each handled connection backtracks to `repeat`, so Scryer truncates the heap
+back to a fixed point and the whole per-request allocation (file contents,
+parse terms) is reclaimed. This eliminated the large per-request growth.
+
+## Cause 2 (residual): Scryer never reclaims arena slabs during execution
+
+After the heap fix, RSS still grew ~4–8 KB per request and never recovered.
+
+Scryer allocates sockets, TLS streams, bignums, etc. in an **arena**: an
+append-only singly-linked list of heap-`Box`ed slabs (`arena.rs`, `fn alloc`).
+Slabs are freed **only** in `Drop for Arena` (`arena.rs:585`), i.e. at process
+exit. There is no incremental sweep, and **backtracking does not touch the
+arena** (it is separate from the global stack).
+
+`close/1` calls `remove_stream` and `drop_payload` (frees the inner payload,
+e.g. the rustls buffers), but the **slab husk** (header + struct shell) stays
+linked forever. Every Gemini connection allocates a TCP stream slab + a TLS
+stream slab, so ~4–8 KB/request is never reclaimed while the process runs.
+
+### Reproduction
+
+A failure-driven `open/close` loop (so the heap *is* reclaimed) still grows
+~30 MB/s, monotonically:
+
+```
+t=1 RSS 53184 kB
+t=2 RSS 82604 kB
+t=3 RSS 112076 kB
+... # linear, never decreases
+```
+
+### Why it isn't cheaply fixable
+
+A stream is represented on the heap as a `Cons` cell pointing at the arena
+slab (`streams.rs`). Eagerly freeing the slab in `close/1` would leave that
+heap cell dangling until backtracking truncates it — safe for sula's
+failure-driven loop, but a use-after-free risk for arbitrary programs. The
+correct general fix is a real arena mark-and-sweep GC (mark from heap, stacks,
+and `indices.streams`; free unmarked slabs), which is a large, higher-risk
+change.
+
+### Decision
+
+**Accept the residual leak** as a known Scryer limitation and mitigate
+operationally:
+
+- periodic restart, or
+- systemd: `MemoryMax=512M` + `Restart=always`.
+
+This is documented inline in `with_connection_loop/3` in `sula.pl`.
+
+## Companion changes in the Scryer fork
+
+These reduce the *peak* per-connection footprint but do not fix the slab-husk
+growth (Cause 2). On `../scryer-prolog`, branch `js/fixes`, commit
+`52d70bcb`:
+
+1. `Stream::close` (`src/machine/streams.rs`) — the `NamedTcp` and `NamedTls`
+ arms now call `drop_payload()` after shutting down the socket / sending
+ `close_notify`, releasing the rustls `ServerConnection` buffers immediately
+ instead of waiting for the arena `Drop` at exit (matches file/byte/pipe
+ stream variants).
+2. `tls_accept_client` (`src/machine/system_calls.rs`) — caches the
+ `Arc<ServerConfig>` keyed by the raw cert/key bytes, so the certificate
+ chain and (4096-bit) private key are not re-parsed on every connection.
--- /dev/null
+# Scryer Prolog patches for sula
+
+sula runs on a patched build of Scryer Prolog. The fork lives at
+`../scryer-prolog` on branch **`js/fixes`**. These are the patches applied on
+top of upstream, in order, starting at `0c9d76c8`:
+
+| Commit | Summary |
+|------------|------------------------------------------------------|
+| `0c9d76c8` | `copy_stream/2` |
+| `8aaa6f57` | Fix `phrase_from_stream` infinite loop |
+| `20df3c69` | Make `tcp_accept` poll for INTERRUPT (catchable C-c) |
+| `840a56ff` | Switch TLS to rustls + expose client certificate |
+| `52d70bcb` | Release TLS/TCP connection state on close + cache config |
+
+For the memory-related context behind `52d70bcb`, see
+[`scryer-findings.md`](scryer-findings.md).
+
+---
+
+## `0c9d76c8` — `copy_stream/2`
+
+Adds a new builtin `copy_stream(+In, +Out)` (exported from `library(sockets)`,
+backed by `'$copy_stream'/2`).
+
+- **What:** copies all remaining bytes from `In` to `Out` in a single native
+ call using a 64 KiB buffer loop (`src/machine/system_calls.rs`,
+ `copy_stream`), without materialising the content on the Prolog heap.
+- **Why:** serving a file by reading it into a list (`phrase_from_file`/`seq`)
+ and writing it back puts the whole file on the heap. For binary content
+ (file → TLS socket) this is wasteful; `copy_stream/2` streams it directly.
+- **Used by:** `serve_binary/3` in `response.pl`.
+- **Touches:** `build/instructions_template.rs`, `src/instructions.rs`,
+ `src/machine/dispatch.rs`, `src/machine/system_calls.rs`,
+ `src/lib/sockets.pl` (the usual 5 places needed to register a new builtin).
+
+## `8aaa6f57` — Fix `phrase_from_stream` infinite loop
+
+- **What:** in `src/lib/pio.pl`, `buffer_prepare_for_n/5` now treats
+ `get_n_chars/3` returning `[]` as **EOF** (sets the buffer tail to `[]` and
+ stops), instead of looping.
+- **Why:** on streams whose `at_end_of_stream/1` never reports `true` — notably
+ **pipes** (e.g. the `openssl` subprocess used in `cert.pl` to read the
+ certificate subject) — the old code looped forever once the stream drained.
+ This made `phrase_from_stream/2` hang.
+- **Used by:** `cert_is_for_hostname/2` in `cert.pl` (parses `openssl x509`
+ output over a pipe).
+
+## `20df3c69` — Make `tcp_accept` poll for INTERRUPT
+
+- **What:** `socket_server_accept` (`src/machine/system_calls.rs`) sets the
+ listener non-blocking and polls `accept()` in a loop, sleeping 100 ms on
+ `WouldBlock`. On each poll it checks the global `machine::INTERRUPT` flag; if
+ a SIGINT arrived it restores blocking mode and returns an `Err` carrying an
+ interrupt error so the dispatch site's `try_or_throw!` performs the
+ throw/backtrack correctly.
+- **Why:** a blocking `accept()` syscall swallows Ctrl-C — the signal can't be
+ turned into a catchable `'$interrupt_thrown'` exception while the thread is
+ parked in the kernel. sula's top-level handler
+ (`handle_top_level_error(error('$interrupt_thrown', _))`) relies on that
+ exception to shut down cleanly. Polling lets the interrupt land in Prolog.
+- **Trade-off:** up to ~100 ms latency between SIGINT and shutdown; negligible
+ accept latency since a ready connection returns immediately.
+
+## `840a56ff` — Switch TLS to rustls and expose the client certificate
+
+The largest patch. Replaces the `native-tls` backend with **rustls**
+(`aws-lc-rs` provider) and reworks the server-side TLS API for Gemini's
+certificate model.
+
+- **Dependencies (`Cargo.toml`):** `tls` feature now pulls in `rustls` (0.22,
+ `aws_lc_rs`), `rustls-native-certs`, `rustls-pemfile`, `aws-lc-rs`, `hex`;
+ drops `native-tls`.
+- **Context API (`src/lib/tls.pl`):** `tls_server_context/2` now takes
+ `[certificate(Cert), key(Key)]` — **PEM** cert chain + private key (PKCS#8 /
+ PKCS#1 / SEC1) — instead of a PKCS#12 archive + password.
+- **Negotiation API:** `tls_server_negotiate/3` → **`tls_server_negotiate/4`**,
+ gaining a `-ClientCert` argument. Backed by `'$tls_accept_client'/5`
+ (`src/machine/system_calls.rs`, `tls_accept_client`).
+- **Client certificate (`ClientCert`):** the server *requests* a client cert
+ but does not require one (Gemini's TOFU model). After the handshake,
+ `ClientCert` is bound to either the atom `none` (anonymous client) or the
+ **SHA-256 digest of the client's leaf certificate**, returned as a list of
+ byte values.
+- **`GeminiClientVerifier`** (custom `rustls` `ClientCertVerifier`): offers
+ client auth, does not make it mandatory, sends no CA hints, and accepts any
+ presented certificate **as-is** (no signature/expiry validation) — i.e.
+ trust-on-first-use, matching Gemini semantics. Signature-scheme support is
+ delegated to the aws-lc-rs provider.
+- **Streams (`src/machine/streams.rs`):** the `NamedTls` stream variant is
+ reworked to wrap a rustls `StreamOwned<ServerConnection, _>`.
+- **Used by:** `cert.pl` (`with_tls_connection/3` calls
+ `tls_server_negotiate/4`, threads `ClientCert` to the request handler) and
+ `sula.pl` (`req_serve/2` receives `_ClientCert`).
+
+## `52d70bcb` — Release connection state on close + cache `ServerConfig`
+
+Memory-footprint fixes for the per-connection TLS path (full context in
+[`scryer-findings.md`](scryer-findings.md)).
+
+- **`Stream::close` (`src/machine/streams.rs`):** the `NamedTcp` and `NamedTls`
+ arms now call `drop_payload()` after the socket shutdown / `close_notify`, so
+ the rustls `ServerConnection` and its buffers are released immediately rather
+ than lingering until the arena is dropped at process exit (matches the
+ file/byte/pipe stream variants).
+- **`tls_accept_client` (`src/machine/system_calls.rs`):** caches the
+ `Arc<ServerConfig>` keyed by the raw cert/key bytes, avoiding a re-parse of
+ the certificate chain and (4096-bit) private key on every connection.
+- **Note:** this reduces *peak* footprint but does not stop the residual
+ ~4–8 KB/request growth from arena stream-slab husks — that is a Scryer
+ limitation (no incremental arena reclamation); see `scryer-findings.md`.