From: Markus Triska Date: Wed, 8 Apr 2020 18:38:47 +0000 (+0200) Subject: extend description of strings and partial strings X-Git-Tag: v0.8.119~17^2~1^2 X-Git-Url: https://git.sagredo.dev/?a=commitdiff_plain;h=fac6d549860877bbec7eca1a6b2bb63244681389;p=scryer-prolog.git extend description of strings and partial strings Also, explain in more detail what this feature means to Prolog application programmers, and the strategic direction of Scryer. --- diff --git a/README.md b/README.md index adae3e8b..263b6227 100644 --- a/README.md +++ b/README.md @@ -196,28 +196,50 @@ true. New operators can be defined using the `op` declaration. -### Partial strings +### Strings and partial strings -Scryer has three specialized non-ISO predicates for handling so-called -"partial strings." Partial strings imitate difference lists of -characters, but their characters are packed in UTF-8 format, a much -more efficient alternative to how lists of characters are represented -in many other Prologs. +In Scryer Prolog, the default value of the Prolog flag `double_quotes` +is `chars`, which is also the recommended setting. This means that +double-quoted strings are interpreted as lists of *characters*, in the +tradition of Marseille Prolog. -To use partial strings, the `iso_ext` library must be loaded: +For example, the following query succeeds: -`?- use_module(library(iso_ext)).` +``` +?- "abc" = [a,b,c]. + true. +``` -If `X` is a free variable, the query +Internally, strings are represented very compactly in packed +UTF-8 encoding. A naive representation of strings as lists of +characters would use one memory cell per character, one +memory cell per list constructor, and one memory cell for +each tail that occurs in the list. Since one memory cell takes +8 bytes on 64-bit machines, the packed representation used by +Scryer Prolog yields an up to **24-fold reduction** of +memory usage, and corresponding reduction of memory accesses when +creating and processing strings. + +Scryer Prolog uses the same efficient encoding for *partial* strings, +which appear to Prolog code as partial lists of characters. The +predicate `partial_string/3` from `library(iso_ext)` lets you +construct partial strings explicitly. For example: -`?- partial_string("abc", X, _), X = [a, b, c | Y], partial_string(X), -partial_string_tail(X, Tail), Tail == Y.` +``` +?- partial_string("abc", Ls0, Ls). + Ls0 = [a,b,c|Ls]. +``` -will succeed, posting: +In this case, and as the answer illustrates, `Ls0` is +indistinguishable from a partial list with tail `Ls`, while +the efficient packed representation is used internally. -`Tail = Y, X = [a,b,c|Y].` +An important design goal of Scryer Prolog is to *automatically* use +the efficient string representation whenever possible. Therefore, it +is only very rarely necessary to use `partial_string/3` explicitly. -By all appearances, partial strings are plain Prolog lists. +Definite clause grammars as provided by `library(dcgs)` are ideally +suited for reasoning about strings. ### Modules