% PKCS#8 v2 format as generated by `ed25519_new_keypair/1`. Sign Data
% with Key, yielding Signature as a list of hexadecimal characters.
+/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
+ Side-channel attacks on Ed25519 predicates
+ ==========================================
+
+ Ed25519 predicates where the private key occurs as part of the
+ arguments are potentially subject to side-channel attacks, since
+ key pairs are represented as strings in the context of Ed25519.
+
+ The compact string representation used by Scryer Prolog means that
+ different characters may occupy different numbers of bytes: Due
+ to UTF-8 encoding, characters with codes 1..127 occupy exactly 1 byte,
+ characters with codes 128..2047 occupy exactly 2 bytes, and '0'
+ is represented as a list element occupying an entire cell and
+ dedicated list constructor in addition to string termination and
+ possibly padding.
+
+ This difference is located at the level of the Rust engine. To
+ Prolog code, any two characters look conceptually the same (i.e.,
+ they are both atoms of length 1), and the internal difference in
+ representation cannot be observed at all.
+
+ Very precise timing information or other measurements about
+ operations that reason about such strings may yield information
+ that is meant to stay secret. For example, if Ks is a secret key
+ stored as a list of characters, then the time it takes to run
+ phrase(..., Ks) may reveal the number of bytes in Ks that are 0 or
+ greater than 127.
+
+ To test whether it is possible to detect such differences, I use
+ exp(N) which succeeds exactly 2^N times:
+
+ exp(E) :-
+ N is 2^E,
+ between(1, N, _).
+
+ Here is an example query that uses partial_string/1 to traverse
+ various strings consisting uniformly of characters with the same
+ code, such as 0, 32, 255 and others, in the hope to detect
+ differences in timing if only in such extreme cases:
+
+ ?- length(Ls, 256),
+ member(Code, [12,0,55,0,0,32,255,10,0,127,64]),
+ portray_clause(byte=Code),
+ maplist(=(Code), Ls),
+ atom_codes(A, Ls),
+ atom_chars(A, Cs),
+ time((exp(21),partial_string(Cs),false)).
+ %@ byte=12.
+ %@ % CPU time: 1.537s, 14_680_107 inferences
+ %@ byte=0.
+ %@ % CPU time: 1.526s, 14_680_107 inferences
+ %@ byte=55.
+ %@ % CPU time: 1.534s, 14_680_107 inferences
+ %@ byte=0.
+ %@ % CPU time: 1.556s, 14_680_107 inferences
+ %@ byte=0.
+ %@ % CPU time: 1.520s, 14_680_107 inferences
+ %@ byte=32.
+ %@ % CPU time: 1.524s, 14_680_107 inferences
+ %@ byte=255.
+ %@ % CPU time: 1.522s, 14_680_107 inferences
+ %@ byte=10.
+ %@ % CPU time: 1.526s, 14_680_107 inferences
+ %@ byte=0.
+ %@ % CPU time: 1.522s, 14_680_107 inferences
+ %@ byte=127.
+ %@ % CPU time: 1.522s, 14_680_107 inferences
+ %@ byte=64.
+ %@ % CPU time: 1.517s, 14_680_107 inferences
+ %@ false.
+
+ This shows that there is enough variety between runs that
+ traversing a list with 256 elements that are all '\x0\' may even,
+ and unexpectedly, be faster than traversing a list consisting
+ entirely of characters with character code 32, which in turn may be
+ slower than processing a list with 256 characters that all have
+ code 255 and thus occupy twice as much space. This holds even over
+ millions of runs. Reasons for such variety can include CPU power
+ saving mechanisms, dynamic optimizations, prefetching heuristics,
+ branch prediction algorithms, varying system loads etc.
+
+ This gives rise to the suspicion that any such timing differences
+ would be extremely hard to exploit, at least on the architecture I
+ tested it on, also since partial_string/1 is a very low-level
+ operation and any actual processing (using phrase/2 etc.) would
+ introduce additional overheads that in all likelihood far outweigh
+ any differences that can be measured with partial_string/1.
+
+ Any resulting differences in timing and resource use, if they are
+ measurable at all in any way, can at most reveal one bit per byte.
+ Note also that Ed25519 private keys are chosen randomly, and hence
+ half of their bytes are expected to be in 128..255. The predicates
+ remain completely safe to use in all scenarios where no information
+ about the private key can be gathered by unauthorized parties.
+
+ Still, the concern remains: We know that different keys may occupy
+ different numbers of bytes in the internal compact representation
+ of strings used by Scryer Prolog, and it may be possible to exploit
+ these differences to obtain information that is meant to be kept
+ secret. We must therefore keep an eye on this issue. For example,
+ it may become a concern on very slow devices such as ID-cards where
+ Scryer Prolog may be deployed in the future and where such timing
+ differences may be detectable, or if Scryer Prolog itself becomes
+ so fast that the relative overhead of such low-level operations
+ becomes greater and thus more easily measurable.
+
+ Possible mitigations in such situations would be to:
+
+ 1. use lists of integers to represent Ed25519 key pairs, resulting
+ in a 24-fold space increase. This may be prohibitive in
+ applications that manage a great number of keys. The Rust code
+ would not be affected by this change, since it already operates
+ on bytes. A Prolog application may implement this privately, and
+ also encourage an API change of this library.
+ 2. introduce a compact internal representation for lists of bytes,
+ which appear to Prolog programs as lists of characters.
+
+ (2) seems to be the better solution despite the implementation
+ overhead: In addition to the improved security properties due to
+ the elimination of side-channel attacks when reasoning about keys,
+ all applications that reason about binary data would benefit from a
+ more compact representation of binary data. Such an additional
+ compact representation should only be attempted if the amount of
+ Rust code it impacts is kept to the absolute minimum, certainly
+ much smaller than what the compact string representation as it is
+ currently implemented affects.
+
+ *No* solution would be to:
+
+ - eliminate the compact string representation from the engine and
+ use plain lists of characters to represent Ed25519 key pairs,
+ - continue to use lists of characters for Ed25519 key pairs,
+ and ensure that they are never coalesced into compact strings
+ (a future GC compaction step may need to be adapted for this)
+
+ This is because atom names are still represented in UTF-8 encoding,
+ and are hence also susceptible to side-channel attacks due to their
+ using different numbers of bytes for different codes.
+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
+
ed25519_sign(KeyPair, Data0, Signature, Options) :-
must_be_octet_chars(KeyPair, ed25519_sign/4),
length(Prefix, 16),