From 6b8a679a516f1ececa2e6e9f4d9ea3c4cd4ab07b Mon Sep 17 00:00:00 2001
From: Markus Triska <triska@metalevel.at>
Date: Sun, 25 Feb 2024 11:13:38 +0100
Subject: [PATCH] explain potential side-channel attacks due to compact string
 representation

This legitimate concern was already raised by @infogulch in:

https://github.com/mthom/scryer-prolog/issues/1309#issuecomment-1080028854

Thank you a lot!
---
 src/lib/crypto.pl | 140 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 140 insertions(+)

diff --git a/src/lib/crypto.pl b/src/lib/crypto.pl
index 94def2c0..53e3bdb9 100644
--- a/src/lib/crypto.pl
+++ b/src/lib/crypto.pl
@@ -750,6 +750,146 @@ ed25519_keypair_public_key(Pair, PublicKey) :-
 %  PKCS#8 v2 format as generated by `ed25519_new_keypair/1`. Sign Data
 %  with Key, yielding Signature as a list of hexadecimal characters.
 
+/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
+   Side-channel attacks on Ed25519 predicates
+   ==========================================
+
+   Ed25519 predicates where the private key occurs as part of the
+   arguments are potentially subject to side-channel attacks, since
+   key pairs are represented as strings in the context of Ed25519.
+
+   The compact string representation used by Scryer Prolog means that
+   different characters may occupy different numbers of bytes: Due
+   to UTF-8 encoding, characters with codes 1..127 occupy exactly 1 byte,
+   characters with codes 128..2047 occupy exactly 2 bytes, and '0'
+   is represented as a list element occupying an entire cell and
+   dedicated list constructor in addition to string termination and
+   possibly padding.
+
+   This difference is located at the level of the Rust engine. To
+   Prolog code, any two characters look conceptually the same (i.e.,
+   they are both atoms of length 1), and the internal difference in
+   representation cannot be observed at all.
+
+   Very precise timing information or other measurements about
+   operations that reason about such strings may yield information
+   that is meant to stay secret. For example, if Ks is a secret key
+   stored as a list of characters, then the time it takes to run
+   phrase(..., Ks) may reveal the number of bytes in Ks that are 0 or
+   greater than 127.
+
+   To test whether it is possible to detect such differences, I use
+   exp(N) which succeeds exactly 2^N times:
+
+        exp(E) :-
+            N is 2^E,
+            between(1, N, _).
+
+   Here is an example query that uses partial_string/1 to traverse
+   various strings consisting uniformly of characters with the same
+   code, such as 0, 32, 255 and others, in the hope to detect
+   differences in timing if only in such extreme cases:
+
+        ?- length(Ls, 256),
+           member(Code, [12,0,55,0,0,32,255,10,0,127,64]),
+           portray_clause(byte=Code),
+           maplist(=(Code), Ls),
+           atom_codes(A, Ls),
+           atom_chars(A, Cs),
+           time((exp(21),partial_string(Cs),false)).
+        %@ byte=12.
+        %@    % CPU time: 1.537s, 14_680_107 inferences
+        %@ byte=0.
+        %@    % CPU time: 1.526s, 14_680_107 inferences
+        %@ byte=55.
+        %@    % CPU time: 1.534s, 14_680_107 inferences
+        %@ byte=0.
+        %@    % CPU time: 1.556s, 14_680_107 inferences
+        %@ byte=0.
+        %@    % CPU time: 1.520s, 14_680_107 inferences
+        %@ byte=32.
+        %@    % CPU time: 1.524s, 14_680_107 inferences
+        %@ byte=255.
+        %@    % CPU time: 1.522s, 14_680_107 inferences
+        %@ byte=10.
+        %@    % CPU time: 1.526s, 14_680_107 inferences
+        %@ byte=0.
+        %@    % CPU time: 1.522s, 14_680_107 inferences
+        %@ byte=127.
+        %@    % CPU time: 1.522s, 14_680_107 inferences
+        %@ byte=64.
+        %@    % CPU time: 1.517s, 14_680_107 inferences
+        %@    false.
+
+   This shows that there is enough variety between runs that
+   traversing a list with 256 elements that are all '\x0\' may even,
+   and unexpectedly, be faster than traversing a list consisting
+   entirely of characters with character code 32, which in turn may be
+   slower than processing a list with 256 characters that all have
+   code 255 and thus occupy twice as much space. This holds even over
+   millions of runs. Reasons for such variety can include CPU power
+   saving mechanisms, dynamic optimizations, prefetching heuristics,
+   branch prediction algorithms, varying system loads etc.
+
+   This gives rise to the suspicion that any such timing differences
+   would be extremely hard to exploit, at least on the architecture I
+   tested it on, also since partial_string/1 is a very low-level
+   operation and any actual processing (using phrase/2 etc.) would
+   introduce additional overheads that in all likelihood far outweigh
+   any differences that can be measured with partial_string/1.
+
+   Any resulting differences in timing and resource use, if they are
+   measurable at all in any way, can at most reveal one bit per byte.
+   Note also that Ed25519 private keys are chosen randomly, and hence
+   half of their bytes are expected to be in 128..255. The predicates
+   remain completely safe to use in all scenarios where no information
+   about the private key can be gathered by unauthorized parties.
+
+   Still, the concern remains: We know that different keys may occupy
+   different numbers of bytes in the internal compact representation
+   of strings used by Scryer Prolog, and it may be possible to exploit
+   these differences to obtain information that is meant to be kept
+   secret. We must therefore keep an eye on this issue. For example,
+   it may become a concern on very slow devices such as ID-cards where
+   Scryer Prolog may be deployed in the future and where such timing
+   differences may be detectable, or if Scryer Prolog itself becomes
+   so fast that the relative overhead of such low-level operations
+   becomes greater and thus more easily measurable.
+
+   Possible mitigations in such situations would be to:
+
+    1. use lists of integers to represent Ed25519 key pairs, resulting
+       in a 24-fold space increase. This may be prohibitive in
+       applications that manage a great number of keys. The Rust code
+       would not be affected by this change, since it already operates
+       on bytes. A Prolog application may implement this privately, and
+       also encourage an API change of this library.
+    2. introduce a compact internal representation for lists of bytes,
+       which appear to Prolog programs as lists of characters.
+
+   (2) seems to be the better solution despite the implementation
+   overhead: In addition to the improved security properties due to
+   the elimination of side-channel attacks when reasoning about keys,
+   all applications that reason about binary data would benefit from a
+   more compact representation of binary data. Such an additional
+   compact representation should only be attempted if the amount of
+   Rust code it impacts is kept to the absolute minimum, certainly
+   much smaller than what the compact string representation as it is
+   currently implemented affects.
+
+   *No* solution would be to:
+
+    - eliminate the compact string representation from the engine and
+      use plain lists of characters to represent Ed25519 key pairs,
+    - continue to use lists of characters for Ed25519 key pairs,
+      and ensure that they are never coalesced into compact strings
+      (a future GC compaction step may need to be adapted for this)
+
+   This is because atom names are still represented in UTF-8 encoding,
+   and are hence also susceptible to side-channel attacks due to their
+   using different numbers of bytes for different codes.
+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
+
 ed25519_sign(KeyPair, Data0, Signature, Options) :-
         must_be_octet_chars(KeyPair, ed25519_sign/4),
         length(Prefix, 16),
-- 
2.54.0