Ondřej Vejpustek via bitcoin-dev
2018-01-17 11:39:42 UTC
There is a rule of thumb which says it is safer plaintext to have low
https://en.wikipedia.org/wiki/Redundancy_(information_theory), i. e.
it's better to encrypt random or compressed data than natural language.
This rule is based on Shannon's information theory which means that a
breach of the rule usually doesn't induce a vulnerability (there is no
known generic attack). This rule is application of a precautionary
Nevertheless, here are some examples of cryptographic attacks which may
be considered as a consequence of the breach of the rule:
* Related Message Attack by Coppersmith, Franklin, Patarin, Reiter
- given RSA ciphertext of two plaintexts x and a*x + b, where a, b are
known, it's possible to effectively compute x provided public exponent
is three. From the informaton-theoretic point of view the second message
is redundant, because it's determined by the first one. Which means that
relative redundancy of both messages is at least one half.
* Stereotyped Messages by Coppersmith
(https://www.di.ens.fr/~fouque/ens-rennes/coppersmith.pdf, section 7) -
given RSA ciphertext and (1-1/e) fraction of plaintext (where e is
public exponent), it's possible to effectively compute x. Message is
highly redundant, because only 1/e of the message is unknown. Relative
redundancy of the message is at least (1-1/e).
Consider a few notes:
* Nowadays there exists more complicated variants of mentioned attacks
which have weaker premisses.
* There is a considerable similarity between RSA and SSS. Both schemes
are algebraically-based (rather than boolean function based).
* CRCs (and error-correcting codes generally) introduce redundancy
into the message. Moreover the redundancy is induced by a linear
relationship among message (compare with the premise of the Related
* Related Message Attack wouldn't be possible if you had two
plaintexts x and hash(x). The relationship between messages has to be
(algebraically) uncomplicated. From the information-theoretic point of
view the situation is the same, but from the practical point of view it
is completely different.
To sum it up, there is a precautionary principle which tells us not to
increase redundancy of a message unless it is introduced in a
complicated way (for example by a hash function). That's why we use SHA
rather than CRC. One more reason why we stick to the principle is that
there's no randomisation in our scheme (such as padding or
initialisation vector). We understood advantages of error-correctings
codes over hash functions (minimal codewords distance property,
performance) and we considered it thoroughly.