linux-kernel - [PATCH] lib/lzo: fix ambiguous encoding bug in lzo-rle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-Id: <20200507100203.29785-1-dave.rodgman@arm.com>
Date:   Thu,  7 May 2020 11:02:03 +0100
From:   Dave Rodgman <dave.rodgman@....com>
To:     akpm@...ux-foundation.org
Cc:     nd@....com, dave.rodgman@....com, pushbx@...kai.org, w@....eu,
        sergey.senozhatsky.work@...il.com, markus@...rhumer.com,
        minchan@...nel.org, ngupta@...are.org, yuchao0@...wei.com,
        linux-kernel@...r.kernel.org
Subject: [PATCH] lib/lzo: fix ambiguous encoding bug in lzo-rle

In some rare cases, for input data over 32 KB, lzo-rle could encode
two different inputs to the same compressed representation, so that
decompression is then ambiguous (i.e. data may be corrupted - although
zram is not affected because it operates over 4 KB pages).

This modifies the compressor without changing the decompressor or the
bitstream format, such that:
- there is no change to how data produced by the old compressor is
  decompressed
- an old decompressor will correctly decode data from the updated
  compressor
- performance and compression ratio are not affected
- we avoid introducing a new bitstream format

In testing over 12.8M real-world files totalling 903 GB, three files
were affected by this bug. I also constructed 37M semi-random 64 KB
files totalling 2.27 TB, and saw no affected files. Finally I tested
over files constructed to contain each of the ~1024 possible bad input
sequences; for all of these cases, updated lzo-rle worked correctly.

There is no significant impact to performance or compression ratio.

Cc: stable@...r.kernel.org
Signed-off-by: Dave Rodgman <dave.rodgman@....com>
---
 Documentation/lzo.txt    |  8 ++++++--
 lib/lzo/lzo1x_compress.c | 13 +++++++++++++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/Documentation/lzo.txt b/Documentation/lzo.txt
index ca983328976b..f65b51523014 100644
--- a/Documentation/lzo.txt
+++ b/Documentation/lzo.txt
@@ -159,11 +159,15 @@ Byte sequences
            distance = 16384 + (H << 14) + D
            state = S (copy S literals after this block)
            End of stream is reached if distance == 16384
+           In version 1 only, to prevent ambiguity with the RLE case when
+           ((distance & 0x803f) == 0x803f) && (261 <= length <= 264), the
+           compressor must not emit block copies where distance and length
+           meet these conditions.
 
         In version 1 only, this instruction is also used to encode a run of
-        zeros if distance = 0xbfff, i.e. H = 1 and the D bits are all 1.
+           zeros if distance = 0xbfff, i.e. H = 1 and the D bits are all 1.
            In this case, it is followed by a fourth byte, X.
-           run length = ((X << 3) | (0 0 0 0 0 L L L)) + 4.
+           run length = ((X << 3) | (0 0 0 0 0 L L L)) + 4
 
       0 0 1 L L L L L  (32..63)
            Copy of small block within 16kB distance (preferably less than 34B)
diff --git a/lib/lzo/lzo1x_compress.c b/lib/lzo/lzo1x_compress.c
index 717c940112f9..8ad5ba2b86e2 100644
--- a/lib/lzo/lzo1x_compress.c
+++ b/lib/lzo/lzo1x_compress.c
@@ -268,6 +268,19 @@ lzo1x_1_do_compress(const unsigned char *in, size_t in_len,
 				*op++ = (M4_MARKER | ((m_off >> 11) & 8)
 						| (m_len - 2));
 			else {
+				if (unlikely(((m_off & 0x403f) == 0x403f)
+						&& (m_len >= 261)
+						&& (m_len <= 264))
+						&& likely(bitstream_version)) {
+					// Under lzo-rle, block copies
+					// for 261 <= length <= 264 and
+					// (distance & 0x80f3) == 0x80f3
+					// can result in ambiguous
+					// output. Adjust length
+					// to 260 to prevent ambiguity.
+					ip -= m_len - 260;
+					m_len = 260;
+				}
 				m_len -= M4_MAX_LEN;
 				*op++ = (M4_MARKER | ((m_off >> 11) & 8));
 				while (unlikely(m_len > 255)) {
-- 
2.23.0