linux-kernel - Re: Don't understand comment in arch/x86/boot/compressed/misc.c

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <m1wrhwixpp.fsf@fess.ebiederm.org>
Date:	Wed, 11 May 2011 17:12:02 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Rob Landley <rob@...dley.net>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: Don't understand comment in arch/x86/boot/compressed/misc.c

Rob Landley <rob@...dley.net> writes:

> It talks about when decompression in place is safe to do:
>
>  * Getting to provable safe in place decompression is hard.
>  * Worst case behaviours need to be analyzed.
> ...
>  * The buffer for decompression in place is the length of the
>  * uncompressed data, plus a small amount extra to keep the algorithm safe.
>  * The compressed data is placed at the end of the buffer.  The output
>  * pointer is placed at the start of the buffer and the input pointer
>  * is placed where the compressed data starts.  Problems will occur
>  * when the output pointer overruns the input pointer.
>  *
>  * The output pointer can only overrun the input pointer if the input
>  * pointer is moving faster than the output pointer.  A condition only
>  * triggered by data whose compressed form is larger than the uncompressed
>  * form.
>
> You have an output pointer at a lower address catching up to an input
> pointer at a higher address.  If the input pointer is moving FASTER
> than the output pointer, wouldn't the gap between them grow rather
> than shrink?

The wording might be clearer but the basic concept in context seems
fine.  The entire section is talking about how many bytes more than the
uncompressed size of the data do you need to guarantee you won't overrun
your compressed data.

For gzip that is a smidge over a single compressed block.

In the worst case you have to assume that none of your blocks
actually compressed.

So an input pointer going faster than an output pointer is a problem
if you try to limit yourself to exactly the area of memory that the
decompressed data lives in.  In that case the input point will
run off the end.

> The concern seems to be about COMPRESSING in place, rather than
> decompressing...?

No.  In theory there is some data that when compressed will grow.  In
the best case that data will grow only by a single bit.

In case a picture will help.

  Decompressed data goes here       Compressed data comes from here
  |                                 |
0 v->                               v->
+---------------------------------------+-----+------------+
|                                       |extra|decompressor|
+---------------------------------------+-----+------------+

The question is how large that extra chunk needs to be.  This matters
either when nothing compresses (the worst case for extra space) or
especially when you get a chunk of blocks at the end that don't
compress (a plausible but almost worst case scenario).

Things have changed a bit since I wrote that analysis.  The computation
of the worst case space has moved to mkpiggy.c, support for other
compression methods have been added, and we now have a mini ELF loader
in misc.c which adds an extra step to everything.  But the overall
concepts remain valid.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/