lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 19 Sep 2007 17:22:46 -0400
From:	Andy Lutomirski <luto@...ealbox.com>
To:	linux-kernel@...r.kernel.org, andi@...stfloor.org,
	kernel1@...erdogtech.com
Subject: Re: A little coding style nugget of joy

Andi Kleen wrote:
> Matt LaPlante <kernel1@...erdogtech.com> writes:
> 
>> Since everyone loves random statistics, here are a few gems to give you a break from your busy day:
>>
>> Number of lines in the 2.6.22 Linux kernel source that include one or more trailing whitespaces: 135209
>> Bytes saved by removing said whitespace: 151809
> 
> You don't actually save anything on disk on most file systems
> (essentially everything except reiserfs on current Linux)
> because all files are rounded to block size (normally 4K) 
> 
> Same in page cache.

This is a terrible assumption in general (i.e. if filesize % blocksize 
is close to uniformly distributed).  If you remove one byte and the data 
is stored with blocksize B, then you either save zero bytes with 
probability 1-1/B or you save B bytes with probability 1/B.  The 
expected number of bytes saved is B*1/B=1.  Since expectation is linear, 
if you remove x bytes, the expected number of bytes saved is x (even if 
there is more than one byte removed per file).

In my tree, about half of the files have size >= 4k, so the assumption 
is probably not _that_ far off the mark.

Alternatively, there are an average of about 16 bytes removed per file, 
and there are 11 which are <= 16 bytes short of a 4k boundary, so it's 
not at all unreasonable that we'd save 40-50k.

> 
> And in tar files bzip2/gzip is very good at compacting them.

That's true.

--Andy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ