lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <cover.1369092449.git.aquini@redhat.com>
Date:	Mon, 20 May 2013 21:04:23 -0300
From:	Rafael Aquini <aquini@...hat.com>
To:	linux-kernel@...r.kernel.org
Cc:	linux-mm@...ck.org, akpm@...ux-foundation.org, hughd@...gle.com,
	shli@...nel.org, kzak@...hat.com, jmoyer@...hat.com,
	riel@...hat.com, lwoodman@...hat.com, mgorman@...e.de
Subject: [RFC PATCH 00/02] swap: allowing a more flexible DISCARD policy

Howdy folks,

While working on a backport for the following changes:

3399446 swap: discard while swapping only if SWAP_FLAG_DISCARD
052b198 swap: don't do discard if no discard option added

We found ourselves around an interesting discussion on how limiting 
the behavior with regard to user-visible swap areas configuration has become
after applying the aforementioned changesets.

Before commit 3399446, if the swap backing device supported DISCARD,
then a batched discard was issued at swapon(8) time, and fine-grained DISCARDs
were issued in between freeing swap page-clusters and re-writing to them.
As noticed at 3399446's commit message, the fine-grained discards often
didn't help on improving performance as expected, and were potentially causing
more trouble than desired. So, commit 3399446 introduced a new swapon flag,
to make the fine-grained discards while swapping conditional.
However a batched discard would have been issued everytime swapon(8) was
turning a new swap area available.

This batched operation that remained at sys_swapon was considered troublesome
for some setups, and specially because a sysadmin was not flagging swapon(8)
to do discards -- http://www.spinics.net/lists/linux-mm/msg31741.html 
then, commit 052b198 got merged to address the scenario described above.
After this last commit, now we can either only do both batched and fine-grained
discards for swap, or none of them.

As depicted above, this seems to be not very flexible as it could be,
and the whole discussion we had (internally) left us wondering if does upstream
feel it would be useful to allow for both a batched discard as well as a
fine-grained discard option for swap? (By batched, here, it could mean just
the one time operation at swapon, or something similar to what fstrim does).

In fact, we all agreed with having no discards sent down at all as the default
behaviour. But thinking a little more about the use cases where device supports
discard:
a) and can do it quickly;
b) but it's slow to do in small granularities (or concurrent with other
   I/O);
c) but the implementation is so horrendous that you don't even want to
   send one down;

And assuming that the sysadmin considers it useful to send the discards down
at all, we would (probably) want the following solutions:

1) do the fine-grained discards if swap device is capable of doing so;
2) do batched discards, either at swapon or via something like fstrim; or
3) turn it off completely (default behavior nowadays)


i.e.: Today, if we have a swap device like (b), we cannot perform (2) even if
it might be regardeed as interesting, or necessary to the workload because it
would imply (1), and the device cannot do that and perform optimally.


With all that in mind, and in order to attempt to sort out the (un)flexibility
problem exposed above, I came up with the following patches:

01 (kernel)     swap: discard while swapping only if SWAP_FLAG_DISCARD_CLUSTER
02 (util-linux) swapon: add "cluster-discard" support


Sorry for the long email.

Your feedback is very much appreciated!
-- Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ