lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWA6aZC_3LPM3niN+2HFjGEm_65m9hiEdpBtEZMn0JhwQ@mail.gmail.com>
Date:	Wed, 11 Nov 2015 20:49:58 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	Minchan Kim <minchan@...nel.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Michael Kerrisk <mtk.manpages@...il.com>,
	Linux API <linux-api@...r.kernel.org>,
	Hugh Dickins <hughd@...gle.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Rik van Riel <riel@...hat.com>, Mel Gorman <mgorman@...e.de>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Jason Evans <je@...com>, Daniel Micay <danielmicay@...il.com>,
	"Kirill A. Shutemov" <kirill@...temov.name>,
	Shaohua Li <shli@...nel.org>, Michal Hocko <mhocko@...e.cz>,
	yalin wang <yalin.wang2010@...il.com>
Subject: Re: [PATCH v3 01/17] mm: support madvise(MADV_FREE)

On Wed, Nov 11, 2015 at 8:32 PM, Minchan Kim <minchan@...nel.org> wrote:
>
> Linux doesn't have an ability to free pages lazy while other OS already
> have been supported that named by madvise(MADV_FREE).
>
> The gain is clear that kernel can discard freed pages rather than swapping
> out or OOM if memory pressure happens.


>
> When madvise syscall is called, VM clears dirty bit of ptes of the range.
> If memory pressure happens, VM checks dirty bit of page table and if it
> found still "clean", it means it's a "lazyfree pages" so VM could discard
> the page instead of swapping out.  Once there was store operation for the
> page before VM peek a page to reclaim, dirty bit is set so VM can swap out
> the page instead of discarding.
>

I realize that this lends itself to an efficient implementation, but
it's certainly the case that the kernel *could* use the accessed bit
instead of the dirty bit to give more sensible user semantics, and the
semantics that rely on the dirty bit make me uncomfortable from an ABI
perspective.

I also think that the kernel should commit to either zeroing the page
or leaving it unchanged in response to MADV_FREE (even if the decision
of which to do is made later on).  I think that your patch series does
this, but only after a few of the patches are applied (the swap entry
freeing), and I think that it should be a real guaranteed part of the
semantics and maybe have a test case.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ