lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 26 Sep 2016 10:06:20 +0900
From:   Minchan Kim <minchan@...nel.org>
To:     Shaohua Li <shli@...nel.org>
Cc:     "Huang, Ying" <ying.huang@...el.com>,
        Rik van Riel <riel@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        tim.c.chen@...el.com, dave.hansen@...el.com, andi.kleen@...el.com,
        aaron.lu@...el.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Hugh Dickins <hughd@...gle.com>,
        Minchan Kim <minchan@...nel.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Vladimir Davydov <vdavydov@...tuozzo.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>
Subject: Re: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping
 out

On Sun, Sep 25, 2016 at 12:18:49PM -0700, Shaohua Li wrote:
> On Fri, Sep 23, 2016 at 10:32:39AM +0800, Huang, Ying wrote:
> > Rik van Riel <riel@...hat.com> writes:
> > 
> > > On Thu, 2016-09-22 at 15:56 -0700, Shaohua Li wrote:
> > >> On Wed, Sep 07, 2016 at 09:45:59AM -0700, Huang, Ying wrote:
> > >> > 
> > >> > - It will help the memory fragmentation, especially when the THP is
> > >> >   heavily used by the applications.  The 2M continuous pages will
> > >> > be
> > >> >   free up after THP swapping out.
> > >> 
> > >> So this is impossible without THP swapin. While 2M swapout makes a
> > >> lot of
> > >> sense, I doubt 2M swapin is really useful. What kind of application
> > >> is
> > >> 'optimized' to do sequential memory access?
> > >
> > > I suspect a lot of this will depend on the ratio of storage
> > > speed to CPU & RAM speed.
> > >
> > > When swapping to a spinning disk, it makes sense to avoid
> > > extra memory use on swapin, and work in 4kB blocks.
> > 
> > For spinning disk, the THP swap optimization will be turned off in
> > current implementation.  Because huge swap cluster allocation based on
> > swap cluster management, which is available only for non-rotating block
> > devices (blk_queue_nonrot()).
> 
> For 2m swapin, as long as one byte is changed in the 2m, next time we must do
> 2m swapout. There is huge waste of memory and IO bandwidth and increases
> unnecessary memory pressure. 2M IO will very easily saturate a very fast SSD

I agree. No doubt THP swapout is helpful for overall performance but
THP swapin should be more careful. It would cause memory pressure which
could evict warm pages which mitigates THP's benefit. THP swapin also
would increase minor fault latency, too.

If we want to swap in a THP, I think we need something to guarantee that
subpages in a THP swapped out were hot and temporal locality so that
it's worth to swap in a THP page to lose other memory kept in in memory.

Maybe it would not matter so much in MADVISE mode where userspace knows
pros and cons and choosed it. The problem would be there in ALWAYS mode.

One of idea is we can raise bar to collapse THP page higher, for example,
reducing khugepaged_max_ptes_none and introducing khugepaged_max_pte_ref.
With that, khugepaged would collapse 4K pages into a THP only if most of
subpages are mapped and hot.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ