lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 19 Oct 2017 12:19:16 -0700
From:   Shakeel Butt <shakeelb@...gle.com>
To:     Anshuman Khandual <khandual@...ux.vnet.ibm.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Michal Hocko <mhocko@...e.com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Minchan Kim <minchan@...nel.org>,
        Yisheng Xie <xieyisheng1@...wei.com>,
        Ingo Molnar <mingo@...nel.org>,
        Greg Thelen <gthelen@...gle.com>,
        Hugh Dickins <hughd@...gle.com>, Linux MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm: mlock: remove lru_add_drain_all()

On Wed, Oct 18, 2017 at 11:24 PM, Anshuman Khandual
<khandual@...ux.vnet.ibm.com> wrote:
> On 10/19/2017 04:47 AM, Shakeel Butt wrote:
>> Recently we have observed high latency in mlock() in our generic
>> library and noticed that users have started using tmpfs files even
>> without swap and the latency was due to expensive remote LRU cache
>> draining.
>
> With and without this I patch I dont see much difference in number
> of instructions executed in the kernel for mlock() system call on
> POWER8 platform just after reboot (all the pagevecs might not been
> filled by then though). There is an improvement but its very less.
>
> Could you share your latency numbers and how this patch is making
> them better.
>

The latency is very dependent on the workload and the number of cores
on the machine. On production workload, the customers were complaining
single mlock() was taking around 10 seconds on tmpfs files which were
already in memory.

>>
>> Is lru_add_drain_all() required by mlock()? The answer is no and the
>> reason it is still in mlock() is to rapidly move mlocked pages to
>> unevictable LRU. Without lru_add_drain_all() the mlocked pages which
>> were on pagevec at mlock() time will be moved to evictable LRUs but
>> will eventually be moved back to unevictable LRU by reclaim. So, we
>
> Wont this affect the performance during reclaim ?
>

Yes, but reclaim is already a slow path and to seriously impact
reclaim we will need a very very antagonistic workload which is very
hard to trigger (i.e. for each mlock on a cpu, the pages being mlocked
happen to be on the cache of other cpus).

>> can safely remove lru_add_drain_all() from mlock(). Also there is no
>> need for local lru_add_drain() as it will be called deep inside
>> __mm_populate() (in follow_page_pte()).
>
> The following commit which originally added lru_add_drain_all()
> during mlock() and mlockall() has similar explanation.
>
> 8891d6da ("mm: remove lru_add_drain_all() from the munlock path")
>
> "In addition, this patch add lru_add_drain_all() to sys_mlock()
> and sys_mlockall().  it isn't must.  but it reduce the failure
> of moving to unevictable list.  its failure can rescue in
> vmscan later.  but reducing is better."
>
> Which sounds like either we have to handle the active to inactive
> LRU movement during reclaim or it can be done here to speed up
> reclaim later on.
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ