lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200120081710.GA18028@richard>
Date:   Mon, 20 Jan 2020 16:17:10 +0800
From:   Wei Yang <richardw.yang@...ux.intel.com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     David Rientjes <rientjes@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Wei Yang <richardw.yang@...ux.intel.com>, hannes@...xchg.org,
        vdavydov.dev@...il.com, ktkhai@...tuozzo.com,
        kirill.shutemov@...ux.intel.com, yang.shi@...ux.alibaba.com,
        cgroups@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, alexander.duyck@...il.com,
        stable@...r.kernel.org
Subject: Re: [Patch v4] mm: thp: remove the defer list related code since
 this will not happen

On Mon, Jan 20, 2020 at 08:22:37AM +0100, Michal Hocko wrote:
>On Sat 18-01-20 15:36:06, David Rientjes wrote:
>> On Sat, 18 Jan 2020, Andrew Morton wrote:
>> 
>> > On Sat, 18 Jan 2020 07:38:36 +0800 Wei Yang <richardw.yang@...ux.intel.com> wrote:
>> > 
>> > > If compound is true, this means it is a PMD mapped THP. Which implies
>> > > the page is not linked to any defer list. So the first code chunk will
>> > > not be executed.
>> > > 
>> > > Also with this reason, it would not be proper to add this page to a
>> > > defer list. So the second code chunk is not correct.
>> > > 
>> > > Based on this, we should remove the defer list related code.
>> > > 
>> > > Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
>> > > 
>> > > Signed-off-by: Wei Yang <richardw.yang@...ux.intel.com>
>> > > Suggested-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
>> > > Cc: <stable@...r.kernel.org>    [5.4+]
>> > 
>> > This patch is identical to "mm: thp: grab the lock before manipulating
>> > defer list", which is rather confusing.  Please let people know when
>> > this sort of thing is done.
>> > 
>> > The earlier changelog mentioned a possible race condition.  This
>> > changelog does not.  In fact this changelog fails to provide any
>> > description of any userspace-visible runtime effects of the bug. 
>> > Please send along such a description for inclusion, as always.
>> > 
>> 
>> The locking concern that Wei was originally looking at is no longer an 
>> issue because we determined that the code in question could simply be 
>> removed.
>> 
>> I think the following can be added to the changelog:
>> 
>> ----->o-----
>> 
>> When migrating memcg charges of thp memory, there are two possibilities:
>> 
>>  (1) The underlying compound page is mapped by a pmd and thus does is not 
>>      on a deferred split queue (it's mapped), or
>> 
>>  (2) The compound page is not mapped by a pmd and is awaiting split on a
>>      deferred split queue.
>> 
>> The current charge migration implementation does *not* migrate charges for 
>> thp memory on the deferred split queue, it only migrates charges for pages 
>> that are mapped by a pmd.
>> 
>> Thus, to migrate charges, the underlying compound page cannot be on a 
>> deferred split queue; no list manipulation needs to be done in 
>> mem_cgroup_move_account().
>> 
>> With the current code, the underlying compound page is moved to the 
>> deferred split queue of the memcg its memory is not charged to, so 
>> susbequent reclaim will consider these pages for the wrong memcg.  Remove 
>> the deferred split queue handling in mem_cgroup_move_account() entirely.
>
>I believe this still doesn't describe the underlying problem to the full
>extent. What happens with the page on the deferred list when it
>shouldn't be there in fact? Unless I am missing something deferred_split_scan
>will simply split that huge page. Which is a bit unfortunate but nothing
>really critical. This should be mentioned in the changelog.
>

Per my understanding, if we do the split when it is not necessary, we
probably have a lower performance due to tlb miss. For others, I don't see the
impact.

>With that clarified, feel free to add
>
>Acked-by: Michal Hocko <mhocko@...e.com>
>
>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ