[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <44001748-05AC-49B2-88F5-371618C12AD9@cs.rutgers.edu>
Date: Sun, 12 Feb 2017 18:25:09 -0600
From: Zi Yan <zi.yan@...rutgers.edu>
To: "Kirill A. Shutemov" <kirill@...temov.name>
CC: Andrea Arcangeli <aarcange@...hat.com>,
Minchan Kim <minchan@...nel.org>,
<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
<kirill.shutemov@...ux.intel.com>, <akpm@...ux-foundation.org>,
<vbabka@...e.cz>, <mgorman@...hsingularity.net>,
<n-horiguchi@...jp.nec.com>, <khandual@...ux.vnet.ibm.com>,
Zi Yan <ziy@...dia.com>
Subject: Re: [PATCH v3 03/14] mm: use pmd lock instead of racy checks in
zap_pmd_range()
Hi Kirill,
>>>> The crash scenario I guess is like:
>>>> 1. A huge page pmd entry is in the middle of being changed into either a
>>>> pmd_protnone or a pmd_migration_entry. It is cleared to pmd_none.
>>>>
>>>> 2. At the same time, the application frees the vma this page belongs to.
>>>
>>> Em... no.
>>>
>>> This shouldn't be possible: your 1. must be done under down_read(mmap_sem).
>>> And we only be able to remove vma under down_write(mmap_sem), so the
>>> scenario should be excluded.
>>>
>>> What do I miss?
>>
>> You are right. This problem will not happen in the upstream kernel.
>>
>> The problem comes from my customized kernel, where I migrate pages away
>> instead of reclaiming them when memory is under pressure. I did not take
>> any mmap_sem when I migrate pages. So I got this error.
>>
>> It is a false alarm. Sorry about that. Thanks for clarifying the problem.
>
> I think there's still a race between MADV_DONTNEED and
> change_huge_pmd(.prot_numa=1) resulting in skipping THP by
> zap_pmd_range(). It need to be addressed.
>
> And MADV_FREE requires a fix.
>
> So, minus one non-bug, plus two bugs.
>
You said a huge page pmd entry needs to be changed under down_read(mmap_sem).
It is only true for huge pages, right?
Since in mm/compaction.c, the kernel does not down_read(mmap_sem) during memory
compaction. Namely, base page migrations do not hold down_read(mmap_sem),
so in zap_pte_range(), the kernel needs to hold PTE page table locks.
Am I right about this?
If yes. IMHO, ultimately, when we need to compact 2MB pages to form 1GB pages,
in zap_pmd_range(), pmd locks have to be taken to make that kind of compactions
possible.
Do you agree?
--
Best Regards
Yan Zi
Download attachment "signature.asc" of type "application/pgp-signature" (497 bytes)
Powered by blists - more mailing lists