lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-Id: <20171212124434.597052315@linuxfoundation.org> Date: Tue, 12 Dec 2017 13:44:30 +0100 From: Greg Kroah-Hartman <gregkh@...uxfoundation.org> To: linux-kernel@...r.kernel.org Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>, stable@...r.kernel.org, "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>, Andrea Arcangeli <aarcange@...hat.com>, Hillf Danton <hillf.zj@...baba-inc.com>, Andrew Morton <akpm@...ux-foundation.org>, Linus Torvalds <torvalds@...ux-foundation.org>, Jack Wang <jinpu.wang@...fitbricks.com> Subject: [PATCH 4.9 060/148] thp: fix MADV_DONTNEED vs. numa balancing race 4.9-stable review patch. If anyone has any objections, please let me know. ------------------ From: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com> commit ced108037c2aa542b3ed8b7afd1576064ad1362a upstream. In case prot_numa, we are under down_read(mmap_sem). It's critical to not clear pmd intermittently to avoid race with MADV_DONTNEED which is also under down_read(mmap_sem): CPU0: CPU1: change_huge_pmd(prot_numa=1) pmdp_huge_get_and_clear_notify() madvise_dontneed() zap_pmd_range() pmd_trans_huge(*pmd) == 0 (without ptl) // skip the pmd set_pmd_at(); // pmd is re-established The race makes MADV_DONTNEED miss the huge pmd and don't clear it which may break userspace. Found by code analysis, never saw triggered. Link: http://lkml.kernel.org/r/20170302151034.27829-3-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com> Cc: Andrea Arcangeli <aarcange@...hat.com> Cc: Hillf Danton <hillf.zj@...baba-inc.com> Cc: <stable@...r.kernel.org> Signed-off-by: Andrew Morton <akpm@...ux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org> [jwang: adjust context for 4.9 ] Signed-off-by: Jack Wang <jinpu.wang@...fitbricks.com> Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org> --- mm/huge_memory.c | 34 +++++++++++++++++++++++++++++++++- 1 file changed, 33 insertions(+), 1 deletion(-) --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1531,7 +1531,39 @@ int change_huge_pmd(struct vm_area_struc if (prot_numa && pmd_protnone(*pmd)) goto unlock; - entry = pmdp_huge_get_and_clear_notify(mm, addr, pmd); + /* + * In case prot_numa, we are under down_read(mmap_sem). It's critical + * to not clear pmd intermittently to avoid race with MADV_DONTNEED + * which is also under down_read(mmap_sem): + * + * CPU0: CPU1: + * change_huge_pmd(prot_numa=1) + * pmdp_huge_get_and_clear_notify() + * madvise_dontneed() + * zap_pmd_range() + * pmd_trans_huge(*pmd) == 0 (without ptl) + * // skip the pmd + * set_pmd_at(); + * // pmd is re-established + * + * The race makes MADV_DONTNEED miss the huge pmd and don't clear it + * which may break userspace. + * + * pmdp_invalidate() is required to make sure we don't miss + * dirty/young flags set by hardware. + */ + entry = *pmd; + pmdp_invalidate(vma, addr, pmd); + + /* + * Recover dirty/young flags. It relies on pmdp_invalidate to not + * corrupt them. + */ + if (pmd_dirty(*pmd)) + entry = pmd_mkdirty(entry); + if (pmd_young(*pmd)) + entry = pmd_mkyoung(entry); + entry = pmd_modify(entry, newprot); if (preserve_write) entry = pmd_mkwrite(entry);
Powered by blists - more mailing lists