lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 14 Jan 2021 14:38:58 +0900 From: Sergey Senozhatsky <sergey.senozhatsky@...il.com> To: Hugh Dickins <hughd@...gle.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>, Andrew Morton <akpm@...ux-foundation.org>, "Kirill A. Shutemov" <kirill@...temov.name>, Suleiman Souhlal <suleiman@...gle.com>, Matthew Wilcox <willy@...radead.org>, Andrea Arcangeli <aarcange@...hat.com>, linux-mm@...ck.org, linux-kernel@...r.kernel.org Subject: Re: madvise(MADV_REMOVE) deadlocks on shmem THP On (21/01/13 20:31), Hugh Dickins wrote: > > We are running into lockups during the memory pressure tests on our > > boards, which essentially NMI panic them. In short the test case is > > > > - THP shmem > > echo advise > /sys/kernel/mm/transparent_hugepage/shmem_enabled > > > > - And a user-space process doing madvise(MADV_HUGEPAGE) on new mappings, > > and madvise(MADV_REMOVE) when it wants to remove the page range > > > > The problem boils down to the reverse locking chain: > > kswapd does > > > > lock_page(page) -> down_read(page->mapping->i_mmap_rwsem) > > > > madvise() process does > > > > down_write(page->mapping->i_mmap_rwsem) -> lock_page(page) > > > > > > > > CPU0 CPU1 > > > > kswapd vfs_fallocate() > > shrink_node() shmem_fallocate() > > shrink_active_list() unmap_mapping_range() > > page_referenced() << lock page:PG_locked >> unmap_mapping_pages() << down_write(mapping->i_mmap_rwsem) >> > > rmap_walk_file() zap_page_range_single() > > down_read(mapping->i_mmap_rwsem) << W-locked on CPU1>> unmap_page_range() > > rwsem_down_read_failed() __split_huge_pmd() > > __rwsem_down_read_failed_common() __lock_page() << PG_locked on CPU0 >> > > schedule() wait_on_page_bit_common() > > io_schedule() > > Very interesting, Sergey: many thanks for this report. Thanks for the quick feedback. > There is no doubt that kswapd is right in its lock ordering: > __split_huge_pmd() is in the wrong to be attempting lock_page(). > > Which used not to be done, but was added in 5.8's c444eb564fb1 ("mm: > thp: make the THP mapcount atomic against __split_huge_pmd_locked()"). Hugh, I forgot to mention, we are facing these issues on 4.19. Let me check if (maybe) we have cherry picked c444eb564fb1. -ss
Powered by blists - more mailing lists