lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 14 Jan 2021 12:33:10 +0900
From:   Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Hugh Dickins <hughd@...gle.com>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Suleiman Souhlal <suleiman@...gle.com>,
        Matthew Wilcox <willy@...radead.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: madvise(MADV_REMOVE) deadlocks on shmem THP

Hi,

We are running into lockups during the memory pressure tests on our
boards, which essentially NMI panic them. In short the test case is

- THP shmem
    echo advise > /sys/kernel/mm/transparent_hugepage/shmem_enabled

- And a user-space process doing madvise(MADV_HUGEPAGE) on new mappings,
  and madvise(MADV_REMOVE) when it wants to remove the page range

The problem boils down to the reverse locking chain:
	kswapd does

		lock_page(page) -> down_read(page->mapping->i_mmap_rwsem)

	madvise() process does

		down_write(page->mapping->i_mmap_rwsem) -> lock_page(page)



CPU0                                                       CPU1

kswapd                                                     vfs_fallocate()
 shrink_node()                                              shmem_fallocate()
  shrink_active_list()                                       unmap_mapping_range()
   page_referenced() << lock page:PG_locked >>                unmap_mapping_pages()  << down_write(mapping->i_mmap_rwsem) >>
    rmap_walk_file()                                           zap_page_range_single()
     down_read(mapping->i_mmap_rwsem) << W-locked on CPU1>>     unmap_page_range()
      rwsem_down_read_failed()                                   __split_huge_pmd()
       __rwsem_down_read_failed_common()                          __lock_page()  << PG_locked on CPU0 >>
        schedule()                                                 wait_on_page_bit_common()
                                                                    io_schedule()

	-ss

Powered by blists - more mailing lists