lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <89009285-c75d-0f09-5b08-d133c42a18f9@nokia.com>
Date:   Tue, 5 Apr 2022 12:20:15 +0200
From:   Alexander Sverdlin <alexander.sverdlin@...ia.com>
To:     Nicholas Piggin <npiggin@...il.com>,
        Alexander Duyck <alexanderduyck@...com>,
        Matthew Wilcox <willy@...radead.org>,
        Michal Hocko <mhocko@...e.com>,
        Hugh Dickins <hughd@...gle.com>, Yu Zhao <yuzhao@...gle.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Lee Schermerhorn <lee.schermerhorn@...com>,
        Hisashi Hifumi <hifumi.hisashi@....ntt.co.jp>,
        Sasha Levin <sashal@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org
Cc:     linux-kernel@...r.kernel.org
Subject: mm: swap: locking in release_pages()

Dear mm developers!

After experiencing a crash in release_pages() [1] I'm trying to understand the locking in the release_pages():

No matter if we consider v5.17 or v5.4 (as in my case), they both have similar locking patterns:

v5.17:

if (PageLRU(page)) {
        struct lruvec *prev_lruvec = lruvec;

        lruvec = folio_lruvec_relock_irqsave(folio, lruvec,
                                                        &flags);
        if (prev_lruvec != lruvec)
                lock_batch = 0;

        del_page_from_lru_list(page, lruvec);
        __clear_page_lru_flags(page);
}

v5.4:

if (PageLRU(page)) {
        struct pglist_data *pgdat = page_pgdat(page);

        if (pgdat != locked_pgdat) {
                if (locked_pgdat)
                        spin_unlock_irqrestore(&locked_pgdat->lru_lock,
                                                        flags);
                lock_batch = 0;
                locked_pgdat = pgdat;
                spin_lock_irqsave(&locked_pgdat->lru_lock, flags);
        }

        lruvec = mem_cgroup_page_lruvec(page, locked_pgdat);
        VM_BUG_ON_PAGE(!PageLRU(page), page);
        __ClearPageLRU(page);
        del_page_from_lru_list(page, lruvec, page_off_lru(page));
}

What I don't understand here is, what guarantees us that "if (PageLRU(page))" condition
is still valid after we swap the locks in "if (pgdat != locked_pgdat)" case?
If we check under one lock and VM_BUG_ON_PAGE() under another lock, what actually stops
it from crashing as below or BUG() from time to time?

1. Crash of v5.4.170 on an ARM32 machine:

Unable to handle kernel NULL pointer dereference at virtual address 00000104
pgd = e138149d
[00000104] *pgd=84d2fd003, *pmd=8ffd6f003
Internal error: Oops: a07 [#1] PREEMPT SMP ARM
...
CPU: 1 PID: 6172 Comm: AaSysInfoRColle Tainted: G    B      O      5.4.170-... #1
Hardware name: Keystone
PC is at release_pages+0x194/0x358
LR is at release_pages+0x10c/0x358
pc : [<c03367a4>]    lr : [<c033671c>]    psr: 600c0093
sp : ccec7da8  ip : 00000000  fp : 00000002
r10: 600c0013  r9 : c0e054a0  r8 : 00000001
r7 : 00000000  r6 : ccec7e68  r5 : 00000000  r4 : d435e280
r3 : 00000122  r2 : 00000100  r1 : 00000001  r0 : 600c0013
Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 30c5387d  Table: 4d2b2cc0  DAC: 55555555
Process AaSysInfoRColle (pid: 6172, stack limit = 0x36239b43)
Stack: (0xccec7da8 to 0xccec8000)
7da0:                   c0a11970 c0ea3380 00000000 c0e04e48 cb8fbef8 ccec7dbc
7dc0: ccec7dbc c0e04e48 d435e280 ccec7e64 d435e280 00000000 ffffffff 00000000
7de0: 00000000 cb8fbef8 00000000 c0338654 00000001 c0345f0c ccec7e28 d0d0d0d0
7e00: 7fffffff c03af3e0 000000fc fffffe00 00000000 00000000 cb8fbdf8 00000000
7e20: 00000000 2f2f2f2f 00000000 c03ac448 cefdc02d c03af738 cefdc02d fefefeff
7e40: ccec7ea0 00000002 ccec7f60 c03ad6e8 00000010 00000000 00000000 ffffffff
7e60: ccec7ea0 00000101 d435e280 0000000a fffffffe c03ad6f0 00000000 ccec7ea0
7e80: ccec7f60 c03bf24c ccec7ef0 c03b9738 cb8fbdf8 c03bf24c ccec7f68 c03b1eac
7ea0: 62395b18 c0e04e48 20d0cab6 ccec7ee0 cb8fbdf8 ced8e580 ccec7f68 ffffffff
7ec0: ffffffff cb8fbdf8 fffffffe c0346468 ffffffff ffffffff 00000000 c03ee290
7ee0: 00000000 c03d16e8 cb8fbdf8 cb8fbe7c 00000007 00000000 00000000 cbb26580
7f00: c0268e64 ccec7f04 ccec7f04 c0e04e48 cb8fbdf8 c0e04e48 c0a108e4 cb8fbdf8
7f20: cb8fbeb8 c0a108e4 ccec7f68 00000000 ffffff9c c03bf790 c9cced48 00000000
7f40: cefdc000 c03b3950 ccec7f68 ccec7f58 cd2a2540 00000002 00000000 00000000
7f60: c3043a90 cd768440 79377e71 00000047 cefdc02d c03c55dc 0000000a c0e04e48
7f80: b0a9b7e8 b0a9b7e8 b6d695a1 aaa0fd60 0000000a c0201204 ccec6000 0000000a
7fa0: 000002a4 c0201000 b0a9b7e8 b6d695a1 b0a9b7e8 b07056b0 a5c2b128 00000000
7fc0: b0a9b7e8 b6d695a1 aaa0fd60 0000000a a5c2a344 a5c2a348 a5c2a474 000002a4
7fe0: b6e5191c a5c2a2bc b68922d3 b68e15d8 600c0030 b0a9b7e8 00000000 00000000
[<c03367a4>] (release_pages) from [<c0338654>] (__pagevec_release+0x28/0x44)
[<c0338654>] (__pagevec_release) from [<c0345f0c>] (shmem_undo_range+0x358/0x714)
[<c0345f0c>] (shmem_undo_range) from [<c0346468>] (shmem_evict_inode+0x11c/0x2ac)
[<c0346468>] (shmem_evict_inode) from [<c03bf790>] (evict+0xa0/0x1b8)
[<c03bf790>] (evict) from [<c03b3950>] (do_unlinkat+0x1d0/0x2a0)
[<c03b3950>] (do_unlinkat) from [<c0201000>] (ret_fast_syscall+0x0/0x4c)
Exception stack(0xccec7fa8 to 0xccec7ff0)
7fa0:                   b0a9b7e8 b6d695a1 b0a9b7e8 b07056b0 a5c2b128 00000000
7fc0: b0a9b7e8 b6d695a1 aaa0fd60 0000000a a5c2a344 a5c2a348 a5c2a474 000002a4
7fe0: b6e5191c a5c2a2bc b68922d3 b68e15d8
Code: e5933000 e3130020 1a000065 e1c420d4 (e5823004)
Kernel panic - not syncing: Fatal exception
 

-- 
Best regards,
Alexander Sverdlin.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ