lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <05ea6668-3dca-23ed-56c8-bbf8079d93cd@arm.com>
Date:   Fri, 30 Jun 2023 11:07:57 +0100
From:   Ryan Roberts <ryan.roberts@....com>
To:     John Hubbard <jhubbard@...dia.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org,
        Adrian Hunter <adrian.hunter@...el.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        Alex Williamson <alex.williamson@...hat.com>,
        Alexander Potapenko <glider@...gle.com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Andrey Konovalov <andreyknvl@...il.com>,
        Andrey Ryabinin <ryabinin.a.a@...il.com>,
        Christian Brauner <brauner@...nel.org>,
        Christoph Hellwig <hch@...radead.org>,
        Daniel Vetter <daniel@...ll.ch>,
        Dave Airlie <airlied@...il.com>,
        Dimitri Sivanich <dimitri.sivanich@....com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Ian Rogers <irogers@...gle.com>,
        Jason Gunthorpe <jgg@...pe.ca>, Jiri Olsa <jolsa@...nel.org>,
        Johannes Weiner <hannes@...xchg.org>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Lorenzo Stoakes <lstoakes@...il.com>,
        Mark Rutland <mark.rutland@....com>,
        Matthew Wilcox <willy@...radead.org>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Michal Hocko <mhocko@...nel.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Mike Rapoport <rppt@...nel.org>,
        Muchun Song <muchun.song@...ux.dev>,
        Namhyung Kim <namhyung@...nel.org>,
        Naoya Horiguchi <naoya.horiguchi@....com>,
        Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>,
        Pavel Tatashin <pasha.tatashin@...een.com>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        SeongJae Park <sj@...nel.org>,
        Shakeel Butt <shakeelb@...gle.com>,
        Uladzislau Rezki <urezki@...il.com>,
        Vincenzo Frascino <vincenzo.frascino@....com>,
        Yu Zhao <yuzhao@...gle.com>
Subject: Re: [PATCH] mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte
 comparison

On 30/06/2023 02:32, John Hubbard wrote:
> The following crash happens for me when running the -mm selftests
> (below). Specifically, it happens while running the uffd-stress
> subtests:
> 
> kernel BUG at mm/hugetlb.c:7249!
> invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 0 PID: 3238 Comm: uffd-stress Not tainted 6.4.0-hubbard-github+ #109
> Hardware name: ASUS X299-A/PRIME X299-A, BIOS 1503 08/03/2018
> RIP: 0010:huge_pte_alloc+0x12c/0x1a0
> ...
> Call Trace:
>  <TASK>
>  ? __die_body+0x63/0xb0
>  ? die+0x9f/0xc0
>  ? do_trap+0xab/0x180
>  ? huge_pte_alloc+0x12c/0x1a0
>  ? do_error_trap+0xc6/0x110
>  ? huge_pte_alloc+0x12c/0x1a0
>  ? handle_invalid_op+0x2c/0x40
>  ? huge_pte_alloc+0x12c/0x1a0
>  ? exc_invalid_op+0x33/0x50
>  ? asm_exc_invalid_op+0x16/0x20
>  ? __pfx_put_prev_task_idle+0x10/0x10
>  ? huge_pte_alloc+0x12c/0x1a0
>  hugetlb_fault+0x1a3/0x1120
>  ? finish_task_switch+0xb3/0x2a0
>  ? lock_is_held_type+0xdb/0x150
>  handle_mm_fault+0xb8a/0xd40
>  ? find_vma+0x5d/0xa0
>  do_user_addr_fault+0x257/0x5d0
>  exc_page_fault+0x7b/0x1f0
>  asm_exc_page_fault+0x22/0x30
> 
> That happens because a BUG() statement in huge_pte_alloc() attempts to
> check that a pte, if present, is a hugetlb pte, but it does so in a
> non-lockless-safe manner that leads to a false BUG() report.
> 
> We got here due to a couple of bugs, each of which by itself was not
> quite enough to cause a problem:
> 
> First of all, before commit c33c794828f2("mm: ptep_get() conversion"),
> the BUG() statement in huge_pte_alloc() was itself fragile: it relied
> upon compiler behavior to only read the pte once, despite using it twice
> in the same conditional.
> 
> Next, commit c33c794828f2 ("mm: ptep_get() conversion") broke that
> delicate situation, by causing all direct pte reads to be done via
> READ_ONCE(). And so READ_ONCE() got called twice within the same BUG()
> conditional, leading to comparing (potentially, occasionally) different
> versions of the pte, and thus to false BUG() reports.

Thanks for finding and fixing this - sorry for the issue. FWIW, I've re-reviewed
the whole ptep_get conversion patch looking for other instances of this pattern
- I didn't spot any other issues.

> 
> Fix this by taking a single snapshot of the pte before using it in the
> BUG conditional.
> 
> Now, that commit is only partially to blame here but, people doing
> bisections will invariably land there, so this will help them find a fix
> for a real crash. And also, the previous behavior was unlikely to ever
> expose this bug--it was fragile, yet not actually broken.
> 
> So that's why I chose this commit for the Fixes tag, rather than the
> commit that created the original BUG() statement.
> 
> Fixes: c33c794828f2 ("mm: ptep_get() conversion")
> Cc: Adrian Hunter <adrian.hunter@...el.com>
> Cc: Al Viro <viro@...iv.linux.org.uk>
> Cc: Alex Williamson <alex.williamson@...hat.com>
> Cc: Alexander Potapenko <glider@...gle.com>
> Cc: Alexander Shishkin <alexander.shishkin@...ux.intel.com>
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> Cc: Andrey Konovalov <andreyknvl@...il.com>
> Cc: Andrey Ryabinin <ryabinin.a.a@...il.com>
> Cc: Christian Brauner <brauner@...nel.org>
> Cc: Christoph Hellwig <hch@...radead.org>
> Cc: Daniel Vetter <daniel@...ll.ch>
> Cc: Dave Airlie <airlied@...il.com>
> Cc: Dimitri Sivanich <dimitri.sivanich@....com>
> Cc: Dmitry Vyukov <dvyukov@...gle.com>
> Cc: Ian Rogers <irogers@...gle.com>
> Cc: Jason Gunthorpe <jgg@...pe.ca>
> Cc: Jiri Olsa <jolsa@...nel.org>
> Cc: Johannes Weiner <hannes@...xchg.org>
> Cc: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> Cc: Lorenzo Stoakes <lstoakes@...il.com>
> Cc: Mark Rutland <mark.rutland@....com>
> Cc: Matthew Wilcox <willy@...radead.org>
> Cc: Miaohe Lin <linmiaohe@...wei.com>
> Cc: Michal Hocko <mhocko@...nel.org>
> Cc: Mike Kravetz <mike.kravetz@...cle.com>
> Cc: Mike Rapoport (IBM) <rppt@...nel.org>
> Cc: Muchun Song <muchun.song@...ux.dev>
> Cc: Namhyung Kim <namhyung@...nel.org>
> Cc: Naoya Horiguchi <naoya.horiguchi@....com>
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>
> Cc: Pavel Tatashin <pasha.tatashin@...een.com>
> Cc: Roman Gushchin <roman.gushchin@...ux.dev>
> Cc: Ryan Roberts <ryan.roberts@....com>
> Cc: SeongJae Park <sj@...nel.org>
> Cc: Shakeel Butt <shakeelb@...gle.com>
> Cc: Uladzislau Rezki (Sony) <urezki@...il.com>
> Cc: Vincenzo Frascino <vincenzo.frascino@....com>
> Cc: Yu Zhao <yuzhao@...gle.com>
> Signed-off-by: John Hubbard <jhubbard@...dia.com>
> ---
>  mm/hugetlb.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index bce28cca73a1..73fbeb8f979f 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -7246,7 +7246,12 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma,
>  				pte = (pte_t *)pmd_alloc(mm, pud, addr);
>  		}
>  	}
> -	BUG_ON(pte && pte_present(ptep_get(pte)) && !pte_huge(ptep_get(pte)));
> +
> +	if (pte) {
> +		pte_t pteval = ptep_get(pte);

Given the PTL is not held here, I think this should technically be
ptep_get_lockless()?

Thanks,
Ryan


> +
> +		BUG_ON(pte_present(pteval) && !pte_huge(pteval));
> +	}
>  
>  	return pte;
>  }
> 
> base-commit: bf1fa6f15553df04f2bdd06190ccd5f388ab0777

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ