lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZQhkfIqwcuTrKxK+@casper.infradead.org>
Date:   Mon, 18 Sep 2023 15:53:48 +0100
From:   Matthew Wilcox <willy@...radead.org>
To:     Mirsad Todorovac <mirsad.todorovac@....unizg.hr>
Cc:     linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        Keith Busch <kbusch@...nel.org>, Jens Axboe <axboe@...nel.dk>,
        Christoph Hellwig <hch@....de>,
        Sagi Grimberg <sagi@...mberg.me>,
        linux-nvme@...ts.infradead.org
Subject: Re: BUG: KCSAN: data-race in folio_batch_move_lru / mpage_read_end_io

On Mon, Sep 18, 2023 at 02:15:05PM +0200, Mirsad Todorovac wrote:
> > This is what I'm currently running with, and it doesn't trigger.
> > I'd expect it to if we were going to hit the KCSAN bug.
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 0c5be12f9336..d22e8798c326 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -4439,6 +4439,7 @@ struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
> >   	page = __alloc_pages_slowpath(alloc_gfp, order, &ac);
> >   out:
> > +	VM_BUG_ON_PAGE(page && (page->flags & (PAGE_FLAGS_CHECK_AT_PREP &~ (1 << PG_head))), page);
> >   	if (memcg_kmem_online() && (gfp & __GFP_ACCOUNT) && page &&
> >   	    unlikely(__memcg_kmem_charge_page(page, gfp, order) != 0)) {
> >   		__free_pages(page, order);
> 
> Hi,
> 
> Caught another instance of this bug involving folio_batch_move_lru: I don't seem that I can make it
> happen reliably by the nature of the data racing conditions if I understood them well.

Were you running with this patch at the time, or was this actually
vanilla?  The problem is that, if my diagnosis is correct, both of the
tasks mentioned are victims; we have a prematurely freed page.  While
btrfs is clearly a user, it may not be btrfs's fault that the
page was also allocated as an anon page.

I'm trying to gather more data, and running with this patch will give
us more -- because it'll dump the entire struct page instead of just
the page->flags, like KCSAN is currently doing.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ