[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d7ff01cf-6681-4470-a6d1-d7d1081edef1@alu.unizg.hr>
Date: Tue, 3 Oct 2023 22:12:29 +0200
From: Mirsad Todorovac <mirsad.todorovac@....unizg.hr>
To: Matthew Wilcox <willy@...radead.org>
Cc: linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
Keith Busch <kbusch@...nel.org>, Jens Axboe <axboe@...nel.dk>,
Christoph Hellwig <hch@....de>,
Sagi Grimberg <sagi@...mberg.me>,
linux-nvme@...ts.infradead.org
Subject: Re: BUG: KCSAN: data-race in folio_batch_move_lru / mpage_read_end_io
On 9/18/23 16:53, Matthew Wilcox wrote:
> On Mon, Sep 18, 2023 at 02:15:05PM +0200, Mirsad Todorovac wrote:
>>> This is what I'm currently running with, and it doesn't trigger.
>>> I'd expect it to if we were going to hit the KCSAN bug.
>>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index 0c5be12f9336..d22e8798c326 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -4439,6 +4439,7 @@ struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
>>> page = __alloc_pages_slowpath(alloc_gfp, order, &ac);
>>> out:
>>> + VM_BUG_ON_PAGE(page && (page->flags & (PAGE_FLAGS_CHECK_AT_PREP &~ (1 << PG_head))), page);
>>> if (memcg_kmem_online() && (gfp & __GFP_ACCOUNT) && page &&
>>> unlikely(__memcg_kmem_charge_page(page, gfp, order) != 0)) {
>>> __free_pages(page, order);
>>
>> Hi,
>>
>> Caught another instance of this bug involving folio_batch_move_lru: I don't seem that I can make it
>> happen reliably by the nature of the data racing conditions if I understood them well.
>
> Were you running with this patch at the time, or was this actually
> vanilla? The problem is that, if my diagnosis is correct, both of the
> tasks mentioned are victims; we have a prematurely freed page. While
> btrfs is clearly a user, it may not be btrfs's fault that the
> page was also allocated as an anon page.
>
> I'm trying to gather more data, and running with this patch will give
> us more -- because it'll dump the entire struct page instead of just
> the page->flags, like KCSAN is currently doing.
As my learning curve adapts, I seem to be more aware of what you are talking about.
I still have to learn to cope with patches, diffs, fixes and pulls all together and
consistent.
Sometimes I feel like in the BORG maturation chamber when I try to learn the Linux kernel,
and I wonder if this is the Author of my story trying to make up "for the years that locust
had eaten". Or is it that I am just losing the plot.
I learn that I was conceited and not respecting the work you guys have done in thirty years
I wasted for one reason or another: objective difficulties and personal weaknesses.
Forgive me this moment of truth.
I certainly feel more motivated to catch the real culprit, rather than just the symptoms.
I will rebuild with your patch again and try to reproduce the problem.
Best regards
Mirsad Todorovac
Powered by blists - more mailing lists