linux-kernel - Re: [PATCH 1/1] mm/page_alloc: Leave IRQs enabled for per-cpu page allocations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20221011082530.p2fk44dhglxulsou@techsingularity.net>
Date:   Tue, 11 Oct 2022 09:25:30 +0100
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Vlastimil Babka <vbabka@...e.cz>
Cc:     Yu Zhao <yuzhao@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Nicolas Saenz Julienne <nsaenzju@...hat.com>,
        Marcelo Tosatti <mtosatti@...hat.com>,
        Michal Hocko <mhocko@...nel.org>,
        Hugh Dickins <hughd@...gle.com>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>
Subject: Re: [PATCH 1/1] mm/page_alloc: Leave IRQs enabled for per-cpu page
 allocations

On Mon, Oct 10, 2022 at 10:45:43PM +0200, Vlastimil Babka wrote:
> On 10/10/22 16:22, Mel Gorman wrote:
> > On Wed, Aug 24, 2022 at 10:58:26PM -0600, Yu Zhao wrote:
> > > On Wed, Aug 24, 2022 at 8:18 AM Mel Gorman <mgorman@...hsingularity.net> wrote:
> > > > 
> > > > The pcp_spin_lock_irqsave protecting the PCP lists is IRQ-safe as a task
> > > > allocating from the PCP must not re-enter the allocator from IRQ context.
> > > > In each instance where IRQ-reentrancy is possible, the lock is acquired using
> > > > pcp_spin_trylock_irqsave() even though IRQs are disabled and re-entrancy
> > > > is impossible.
> > > > 
> > > > Demote the lock to pcp_spin_lock avoids an IRQ disable/enable in the common
> > > > case at the cost of some IRQ allocations taking a slower path. If the PCP
> > > > lists need to be refilled, the zone lock still needs to disable IRQs but
> > > > that will only happen on PCP refill and drain. If an IRQ is raised when
> > > > a PCP allocation is in progress, the trylock will fail and fallback to
> > > > using the buddy lists directly. Note that this may not be a universal win
> > > > if an interrupt-intensive workload also allocates heavily from interrupt
> > > > context and contends heavily on the zone->lock as a result.
> > > 
> > > Hi,
> > > 
> > > This patch caused the following warning. Please take a look.
> > > 
> > > Thanks.
> > > 
> > >    WARNING: inconsistent lock state
> > >    6.0.0-dbg-DEV #1 Tainted: G S      W  O
> > >    --------------------------------
> > 
> > I finally found time to take a closer look at this and I cannot reproduce
> > it against 6.0. What workload triggered the warning, on what platform and
> > can you post the kernel config used please? It would also help if you
> > can remember what git commit the patch was tested upon.
> > 
> > Thanks and sorry for the long delay.
> 
> I didn't (try to) reproduce this, but FWIW the report looked legit to me, as
> after the patch, pcp_spin_trylock() has to be used for both allocation and
> freeing to be IRQ safe. free_unref_page() uses it, so it's fine. But as the
> stack trace in the report shows, free_unref_page_list() does pcp_spin_lock()
> and not _trylock, and that's IMHO the problem.
> 

I completely agree, it was a surprise to me that IO completion would
happen in soft IRQ context even though blk_done_softirq indicates that
it is normal and I didn't manage to trigger that case myself. I wondered
if there was an easy way to force that which would have made testing of
this easier. I can live without the reproduction case and cc Yu Zhao after
6.1-rc1 comes out and I've fixed this.

-- 
Mel Gorman
SUSE Labs