[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aWsa9SdO9kLJTfz4@snowbird>
Date: Fri, 16 Jan 2026 21:15:33 -0800
From: Dennis Zhou <dennis@...nel.org>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Tejun Heo <tj@...nel.org>, Christoph Lameter <cl@...ux.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [PATCH v2] percpu: add basic double free check
On Fri, Jan 16, 2026 at 07:15:48PM -0800, Andrew Morton wrote:
> On Thu, 15 Jan 2026 18:32:16 -0800 Dennis Zhou <dennis@...nel.org> wrote:
>
> > This adds a basic double free check by validating the first bit of the
> > allocation in alloc_map and bound_map are set. If the alloc_map bit is
> > not set, then this means the area is currently unallocated. If the
> > bound_map bit is not set, then we are not freeing from the beginning of
> > the allocation.
> >
> > This is a respin of [1] adding the requested changes from me and
> > Christoph.
> >
> > ...
> >
> > @@ -1276,18 +1277,24 @@ static int pcpu_alloc_area(struct pcpu_chunk *chunk, int alloc_bits,
> > static int pcpu_free_area(struct pcpu_chunk *chunk, int off)
> > {
> > struct pcpu_block_md *chunk_md = &chunk->chunk_md;
> > + int region_bits = pcpu_chunk_map_bits(chunk);
> > int bit_off, bits, end, oslot, freed;
> >
> > lockdep_assert_held(&pcpu_lock);
> > - pcpu_stats_area_dealloc(chunk);
> >
> > oslot = pcpu_chunk_slot(chunk);
> >
> > bit_off = off / PCPU_MIN_ALLOC_SIZE;
> > + if (unlikely(bit_off < 0 || bit_off >= region_bits))
> > + return 0;
>
> This (which looks sensible) wasn't changelogged?
>
Sorry that's my fault. I can respin and add it if you'd like.
> > @@ -2242,6 +2252,13 @@ void free_percpu(void __percpu *ptr)
> >
> > spin_lock_irqsave(&pcpu_lock, flags);
> > size = pcpu_free_area(chunk, off);
> > + if (size == 0) {
> > + spin_unlock_irqrestore(&pcpu_lock, flags);
> > +
> > + if (__ratelimit(&_rs))
> > + WARN(1, "percpu double free or bad ptr\n");
>
> Is ratelimiting really needed? A WARN_ON_ONCE is enough to tell people
> that this kernel is wrecked?
>
I can see running multiple tests that might give me additional debug /
signal to how badly I screwed up. In production a WARN_ON_ONCE is
definitely more than enough, but might as well offer the chance to try
and trigger it more than once.
> > + return;
> > + }
>
> The patch does appear to do that which it set out to do. But do we
> want to do it? Is there a history of callers double-freeing percpu
> memory? Was there some bug which would have been more rapidly and
> easily solved had this change been in place?
>
Originally, Sebastian posted he ran into the issue where he double freed
in [1] (linked in patch). Maybe he can elaborate how that bug was
introduced.
Wrt do we want to do it - I think it doesn't hurt and makes it more
explicit that something very wrong occurred. Percpu memory really
expects users to be good samaritans. If you do happen to accidentally
double free without the warning, in a contrived case you could
experience unexplained behavior for some time before crashing in a spot
that would leave your head scratching. If anything I think there could
be an argument to fail louder.
[1] https://lore.kernel.org/linux-mm/20250904143514.Yk6Ap-jy@linutronix.de/
Thanks,
Dennis
Powered by blists - more mailing lists