linux-kernel - Re: [patch 02/20] posix-timers: Ensure timer ID search-loop limit is valid

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZFoVg9UmItpIaA69@lothringen>
Date:   Tue, 9 May 2023 11:42:27 +0200
From:   Frederic Weisbecker <frederic@...nel.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Anna-Maria Behnsen <anna-maria@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        syzbot+5c54bd3eb218bb595aa9@...kaller.appspotmail.com,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Sebastian Siewior <bigeasy@...utronix.de>,
        Michael Kerrisk <mtk.manpages@...il.com>
Subject: Re: [patch 02/20] posix-timers: Ensure timer ID search-loop limit is
 valid

On Sat, May 06, 2023 at 01:36:22AM +0200, Thomas Gleixner wrote:
> On Sat, May 06 2023 at 00:58, Thomas Gleixner wrote:
> > On Fri, May 05 2023 at 16:50, Frederic Weisbecker wrote:
> > So the whole thing works like this:
> >
> >    start = READ_LOCKLESS(sig->next_id);
> >
> >    // Enfore that id and start are different to not terminate right away
> >    id = ~start;
> >
> > loop:
> >    if (id == start)
> >    	goto fail;
> >    lock()
> >         id = sig->next_id;                      <-- stable readout
> >         sig->next_id = (id + 1) & INT_MAX;      <-- prevent going negative
> >
> >         if (unused_id(id)) {
> >            add_timer_to_hash(timer, id);
> >            unlock();
> >            return id;
> >         }
> >    id++;
> >    unlock();
> >    goto loop;
> >
> > As the initial lockless readout is guaranteed to be in the positive
> > space, how is that supposed to be looping forever?
> 
> Unless you think about the theoretical case of an unlimited number of
> threads sharing the signal_struct which all concurrently try to allocate
> a timer id and then releasing it immediately again (to avoid resource
> limit exhaustion). Theoretically possible, but is this a real concern
> with a timer ID space of 2G?

I didn't go that far actually, it was just me misunderstanding that loop and
especially the (id =~start) part. Now I got it.

I guess the for statement can just be:

for (; start != id; id++)

> 
> I'm sure that it's incredibly hard to exploit this, but what's really
> bothering me is the hash table itself. The only reason why we have that
> is CRIU.
> 
> The only alternative solution I could come up with is a paritioned
> xarray where the index space would be segmented for each TGID, i.e.
> 
>        segment.start = TGID * MAX_TIMERS_PER_PROCESS
>        segment.end    = segment.start + MAX_TIMERS_PER_PROCESS - 1
> 
> where MAX_TIMERS_PER_PROCESS could be a copius 2^16 which would work for
> both 32bit and 64bit TID limits.
> 
> That would avoid the hash table lookups and the related issues, but OTH
> it would require to allocate one extra page per TGID if the application
> uses a single posix timer.
> 
> Not sure whether that's worth it though.

Not sure either...

Thanks.