lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 6 Mar 2023 12:57:11 +0100
From:   Frederic Weisbecker <frederic@...nel.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Jakub Kicinski <kuba@...nel.org>, peterz@...radead.org,
        jstultz@...gle.com, edumazet@...gle.com, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        "Paul E. McKenney" <paulmck@...nel.org>
Subject: Re: [PATCH 2/3] softirq: avoid spurious stalls due to need_resched()

On Sun, Mar 05, 2023 at 09:43:23PM +0100, Thomas Gleixner wrote:
> That said, I have no brilliant solution for that off the top of my head,
> but I'm not comfortable with applying more adhoc solutions which are
> contrary to the efforts of e.g. the audio folks.
> 
> I have some vague ideas how to approach that, but I'm traveling all of
> next week, so I neither will be reading much email, nor will I have time
> to think deeply about softirqs. I'll resume when I'm back.

IIUC: the problem is that some (rare?) softirq vector callbacks rely on the
fact they can not be interrupted by other local vectors and they rely on
that to protect against concurrent per-cpu state access, right?

And there is no automatic way to detect those cases otherwise we would have
fixed them all with spinlocks already.

So I fear the only (in-)sane idea I could think of is to do it the same way
we did with the BKL. Some sort of pushdown: vector callbacks known for having
no such subtle interaction can re-enable softirqs.

For example known safe timers (either because they have no such interactions
or because they handle them correctly via spinlocks) can carry a
TIMER_SOFTIRQ_SAFE flag to tell about that. And RCU callbacks something alike.

Of course this is going to be a tremendous amount of work but it has the
advantage of being iterative and it will pay in the long run. Also I'm confident
that the hottest places will be handled quickly. And most of them are likely to
be in core networking code.

Because I fear no hack will ever fix that otherwise, and we have tried a lot.

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ