lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 25 Jan 2024 11:51:28 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Mikulas Patocka <mpatocka@...hat.com>, Tejun Heo <tj@...nel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org, 
	dm-devel@...ts.linux.dev, Mike Snitzer <msnitzer@...hat.com>, 
	Ignat Korchagin <ignat@...udflare.com>, Damien Le Moal <damien.lemoal@....com>, 
	Bob Liu <bob.liu@...cle.com>, Hou Tao <houtao1@...wei.com>, 
	Nathan Huckleberry <nhuck@...gle.com>, Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...e.hu>
Subject: Re: [PATCH] softirq: fix memory corruption when freeing tasklet_struct

On Thu, 25 Jan 2024 at 10:30, Mikulas Patocka <mpatocka@...hat.com> wrote:
>
> There's a problem with the tasklet API - there is no reliable way how to
> free a structure that contains tasklet_struct. The problem is that the
> function tasklet_action_common calls task_unlock(t) after it called the
> callback. If the callback does something that frees tasklet_struct,
> task_unlock(t) would write into free memory.

Ugh.

I see what you're doing, but I have to say, I dislike this patch
immensely. It feels like a serious misdesign that is then papered over
with a hack.

I'd much rather see us trying to move away from tasklets entirely in
cases like this. Just say "you cannot do that".

In fact, of the two cases that want this new functionality, at least
dm-verity already makes tasklets a conditional feature that isn't even
enabled by default, and that was only introduced in the last couple of
years.

So I think dm-verity would be better off just removing tasklet use,
and we should check whether there are better models for handling the
latency issue.

The dm-crypt.c case looks different, but similar. I'm not sure why it
doesn't just use the workqueue for the "in interrupt" case. Like
dm-verity, it already does have a workqueue option, and it's a
setup-time option to say "don't use the workqueue for reads / writes".
But it feels like the code should just say "tough luck, in interrupt
context we *will* use workqueues".

So honestly, both of the cases you bring up seem to be just BUGGY. The
fix is not to extend tasklets to a new thing, the fix is to say "those
two uses of tasklets were broken, and should go away".

End result: I would suggest:

 - just get rid of the actively buggy use of tasklets. It's not
necessary in either case.

 - look at introducing a "low-latency atomic workqueue" that looks
*exactly* like a regular workqueue, but has the rule that it's per-cpu
and functions on it cannot sleep

because I think one common issue with workqueues - which are better
designed than tasklets - is that scheduling latency.

I think if we introduced a workqueue that worked more like a tasklet -
in that it's run in softirq context - but doesn't have the interface
mistakes of tasklets, a number of existing workqueue users might
decide that that is exactly what they want.

So we could have a per-cpu 'atomic_wq' that things can be scheduled
on, and that runs from softirqs just like tasklets, and shares the
workqueue queueing infrastructure but doesn't use the workqueue
threads.

Yes, the traditional use of workqueues is to be able to sleep and do
things in process context, so that sounds a bit odd, but let's face
it, we

 (a) already have multiple classes of workqueues

 (b) avoiding deep - and possibly recursive - stack depths is another
reason people use workqueues

 (c) avoiding interrupt context is a real concern, even if you don't
want to sleep

and I really *really* would like to get rid of tasklets entirely.

They started as this very specific hardcoded softirq thing used by
some drivers, and then the notion was generalized.

And I think it was generalized badly, as shown by this example.

I have added Tejun to the cc, so that he can throw his hands up in
horror and say "Linus, you're crazy, your drug-fueled idea would be
horrid because of Xyz".

But *maybe* Tejun has been taking the same drugs I have, and goes
"yeah, that would fit well".

Tejun? Please tell me I'm not on some bad crack..

               Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ