linux-kernel - Re: [RFC PATCH 0/2] RT scheduling policies for workqueues

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8b21ad64-ea9c-84f2-c798-222c9383e3de@rasmusvillemoes.dk>
Date:   Tue, 29 Mar 2022 10:33:19 +0200
From:   Rasmus Villemoes <linux@...musvillemoes.dk>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Tejun Heo <tj@...nel.org>
Cc:     Marc Kleine-Budde <mkl@...gutronix.de>,
        Peter Hurley <peter@...leysoftware.com>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        Esben Haabendal <esben@...nix.com>,
        Steven Walter <stevenrwalter@...il.com>,
        linux-kernel@...r.kernel.org,
        Oleksij Rempel <o.rempel@...gutronix.de>,
        Pengutronix Kernel Team <kernel@...gutronix.de>,
        André Pribil <a.pribil@...k-ipc.com>,
        Jiri Slaby <jirislaby@...nel.org>,
        linux-rt-users@...r.kernel.org
Subject: Re: [RFC PATCH 0/2] RT scheduling policies for workqueues

On 29/03/2022 08.30, Sebastian Andrzej Siewior wrote:
> On 2022-03-28 07:39:25 [-1000], Tejun Heo wrote:
>> Hello,
> Hi,
> 
>> I wonder whether it'd be useful to provide a set of wrappers which can make
>> switching between workqueue and kworker easy. Semantics-wise, they're
>> already mostly aligned and it shouldn't be too difficult to e.g. make an
>> unbounded workqueue be backed by a dedicated kthread_worker instead of
>> shared pool depending on a flag, or even allow switching dynamically.

Well, that would certainly not make it any easier for userspace to
discover the thread it needs to chrt().

> This could work. For the tty layer it could use 'lowlatency' attribute
> to decide which implementation makes sense.

I have patches that merely touch the tty layer, but tying it to the
lowlatency attribute is quite painful (which has also come up in
previous discussions on this) - because the lowlatency flag can be
flipped from userspace, but synchronizing which variant is used and
switching dynamically is at least beyond my skills to make work
robustly. So in my patches, the choice is made at open() time. However,
I'm still not convinced code like

 struct tty_bufhead {
        struct tty_buffer *head;        /* Queue head */
        struct work_struct work;
+       struct kthread_work kwork;
+       struct kthread_worker *kworker;

 bool tty_buffer_restart_work(struct tty_port *port)
 {
-       return queue_work(system_unbound_wq, &port->buf.work);
+       struct tty_bufhead *buf = &port->buf;
+
+       if (buf->kworker)
+               return kthread_queue_work(buf->kworker, &buf->kwork);
+       else
+               return queue_work(system_unbound_wq, &buf->work);
 }

etc. is the way to go.

===

Here's another idea: In an ideal world, the irq thread itself [people
caring about latency use threaded interrupts] could just do the work
immediately - then the admin only has one kernel thread to properly
configure. However, as Sebastian pointed out, doing that leads to a
lockdep splat [1], and it also means that there's no work item involved,
so some other thread calling tty_buffer_flush_work() might not actually
wait for a concurrent flush_to_ldisc() to finish. So could we create a
struct hybrid_work { } which, when enqueued, does something like

  bool current_is_irqthread(void) { return in_task() &&
kthread_func(current) == irq_thread; }

hwork_queue(struct hybrid_work *hwork, struct workqueue_struct *wq)
  if (current_is_irqthread()) {
    task_work_add(current, &hwork->twork)
  } else {
    queue_work(wq, &hwork->work);
  }

(with extra bookkeeping so _flush  and _cancel_sync methods can also be
created). It would require irqthread to learn to run its queued
task_works in its main loop, which in turn would require finding some
other way to do the irq_thread_dtor() cleanup, but that should be doable.

While the implementation of hybrid_work might be a bit complex, I think
this would have potential for being used in other situations, and for
the users, the API would be as simple as the current workqueue/struct
kwork APIs. By letting the irq thread do more/all of the work, we'd
probably also win some latency due to fewer threads involved and better
cache locality. And the admin/BSP is already setting the rt priorities
of the [irq/...] threads.

Rasmus

[1]
https://lore.kernel.org/linux-rt-users/20180711080957.f6txdmzrrrrdm7ig@linutronix.de/