linux-kernel - Re: wg-crypt-wg0 process

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHmME9oORiNMN6EyAWOy3Fxzo33etkQxQtQbKdPqw_fWdw9wtw@mail.gmail.com>
Date:   Wed, 30 Dec 2020 13:39:28 +0100
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     Fatih USTA <fatihusta86@...il.com>
Cc:     WireGuard mailing list <wireguard@...ts.zx2c4.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: wg-crypt-wg0 process

Hi Fatih,

Thanks for the report and the detailed test case. From what I can see,
this behavior presents itself both with the explicit ip link del and
without. When running with debugging enabled, I can see this in dmesg:

[558758.361056] wireguard: wg0: Keypair 244 destroyed for peer 21
[558758.546649] wireguard: wg0: Peer 21 (10.150.150.2:51820) destroyed
[558758.563317] wireguard: wg0: Interface destroyed
[558758.567803] wireguard: wg0: Keypair 243 destroyed for peer 22
[558758.733287] wireguard: wg0: Peer 22 (10.150.150.1:51820) destroyed
[558758.749991] wireguard: wg0: Interface destroyed

The fact that I see "Interface destroyed" for both interfaces means
that wg_destruct() is being called, which includes these calls:

        destroy_workqueue(wg->handshake_receive_wq);
        destroy_workqueue(wg->handshake_send_wq);
        destroy_workqueue(wg->packet_crypt_wq);

In doing so, the output of ps changes from:

$ ps aux|grep wg0
root      200479  0.0  0.0      0     0 ?        I    13:06   0:00
[kworker/0:2-wg-crypt-wg0]
root      201226  0.0  0.0      0     0 ?        I    13:08   0:00
[kworker/1:4-wg-crypt-wg0]
root      201476  0.0  0.0      0     0 ?        I<   13:11   0:00
[wg-crypt-wg0]
root      201484  0.0  0.0      0     0 ?        I<   13:11   0:00
[wg-crypt-wg0]

to:

$ ps aux|grep wg0
root      200479  0.0  0.0      0     0 ?        I    13:06   0:00
[kworker/0:2-wg-crypt-wg0]
root      201226  0.0  0.0      0     0 ?        I    13:08   0:00
[kworker/1:4-wg-crypt-wg0]

What I suspect is happening is that destroying the workqueue does not
actually destroy the kthreads that they were using, so that they can
be reused (and eventually relabeled) by other drivers. Looking at the
stack of those indicates this is probably the case:

$ cat /proc/200479/stack
[<0>] worker_thread+0xba/0x3c0
[<0>] kthread+0x114/0x130
[<0>] ret_from_fork+0x1f/0x30

So it's just hanging out there idle waiting to be scheduled by
something new. In fact, while I was writing this email, that worker
already seems to have been reclaimed by another driver:

$ cat /proc/200479/comm
kworker/0:2-events

Now it's called "events".

This is happening because the kthread isn't actually destroyed, and
task->comm is being hijacked. In proc_task_name we have:

        if (p->flags & PF_WQ_WORKER)
               wq_worker_comm(tcomm, sizeof(tcomm), p);
       else
               __get_task_comm(tcomm, sizeof(tcomm), p);

That top condition holds for workqueue workers, and wq_worker_comm
winds up scnprintf'ing the current worker description in there:

                        /*
                        * ->desc tracks information (wq name or
                        * set_worker_desc()) for the latest execution.  If
                        * current, prepend '+', otherwise '-'.
                        */
                       if (worker->desc[0] != '\0') {
                               if (worker->current_work)
                                       scnprintf(buf + off, size - off, "+%s",
                                                 worker->desc);
                               else
                                       scnprintf(buf + off, size - off, "-%s",
                                                 worker->desc);

But worker->desc isn't set until process_one_work is called:

        /*
        * Record wq name for cmdline and debug reporting, may get
        * overridden through set_worker_desc().
        */
       strscpy(worker->desc, pwq->wq->name, WORKER_DESC_LEN);

And it's never unset after the work is done and it's waiting idle in
worker_thread for the scheduler to reschedule it and eventually call
process_one_work on a new work unit.

It would be easy to just null out that string after the work is done
with something like:

        worker->current_func(work);
        worker->desc[0] = '\0';

But I guess this has never sufficiently bothered anyone before. I
suppose I could submit a patch and see how it's received. But it also
looks like the scnprintf above in wq_worker_comm distinguishes these
cases anyway. If there's a + it means that the work is active and if
there's a - it means that it's an old leftover thread. So maybe this
is fine as-is?

Jason