linux-kernel - Re: systemd-rfkill regression on 5.11 and later kernels

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <s5ho8fgixl9.wl-tiwai@suse.de>
Date:   Thu, 18 Mar 2021 11:07:46 +0100
From:   Takashi Iwai <tiwai@...e.de>
To:     Johannes Berg <johannes@...solutions.net>
Cc:     Emmanuel Grumbach <emmanuel.grumbach@...el.com>,
        linux-wireless@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: systemd-rfkill regression on 5.11 and later kernels

On Thu, 18 Mar 2021 10:36:19 +0100,
Johannes Berg wrote:
> 
> Hi Takashi,
> 
> Oh yay :-(
> 
> > we've received a bug report about rfkill change that was introduced in
> > 5.11.  While the systemd-rfkill expects the same size of both read and
> > write, the kernel rfkill write cuts off to the old 8 bytes while read
> > gives 9 bytes, hence it leads the error:
> >   https://github.com/systemd/systemd/issues/18677
> >   https://bugzilla.opensuse.org/show_bug.cgi?id=1183147
> 
> > As far as I understand from the log in the commit 14486c82612a, this
> > sounds like the intended behavior.
> 
> Not really? I don't even understand why we get this behaviour.
> 
> The code is this:
> 
> rfkill_fop_write():
> 
> ...
>         /* we don't need the 'hard' variable but accept it */
>         if (count < RFKILL_EVENT_SIZE_V1 - 1)
>                 return -EINVAL;
> 
> # this is actually 7 - RFKILL_EVENT_SIZE_V1 is 8
> # (and obviously we get past this if and don't get -EINVAL
> 
>         /*
>          * Copy as much data as we can accept into our 'ev' buffer,
>          * but tell userspace how much we've copied so it can determine
>          * our API version even in a write() call, if it cares.
>          */
>         count = min(count, sizeof(ev));
> 
> # sizeof(ev) should be 9 since 'ev' is the new struct
> 
>         if (copy_from_user(&ev, buf, count))
>                 return -EFAULT;
> 
> ...
> 	ret = 0;
> ...
> 	return ret ?: count;
> 
> 
> 
> 
> Ah, no, I see. The bug says:
> 
> 	EDIT: above is with kernel-core-5.10.16-200.fc33.x86_64.
> 
> So you've recompiled systemd with 5.11 headers, but are running against
> 5.10 now, where the short write really was intentional - it lets you
> detect that the new fields weren't handled by the kernel. If 
> 
> 
> The other issue is basically this (pre-fix) systemd code:
> 
> l = read(c.rfkill_fd, &event, sizeof(event));
> ...
> if (l != RFKILL_EVENT_SIZE_V1) /* log/return error */
> 
> 
> 
> So ... honestly, I don't have all that much sympathy, when the uapi
> header explicitly says we want to be able to change the size. But I
> guess "no regressions" rules are hard, so ... dunno. I guess we can add
> a version/size ioctl and keep using 8 bytes unless you send that?

OK, I took a deeper look again, and actually there are two issues in
systemd-rfkill code:

* It expects 8 bytes returned from read while it reads a struct
  rfkill_event record.  If the code is rebuilt with the latest kernel
  headers, it breaks due to the change of rfkill_event.  That's the
  error openSUSE bug report points to.

* When systemd-rfkill is built with the latest kernel headers but runs
  on the old kernel code, the write size check fails as you mentioned
  in the above.  That's another part of the github issue.

So, with a kernel devs hat on, I share your feeling, that's an
application bug.  OTOH, the extension of the rfkill_event is, well,
not really safe as expected.

IMO, if systemd-rfkill is the only one that hits such a problem, we
may let the systemd code fixed, as it's obviously buggy.  But who
knows...

Is the extension of rfkill_event mandatory?  Can the new entry
provided in a different way such as another sysfs record?
IOW, if we revert the change, would it break anything else new?


thanks,

Takashi