linux-kernel - Re: [PATCH] platform/chrome: cros_ec_proto: Lock device when updating MKBP version

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAMxeMi3VFN91FpGb3dgobz9aXt+Ok8rEqGkidwrGxNNk43O=6g@mail.gmail.com>
Date: Tue, 30 Jul 2024 10:05:16 +0200
From: Patryk Duda <patrykd@...gle.com>
To: Tzung-Bi Shih <tzungbi@...nel.org>
Cc: Guenter Roeck <groeck@...omium.org>, Benson Leung <bleung@...omium.org>, 
	chrome-platform@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] platform/chrome: cros_ec_proto: Lock device when updating
 MKBP version

On Tue, Jul 30, 2024 at 8:04 AM Tzung-Bi Shih <tzungbi@...nel.org> wrote:
>
> On Mon, Jul 29, 2024 at 01:57:09PM +0200, Patryk Duda wrote:
> > On Mon, Jul 29, 2024 at 5:47 AM Tzung-Bi Shih <tzungbi@...nel.org> wrote:
> > >
> > > On Thu, Jul 25, 2024 at 05:57:13PM +0000, Patryk Duda wrote:
> > > > The cros_ec_get_host_command_version_mask() function requires that the
> > > > caller must have ec_dev->lock mutex before calling it. This requirement
> > > > was not met and as a result it was possible that two commands were sent
> > > > to the device at the same time.
> > >
> > > To clarify:
> > > - What would happen if multiple cros_ec_get_host_command_version_mask() calls
> > >   at the same time?
> > In the best case, MCU will receive both commands glued together and
> > will ignore them.
> > It will result in a timeout in the kernel. In the worst case, request
> > and/or response buffers will be
> > corrupted.
> >
> > > - What are the callees?  I'm trying to understand the source of parallelism.
> > This is a race between interrupt handling and ioctl call from userspace
> >
> > Handling interrupt path
> > cros_ec_irq_thread()
> > cros_ec_handle_event()
> > cros_ec_get_next_event() - Queries host command version without taking
> > ec_dev->lock mutex first
> > cros_ec_get_host_command_version_mask()
> > cros_ec_send_command()
> > cros_ec_xfer_command()
> > cros_ec_uart_pkt_xfer()
> >
> > Command from userspace
> > cros_ec_chardev_ioctl()
> > cros_ec_chardev_ioctl_xcmd()
> > cros_ec_cmd_xfer() - Locks ec_dev->lock mutex before sending command
> > cros_ec_send_command()
> > cros_ec_xfer_command()
> > cros_ec_uart_pkt_xfer()
> >
> > >
> > > Also, the patch also needs an unlock at [1].
> > >
> > > [1]: https://elixir.bootlin.com/linux/v6.10/source/drivers/platform/chrome/cros_ec_proto.c#L819
> >
> > Yeah. I'll fix it in v2
>
> I'm wondering if it's simpler to just lock and unlock around calling
> cros_ec_get_host_command_version_mask().  What do you think?
>
Initially, I thought it would be good to keep ec_dev->mkbp_event_supported
update under the mutex (similar to cros_ec_query_all() which is called with
locked mutex), but mkbp_event_supported is also used without locked mutex.

I don't see any obvious risks with updating the MKBP version outside mutex.
Do you want me to change it?

> > > > The problem was observed while using UART backend which doesn't use any
> > > > additional locks, unlike SPI backend which locks the controller until
> > > > response is received.
> > >
> > > Is it a general issue if multiple commands send to EC at a time?  If yes, it
> > > should serialize that in the UART transportation.
> >
> > Host Commands only support one command at a time. It's enforced by 'lock' mutex
> > from cros_ec_device structure. We just need to use it properly.
>
> I see.  Please use the fixes tag if you'd have chance to send next version:
> Fixes: f74c7557ed0d ("platform/chrome: cros_ec_proto: Update version on GET_NEXT_EVENT failure")