[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <22bc31d7-b599-dc6a-2ef0-0dfc551a4715@i-love.sakura.ne.jp>
Date: Tue, 9 Mar 2021 22:56:17 +0900
From: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
To: Shuah Khan <skhan@...uxfoundation.org>, shuah@...nel.org,
valentina.manea.m@...il.com, gregkh@...uxfoundation.org
Cc: linux-usb@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races
leading to gpf
On 2021/03/09 20:04, Tetsuo Handa wrote:
> On 2021/03/09 1:27, Shuah Khan wrote:
>> Yes. We might need synchronization between events, threads, and shutdown
>> in usbip_host side and in connection polling and threads in vhci.
>>
>> I am also looking at the shutdown sequences closely as well since the
>> local state is referenced without usbip_device lock in these paths.
>>
>> I am approaching these problems as peeling the onion an expression so
>> we can limit the changes and take a spot fix approach. We have the
>> goal to address these crashes and not introduce regressions.
>
> I think my [PATCH v4 01/12]-[PATCH v4 06/12] simplify your further changes
> without introducing regressions. While ud->lock is held when checking ud->status,
> current attach/detach code is racy about read/update of ud->status . I think we
> can close race in attach/detach code via a simple usbip_event_mutex serialization.
>
Commit 04679b3489e048cd ("Staging: USB/IP: add client driver") says
/*
* The important thing is that only one context begins cleanup.
* This is why error handling and cleanup become simple.
* We do not want to consider race condition as possible.
*/
static void vhci_shutdown_connection(struct usbip_device *ud)
but why are we allowing multiple contexts to begin startup?
That's where a subtle race window syzbot is reporting happened.
This driver has never thought the possibility that multiple userspace
processes can concurrently begin startup. Then, it is quite natural that
we make startup simple and safe, by enforcing that only one context
begins startup.
Without serialization between startup/cleanup/event_handler(),
"Fix the above problems:" in your changelog cannot become true.
First step should be closing time-of-check to time-of-use bug
via entire serialization as if "nonpreemptible UP kernel".
Powered by blists - more mailing lists