[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aJ6U3DQn876wGS4C@codewreck.org>
Date: Fri, 15 Aug 2025 11:01:00 +0900
From: Dominique Martinet <asmadeus@...ewreck.org>
To: Nalivayko Sergey <Sergey.Nalivayko@...persky.com>
Cc: v9fs-developer@...ts.sourceforge.net, netdev@...r.kernel.org,
Eric Van Hensbergen <ericvh@...il.com>,
Wang Hai <wanghai38@...wei.com>,
Latchesar Ionkov <lucho@...kov.net>, lvc-project@...uxtesting.org,
stable@...r.kernel.org
Subject: Re: [PATCH] net: 9p: fix double req put in p9_fd_cancelled
Nalivayko Sergey wrote on Tue, Jul 15, 2025 at 06:48:15PM +0300:
> This happens because of a race condition between:
>
> - The 9p client sending an invalid flush request and later cleaning it up;
> - The 9p client in p9_read_work() canceled all pending requests.
>
> Thread 1 Thread 2
> ...
> p9_client_create()
> ...
> p9_fd_create()
> ...
> p9_conn_create()
> ...
> // start Thread 2
> INIT_WORK(&m->rq, p9_read_work);
> p9_read_work()
> ...
> p9_client_rpc()
> ...
> ...
> p9_conn_cancel()
> ...
> spin_lock(&m->req_lock);
> ...
> p9_fd_cancelled()
> ...
> ...
> spin_unlock(&m->req_lock);
> // status rewrite
> p9_client_cb(m->client, req, REQ_STATUS_ERROR)
> // first remove
> list_del(&req->req_list);
> ...
>
> spin_lock(&m->req_lock)
> ...
> // second remove
> list_del(&req->req_list);
> spin_unlock(&m->req_lock)
> ...
>
> Commit 74d6a5d56629 ("9p/trans_fd: Fix concurrency del of req_list in
> p9_fd_cancelled/p9_read_work") fixes a concurrency issue in the 9p filesystem
> client where the req_list could be deleted simultaneously by both
> p9_read_work and p9_fd_cancelled functions, but for the case where req->status
> equals REQ_STATUS_RCVD.
Sorry for the delay,
Thanks for the investigation, this makes sense and deserves fixing.
> Add an explicit check for REQ_STATUS_ERROR in p9_fd_cancelled before
> processing the request. Skip processing if the request is already in the error
> state, as it has been removed and its resources cleaned up.
Looking at the other status, it's quite unlikely but if other thread
would make it FLSHD we should also skip these -- and I don't think it's
possible as far as the logic goes but if it's not sent yet we would have
nothing to flush either, so it's probably better to invert the check,
and make it `if (req != SENT) return` ?
client.c already checks `READ_ONCE(oldreq->status) == REQ_STATUS_SENT`
before calling cancelled but that's without lock, so basically we're
checking nothing raced since that check, and it's not limited to RCVD
and ERROR.
If you can send a v2 with that I'll pick it up.
Thanks,
--
Dominique Martinet | Asmadeus
Powered by blists - more mailing lists