[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <19d7d71b-c911-45cc-9671-235d98720be6@linux.alibaba.com>
Date: Fri, 23 Feb 2024 11:36:14 +0800
From: Wen Gu <guwen@...ux.alibaba.com>
To: "Antipov, Dmitriy" <Dmitriy.Antipov@...tline.com>,
"dmantipov@...dex.ru" <dmantipov@...dex.ru>,
"wenjia@...ux.ibm.com" <wenjia@...ux.ibm.com>
Cc: "lvc-project@...uxtesting.org" <lvc-project@...uxtesting.org>,
"linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"jaka@...ux.ibm.com" <jaka@...ux.ibm.com>
Subject: Re: [lvc-project] [PATCH] [RFC] net: smc: fix fasync leak in
smc_release()
On 2024/2/21 23:02, Antipov, Dmitriy wrote:
> On Wed, 2024-02-21 at 21:09 +0800, Wen Gu wrote:
>
>> 1. on = 1; ioctl(sock, FIOASYNC, &on), a fasync entry is added to
>> smc->sk.sk_socket->wq.fasync_list;
>>
>> 2. Then fallback happend, and swapped the socket:
>> smc->clcsock->file = smc->sk.sk_socket->file;
>> smc->clcsock->file->private_data = smc->clcsock;
>> smc->clcsock->wq.fasync_list = smc->sk.sk_socket->wq.fasync_list;
>> smc->sk.sk_socket->wq.fasync_list = NULL;
>>
>> 3. on = 0; ioctl(sock, FIOASYNC, &on), the fasync entry is removed
>> from smc->clcsock->wq.fasync_list,
>> (Is there a race between 2 and 3 ?)
>
> 1) IIUC yes. The following sequence from smc_switch_to_fallback():
>
> smc->clcsock->file = smc->sk.sk_socket->file;
> smc->clcsock->file->private_data = smc->clcsock;
>
> is executed with locked smc->sk.sk_socket but unlocked smc->clcsock.
> This way,
>
> struct socket *sock = filp->private_data;
>
> in sock_fasync() introduces an undefined behavior (because an
> actual value of 'private_data' is unpredictable). So there are
> two possible scenarios. When
>
> on = 1; ioctl(sock, FIOASYNC, &on);
> on = 0; ioctl(sock, FIOASYNC, &on);
>
> actually modifies fasync list of the same socket, this works as
> expected. If kernel sockets behind 'sock' gets swapped between
> calls to ioctl(), fasync list of the first socket has an entry
> which can't be removed by the second ioctl().
>
Thank you for the explanation, Dmitriy.
So the race could be:
sock_fasync | smc_switch_to_fallback
------------------------------------------------------------------
/* smc->sk.sk_socket->wq.fasync_list|
is empty at the beginning */ |
|
/* attempt to add fasync_struct */ |
sock = filp->private_data; |
(smc->sk.sk_socket) | /* fallback happens */
| lock_sock(sk);
| file->private_data = smc->clcsock
| smc->clcsock->wq.fasync_list = smc->sk.sk_socket->wq.fasync_list
| smc->sk.sk_socket->wq.fasync_list = NULL
| release_sock(sk);
lock_sock(sk); |
fasync_helper(on=1) |
(successed to add the entry in |
smc->sk.sk_socket->wq.fasync_list) |
release_sock(sk); |
|
/* the fasync_struct entry can't be |
removed later, since |
file->private_data has been changed |
to clcsock */ |
Now clcsock->wq.fasync_list is empty and the fasync_struct entry is
always left in smc->sk.sk_socket->wq.fasync_list.
If we only remove fasync_struct entries in smc->sk.sk_socket->wq.fasync_list
during smc_release, it indeed solves the leak, but it fails to address
the problem of fasync_struct entries being incorrectly inserted into the
old socket (they should have been placed in smc->clcsock->wq.fasync_list
if fallback happens).
One solution to this issue I can think of is to check whether
filp->private_data has been changed when the sock_fasync holds the sock lock,
but it inevitably changes the general code..
diff --git a/net/socket.c b/net/socket.c
index ed3df2f749bf..a28435195854 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1443,6 +1443,11 @@ static int sock_fasync(int fd, struct file *filp, int on)
return -EINVAL;
lock_sock(sk);
+ /* filp->private_data has changed */
+ if (on && unlikely(sock != filp->private_data)) {
+ release_sock(sk);
+ return -EAGAIN;
+ }
fasync_helper(fd, filp, on, &wq->fasync_list);
if (!wq->fasync_list)
Let's see if anyone else has a better idea.
Best regards,
Wen Gu
>
>> 4. Then close the file, __fput() calls file->f_op->fasync(-1, file, 0),
>> then sock_fasync() calls fasync_helper(fd, filp, on, &wq->fasync_list)
>> and fasync_remove_entry() removes entries in smc->clcsock->wq.fasync_list.
>> Now smc->clcsock->wq.fasync_list is empty.
>
> 2) No. In the second (bad) scenario from the above, an attempt to remove
> fasync entry from smc->clcsock->wq.fasync_list always fails because
> fasync entry was actually linked to smc->sk.sk_socket->wq.fasync_list.
>
> Note sock_fasync() doesn't check the value returned from fasync_helper().
> How dumb.
>
>> 5. __fput() calls file->f_op->release(inode, file), then sock_close calls
>> __sock_release, then ops->release calls smc_release(), and __smc_release()
>> calls smc_restore_fallback_changes() to restore socket:
>> if (smc->clcsock->file) { /* non-accepted sockets have no file yet */
>> smc->clcsock->file->private_data = smc->sk.sk_socket;
>> smc->clcsock->file = NULL;
>> smc_fback_restore_callbacks(smc);
>> }
>
> 3) Yes. And it's too late because __fput() calls file->f_op->fasync(-1, ...,
> 0) _before_ file->f_op->release(). So even if you have sockets swapped back,
> no one will take care about freeing fasync list.
>
>
>> 6. Then back to __sock_release, check if sock->wq.fasync_list (that is
>> smc->sk.sk_socket->wq.fasync_list) is empty and it is empty.
>
> 4) No. It's not empty because an attempt to remove fasync entry has failed
> at [2] just because it was made against the wrong socket.
>
>
> For your convenience, there is a human-readable reproducer loosely modeled
> around the one generated by syzkaller. You can try it on any system running
> recently enough kernel with CONFIG_SMC enabled (root is not required), and
> receiving a few (or may be a lot of) "__sock_release: fasync list not empty"
> messages clearly indicates an issue. NOTE: this shouldn't crash the system
> and/or make it unusable, but remember that each message comes with a small
> kernel memory leak.
>
> Dmitry
>
> #include <signal.h>
> #include <unistd.h>
> #include <pthread.h>
> #include <sys/ioctl.h>
> #include <sys/socket.h>
>
> int sock;
>
> void *
> loop0 (void *arg)
> {
> struct msghdr msg = { 0 };
>
> while (1)
> {
> sock = socket (AF_SMC, SOCK_STREAM, 0);
> sendmsg (sock, &msg, MSG_FASTOPEN);
> close (sock);
> }
> return NULL;
> }
>
> void *
> loop1 (void *arg)
> {
> int on;
>
> while (1)
> {
> on = 1;
> ioctl (sock, FIOASYNC, &on);
> on = 0;
> ioctl (sock, FIOASYNC, &on);
> }
>
> return NULL;
> }
>
> int
> main (int argc, char *argv[])
> {
> pthread_t a, b;
> struct sigaction sa = { 0 };
>
> sa.sa_handler = SIG_IGN;
> sigaction (SIGIO, &sa, NULL);
>
> pthread_create (&a, NULL, loop0, NULL);
> pthread_create (&b, NULL, loop1, NULL);
>
> pause ();
>
> pthread_join (a, NULL);
> pthread_join (b, NULL);
>
> return 0;
> }
>
>
Powered by blists - more mailing lists