lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6e62fdbf-a5d1-4650-b0b1-2a2698ed2040@linux.ibm.com>
Date: Thu, 7 Mar 2024 09:58:51 +0100
From: Jan Karcher <jaka@...ux.ibm.com>
To: Dmitry Antipov <dmantipov@...dex.ru>, Wen Gu <guwen@...ux.alibaba.com>,
        "wenjia@...ux.ibm.com" <wenjia@...ux.ibm.com>
Cc: "lvc-project@...uxtesting.org" <lvc-project@...uxtesting.org>,
        "linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [lvc-project] [PATCH] [RFC] net: smc: fix fasync leak in
 smc_release()



On 06/03/2024 19:07, Dmitry Antipov wrote:
> 
> Well, the whole picture is somewhat more complicated. Consider the
> following diagram (an underlying kernel socket is in [], e.g. [smc->sk]):
> 
> Thread 0                        Thread 1
> 
> ioctl(sock, FIOASYNC, [1])
> ...
> sock = filp->private_data;
> lock_sock(sock [smc->sk]);
> sock_fasync(sock, ..., 1)       ; new fasync_struct linked to smc->sk
> release_sock(sock [smc->sk]);
>                                  ...
>                                  lock_sock([smc->sk]);
>                                  ...
>                                  smc_switch_to_fallback()
>                                  ...
>                                  smc->clcsock->file->private_data = 
> smc->clcsock;
>                                  ...
>                                  release_sock([smc->sk]);
> ioctl(sock, FIOASYNC, [0])
> ...
> sock = filp->private_data;
> lock_sock(sock [smc->clcsock]);
> sock_fasync(sock, ..., 0)       ; nothing to unlink from smc->clcsock
>                                  ; since fasync entry was linked to smc->sk
> release_sock(sock [smc->clcsock]);
>                                  ...
>                                  close(sock [smc->clcsock]);
>                                  __fput(...);
>                                  file->f_op->fasync(sock, [0])   ; 
> always failed -
>                                                                  ; 
> should use
>                                                                  ; 
> smc->sk instead
>                                  file->f_op->release()
>                                     ...
>                                     smc_restore_fallback_changes()
>                                     ...
>                                     file->private_data = smc->sk.sk_socket;

Thank you Dmitry for your detailed explanations.
It helped me a lot trying to understand the problem.
And I'm still in the progress of trying to understand it.
 From my naive point of view:
Would it help if we catch the ioctl and check if there is an active 
fallback and either stall or route the ioctl to the correct socket?
I've seen that there is an ioctl function handle in the proto_ops.
So on a very abstract level we could do the following:
1. Indicate an active Fallback at the start of the fallback to tcp.
2. Catch ioctls.
3. Check if there is an active fallback.
	NO: Pass the ioctl.
	YES: Wait for the fallback to complete and pass after.

If this blocks too much we can obviously do some finer checks there.
E.g.: Just check if the private data is already at attached to the socket.

Do you think this would be a suiteable solution?
I'm going to look into your proposal Dmitry to see how you solved the 
problem and to understand it better.

Thanks
- Jan

> 
> That is, smc_restore_fallback_changes() restores filp->private_data to
> smc->sk. If __fput() would have called file->f_op->release() _before_
> file->f_op->fasync(), the fix would be as simple as adding
> 
> smc->sk.sk_socket->wq.fasync_list = smc->clcsock->wq.fasync_list;
> 
> to smc_restore_fallback_changes(). But since file->f_op->fasync() is called
> before file->f_op->release(), the former always makes an attempt to 
> unlink fasync
> entry from smc->clcsock instead of smc->sk, thus introducing the memory 
> leak.
> 
> And an idea with shared wait queue was intended in attempt to eliminate
> this chicken-egg lookalike problem completely.
> 
> Dmitry
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ