[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240516123120.99486-1-kuniyu@amazon.com>
Date: Thu, 16 May 2024 21:31:21 +0900
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <mhal@...x.co>
CC: <davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
<kuniyu@...zon.com>, <netdev@...r.kernel.org>, <pabeni@...hat.com>
Subject: Re: [PATCH net] af_unix: Fix garbage collection of embryos carrying OOB with SCM_RIGHTS
From: Michal Luczaj <mhal@...x.co>
Date: Thu, 16 May 2024 12:20:48 +0200
> GC attempts to explicitly drop oob_skb before purging the hit list.
>
> The problem is with embryos: kfree_skb(u->oob_skb) is never called on an
> embryo socket, as those sockets are not directly stacked by the SCC walk.
>
> The python script below [0] sends a listener's fd to its embryo as OOB
> data. While GC does collect the embryo's queue, it fails to drop the OOB
> skb's refcount. The skb which was in embryo's receive queue stays as
> unix_sk(sk)->oob_skb and keeps the listener's refcount [1].
>
> Tell GC to dispose embryo's oob_skb.
>
> [0]:
> from array import array
> from socket import *
>
> addr = '\x00unix-oob'
> lis = socket(AF_UNIX, SOCK_STREAM)
> lis.bind(addr)
> lis.listen(1)
>
> s = socket(AF_UNIX, SOCK_STREAM)
> s.connect(addr)
> scm = (SOL_SOCKET, SCM_RIGHTS, array('i', [lis.fileno()]))
> s.sendmsg([b'x'], [scm], MSG_OOB)
> lis.close()
>
> [1]
> $ grep unix-oob /proc/net/unix
> $ ./unix-oob.py
> $ grep unix-oob /proc/net/unix
> 0000000000000000: 00000002 00000000 00000000 0001 02 0 @unix-oob
> 0000000000000000: 00000002 00000000 00010000 0001 01 6072 @unix-oob
>
> Fixes: 4090fa373f0e ("af_unix: Replace garbage collection algorithm.")
> Signed-off-by: Michal Luczaj <mhal@...x.co>
> ---
> net/unix/garbage.c | 25 ++++++++++++++++---------
> 1 file changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/net/unix/garbage.c b/net/unix/garbage.c
> index 1f8b8cdfcdc8..8e9431955ab7 100644
> --- a/net/unix/garbage.c
> +++ b/net/unix/garbage.c
> @@ -342,6 +342,18 @@ enum unix_recv_queue_lock_class {
> U_RECVQ_LOCK_EMBRYO,
> };
>
> +static void unix_collect_queue(struct unix_sock *u, struct sk_buff_head *hitlist)
> +{
> + skb_queue_splice_init(&u->sk.sk_receive_queue, hitlist);
> +
> +#if IS_ENABLED(CONFIG_AF_UNIX_OOB)
> + if (u->oob_skb) {
> + WARN_ON_ONCE(skb_unref(u->oob_skb));
> + u->oob_skb = NULL;
> + }
> +#endif
> +}
> +
> static void unix_collect_skb(struct list_head *scc, struct sk_buff_head *hitlist)
> {
> struct unix_vertex *vertex;
> @@ -365,18 +377,11 @@ static void unix_collect_skb(struct list_head *scc, struct sk_buff_head *hitlist
>
> /* listener -> embryo order, the inversion never happens. */
> spin_lock_nested(&embryo_queue->lock, U_RECVQ_LOCK_EMBRYO);
> - skb_queue_splice_init(embryo_queue, hitlist);
> + unix_collect_queue(unix_sk(skb->sk), hitlist);
> spin_unlock(&embryo_queue->lock);
> }
> } else {
> - skb_queue_splice_init(queue, hitlist);
> -
> -#if IS_ENABLED(CONFIG_AF_UNIX_OOB)
> - if (u->oob_skb) {
> - kfree_skb(u->oob_skb);
> - u->oob_skb = NULL;
> - }
> -#endif
> + unix_collect_queue(u, hitlist);
> }
>
> spin_unlock(&queue->lock);
> @@ -583,6 +588,8 @@ static void __unix_gc(struct work_struct *work)
> skb_queue_walk(&hitlist, skb) {
> if (UNIXCB(skb).fp)
> UNIXCB(skb).fp->dead = true;
> +
> + WARN_ON_ONCE(refcount_read(&skb->users) != 1);
Given we will refactor OOB with no additional refcount, this will not
make sense. Rather, I'd add a test case in a selftest to catch the
future regression.
And I noticed that I actually tried to catch this in
tools/testing/selftests/net/af_unix/scm_rights.c
, and what is missing is... :S
---8<---
>From f6f47b3cdd22e7c4ed48bf1de089babd09c606e0 Mon Sep 17 00:00:00 2001
From: Kuniyuki Iwashima <kuniyu@...zon.com>
Date: Thu, 16 May 2024 12:10:45 +0000
Subject: [PATCH] selftest: af_unix: Make SCM_RIGHTS into OOB data.
scm_rights.c covers various test cases for inflight file descriptors
and garbage collector for AF_UNIX sockets.
Currently, SCM_RIGHTS messages are sent with 3-bytes string, and it's
not good for MSG_OOB cases, as SCM_RIGTS cmsg goes with the first 2-bytes,
which is non-OOB data.
Let's send SCM_RIGHTS messages with 1-byte character to pack SCM_RIGHTS
into OOB data.
Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.com>
diff --git a/tools/testing/selftests/net/af_unix/scm_rights.c b/tools/testing/selftests/net/af_unix/scm_rights.c
index bab606c9f1eb..2bfed46e0b19 100644
--- a/tools/testing/selftests/net/af_unix/scm_rights.c
+++ b/tools/testing/selftests/net/af_unix/scm_rights.c
@@ -197,8 +197,8 @@ void __send_fd(struct __test_metadata *_metadata,
const FIXTURE_VARIANT(scm_rights) *variant,
int inflight, int receiver)
{
-#define MSG "nop"
-#define MSGLEN 3
+#define MSG "x"
+#define MSGLEN 1
struct {
struct cmsghdr cmsghdr;
int fd[2];
---8<---
With this, we can catch the case properly,
# RUN scm_rights.stream_listener_oob.self_ref ...
# scm_rights.c:119:self_ref:Expected 0 (0) == ret (6)
# self_ref: Test terminated by assertion
# FAIL scm_rights.stream_listener_oob.self_ref
not ok 13 scm_rights.stream_listener_oob.self_ref
# RUN scm_rights.stream_listener_oob.triangle ...
# scm_rights.c:119:triangle:Expected 0 (0) == ret (18)
# triangle: Test terminated by assertion
# FAIL scm_rights.stream_listener_oob.triangle
not ok 14 scm_rights.stream_listener_oob.triangle
# RUN scm_rights.stream_listener_oob.cross_edge ...
# scm_rights.c:119:cross_edge:Expected 0 (0) == ret (24)
# cross_edge: Test terminated by assertion
# FAIL scm_rights.stream_listener_oob.cross_edge
not ok 15 scm_rights.stream_listener_oob.cross_edge
# FAILED: 12 / 15 tests passed.
# Totals: pass:12 fail:3 xfail:0 xpass:0 skip:0 error:0
And with your patch, all good !
# RUN scm_rights.stream_listener_oob.self_ref ...
# OK scm_rights.stream_listener_oob.self_ref
ok 13 scm_rights.stream_listener_oob.self_ref
# RUN scm_rights.stream_listener_oob.triangle ...
# OK scm_rights.stream_listener_oob.triangle
ok 14 scm_rights.stream_listener_oob.triangle
# RUN scm_rights.stream_listener_oob.cross_edge ...
# OK scm_rights.stream_listener_oob.cross_edge
ok 15 scm_rights.stream_listener_oob.cross_edge
# PASSED: 15 / 15 tests passed.
# Totals: pass:15 fail:0 xfail:0 xpass:0 skip:0 error:0
Could you remove the WARN_ON_ONCE() and repost with my patch
above ?
Thanks!
Powered by blists - more mailing lists