[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240306194412.11255-1-kuniyu@amazon.com>
Date: Wed, 6 Mar 2024 11:44:12 -0800
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <pabeni@...hat.com>
CC: <davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
<kuni1840@...il.com>, <kuniyu@...zon.com>, <netdev@...r.kernel.org>
Subject: Re: [PATCH v4 net-next 02/15] af_unix: Allocate struct unix_edge for each inflight AF_UNIX fd.
From: Paolo Abeni <pabeni@...hat.com>
Date: Tue, 05 Mar 2024 09:47:04 +0100
> On Thu, 2024-02-29 at 18:22 -0800, Kuniyuki Iwashima wrote:
> > As with the previous patch, we preallocate to skb's scm_fp_list an
> > array of struct unix_edge in the number of inflight AF_UNIX fds.
> >
> > There we just preallocate memory and do not use immediately because
> > sendmsg() could fail after this point. The actual use will be in
> > the next patch.
> >
> > When we queue skb with inflight edges, we will set the inflight
> > socket's unix_sock as unix_edge->predecessor and the receiver's
> > unix_sock as successor, and then we will link the edge to the
> > inflight socket's unix_vertex.edges.
> >
> > Note that we set NULL to cloned scm_fp_list.edges in scm_fp_dup()
> > so that MSG_PEEK does not change the shape of the directed graph.
> >
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.com>
> > ---
> > include/net/af_unix.h | 6 ++++++
> > include/net/scm.h | 5 +++++
> > net/core/scm.c | 2 ++
> > net/unix/garbage.c | 6 ++++++
> > 4 files changed, 19 insertions(+)
> >
> > diff --git a/include/net/af_unix.h b/include/net/af_unix.h
> > index c270877a5256..55c4abc26a71 100644
> > --- a/include/net/af_unix.h
> > +++ b/include/net/af_unix.h
> > @@ -33,6 +33,12 @@ struct unix_vertex {
> > unsigned long out_degree;
> > };
> >
> > +struct unix_edge {
> > + struct unix_sock *predecessor;
> > + struct unix_sock *successor;
> > + struct list_head vertex_entry;
> > +};
> > +
> > struct sock *unix_peer_get(struct sock *sk);
> >
> > #define UNIX_HASH_MOD (256 - 1)
> > diff --git a/include/net/scm.h b/include/net/scm.h
> > index e34321b6e204..5f5154e5096d 100644
> > --- a/include/net/scm.h
> > +++ b/include/net/scm.h
> > @@ -23,12 +23,17 @@ struct scm_creds {
> > kgid_t gid;
> > };
> >
> > +#ifdef CONFIG_UNIX
> > +struct unix_edge;
> > +#endif
> > +
> > struct scm_fp_list {
> > short count;
> > short count_unix;
> > short max;
> > #ifdef CONFIG_UNIX
> > struct list_head vertices;
> > + struct unix_edge *edges;
> > #endif
> > struct user_struct *user;
> > struct file *fp[SCM_MAX_FD];
> > diff --git a/net/core/scm.c b/net/core/scm.c
> > index 87dfec1c3378..1bcc8a2d65e3 100644
> > --- a/net/core/scm.c
> > +++ b/net/core/scm.c
> > @@ -90,6 +90,7 @@ static int scm_fp_copy(struct cmsghdr *cmsg, struct scm_fp_list **fplp)
> > fpl->max = SCM_MAX_FD;
> > fpl->user = NULL;
> > #if IS_ENABLED(CONFIG_UNIX)
> > + fpl->edges = NULL;
> > INIT_LIST_HEAD(&fpl->vertices);
> > #endif
> > }
> > @@ -383,6 +384,7 @@ struct scm_fp_list *scm_fp_dup(struct scm_fp_list *fpl)
> > new_fpl->max = new_fpl->count;
> > new_fpl->user = get_uid(fpl->user);
> > #if IS_ENABLED(CONFIG_UNIX)
> > + new_fpl->edges = NULL;
> > INIT_LIST_HEAD(&new_fpl->vertices);
> > #endif
> > }
> > diff --git a/net/unix/garbage.c b/net/unix/garbage.c
> > index 75bdf66b81df..f31917683288 100644
> > --- a/net/unix/garbage.c
> > +++ b/net/unix/garbage.c
> > @@ -127,6 +127,11 @@ int unix_prepare_fpl(struct scm_fp_list *fpl)
> > list_add(&vertex->entry, &fpl->vertices);
> > }
> >
> > + fpl->edges = kvmalloc_array(fpl->count_unix, sizeof(*fpl->edges),
> > + GFP_KERNEL_ACCOUNT);
>
> If I read correctly, the total amount of additionally memory used will
> be proportional to vertices num + edges num.
Correct.
> Can you please provide a
> the real figures for some reasonable unix fd numbers (say a small
> table), to verify this memory usage is reasonable?
First, regarding per-sock change.
Before this series, unix_sock is 1920 bytes.
After this series, unix_sock is 1856 bytes, and unix_vertex is 72 bytes.
The delta is (1856 + 72) - 1920 = 8 bytes.
But, there is 8-bytes hole in unix_sock after the series, and the last
oob_skb can be moved there. Then, the size decrease 4 bytes in total.
So, we can ignore the delta with a follow-up patch moving oob_skb (and
other fields) to pack unix_sock.
---8<---
$ pahole -C unix_sock vmlinux
struct unix_sock {
...
spinlock_t lock; /* 1592 64 */
/* XXX 8 bytes hole, try to pack */
...
/* XXX 4 bytes hole, try to pack */
struct sk_buff * oob_skb; /* 1840 8 */
/* size: 1856, cachelines: 29, members: 13 */
---8<---
Second, regarding unix_edge.
sk_buff is 224 bytes, and scm_fp_list, which has struct file
pointer with the fixed-size array fp[253], is 2064 bytes.
After this series, unix_edge (48 bytes) is added to each AF_UNIX
fds.
Before this series: 224 + 2064 = 2288 bytes
After this series : 2288 + 48 * n bytes
In our distro, systemd is the major user receiving AF_UNIX fd at
boot time, ssh login, etc, and each skb has only one AF_UNIX fd.
In that case, skb has 48 bytes (2%) increase in size.
If we fully populate skb with 253 AF_UNIX fds, it will have about
12KB increase, but this is unlikely.
Powered by blists - more mailing lists