[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20221114210856.34981-1-kuniyu@amazon.com>
Date: Mon, 14 Nov 2022 13:08:56 -0800
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <pabeni@...hat.com>
CC: <davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
<kuni1840@...il.com>, <kuniyu@...zon.com>, <netdev@...r.kernel.org>
Subject: Re: [PATCH v2 net-next 6/6] udp: Introduce optional per-netns hash table.
From: Paolo Abeni <pabeni@...hat.com>
Date: Mon, 14 Nov 2022 21:55:11 +0100
> On Mon, 2022-11-14 at 12:21 -0800, Kuniyuki Iwashima wrote:
> > From: Paolo Abeni <pabeni@...hat.com>
> > Date: Fri, 11 Nov 2022 09:53:31 +0100
> > > On Thu, 2022-11-10 at 20:00 -0800, Kuniyuki Iwashima wrote:
> > > > @@ -408,6 +409,28 @@ static int proc_tcp_ehash_entries(struct ctl_table *table, int write,
> > > > return proc_dointvec(&tbl, write, buffer, lenp, ppos);
> > > > }
> > > >
> > > > +static int proc_udp_hash_entries(struct ctl_table *table, int write,
> > > > + void *buffer, size_t *lenp, loff_t *ppos)
> > > > +{
> > > > + struct net *net = container_of(table->data, struct net,
> > > > + ipv4.sysctl_udp_child_hash_entries);
> > > > + int udp_hash_entries;
> > > > + struct ctl_table tbl;
> > > > +
> > > > + udp_hash_entries = net->ipv4.udp_table->mask + 1;
> > > > +
> > > > + /* A negative number indicates that the child netns
> > > > + * shares the global udp_table.
> > > > + */
> > > > + if (!net_eq(net, &init_net) && net->ipv4.udp_table == &udp_table)
> > > > + udp_hash_entries *= -1;
> > > > +
> > > > + tbl.data = &udp_hash_entries;
> > > > + tbl.maxlen = sizeof(int);
> > >
> > > I see the procfs code below will only use tbl.data and tbl.maxlen, but
> > > perhaps is cleaner intially explicitly memset tbl to 0
> >
> > Will add memset()
> >
> >
> > >
> > > >
> > > > +
> > > > + return proc_dointvec(&tbl, write, buffer, lenp, ppos);
> > > > +}
> > > > +
> > > > #ifdef CONFIG_IP_ROUTE_MULTIPATH
> > > > static int proc_fib_multipath_hash_policy(struct ctl_table *table, int write,
> > > > void *buffer, size_t *lenp,
> > >
> > > [...]
> > >
> > > > @@ -3308,22 +3308,112 @@ u32 udp_flow_hashrnd(void)
> > > > }
> > > > EXPORT_SYMBOL(udp_flow_hashrnd);
> > > >
> > > > -static int __net_init udp_sysctl_init(struct net *net)
> > > > +static void __net_init udp_sysctl_init(struct net *net)
> > > > {
> > > > - net->ipv4.udp_table = &udp_table;
> > > > -
> > > > net->ipv4.sysctl_udp_rmem_min = PAGE_SIZE;
> > > > net->ipv4.sysctl_udp_wmem_min = PAGE_SIZE;
> > > >
> > > > #ifdef CONFIG_NET_L3_MASTER_DEV
> > > > net->ipv4.sysctl_udp_l3mdev_accept = 0;
> > > > #endif
> > > > +}
> > > > +
> > > > +static struct udp_table __net_init *udp_pernet_table_alloc(unsigned int hash_entries)
> > > > +{
> > > > + unsigned long hash_size, bitmap_size;
> > > > + struct udp_table *udptable;
> > > > + int i;
> > > > +
> > > > + udptable = kmalloc(sizeof(*udptable), GFP_KERNEL);
> > > > + if (!udptable)
> > > > + goto out;
> > > > +
> > > > + udptable->log = ilog2(hash_entries);
> > > > + udptable->mask = hash_entries - 1;
> > > > +
> > > > + hash_size = L1_CACHE_ALIGN(hash_entries * 2 * sizeof(struct udp_hslot));
> > > > + bitmap_size = hash_entries *
> > > > + BITS_TO_LONGS(udp_bitmap_size(udptable)) * sizeof(unsigned long);
> > >
> > > Ouch, I'm very sorry. I did not realize we need a bitmap per hash
> > > bucket. This leads to a constant 8k additional memory overhead per
> > > netns, undependently from arch long bitsize.
> >
> > Ugh, it will be 64K per netns ... ?
> >
> > hash_entries : 2 ^ n
> > BITS_TO_LONGS : 2 ^ -m # arch specific
-(m + 3)
> > udp_bitmap_size(udptable) : 2 ^ (16 - n)
> > sizeof(unsigned long) : 2 ^ m # arch specific
> >
> > (2 ^ n) * (2 ^ -m) * (2 ^ (16 - n)) * (2 ^ m)
(-m - 3)
> > = 2 ^ (n - m + 16 - n + m)
13
> > = 2 ^ 16
13
> > = 64 K
8 K
>
> For the records, I still think it's 8k ;)
>
> BITS_TO_LONGS(n) * sizeof(unsigned long) is always equal to n/8
> regardless of the arch, while the above math gives BITS_TO_LONGS(n) *
> sizeof(unsigned long) == n.
Ah, right!
My math was bad :p
Thank you!
Powered by blists - more mailing lists