[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <161643489069.6320.12260867980480523074@kwain.local>
Date: Mon, 22 Mar 2021 18:41:30 +0100
From: Antoine Tenart <atenart@...nel.org>
To: Matthew Wilcox <willy@...radead.org>
Cc: davem@...emloft.net, kuba@...nel.org, alexander.duyck@...il.com,
netdev@...r.kernel.org, kernel test robot <oliver.sang@...el.com>
Subject: Re: [PATCH net-next] net-sysfs: remove possible sleep from an RCU read-side critical section
Quoting Matthew Wilcox (2021-03-22 17:54:39)
> On Mon, Mar 22, 2021 at 04:43:29PM +0100, Antoine Tenart wrote:
> > xps_queue_show is mostly made of an RCU read-side critical section and
> > calls bitmap_zalloc with GFP_KERNEL in the middle of it. That is not
> > allowed as this call may sleep and such behaviours aren't allowed in RCU
> > read-side critical sections. Fix this by using GFP_NOWAIT instead.
>
> This would be another way of fixing the problem that is slightly less
> complex than my initial proposal, but does allow for using GFP_KERNEL
> for fewer failures:
>
> @@ -1366,11 +1366,10 @@ static ssize_t xps_queue_show(struct net_device *dev, unsigned int index,
> {
> struct xps_dev_maps *dev_maps;
> unsigned long *mask;
> - unsigned int nr_ids;
> + unsigned int nr_ids, new_nr_ids;
> int j, len;
>
> - rcu_read_lock();
> - dev_maps = rcu_dereference(dev->xps_maps[type]);
> + dev_maps = READ_ONCE(dev->xps_maps[type]);
Couldn't dev_maps be freed between here and the read of dev_maps->nr_ids
as we're not in an RCU read-side critical section?
> /* Default to nr_cpu_ids/dev->num_rx_queues and do not just return 0
> * when dev_maps hasn't been allocated yet, to be backward compatible.
> @@ -1379,10 +1378,18 @@ static ssize_t xps_queue_show(struct net_device *dev, unsigned int index,
> (type == XPS_CPUS ? nr_cpu_ids : dev->num_rx_queues);
>
> mask = bitmap_zalloc(nr_ids, GFP_KERNEL);
> - if (!mask) {
> - rcu_read_unlock();
> + if (!mask)
> return -ENOMEM;
> - }
> +
> + rcu_read_lock();
> + dev_maps = rcu_dereference(dev->xps_maps[type]);
> + /* if nr_ids shrank in the meantime, do not overrun array.
> + * if it increased, we just won't show the new ones
> + */
> + new_nr_ids = dev_maps ? dev_maps->nr_ids :
> + (type == XPS_CPUS ? nr_cpu_ids : dev->num_rx_queues);
> + if (new_nr_ids < nr_ids)
> + nr_ids = new_nr_ids;
>
> if (!dev_maps || tc >= dev_maps->num_tc)
> goto out_no_maps;
My feeling is there is not much value in having a tricky allocation
logic for reads from xps_cpus and xps_rxqs. While we could come up with
something, returning -ENOMEM on memory pressure should be fine.
Antoine
Powered by blists - more mailing lists