netdev - Re: [PATCH net-next] net-sysfs: remove possible sleep from an RCU read-side critical section

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <161643489069.6320.12260867980480523074@kwain.local>
Date:   Mon, 22 Mar 2021 18:41:30 +0100
From:   Antoine Tenart <atenart@...nel.org>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     davem@...emloft.net, kuba@...nel.org, alexander.duyck@...il.com,
        netdev@...r.kernel.org, kernel test robot <oliver.sang@...el.com>
Subject: Re: [PATCH net-next] net-sysfs: remove possible sleep from an RCU read-side critical section

Quoting Matthew Wilcox (2021-03-22 17:54:39)
> On Mon, Mar 22, 2021 at 04:43:29PM +0100, Antoine Tenart wrote:
> > xps_queue_show is mostly made of an RCU read-side critical section and
> > calls bitmap_zalloc with GFP_KERNEL in the middle of it. That is not
> > allowed as this call may sleep and such behaviours aren't allowed in RCU
> > read-side critical sections. Fix this by using GFP_NOWAIT instead.
> 
> This would be another way of fixing the problem that is slightly less
> complex than my initial proposal, but does allow for using GFP_KERNEL
> for fewer failures:
> 
> @@ -1366,11 +1366,10 @@ static ssize_t xps_queue_show(struct net_device *dev, unsigned int index,
>  {
>         struct xps_dev_maps *dev_maps;
>         unsigned long *mask;
> -       unsigned int nr_ids;
> +       unsigned int nr_ids, new_nr_ids;
>         int j, len;
>  
> -       rcu_read_lock();
> -       dev_maps = rcu_dereference(dev->xps_maps[type]);
> +       dev_maps = READ_ONCE(dev->xps_maps[type]);

Couldn't dev_maps be freed between here and the read of dev_maps->nr_ids
as we're not in an RCU read-side critical section?

>         /* Default to nr_cpu_ids/dev->num_rx_queues and do not just return 0
>          * when dev_maps hasn't been allocated yet, to be backward compatible.
> @@ -1379,10 +1378,18 @@ static ssize_t xps_queue_show(struct net_device *dev, unsigned int index,
>                  (type == XPS_CPUS ? nr_cpu_ids : dev->num_rx_queues);
>  
>         mask = bitmap_zalloc(nr_ids, GFP_KERNEL);
> -       if (!mask) {
> -               rcu_read_unlock();
> +       if (!mask)
>                 return -ENOMEM;
> -       }
> +
> +       rcu_read_lock();
> +       dev_maps = rcu_dereference(dev->xps_maps[type]);
> +       /* if nr_ids shrank in the meantime, do not overrun array.
> +        * if it increased, we just won't show the new ones
> +        */
> +       new_nr_ids = dev_maps ? dev_maps->nr_ids :
> +                       (type == XPS_CPUS ? nr_cpu_ids : dev->num_rx_queues);
> +       if (new_nr_ids < nr_ids)
> +               nr_ids = new_nr_ids;
>  
>         if (!dev_maps || tc >= dev_maps->num_tc)
>                 goto out_no_maps;

My feeling is there is not much value in having a tricky allocation
logic for reads from xps_cpus and xps_rxqs. While we could come up with
something, returning -ENOMEM on memory pressure should be fine.

Antoine