lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 18 Dec 2020 17:41:08 -0800
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Antoine Tenart <atenart@...nel.org>,
        David Miller <davem@...emloft.net>,
        Netdev <netdev@...r.kernel.org>, Paolo Abeni <pabeni@...hat.com>
Subject: Re: [PATCH net 1/4] net-sysfs: take the rtnl lock when storing xps_cpus

On Fri, Dec 18, 2020 at 4:30 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Thu, 17 Dec 2020 17:25:18 +0100 Antoine Tenart wrote:
> > Callers to netif_set_xps_queue should take the rtnl lock. Failing to do
> > so can lead to race conditions between netdev_set_num_tc and
> > netif_set_xps_queue, triggering various oops:
> >
> > - netif_set_xps_queue uses dev->tc_num as one of the parameters to
> >   compute the size of new_dev_maps when allocating it. dev->tc_num is
> >   also used to access the map, and the compiler may generate code to
> >   retrieve this field multiple times in the function.
> >
> > - netdev_set_num_tc sets dev->tc_num.
> >
> > If new_dev_maps is allocated using dev->tc_num and then dev->tc_num is
> > set to a higher value through netdev_set_num_tc, later accesses to
> > new_dev_maps in netif_set_xps_queue could lead to accessing memory
> > outside of new_dev_maps; triggering an oops.
> >
> > One way of triggering this is to set an iface up (for which the driver
> > uses netdev_set_num_tc in the open path, such as bnx2x) and writing to
> > xps_cpus in a concurrent thread. With the right timing an oops is
> > triggered.
> >
> > Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
>
> Let's CC Alex
>
> > Signed-off-by: Antoine Tenart <atenart@...nel.org>
>
> Two things: (a) is the datapath not exposed to a similar problem?
> __get_xps_queue_idx() uses dev->tc_num in a very similar fashion.

I think we are shielded from this by the fact that if you change the
number of tc the Tx path has to be torn down and rebuilt since you are
normally changing the qdisc configuration anyway.

> Should we perhaps make the "num_tcs" part of the XPS maps which is
> under RCU protection rather than accessing the netdev copy?

So it looks like the issue is the fact that we really need to
synchronize netdev_reset_tc, netdev_set_tc_queue, and
netdev_set_num_tc with __netif_set_xps_queue.

> (b) if we always take rtnl_lock, why have xps_map_mutex? Can we
> rearrange things so that xps_map_mutex is sufficient?

It seems like the quick and dirty way would be to look at updating the
3 functions I called out so that they were holding the xps_map_mutex
while they were updating things, and for __netif_set_xps_queue to
expand out the mutex to include the code starting at "if (dev->num_tc)
{".

Powered by blists - more mailing lists