lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 06 Jan 2017 12:13:23 -0800
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     David Miller <davem@...emloft.net>
Cc:     michael.chan@...adcom.com, netdev@...r.kernel.org
Subject: Re: [PATCH net] net: Fix inconsistent rtnl_lock usage on
 dev_get_stats().

On Fri, 2017-01-06 at 13:01 -0500, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@...il.com>
> Date: Fri, 06 Jan 2017 09:32:56 -0800
> 
> > This makes no sense to me.
> > 
> > RTNL is absolutely not needed to get device stats.
> > 
> > We try to not add RTNL, especially when not required.
> > 
> > Sure, RTNETLINK dumps currently hold RTNL, but we had various attempts
> > in the past to get rid of this behavior.
> > 
> > If a device driver expects RTNL being locked, it is clearly a bug that
> > needs a fix anyway.
> 
> This is extremely problematic when the driver has to synchronize some
> piece of state between the get stats method and open/close.  It is
> exactly the case we are trying to solve in tg3, and lots of drivers
> end up hitting the same exact issue.
> 
> If open/close can happen asynchronously to get stats, it is very hard
> to make dynamically allocated data structures or DMA buffers usable
> from the stats call.

Yes, I had some issues lately with mlx4. netdevices are protected by
RCU, adding proper RCU logic for the stats is doable.

> 
> Drivers in this situation will just add a mutex specifically for this
> situation if we don't consistently apply RTNL locking here.

Well, there are cases where RTNL is quite contended, but supervisions
like to get /proc/net/devices or various sysfs attributes
(netstat_show() can be called very very often
for /sys/class/net/*/statistics/*) in a reasonable amount of time.


I fear that such a change will add drifts, when devices are constantly
added/removed.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ