[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50c9ac34-3971-ac1b-0b5e-1bfa004fd4cf@oracle.com>
Date: Thu, 3 May 2018 10:01:40 -0400
From: Steven Sistare <steven.sistare@...cle.com>
To: Pavel Tatashin <pasha.tatashin@...cle.com>
Cc: Alexander Duyck <alexander.duyck@...il.com>,
Daniel Jordan <daniel.m.jordan@...cle.com>,
LKML <linux-kernel@...r.kernel.org>,
Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
intel-wired-lan <intel-wired-lan@...ts.osuosl.org>,
Netdev <netdev@...r.kernel.org>,
Greg KH <gregkh@...uxfoundation.org>
Subject: Re: [PATCH 1/2] ixgbe: release lock for the duration of
ixgbe_suspend_close()
On 5/3/2018 9:46 AM, Alexander Duyck wrote:
> On Thu, May 3, 2018 at 6:30 AM, Pavel Tatashin
> <pasha.tatashin@...cle.com> wrote:
>> Hi Alex,
>
> Hi Pavel,
>
>>> I'm not a fan of dropping the mutex while we go through
>>> ixgbe_close_suspend. I'm concerned it will result in us having a
>>> number of races on shutdown.
>>
>> I would agree, but ixgbe_close_suspend() is already called without this
>> mutex from ixgbe_close(). This path is executed also during machine
>> shutdown but when kexec'ed. So, it is either an existing bug or there are
>> no races. But, because ixgbe_close() is called from the userland, and a
>> little earlier than ixgbe_shutdown() I think this means there are no races.
>
> All callers of ixgbe_close are supposed to already be holding the RTNL
> mutex. That is the whole reason why we need to hold it here as the
> shutdown path doesn't have the mutex held otherwise. The mutex was
> added here because shutdown was racing with the open/close calls.
>
>>
>>> If anything, I think we would need to find a replacement for the mutex
>>> that can operate at the per-netdev level if you are wanting to
>>> parallelize this.
>>
>> Yes, if lock is necessary, it can be replaced in this place (and added to
>> ixgbe_close()) with something scalable.
>
> I wouldn't suggest adding yet another lock for all this. Instead it
> might make more sense for us to start looking at breaking up the
> locking since most of the networking subsystem uses the rtnl_lock call
> it might be time to start looking at doing something like cutting back
> on the use of this in cases where all the resources needed are
> specific to one device and instead have a per device lock that could
> address those kind of cases.
Hi Pavel, you might want to pull lock optimization out of this patch series.
The parallelization by itself is valuable, and optimizations for individual
devices including locks can come later.
- Steve
Powered by blists - more mailing lists