[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c78a8972-4454-9ade-1e6d-4322e6190ebe@lab.ntt.co.jp>
Date: Thu, 18 Oct 2018 18:53:54 +0900
From: Toshiaki Makita <makita.toshiaki@....ntt.co.jp>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: Jason Wang <jasowang@...hat.com>, netdev@...r.kernel.org,
virtualization@...ts.linux-foundation.org, tglx@...utronix.de,
"Michael S. Tsirkin" <mst@...hat.com>,
"David S. Miller" <davem@...emloft.net>
Subject: Re: [RFC] virtio_net: add local_bh_disable() around
u64_stats_update_begin
On 2018/10/18 18:30, Sebastian Andrzej Siewior wrote:
> On 2018-10-18 18:19:21 [+0900], Toshiaki Makita wrote:
>> On 2018/10/18 18:08, Sebastian Andrzej Siewior wrote:
>>> Again: lockdep saw the lock in softirq context once and in process
>>> context once and this is what triggers the warning. It does not matter
>>> if NAPI is enabled or not during the access in process context. If you
>>> want to allow this you need further lockdep annotation…
>>>
>>> … but: refill_work() disables NAPI for &vi->rq[1] and refills + updates
>>> stats while NAPI is enabled for &vi->rq[0].
>>
>> Do you mean this is false positive? rq[0] and rq[1] never race with each
>> other...
>
> Why? So you can't refill rq[1] and then be interrupted and process NAPI
> for rq[0]?
Probably I don't understand what you are trying to say, but rq[0] and
rq[1] are different objects so they can be processed concurrently but
should not affect each other.
>
> But as I said. If lockdep saw the lock in acquired in softirq (what it
> did) _and_ in process context (what it did as well) _once_ then this is
> enough evidence for the warning.
> If you claim that this can not happen due to NAPI guard [0] then this is
> something lockdep does not know about.
The point is that it is not the problem of stats. try_fill_recv() should
not be called for the same rq concurrently in the first place. This
requirement should be satisfied by NAPI guard, so if the NAPI guard
logic is buggy then we need to fix it.
> [0] which I currently don't understand and therefore sent the patch [1]
> as Jason pointed out that in the ->ndo_open case the work is
> scheduled and then NAPI is enabled (which means the worker could
> disable NAPI and refill but before it finishes, ->ndo_open would
> continue and enable NAPI)).
It may be related but I have not investigated it deeply. I'll check if
it can cause this problem later.
> [1] 20181018084753.wefvsypdevbzoadg@...utronix.de
--
Toshiaki Makita
Powered by blists - more mailing lists