lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 5 May 2020 21:25:45 -0400 From: "Michael S. Tsirkin" <mst@...hat.com> To: Eric Dumazet <eric.dumazet@...il.com> Cc: Thomas Gleixner <tglx@...utronix.de>, netdev@...r.kernel.org, Jason Wang <jasowang@...hat.com>, Peter Zijlstra <peterz@...radead.org> Subject: Re: [BUG] Inconsistent lock state in virtnet poll On Tue, May 05, 2020 at 06:19:09PM -0700, Eric Dumazet wrote: > > > On 5/5/20 5:43 PM, Michael S. Tsirkin wrote: > > On Tue, May 05, 2020 at 03:40:09PM -0700, Eric Dumazet wrote: > >> > >> > >> On 5/5/20 3:30 PM, Thomas Gleixner wrote: > >>> "Michael S. Tsirkin" <mst@...hat.com> writes: > >>>> On Tue, May 05, 2020 at 02:08:56PM +0200, Thomas Gleixner wrote: > >>>>> > >>>>> The following lockdep splat happens reproducibly on 5.7-rc4 > >>>> > >>>>> ================================ > >>>>> WARNING: inconsistent lock state > >>>>> 5.7.0-rc4+ #79 Not tainted > >>>>> -------------------------------- > >>>>> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. > >>>>> ip/356 [HC0[0]:SC1[1]:HE1:SE0] takes: > >>>>> f3ee4cd8 (&syncp->seq#2){+.?.}-{0:0}, at: net_rx_action+0xfb/0x390 > >>>>> {SOFTIRQ-ON-W} state was registered at: > >>>>> lock_acquire+0x82/0x300 > >>>>> try_fill_recv+0x39f/0x590 > >>>> > >>>> Weird. Where does try_fill_recv acquire any locks? > >>> > >>> u64_stats_update_begin(&rq->stats.syncp); > >>> > >>> That's a 32bit kernel which uses a seqcount for this. sequence counts > >>> are "lock" constructs where you need to make sure that writers are > >>> serialized. > >>> > >>> Actually the problem at hand is that try_fill_recv() is called from > >>> fully preemptible context initialy and then from softirq context. > >>> > >>> Obviously that's for the open() path a non issue, but lockdep does not > >>> know about that. OTOH, there is other code which calls that from > >>> non-softirq context. > >>> > >>> The hack below made it shut up. It's obvioulsy not ideal, but at least > >>> it let me look at the actual problem I was chasing down :) > >>> > >>> Thanks, > >>> > >>> tglx > >>> > >>> 8<----------- > >>> --- a/drivers/net/virtio_net.c > >>> +++ b/drivers/net/virtio_net.c > >>> @@ -1243,9 +1243,11 @@ static bool try_fill_recv(struct virtnet > >>> break; > >>> } while (rq->vq->num_free); > >>> if (virtqueue_kick_prepare(rq->vq) && virtqueue_notify(rq->vq)) { > >>> + local_bh_disable(); > >> > >> Or use u64_stats_update_begin_irqsave() whic is a NOP on 64bit kernels > > > > I applied this, but am still trying to think of something that > > is 0 overhead for all configs. > > Maybe we can select a lockdep class depending on whether napi > > is enabled? > > > Do you _really_ need 64bit counter for stats.kicks on 32bit kernels ? > > Adding 64bit counters just because we can might be overhead anyway. Well 32 bit kernels don't fundamentally kick less than 64 bit ones, and we kick more or less per packet, sometimes per batch, people expect these to be in sync .. > > > > > >>> u64_stats_update_begin(&rq->stats.syncp); > >>> rq->stats.kicks++; > >>> u64_stats_update_end(&rq->stats.syncp); > >>> + local_bh_enable(); > >>> } > >>> > >>> return !oom; > >>> > >
Powered by blists - more mailing lists