lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <476f1719-6fab-88fe-11ab-8e1d738275e3@cumulusnetworks.com>
Date:	Tue, 21 Jun 2016 07:22:38 -0600
From:	David Ahern <dsa@...ulusnetworks.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Ingo Molnar <mingo@...nel.org>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: chain_key collision with 4.7.0-rec2+

On 6/21/16 7:13 AM, Peter Zijlstra wrote:
> Plz as to not wrap dmesg output when pasting

not wrapped on my end -- the copy in Send folder in or the ouptut in 
your response.

>
>> IPv4 one:
>
>> [  189.171084] ======================
>> [  189.171555] [chain_key collision ]
>> [  189.172020] 4.7.0-rc2+ #310 Tainted: G        W
>> [  189.172729] ----------------------
>> [  189.173192] sshd/1463: Hash chain already cached but the contents don't > match!
>> [  189.174236] Held locks:depth: 6
>> [  189.174707]  class_idx:1253 -> chain_key:00000000000004e5 > (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff8144a84c>] lock_sock+0xb/0xd
>> [  189.176376]  class_idx:26 -> chain_key:00000000009ca01a > (rcu_read_lock){......}, at: [<ffffffff814421fb>] rcu_lock_acquire+0x0/0x20
>> [  189.178014]  class_idx:898 -> chain_key:0000001394034382 > (rcu_read_lock_bh){......}, at: [<ffffffff814421fb>] > rcu_lock_acquire+0x0/0x20
>> [  189.179712]  class_idx:898 -> chain_key:0002728068704382 > (rcu_read_lock_bh){......}, at: [<ffffffff81400393>] > rcu_lock_acquire+0x0/0x20
>> [  189.181374]  class_idx:1221 -> chain_key:4e500d0e087044c5 > (dev->qdisc_running_key ?: &qdisc_running_key){+.....}, at: > [<ffffffff81408a7e>] __dev_queue_xmit+0x40c/0x72d
>> [  189.183400]  class_idx:1222 -> chain_key:01a1c10e0898ad0c > (_xmit_ETHER#2){+.-...}, at: [<ffffffff814287e6>] sch_direct_xmit+0x6f/0x167
>> [  189.185178] Locks in cached chain:depth: 6
>> [  189.185791]  class_idx:1252 -> chain_key:00000000000004e4 > (((&ndev->rs_timer))){+.-...}
>> [  189.186957]  class_idx:26 -> chain_key:00000000009c801a > (rcu_read_lock){......}
>> [  189.187991]  class_idx:898 -> chain_key:0000001390034382 > (rcu_read_lock_bh){......}
>> [  189.189023]  class_idx:898 -> chain_key:0002720068704382 > (rcu_read_lock_bh){......}
>> [  189.190050]  class_idx:1221 -> chain_key:4e400d0e087044c5 > (dev->qdisc_running_key ?: &qdisc_running_key){+.....}
>> [  189.191370]  class_idx:1220 -> chain_key:01a1c10e0898ad0c > (&(&list->lock)->rlock#3){+.-...}
>
>> and IPv6:
>
>> [   22.082615] ======================
>> [   22.083126] [chain_key collision ]
>> [   22.083646] 4.7.0-rc2+ #310 Tainted: G        W
>> [   22.084410] ----------------------
>> [   22.084927] sshd/1401: Hash chain already cached but the contents don't > match!
>> [   22.086010] Held locks:depth: 6
>> [   22.086509]  class_idx:1255 -> chain_key:00000000000004e7 > (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff8144a84c>] lock_sock+0xb/0xd
>> [   22.088253]  class_idx:26 -> chain_key:00000000009ce01a > (rcu_read_lock){......}, at: [<ffffffff814cd8a2>] rcu_read_lock+0x0/0x68
>> [   22.089999]  class_idx:898 -> chain_key:000000139c034382 > (rcu_read_lock_bh){......}, at: [<ffffffff814a031b>] > rcu_lock_acquire+0x0/0x20
>> [   22.091811]  class_idx:898 -> chain_key:0002738068704382 > (rcu_read_lock_bh){......}, at: [<ffffffff81400393>] > rcu_lock_acquire+0x0/0x20
>> [   22.093639]  class_idx:1223 -> chain_key:4e700d0e087044c7 > (dev->qdisc_running_key ?: &qdisc_running_key){+.....}, at: > [<ffffffff81408a7e>] __dev_queue_xmit+0x40c/0x72d
>> [   22.095824]  class_idx:1224 -> chain_key:01a1c10e0898ed06 > (_xmit_ETHER#2){+.-...}, at: [<ffffffff814287e6>] sch_direct_xmit+0x6f/0x167
>> [   22.097616] Locks in cached chain:depth: 6
>> [   22.098197]  class_idx:1248 -> chain_key:00000000000004e0 > (((&ndev->rs_timer))){+.-...}
>> [   22.099345]  class_idx:26 -> chain_key:00000000009c001a > (rcu_read_lock){......}
>> [   22.100419]  class_idx:898 -> chain_key:0000001380034382 > (rcu_read_lock_bh){......}
>> [   22.101535]  class_idx:898 -> chain_key:0002700068704382 > (rcu_read_lock_bh){......}
>> [   22.102656]  class_idx:1223 -> chain_key:4e000d0e087044c7 > (dev->qdisc_running_key ?: &qdisc_running_key){+.....}
>> [   22.104095]  class_idx:1222 -> chain_key:01a1c10e0898ed06 > (&(&list->lock)->rlock#3){+.-...}
>
>
> Right, so those hashes do indeed match up pretty nicely.
>
> Now note that this isn't a new issue, just a new warning. Such
> collisions were silent before and could/can/do hide actual lock cycles.
>
> In any case, could you perhaps try with commit (from tip/locking/core)
> to see if that makes it any better:
>
>   dfaaf3fa01d6 ("locking/lockdep: Use __jhash_mix() for iterate_chain_key()")

I'll add it and run for a while. I tend to see the trace once a day or 
so depending on my workload (ie., time to work on code).

>
> I still have to consider a better solution to the problem, because
> fundamentally collisions are unavoidable, and we should deal better with
> them. But our mixing function did suck somewhat, so making collisions
> less likely might give us a little more time to ponder.
>
> But if you're really annoyed by it all, disable DEBUG_LOCKDEP for a
> while and rest assured that its not more broken than it was before --
> just louder :-)
>

Not really an option if I want to make sure I am not adding locking 
issues and the habit of ignoring splats defeats the purpose of enabling 
the lock debugging. So hopefully you come up with a better answer.

Thanks for the quick response


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ