lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210104153836.GS3021@hirez.programming.kicks-ass.net>
Date:   Mon, 4 Jan 2021 16:38:36 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Waiman Long <longman@...hat.com>
Cc:     Ingo Molnar <mingo@...hat.com>, Will Deacon <will.deacon@....com>,
        linux-kernel@...r.kernel.org, Bart Van Assche <bvanassche@....org>,
        Paul McKenney <paulmck@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>
Subject: Re: [PATCH] locking/lockdep: Use local_irq_save() with call_rcu()

On Tue, Dec 22, 2020 at 05:55:53PM -0500, Waiman Long wrote:
> The following lockdep splat was hit:
> 
>  [  560.638354] WARNING: CPU: 79 PID: 27458 at kernel/rcu/tree_plugin.h:1749 call_rcu+0x6dc/0xf00
>     :
>  [  560.647761] RIP: 0010:call_rcu+0x6dc/0xf00
>  [  560.647763] Code: 0f 8f 29 04 00 00 e8 93 da 1c 00 48 8b 3c 24 57 9d 0f 1f 44 00 00 e9 19 fa ff ff 65 8b 05 38 83 c4 49 85 c0 0f 84 cd fb ff ff <0f> 0b e9 c6 fb ff ff e8 b8 45 51 00 4c 89 f2 48 b8 00 00 00 00 00
>  [  560.647764] RSP: 0018:ff11001050097b58 EFLAGS: 00010002
>  [  560.647766] RAX: 0000000000000001 RBX: ffffffffbb1f3360 RCX: 0000000000000001
>  [  560.647766] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffb99bac9c
>  [  560.647767] RBP: 1fe220020a012f73 R08: 000000010004005c R09: dffffc0000000000
>  [  560.647768] R10: dffffc0000000000 R11: 0000000000000003 R12: ff1100105b7f70e1
>  [  560.647769] R13: ffffffffb635d8a0 R14: ff1100105b7f72d8 R15: ff1100105b7f7040
>  [  560.647770] FS:  00007fd9b3437080(0000) GS:ff1100105b600000(0000) knlGS:0000000000000000
>  [  560.647771] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  [  560.647772] CR2: 00007fd9b30112bc CR3: 000000105e898006 CR4: 0000000000761ee0
>  [  560.647773] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  [  560.647773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>  [  560.647774] PKRU: 55555554
>  [  560.647774] Call Trace:
>  [  560.647778]  ? invoke_rcu_core+0x180/0x180
>  [  560.647782]  ? __is_module_percpu_address+0xed/0x440
>  [  560.647787]  lockdep_unregister_key+0x2ab/0x5b0
>  [  560.647791]  destroy_workqueue+0x40b/0x610
>  [  560.647862]  xlog_dealloc_log+0x216/0x2b0 [xfs]
>     :
> 
> This splat is caused by the fact that lockdep_unregister_key() uses
> raw_local_irq_save() which doesn't update the hardirqs_enabled
> percpu flag.  The call_rcu() function, however, will call
> lockdep_assert_irqs_disabled() to check the hardirqs_enabled flag which
> remained set in this case.
> 
> Fix this problem by using local_irq_save()/local_irq_restore() pairs
> whenever call_rcu() is being called.

I'm not sure I much like all this,.. :/

> I think raw_local_irq_save() function can be used if no external
> function is being called except maybe printk() as it means another
> lockdep problem exists.

The reason lockdep is using raw_local_irq_save() is to avoid calling
into itself again, notably local_irq_restore() will end up in
mark_held_locks().

> Fixes: a0b0fd53e1e67 ("locking/lockdep: Free lock classes that are no longer in use")

Seems dubious, as the lockdep_assert_irqs_disabled() that triggered was
added after that patch.

I'm thinking another solution would be to increment the lockdep
recursion count before calling RCU, because then we'll fail
__lockdep_enabled and the assertion gets killed. Hmm?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ