linux-kernel - Re: [PATCH] cgroup: use irqsave in cgroup_rstat_flush

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180703213539.i3mozfuhi7j7xefm@linutronix.de>
Date:   Tue, 3 Jul 2018 23:35:39 +0200
From:   Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To:     Tejun Heo <tj@...nel.org>
Cc:     linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH] cgroup: use irqsave in cgroup_rstat_flush_locked()

On 2018-07-03 13:24:24 [-0700], Tejun Heo wrote:
> (cc'ing Peter and Ingo for lockdep)
> 
> Hello, Sebastian.
Hi Tejun,

> On Tue, Jul 03, 2018 at 06:45:44PM +0200, Sebastian Andrzej Siewior wrote:
> > All callers of cgroup_rstat_flush_locked() acquire cgroup_rstat_lock
> > either with spin_lock_irq() or spin_lock_irqsave().
> 
> So, irq is always disabled in cgroup_rstat_flush_locked().

on not RT enabled kernels. On RT enabled kernels spin_lock_irq.*() is
turned into a sleeping spinlock which do not disable interrupts.

> > cgroup_rstat_flush_locked() itself acquires cgroup_rstat_cpu_lock which
> > is a raw_spin_lock. This lock is also acquired in cgroup_rstat_updated()
> > in IRQ context and therefore requires _irqsave() locking suffix in
> > cgroup_rstat_flush_locked().
> 
> Yes, the cpu locks should be irqsafe too; however, as irq is always
> disabled in that function, save/restore is redundant, no?

as I pointed out above only the raw_spin_lock_t really disables
interrupts on -RT. That is the difference between those two.

> > Since there is no difference between spin_lock_t and raw_spin_lock_t
> > on !RT lockdep does not complain here. On RT lockdep complains because
> > the interrupts were not disabled here and a deadlock is possible.
> 
> We at least used to do this in the kernel - manipulating irqsafe locks
> with spin_lock/unlock() if the irq state is known, whether enabled or
> disabled, and ISTR lockdep being smart enough to track actual irq
> state to determine irq safety.  Am I misremembering or is this
> different on RT kernels?

No, this is correct. So on !RT kernels the spin_lock_irq() disables
interrupts and the raw_spin_lock() has the interrupts already disabled,
everything is good. On RT kernels the spin_lock_irq() does not disable
interrupts and the raw_spin_lock() acquires the lock with enabled
interrupts and lockdep complains properly.
lockdep sees the hardirq path via:

 {IN-HARDIRQ-W} state was registered at:
   lock_acquire+0x9e/0x250
   _raw_spin_lock_irqsave+0x38/0x50
   cgroup_rstat_updated+0x57/0x100
   cgroup_base_stat_cputime_account_end.isra.6+0x17/0x60
   __cgroup_account_cputime_field+0x49/0x60
   account_system_index_time+0xdb/0x1f0
   account_system_time+0x3f/0x70
   account_process_tick+0x59/0x80
   update_process_times+0x1d/0x50
   tick_sched_handle+0x20/0x60
   tick_sched_timer+0x37/0x80
   __hrtimer_run_queues+0x12c/0x6d0
   hrtimer_interrupt+0xed/0x240
   smp_apic_timer_interrupt+0x89/0x3c0
   apic_timer_interrupt+0xf/0x20
   pin_current_cpu+0xa/0x120
   migrate_disable+0x9a/0x200
   rt_spin_lock+0x1d/0x60
   put_unused_fd+0x2c/0x50
   do_sys_open+0x23a/0x250
   __x64_sys_openat+0x1b/0x20
   do_syscall_64+0x50/0x190
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

> Thanks.

Sebastian