linux-kernel - Re: [PATCH] cgroup: use irqsave in cgroup_rstat_flush

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180711110513.7gqhlf6odqoxnext@linutronix.de>
Date:   Wed, 11 Jul 2018 13:05:13 +0200
From:   Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To:     Tejun Heo <tj@...nel.org>
Cc:     linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH] cgroup: use irqsave in cgroup_rstat_flush_locked()

On 2018-07-03 23:35:39 [+0200], To Tejun Heo wrote:
> On 2018-07-03 13:24:24 [-0700], Tejun Heo wrote:
> > (cc'ing Peter and Ingo for lockdep)
> > 
> > Hello, Sebastian.
> Hi Tejun,
> 
> > On Tue, Jul 03, 2018 at 06:45:44PM +0200, Sebastian Andrzej Siewior wrote:
> > > All callers of cgroup_rstat_flush_locked() acquire cgroup_rstat_lock
> > > either with spin_lock_irq() or spin_lock_irqsave().
> > 
> > So, irq is always disabled in cgroup_rstat_flush_locked().
> 
> on not RT enabled kernels. On RT enabled kernels spin_lock_irq.*() is
> turned into a sleeping spinlock which do not disable interrupts.
> 
> > > cgroup_rstat_flush_locked() itself acquires cgroup_rstat_cpu_lock which
> > > is a raw_spin_lock. This lock is also acquired in cgroup_rstat_updated()
> > > in IRQ context and therefore requires _irqsave() locking suffix in
> > > cgroup_rstat_flush_locked().
> > 
> > Yes, the cpu locks should be irqsafe too; however, as irq is always
> > disabled in that function, save/restore is redundant, no?
> 
> as I pointed out above only the raw_spin_lock_t really disables
> interrupts on -RT. That is the difference between those two.
> 
> > > Since there is no difference between spin_lock_t and raw_spin_lock_t
> > > on !RT lockdep does not complain here. On RT lockdep complains because
> > > the interrupts were not disabled here and a deadlock is possible.
> > 
> > We at least used to do this in the kernel - manipulating irqsafe locks
> > with spin_lock/unlock() if the irq state is known, whether enabled or
> > disabled, and ISTR lockdep being smart enough to track actual irq
> > state to determine irq safety.  Am I misremembering or is this
> > different on RT kernels?
> 
> No, this is correct. So on !RT kernels the spin_lock_irq() disables
> interrupts and the raw_spin_lock() has the interrupts already disabled,
> everything is good. On RT kernels the spin_lock_irq() does not disable
> interrupts and the raw_spin_lock() acquires the lock with enabled
> interrupts and lockdep complains properly.
> lockdep sees the hardirq path via:
> 
>  {IN-HARDIRQ-W} state was registered at:
>    lock_acquire+0x9e/0x250
>    _raw_spin_lock_irqsave+0x38/0x50
>    cgroup_rstat_updated+0x57/0x100
>    cgroup_base_stat_cputime_account_end.isra.6+0x17/0x60
>    __cgroup_account_cputime_field+0x49/0x60
>    account_system_index_time+0xdb/0x1f0
>    account_system_time+0x3f/0x70
>    account_process_tick+0x59/0x80
>    update_process_times+0x1d/0x50
>    tick_sched_handle+0x20/0x60
>    tick_sched_timer+0x37/0x80
>    __hrtimer_run_queues+0x12c/0x6d0
>    hrtimer_interrupt+0xed/0x240
>    smp_apic_timer_interrupt+0x89/0x3c0
>    apic_timer_interrupt+0xf/0x20
>    pin_current_cpu+0xa/0x120
>    migrate_disable+0x9a/0x200
>    rt_spin_lock+0x1d/0x60
>    put_unused_fd+0x2c/0x50
>    do_sys_open+0x23a/0x250
>    __x64_sys_openat+0x1b/0x20
>    do_syscall_64+0x50/0x190
>    entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> > Thanks.

ping.

Sebastian