[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <626d056e-1489-d406-62cc-4b981ff94175@redhat.com>
Date: Tue, 30 Mar 2021 13:43:23 -0400
From: Waiman Long <longman@...hat.com>
To: Daniel Thompson <daniel.thompson@...aro.org>
Cc: Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Bharata B Rao <bharata@...ux.vnet.ibm.com>,
Phil Auld <pauld@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] sched/debug: Use sched_debug_lock to serialize use of
cgroup_path[] only
On 3/30/21 6:42 AM, Daniel Thompson wrote:
> On Mon, Mar 29, 2021 at 03:32:35PM -0400, Waiman Long wrote:
>> The handling of sysrq keys should normally be done in an user context
>> except when MAGIC_SYSRQ_SERIAL is set and the magic sequence is typed
>> in a serial console.
> This seems to be a poor summary of the typical calling context for
> handle_sysrq() except in the trivial case of using
> /proc/sysrq-trigger.
>
> For example on my system then the backtrace when I do sysrq-h on a USB
> keyboard shows us running from a softirq handler and with interrupts
> locked. Note also that the interrupt lock is present even on systems that
> handle keyboard input from a kthread due to the interrupt lock in
> report_input_key().
I will reword this part of the patch. I don't have a deep understanding
of how the different way of keyword input work and thanks for showing me
that there are other ways of getting keyboard input.
>
>> Currently in print_cpu() of kernel/sched/debug.c, sched_debug_lock is taken
>> with interrupt disabled for the whole duration of the calls to print_*_stats()
>> and print_rq() which could last for the quite some time if the information dump
>> happens on the serial console.
>>
>> If the system has many cpus and the sched_debug_lock is somehow busy
>> (e.g. parallel sysrq-t), the system may hit a hard lockup panic, like
> <snip>
>
>> The purpose of sched_debug_lock is to serialize the use of the global
>> cgroup_path[] buffer in print_cpu(). The rests of the printk() calls
>> don't need serialization from sched_debug_lock.
>>
>> Calling printk() with interrupt disabled can still be/proc/sysrq-trigger
>> problematic. Allocating a stack buffer of PATH_MAX bytes is not
>> feasible. So a compromised solution is used where a small stack buffer
>> is allocated for pathname. If the actual pathname is short enough, it
>> is copied to the stack buffer with sched_debug_lock release afterward
>> before printk(). Otherwise, the global group_path[] buffer will be
>> used with sched_debug_lock held until after printk().
> Does this actually fix the problem in any circumstance except when the
> sysrq is triggered using /proc/sysrq-trigger?
I have a reproducer that generates hard lockup panic when there are
multiple instances of sysrq-t via /proc/sysrq-trigger. This is probably
less a problem on console as I don't think we can do multiple
simultaneous sysrq-t there. Anyway, my goal is to limit the amount of
time that irq is disabled. Doing a printk can take a while depending on
whether there are contention in the underlying locks or resources. Even
if I limit the the critical sections to just those printk() that outputs
cgroup path, I can still cause the panic.
Cheers,
Longman
The approach used by this patch should minimize the chance of a panic
happening. However, if there are many tasks with very long cgroup paths,
I suppose that panic may still happen under some extreme conditions. So
I won't say this will completely fix the problem until the printk()
rework that makes printk work more like printk_deferred() is merged.
Powered by blists - more mailing lists