lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220815144143.zjsiamw5y22bvgki@suse.de>
Date:   Mon, 15 Aug 2022 15:41:43 +0100
From:   Mel Gorman <mgorman@...e.de>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        David Hildenbrand <david@...hat.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        stable@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Axel Rasmussen <axelrasmussen@...gle.com>,
        Peter Xu <peterx@...hat.com>, Hugh Dickins <hughd@...gle.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Matthew Wilcox <willy@...radead.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        John Hubbard <jhubbard@...dia.com>,
        Jason Gunthorpe <jgg@...dia.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH v2] sched/all: Change all BUG_ON() instances in the
 scheduler to WARN_ON_ONCE()

On Fri, Aug 12, 2022 at 11:29:18AM +0200, Ingo Molnar wrote:
> From: Ingo Molnar <mingo@...nel.org>
> Date: Thu, 11 Aug 2022 08:54:52 +0200
> Subject: [PATCH] sched/all: Change all BUG_ON() instances in the scheduler to WARN_ON_ONCE()
> 
> There's no good reason to crash a user's system with a BUG_ON(),
> chances are high that they'll never even see the crash message on
> Xorg, and it won't make it into the syslog either.
> 
> By using a WARN_ON_ONCE() we at least give the user a chance to report
> any bugs triggered here - instead of getting silent hangs.
> 
> None of these WARN_ON_ONCE()s are supposed to trigger, ever - so we ignore
> cases where a NULL check is done via a BUG_ON() and we let a NULL
> pointer through after a WARN_ON_ONCE().
> 
> There's one exception: WARN_ON_ONCE() arguments with side-effects,
> such as locking - in this case we use the return value of the
> WARN_ON_ONCE(), such as in:
> 
>  -       BUG_ON(!lock_task_sighand(p, &flags));
>  +       if (WARN_ON_ONCE(!lock_task_sighand(p, &flags)))
>  +               return;
> 
> Suggested-by: Linus Torvalds <torvalds@...ux-foundation.org>
> Signed-off-by: Ingo Molnar <mingo@...nel.org>
> Link: https://lore.kernel.org/r/YvSsKcAXISmshtHo@gmail.com
> ---
>  kernel/sched/autogroup.c |  3 ++-
>  kernel/sched/core.c      |  2 +-
>  kernel/sched/cpupri.c    |  2 +-
>  kernel/sched/deadline.c  | 26 +++++++++++++-------------
>  kernel/sched/fair.c      | 10 +++++-----
>  kernel/sched/rt.c        |  2 +-
>  kernel/sched/sched.h     |  6 +++---
>  7 files changed, 26 insertions(+), 25 deletions(-)
> 
> diff --git a/kernel/sched/cpupri.c b/kernel/sched/cpupri.c
> index fa9ce9d83683..a286e726eb4b 100644
> --- a/kernel/sched/cpupri.c
> +++ b/kernel/sched/cpupri.c
> @@ -147,7 +147,7 @@ int cpupri_find_fitness(struct cpupri *cp, struct task_struct *p,
>  	int task_pri = convert_prio(p->prio);
>  	int idx, cpu;
>  
> -	BUG_ON(task_pri >= CPUPRI_NR_PRIORITIES);
> +	WARN_ON_ONCE(task_pri >= CPUPRI_NR_PRIORITIES);
>  
>  	for (idx = 0; idx < task_pri; idx++) {
>  

Should the return value be used here to clamp task_pri to
CPUPRI_NR_PRIORITIES? task_pri is used for index which in __cpupri_find
then accesses beyond the end of an array and the future behaviour will be
very unpredictable.

> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 0ab79d819a0d..962b169b05cf 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2017,7 +2017,7 @@ static struct task_struct *pick_task_dl(struct rq *rq)
>  		return NULL;
>  
>  	dl_se = pick_next_dl_entity(dl_rq);
> -	BUG_ON(!dl_se);
> +	WARN_ON_ONCE(!dl_se);
>  	p = dl_task_of(dl_se);
>  
>  	return p;

It's a somewhat redundant check, it'll NULL pointer dereference shortly
afterwards but it'll be a bit more obvious.

> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 914096c5b1ae..28f10dccd194 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2600,7 +2600,7 @@ static void task_numa_group(struct task_struct *p, int cpupid, int flags,
>  	if (!join)
>  		return;
>  
> -	BUG_ON(irqs_disabled());
> +	WARN_ON_ONCE(irqs_disabled());
>  	double_lock_irq(&my_grp->lock, &grp->lock);
>  
>  	for (i = 0; i < NR_NUMA_HINT_FAULT_STATS * nr_node_ids; i++) {

Recoverable with a goto no_join. It'll be a terrible recovery because
there is no way IRQs should be disabled here. Something else incredibly
bad happened before this would fire.

> @@ -7279,7 +7279,7 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p, int wake_
>  		return;
>  
>  	find_matching_se(&se, &pse);
> -	BUG_ON(!pse);
> +	WARN_ON_ONCE(!pse);
>  
>  	cse_is_idle = se_is_idle(se);
>  	pse_is_idle = se_is_idle(pse);

Similar to pick_task_dl.

> @@ -10134,7 +10134,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>  		goto out_balanced;
>  	}
>  
> -	BUG_ON(busiest == env.dst_rq);
> +	WARN_ON_ONCE(busiest == env.dst_rq);
>  
>  	schedstat_add(sd->lb_imbalance[idle], env.imbalance);
>  

goto out if it triggers? It'll just continue to be unbalanced.

> @@ -10430,7 +10430,7 @@ static int active_load_balance_cpu_stop(void *data)
>  	 * we need to fix it. Originally reported by
>  	 * Bjorn Helgaas on a 128-CPU setup.
>  	 */
> -	BUG_ON(busiest_rq == target_rq);
> +	WARN_ON_ONCE(busiest_rq == target_rq);
>  
>  	/* Search for an sd spanning us and the target CPU. */
>  	rcu_read_lock();

goto out_unlock if it fires?

For the rest, I didn't see obvious recovery paths that would allow the
system to run predictably. Any of them firing will have unpredictable
consequences (e.g. move_queued_task firing would be fun if it was a per-cpu
kthread). Depending on which warning triggers, the remaining life of the
system may be very short but maybe long enough to be logged even if system
locks up shortly afterwards.

-- 
Mel Gorman
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ