linux-kernel - Re: [PATCH v4] sched/numa: Fix divide by zero for sysctl_numa_balancing_scan

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3fdc9c92-4a4f-b716-c8bc-056111feb3e0@oracle.com>
Date:   Mon, 1 May 2023 12:21:17 -0400
From:   chris hyser <chris.hyser@...cle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org, dietmar.eggemann@....com,
        vincent.guittot@...aro.org, Chen Yu <yu.c.chen@...el.com>,
        Ingo Molnar <mingo@...nel.org>
Subject: Re: [PATCH v4] sched/numa: Fix divide by zero for
 sysctl_numa_balancing_scan_size.

On 4/29/23 10:56, Peter Zijlstra wrote:
> On Thu, Apr 06, 2023 at 11:26:33AM -0400, chris hyser wrote:
>> Commit 6419265899d9 ("sched/fair: Fix division by zero
>> sysctl_numa_balancing_scan_size") prevented a divide by zero by using
>> sysctl mechanisms to return EINVAL for a sysctl_numa_balancing_scan_size
>> value of zero. When moved from a sysctl to a debugfs file, this checking
>> was lost.
>>
>> This patch puts zero checking back in place.
>>
>> Cc: stable@...r.kernel.org
>> Fixes: 8a99b6833c88 ("sched: Move SCHED_DEBUG sysctl to debugfs")
>> Tested-by: Chen Yu <yu.c.chen@...el.com>
>> Signed-off-by: Chris Hyser <chris.hyser@...cle.com>
> 
> I suppose.. but is it really worth the hassle? I mean, this is debug
> stuff, just don't write 0 in then?

My understanding of the history is that this was always debug, someone 
found the divide by zero and a convenient sysctl mechanism was used to 
fix it. It did also cleanup a little compiler weirdness. I did not see 
any justifications in the discussion of the inclusion of that patch 
other than showing the nasty stack trace you get when the machine dies 
after writing a zero. It is a major inconvenience, completely 
preventable and technically a regression, but as you point out the new 
fix is a lot more code.

In terms of actually wanting to fix this, I'm a bit confused. It clearly 
was worth fixing the first time around (it has your sign-off), and the 
only thing that has changed is that that fix no longer works.

> 
> If we do find we want this (why?!) then should we not invest in a better
> debugfs_create_u32_minmax() or something so that we don't get to add 40+
> lines for everthing we want to add limits on?

I will look at a way to greatly simplify the bounds checking here as you 
suggest.

-chrish