linux-kernel - Re: [PATCH v2 4/4] mm/hugetl.c: warn out if expected count of huge pages adjustment is not achieved

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b94f4dc1-5c53-68ca-2023-0aa4de4df8b7@oracle.com>
Date:   Thu, 23 Jul 2020 11:21:52 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Baoquan He <bhe@...hat.com>,
        Anshuman Khandual <anshuman.khandual@....com>
Cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org, david@...hat.com,
        akpm@...ux-foundation.org
Subject: Re: [PATCH v2 4/4] mm/hugetl.c: warn out if expected count of huge
 pages adjustment is not achieved

On 7/23/20 2:11 AM, Baoquan He wrote:
> On 07/23/20 at 11:46am, Anshuman Khandual wrote:
>>
>>
>> On 07/23/2020 08:52 AM, Baoquan He wrote:
>>> A customer complained that no message is logged wh	en the number of
>>> persistent huge pages is not changed to the exact value written to
>>> the sysfs or proc nr_hugepages file.
>>>
>>> In the current code, a best effort is made to satisfy requests made
>>> via the nr_hugepages file.  However, requests may be only partially
>>> satisfied.
>>>
>>> Log a message if the code was unsuccessful in fully satisfying a
>>> request. This includes both increasing and decreasing the number
>>> of persistent huge pages.
>>
>> But is kernel expected to warn for all such situations where the user
>> requested resources could not be allocated completely ? Otherwise, it
>> does not make sense to add an warning for just one such situation.
> 
> It's not for just one such situation, we have already had one to warn
> out in mm/hugetlb.c, please check hugetlb_hstate_alloc_pages().

Those are a little different in that they are warnings based on kernel
command line parameters.

> As Mike said, in one time of persistent huge page number setting,
> comparing the old value with the new vlaue is good enough for customer
> to get the information. However, if customer want to detect and analyze
> previous setting failure, logging message will be helpful. So I think
> logging the failure or partial success makes sense.

I can understand the argument against adding a new warning for this.
You could even argue that this condition has existed since the time
hugetlb was added to the kernel which was long ago.  And, nobody has
complained enough to add a warning.  I have even heard of a sysadmin
practice of asking for a ridiculously large amount of hugetlb pages
just so that the kernel will allocate as many as possible.  They do
not 'expect' to get the ridiculous amount they asked for.  In such
cases, this will be a new warning in their log.

As mentioned in a previous e-mail, when one sets nr_hugepages by writing
to the sysfs or proc file, one needs to read the file to determine if the
number of requested pages were actually allocated.  Anyone who does not
do this is just asking for trouble.  Yet, I imagine that it may happen.

To be honest, I do not see this log message as something that would be
helpful to end users.  Rather, I could see this as being useful to support
people.  Support always asks for system logs and this could point out a
possible issue with hugetlb usage.

I do not feel strongly one way or another about adding the warning.  Since
it is fairly trivial and could help diagnose issues I am in favor of adding
it.  If people feel strongly that it should not be added, I am open to
those arguments.
-- 
Mike Kravetz