linux-kernel - Re: [PATCH] nfsd: fix incorrect high limit in clamp() on over-allocation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d4f9d5c2-6919-421f-b1f3-bda6986e878a@oracle.com>
Date: Mon, 23 Dec 2024 12:49:40 -0500
From: Chuck Lever <chuck.lever@...cle.com>
To: Geert Uytterhoeven <geert@...ux-m68k.org>
Cc: mailhol.vincent@...adoo.fr, NeilBrown <neilb@...e.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "J. Bruce Fields" <bfields@...ldses.org>,
        David Laight <David.Laight@...lab.com>,
        Jeff Layton <jlayton@...nel.org>,
        Olga Kornievskaia <okorniev@...hat.com>, Dai Ngo <Dai.Ngo@...cle.com>,
        Tom Talpey <tom@...pey.com>, linux-nfs@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] nfsd: fix incorrect high limit in clamp() on
 over-allocation

On 12/23/24 11:06 AM, Geert Uytterhoeven wrote:
> Hi Chuck,
> 
> On Mon, Dec 9, 2024 at 3:48 PM Chuck Lever <chuck.lever@...cle.com> wrote:
>> On 12/9/24 7:25 AM, Vincent Mailhol via B4 Relay wrote:
>>> From: Vincent Mailhol <mailhol.vincent@...adoo.fr>
>>>
>>> If over allocation occurs in nfsd4_get_drc_mem(), total_avail is set
>>> to zero. Consequently,
>>>
>>>     clamp_t(unsigned long, avail, slotsize, total_avail/scale_factor);
>>>
>>> gives:
>>>
>>>     clamp_t(unsigned long, avail, slotsize, 0);
>>>
>>> resulting in a clamp() call where the high limit is smaller than the
>>> low limit, which is undefined: the result could be either slotsize or
>>> zero depending on the order of evaluation.
>>>
>>> Luckily, the two instructions just below the clamp() recover the
>>> undefined behaviour:
>>>
>>>     num = min_t(int, num, avail / slotsize);
>>>     num = max_t(int, num, 1);
>>>
>>> If avail = slotsize, the min_t() sets it back to 1. If avail = 0, the
>>> max_t() sets it back to 1.
>>>
>>> So this undefined behaviour has no visible effect.
>>>
>>> Anyway, remove the undefined behaviour in clamp() by only calling it
>>> and only doing the calculation of num if memory is still available.
>>> Otherwise, if over-allocation occurred, directly set num to 1 as
>>> intended by the author.
>>>
>>> While at it, apply below checkpatch fix:
>>>
>>>     WARNING: min() should probably be min_t(unsigned long, NFSD_MAX_MEM_PER_SESSION, total_avail)
>>>     #100: FILE: fs/nfsd/nfs4state.c:1954:
>>>     +          avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION, total_avail);
>>>
>>> Fixes: 7f49fd5d7acd ("nfsd: handle drc over-allocation gracefully.")
>>> Signed-off-by: Vincent Mailhol <mailhol.vincent@...adoo.fr>
>>> ---
>>> Found by applying below patch from David:
>>>
>>>     https://lore.kernel.org/all/34d53778977747f19cce2abb287bb3e6@AcuMS.aculab.com/
>>>
>>> Doing so yield this report:
>>>
>>>     In function ‘nfsd4_get_drc_mem’,
>>>         inlined from ‘check_forechannel_attrs’ at fs/nfsd/nfs4state.c:3791:16,
>>>         inlined from ‘nfsd4_create_session’ at fs/nfsd/nfs4state.c:3864:11:
>>>     ././include/linux/compiler_types.h:542:38: error: call to ‘__compiletime_assert_3707’ declared with attribute error: clamp() low limit (unsigned long)(slotsize) greater than high limit (unsigned long)(total_avail/scale_factor)
>>>       542 |  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
>>>           |                                      ^
>>>     ././include/linux/compiler_types.h:523:4: note: in definition of macro ‘__compiletime_assert’
>>>       523 |    prefix ## suffix();    \
>>>           |    ^~~~~~
>>>     ././include/linux/compiler_types.h:542:2: note: in expansion of macro ‘_compiletime_assert’
>>>       542 |  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
>>>           |  ^~~~~~~~~~~~~~~~~~~
>>>     ./include/linux/build_bug.h:39:37: note: in expansion of macro ‘compiletime_assert’
>>>        39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
>>>           |                                     ^~~~~~~~~~~~~~~~~~
>>>     ./include/linux/minmax.h:114:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’
>>>       114 |  BUILD_BUG_ON_MSG(statically_true(ulo > uhi),    \
>>>           |  ^~~~~~~~~~~~~~~~
>>>     ./include/linux/minmax.h:121:2: note: in expansion of macro ‘__clamp_once’
>>>       121 |  __clamp_once(val, lo, hi, __UNIQUE_ID(v_), __UNIQUE_ID(l_), __UNIQUE_ID(h_))
>>>           |  ^~~~~~~~~~~~
>>>     ./include/linux/minmax.h:275:36: note: in expansion of macro ‘__careful_clamp’
>>>       275 | #define clamp_t(type, val, lo, hi) __careful_clamp((type)(val), (type)(lo), (type)(hi))
>>>           |                                    ^~~~~~~~~~~~~~~
>>>     fs/nfsd/nfs4state.c:1972:10: note: in expansion of macro ‘clamp_t’
>>>      1972 |  avail = clamp_t(unsigned long, avail, slotsize,
>>>           |          ^~~~~~~
>>>
>>> Because David's patch is targetting Andrew's mm tree, I would suggest
>>> that my patch also goes to that tree.
>>> ---
>>>    fs/nfsd/nfs4state.c | 46 +++++++++++++++++++++++++---------------------
>>>    1 file changed, 25 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>> index 741b9449f727defc794347f1b116c955d715e691..eb91460c434e30f6df70f66d937f8c0f334b8e1b 100644
>>> --- a/fs/nfsd/nfs4state.c
>>> +++ b/fs/nfsd/nfs4state.c
>>> @@ -1944,35 +1944,39 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca, struct nfsd_net *nn
>>>    {
>>>        u32 slotsize = slot_bytes(ca);
>>>        u32 num = ca->maxreqs;
>>> -     unsigned long avail, total_avail;
>>> -     unsigned int scale_factor;
>>>
>>>        spin_lock(&nfsd_drc_lock);
>>> -     if (nfsd_drc_max_mem > nfsd_drc_mem_used)
>>> +     if (nfsd_drc_max_mem > nfsd_drc_mem_used) {
>>> +             unsigned long avail, total_avail;
>>> +             unsigned int scale_factor;
>>> +
>>>                total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;
>>> -     else
>>> +             avail = min_t(unsigned long,
>>> +                           NFSD_MAX_MEM_PER_SESSION, total_avail);
>>> +             /*
>>> +              * Never use more than a fraction of the remaining memory,
>>> +              * unless it's the only way to give this client a slot.
>>> +              * The chosen fraction is either 1/8 or 1/number of threads,
>>> +              * whichever is smaller.  This ensures there are adequate
>>> +              * slots to support multiple clients per thread.
>>> +              * Give the client one slot even if that would require
>>> +              * over-allocation--it is better than failure.
>>> +              */
>>> +             scale_factor = max_t(unsigned int,
>>> +                                  8, nn->nfsd_serv->sv_nrthreads);
>>> +
>>> +             avail = clamp_t(unsigned long, avail, slotsize,
>>> +                             total_avail/scale_factor);
>>> +             num = min_t(int, num, avail / slotsize);
>>> +             num = max_t(int, num, 1);
>>> +     } else {
>>>                /* We have handed out more space than we chose in
>>>                 * set_max_drc() to allow.  That isn't really a
>>>                 * problem as long as that doesn't make us think we
>>>                 * have lots more due to integer overflow.
>>>                 */
>>> -             total_avail = 0;
>>> -     avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION, total_avail);
>>> -     /*
>>> -      * Never use more than a fraction of the remaining memory,
>>> -      * unless it's the only way to give this client a slot.
>>> -      * The chosen fraction is either 1/8 or 1/number of threads,
>>> -      * whichever is smaller.  This ensures there are adequate
>>> -      * slots to support multiple clients per thread.
>>> -      * Give the client one slot even if that would require
>>> -      * over-allocation--it is better than failure.
>>> -      */
>>> -     scale_factor = max_t(unsigned int, 8, nn->nfsd_serv->sv_nrthreads);
>>> -
>>> -     avail = clamp_t(unsigned long, avail, slotsize,
>>> -                     total_avail/scale_factor);
>>> -     num = min_t(int, num, avail / slotsize);
>>> -     num = max_t(int, num, 1);
>>> +             num = 1;
>>> +     }
>>>        nfsd_drc_mem_used += num * slotsize;
>>>        spin_unlock(&nfsd_drc_lock);
>>>
>>>
>>> ---
>>> base-commit: fac04efc5c793dccbd07e2d59af9f90b7fc0dca4
>>> change-id: 20241209-nfs4state_fix-bc6f1c1fc1d1
> 
>> We're replacing this code wholesale in 6.14. See:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/commit/?h=nfsd-testing&id=8233f78fbd970cbfcb9f78c719ac5a3aac4ea053
> 
> Bad commit reference?

Expired commit reference. That commit lives in a testing branch, which
has subsequently been rebased. The current reference is:

https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/commit/?h=nfsd-testing&id=94af736b97fbd8d02d66b3a0271f9c618f446bf0


> And hence this is still failing in next-20241220...

I haven't moved these commits to the nfsd-next branch yet.

Is there an urgency for this fix? Is this a problem in LTS kernels
that might need a backport? 94af736 is not intended to be backported.


-- 
Chuck Lever