linux-kernel - Re: [PATCH 12/24] getgrent_r.3: Use sizeof() to get buffer size (instead of hardcoding macro name)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b17a4e7c-477b-4eaa-b66f-a9455fb36cbb@gmail.com>
Date:   Thu, 24 Sep 2020 15:09:52 +0200
From:   Alejandro Colomar <colomar.6.4.3@...il.com>
To:     "Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>,
        Stefan Puiu <stefan.puiu@...il.com>
Cc:     lnx-man <linux-man@...r.kernel.org>, linux-kernel@...r.kernel.org,
        Walter Harms <wharms@....de>
Subject: Re: [PATCH 12/24] getgrent_r.3: Use sizeof() to get buffer size
 (instead of hardcoding macro name)

Hi Michael,

On 2020-09-24 13:38, Michael Kerrisk (man-pages) wrote:
 > On 9/24/20 11:35 AM, Alejandro Colomar wrote:
 >> Hi,
 >>
 >> On 2020-09-23 22:35, Michael Kerrisk (man-pages) wrote:
 >>> On 9/15/20 12:03 PM, Stefan Puiu wrote:
 >>>> Hi,
 >>>>
 >>>> On Fri, Sep 11, 2020 at 6:28 PM Alejandro Colomar
 >>>> <colomar.6.4.3@...il.com> wrote:
 >>>>>
 >>>>> Hi Stefan,
 >>>>>
 >>>>> On 2020-09-11 16:35, Stefan Puiu wrote:
 >>>>>    > Hi,
 >>>>>    >
 >>>>>    > On Fri, Sep 11, 2020 at 12:15 AM Alejandro Colomar
 >>>>>    > <colomar.6.4.3@...il.com> wrote:
 >>>>>    >>
 >>>>>    >> Signed-off-by: Alejandro Colomar <colomar.6.4.3@...il.com>
 >>>>>    >> ---
 >>>>>    >>   man3/getgrent_r.3 | 2 +-
 >>>>>    >>   1 file changed, 1 insertion(+), 1 deletion(-)
 >>>>>    >>
 >>>>>    >> diff --git a/man3/getgrent_r.3 b/man3/getgrent_r.3
 >>>>>    >> index 81d81a851..76deec370 100644
 >>>>>    >> --- a/man3/getgrent_r.3
 >>>>>    >> +++ b/man3/getgrent_r.3
 >>>>>    >> @@ -186,7 +186,7 @@ main(void)
 >>>>>    >>
 >>>>>    >>       setgrent();
 >>>>>    >>       while (1) {
 >>>>>    >> -        i = getgrent_r(&grp, buf, BUFLEN, &grpp);
 >>>>>    >> +        i = getgrent_r(&grp, buf, sizeof(buf), &grpp);
 >>>>>    >
 >>>>>    > I'm worried that less attentive people might copy/paste 
parts of this
 >>>>>    > in their code, where maybe buf is just a pointer, and expect 
it to
 >>>>>    > work. Maybe leaving BUFLEN here is useful as a reminder that 
they need
 >>>>>    > to change something to adapt the code?
 >>>>>    >
 >>>>>    > Just my 2 cents,
 >>>>>    > Stefan.
 >>>>>    >
 >>>>> That's a very good point.
 >>>>>
 >>>>> So we have 3 options and I will propose now a 4th one.  Let's see all
 >>>>> of them and see which one is better for the man pages.
 >>>>>
 >>>>> 1.-     Use the macro everywhere.
 >>>>>
 >>>>> pros:
 >>>>> - It is still valid when the buffer is a pointer and not an array.
 >>>>> cons:
 >>>>> - Hardcodes the initializer.  If the array is later initialized 
with a
 >>>>>      different value, it may produce a silent bug, or a 
compilation break.
 >>>>>
 >>>>> 2.-     Use sizeof() everywhere, and the macro for the initializer.
 >>>>>
 >>>>> pros:
 >>>>> - It is valid as long as the buffer is an array.
 >>>>> cons:
 >>>>> - If the code gets into a function, and the buffer is then a pointer,
 >>>>>      it will definitively produce a silent bug.
 >>>>>
 >>>>> 3.-     Use sizeof() everywhere, and a magic number for the 
initializer.
 >>>>>
 >>>>> The same as 2.
 >>>>>
 >>>>> 4.-     Use ARRAY_BYTES() macro
 >>>>>
 >>>>> pros:
 >>>>> - It is always safe and when code changes, it may break 
compilation, but
 >>>>>      never a silent bug.
 >>>>> cons:
 >>>>> - Add a few lines of code.  Maybe too much complexity for an example.
 >>>>>      But I'd say that it is the only safe option, and in real code it
 >>>>>      should probably be used more, so maybe it's good to show a 
good practice.
 >>>>
 >>>> If you ask me, I think examples should be simple and easy to
 >>>> understand, and easy to copy/paste in your code. I'd settle for easy
 >>>> enough, not perfect or completely foolproof. If you need to look up
 >>>> obscure gcc features to understand an example, that's not very
 >>>> helpful. So I'd be more inclined to prefer version 1 above. But let's
 >>>> see Michael's opinion on this.
 >>>>
 >>>> Just my 2c,
 >>>
 >>> So, the fundamental problem is that C is nearly 50 years old.
 >>> It's a great high-level assembly language, but when it comes
 >>> to nuances like this it gets pretty painful. One can do macro
 >>> magic of the kind you suggest, but I agree with Stefan that it
 >>> gets confusing and distracting for the reader. I think I also
 >>> lean to solution 1. Yes, it's not perfect, but it's easy to
 >>> understand, and I don't think we can or should try and solve
 >>> the broken-ness of C in the manual pages.
 >>>
 >>> Thanks,
 >>>
 >>> Michael
 >>>
 >>>
 >>
 >> I was reverting the 3 patches I introduced (they changed from solution 1
 >> to solution 2), and also was grepping for already existing solution 2 in
 >> the pages (it seems that solution 2 was a bit more extended than
 >> solution 1).
 >>
 >> While doing that, I've been thinking about it again...
 >>
 >> There's a good thing about sizeof (even though I admit it's very
 >> insecure; and I never use it for myself), especially for the man pages:
 >>
 >> I'll copy here a sample from getnameinfo.3 to ilustrate it:
 >>
 >> [[
 >> .EX
 >> struct sockaddr *addr;     /* input */
 >> socklen_t addrlen;         /* input */
 >> char hbuf[NI_MAXHOST], sbuf[NI_MAXSERV];
 >>
 >> if (getnameinfo(addr, addrlen, hbuf, sizeof(hbuf), sbuf,
 >>               sizeof(sbuf), NI_NUMERICHOST | NI_NUMERICSERV) == 0)
 >>       printf("host=%s, serv=%s\en", hbuf, sbuf);
 >> .EE
 >> ]]
 >>
 >> Here, it's clear to the reader that the 4th argument to 'getnameinfo()'
 >> is the size of the buffer passed as the 3rd argument.
 >>
 >> If the function call was changed to
 >>
 >> [[
 >> getnameinfo(addr, addrlen, hbuf, NI_MAXHOST, sbuf,
 >>               sizeof(sbuf), NI_NUMERICHOST | NI_NUMERICSERV)
 >> ]]
 >>
 >> then it would be less clear, and the reader should go back and forth to
 >> see where that comes from.  In this short example it is relatively very
 >> clear, but in some examples it might be less clear.
 >>
 >> Would you maintain your preference for solution 1?
 >>
 >>
 >> Also... I am trying to patch glibc to provide a safe version of
 >> 'nitems()', and shortly after they accept that patch (if they do), I'll
 >> send another one to add a safe 'array_bytes()' based on 'nitems()'.
 >>
 >> Maybe the examples could use 'array_bytes()'; although is will be a
 >> glibc extension, and non-existent in any other POSIX systems, of course,
 >> which would make the examples non-portable, but still can be solved with
 >> a simple
 >>
 >> [[
 >> #if !defined(array_bytes)
 >> #define array_bytes() sizeof()
 >> #endif
 >> ]]
 >>
 >> But again it complicates the examples...
 >
 > (And I'd rather not...)
 >
 >> I'm not sure at all about what should be done.  Please comment.  If you
 >> still prefer solution 1, I'll send you a patch with the revert + fixes,
 >> but I think it's very delicate.
 >
 > Okay -- I agree with your perspective of the getnameinfo example.
 >
 > So, I think maybe solution 1 is clearer sometimes, and other times
 > solution 2 is clearer. I don't feel too strongly about this

Me neither.

 > (because we can't solve the bugger problem, which is C).
 > I'm fine with not reverting the three patches you
 > refer to. I'm going to leave this decision to you :-)!
 >
 > Thanks,
 >
 > Michael
 >
 >

I'll keep it as is now. :)

Thanks,

Alex