linux-kernel - Re: [RFC net-next] bonding: netlink error message support for options

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <616dbc51-e87d-dd11-da73-e9d7229ed8ce@redhat.com>
Date:   Tue, 17 May 2022 23:37:13 -0400
From:   Jonathan Toppins <jtoppins@...hat.com>
To:     Jakub Kicinski <kuba@...nel.org>,
        Stephen Hemminger <stephen@...workplumber.org>
Cc:     netdev@...r.kernel.org, Jay Vosburgh <j.vosburgh@...il.com>,
        Veaceslav Falico <vfalico@...il.com>,
        Andy Gospodarek <andy@...yhouse.net>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Paolo Abeni <pabeni@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [RFC net-next] bonding: netlink error message support for options

On 5/17/22 19:54, Jakub Kicinski wrote:
> On Tue, 17 May 2022 15:44:19 -0700 Stephen Hemminger wrote:
>> On Tue, 17 May 2022 16:31:19 -0400
>> Jonathan Toppins <jtoppins@...hat.com> wrote:
>>
>>>      This is an RFC because the current NL_SET_ERR_MSG() macros do not support
>>>      printf like semantics so I rolled my own buffer setting in __bond_opt_set().
>>>      The issue is I could not quite figure out the life-cycle of the buffer, if
>>>      rtnl lock is held until after the text buffer is copied into the packet
>>>      then we are ok, otherwise, some other type of buffer management scheme will
>>>      be needed as this could result in corrupted error messages when modifying
>>>      multiple bonds.
>>
>> Might be better for others in long term if NL_SET_ERR_MSG() had printf like
>> semantics. Surely this isn't going to be first or last case.
>>
>> Then internally, it could print right to the netlink message.
> 
> Dunno. I think pointing at the bad attr + exposing per-attr netlink
> parsing policy + a string for a human worked pretty well so far.
> IMHO printf() is just a knee jerk reaction, especially when converting
> from netdev_err().

For some subsystems it is not a convert from netdev_err, it is an AND. 
In this RFC there are instances where changing the message from 
netdev_err() to the macro was trivial;

@@ -240,12 +243,14 @@ static int bond_changelink(struct net_device 
*bond_dev, st
ruct nlattr *tb[],
                 int arp_interval = 
nla_get_u32(data[IFLA_BOND_ARP_INTERVAL]);

                 if (arp_interval && miimon) {
-                       netdev_err(bond->dev, "ARP monitoring cannot be 
used with MII monitoring\n");
+                       NL_SET_ERR_MSG(extack,
+                                      "ARP monitoring cannot be used 
with MII monitoring");
                         return -EINVAL;
                 }

These are trivial because the path does not have to care about sysfs or 
some other legacy configuration interface. These macros become rather 
annoying to use once a system needs to support multiple configuration 
paths and is trying to utilize as much common configuration code[0] as 
possible so that all interfaces largely operate the same way.

> 
> Augmenting structured information is much, much better long term.
> 
> To me the never ending stream of efforts to improve printk() is a
> proof that once we let people printf() at will, efforts to contain
> it will be futile.
> 
At least for bonding I was trying to reuse the most amount of code which 
needs to deal with both sysfs and netlink. And I don't think it is a 
good idea to split the code paths, so if I am suppose to use statically 
allocated strings to support netlink errors that basically means 
anything that has to support multiple interfaces gets to sprinkle `if 
(extack)` everywhere[0]. Not great. The ownership model of the error 
buffer seems odd to me with the current macros, I am suppose to set a 
pointer in a structure subsystem X didn't allocate and has no control 
over its lifetime. Then netlink takes this pointer and does whatever 
with it. And somehow subsystem X is suppose to guarantee the pointer's 
lifetime exists forever, making a `const static char[]` buffer the only 
option. I don't understand why netlink doesn't provide the buffer and a 
subsystem just populates it. Using memcpy or snprintf doesn't matter, to 
me its a lifetime issue that makes the API not great to work with when 
you have to handle cases other than netlink.

Also as Joe Perches points out in this thread[1,2] the way the macros 
are written it is bloating the kernel because the error messages are 
getting duplicated for subsystems that need to support multiple 
configuration interfaces.

-Jon

[0] 
https://lore.kernel.org/netdev/e6b78ce8f5904a5411a809cf4205d745f8af98cb.1628650079.git.jtoppins@redhat.com/
[1] https://lore.kernel.org/netdev/cover.1628306392.git.jtoppins@redhat.com/
[2] 
https://lore.kernel.org/netdev/c8b69905c995ab887633ef11862705ee66c60aad.camel@perches.com/