netdev - Re: [PATCH RESEND net-next] netlink: add tracepoint at NL_SET_ERR

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210202123007.GE3288@horizon.localdomain>
Date:   Tue, 2 Feb 2021 09:30:07 -0300
From:   Marcelo Ricardo Leitner <marcelo.leitner@...il.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     netdev@...r.kernel.org
Subject: Re: [PATCH RESEND net-next] netlink: add tracepoint at NL_SET_ERR_MSG

On Mon, Feb 01, 2021 at 05:34:00PM -0800, Jakub Kicinski wrote:
> On Mon,  1 Feb 2021 15:12:19 -0300 Marcelo Ricardo Leitner wrote:
> > Often userspace won't request the extack information, or they don't log it
> > because of log level or so, and even when they do, sometimes it's not
> > enough to know exactly what caused the error.
> > 
> > Netlink extack is the standard way of reporting erros with descriptive
> > error messages. With a trace point on it, we then can know exactly where
> > the error happened, regardless of userspace app. Also, we can even see if
> > the err msg was overwritten.
> > 
> > The wrapper do_trace_netlink_extack() is because trace points shouldn't be
> > called from .h files, as trace points are not that small, and the function
> > call to do_trace_netlink_extack() on the macros is not protected by
> > tracepoint_enabled() because the macros are called from modules, and this
> > would require exporting some trace structs. As this is error path, it's
> > better to export just the wrapper instead.
> > 
> > Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@...il.com>
> 
> Did you measure the allyesconfig bloat from this?

Now I did:

$ ./scripts/bloat-o-meter -t out/{orig,new}/vmlinux
...
Total: Before=212077464, After=212108056, chg +0.01%

$ size out/{orig,new}/vmlinux
   text    data     bss     dec     hex filename
267409181       333328965       83018348        683756494       28c14bce      out/orig/vmlinux
267413171       333337273       83010156        683760600       28c15bd8      out/new/vmlinux

with the commit on top of 46eb3c108fe1744d0a6abfda69ef8c1d4f0e92d4.
It's not much because it's adding just a function call to the macro,
rather than the tracepoint itself.

> How valuable is it to have the tracepoint in at the time it's set?

To know exactly its source. It's very helpful to track down some
EINVALs reported to userspace. If not, we have to rely on the errmsg
and grep the code afterwards.

Also, if the message is a common one, one may not be able to easily
distinguish them. Ideally this shouldn't happen, but when debugging
applications such as OVS, where lots of netlink requests are flying,
it saves us time. I can, for example, look at a perf capture and
search for cls_flower or so. Otherwise, it will all show up as
"af_netlink: <err_msg>"

Also, it allows tracking when a previous errmsg (which would have been
a warning) is overwritten with a new one.

> IIRC extack is passed in to the callbacks from the netlink core, it's
> just not reported to user space if not requested. So we could capture
> the message in af_netlink.c, no?

If 4k is too much, yes. It looses practicality, per above, but should
be doable.

  Marcelo