lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 4 Oct 2022 17:04:00 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Kees Cook <keescook@...omium.org>
Cc:     Dmitry Vyukov <dvyukov@...gle.com>,
        syzbot <syzbot+3a080099974c271cd7e9@...kaller.appspotmail.com>,
        bpf@...r.kernel.org, davem@...emloft.net, edumazet@...gle.com,
        fw@...len.de, harshit.m.mogalapalli@...cle.com,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        pabeni@...hat.com, syzkaller-bugs@...glegroups.com,
        linux-hardening@...r.kernel.org
Subject: Re: [syzbot] upstream boot error: WARNING in netlink_ack

On Tue, 4 Oct 2022 16:40:32 -0700 Kees Cook wrote:
> On Tue, Oct 04, 2022 at 10:42:53AM -0700, Jakub Kicinski wrote:
> > This has been weighing on my conscience a little, I don't like how we
> > still depend on putting one length in the skb and then using a
> > different one for the actual memcpy(). How would you feel about this
> > patch on top (untested):  
> 
> tl;dr: yes, I like it. Please add a nlmsg_contents member. :)

Can do, but you'll need to tell me how..

	__DECLARE_FLEX_ARRAY(char, nlmsg_contents)

?

> > +				 u32 size)
> > +{
> > +	if (unlikely(skb_tailroom(skb) < NLMSG_ALIGN(size)))
> > +		return NULL;
> > +
> > +	if (!__builtin_constant_p(size) || NLMSG_ALIGN(size) - size != 0)  
> 
> why does a fixed size mean no memset?

Copy and paste, it seems to originate from:

0c19b0adb8dd ("netlink: avoid memset of 0 bytes sparse warning")

Any idea why sparse would not like empty memsets?

> >  	rep = nlmsg_put(skb, NETLINK_CB(in_skb).portid, nlh->nlmsg_seq,
> > -			NLMSG_ERROR, payload, flags);
> > +			NLMSG_ERROR, sizeof(*errmsg), flags);
> > +	if (!rep)
> > +		goto err_bad_put;
> >  	errmsg = nlmsg_data(rep);
> >  	errmsg->error = err;
> > -	unsafe_memcpy(&errmsg->msg, nlh, payload > sizeof(*errmsg)
> > -					 ? nlh->nlmsg_len : sizeof(*nlh),
> > -		      /* Bounds checked by the skb layer. */);
> > +	memcpy(&errmsg->msg, nlh, sizeof(*nlh));
> > +
> > +	if (!(flags & NLM_F_CAPPED)) {  
> 
> Should it test this flag, or test if the sizes show the need for "extra"
> payload length?
> 
> I always found the progression of sizes here to be confusing. "payload"
> starts as sizeof(*errmsg), and gets nlmsg_len(nlh) added but only when if
> "(err && !(nlk->flags & NETLINK_F_CAP_ACK)" was true.

struct nlmsgerr is one of the least badly documented structs we have in
netlink so let me start with a copy & paste:

struct nlmsgerr {
	int		error;
	struct nlmsghdr msg;
	/*
	 * followed by the message contents unless NETLINK_CAP_ACK was set
	 * or the ACK indicates success (error == 0)
	 * message length is aligned with NLMSG_ALIGN()
	 */
	/*
	 * followed by TLVs defined in enum nlmsgerr_attrs
	 * if NETLINK_EXT_ACK was set
	 */
};

*Why* that's the behavior - 🤷

> Why is nlmsg_len(nlh) _wrong_ if the rest of its contents are
> correct? 

This is an ack message, to be clear, doesn't mean anything was wrong.
It just carries errno.

> If this was "0" in the other state, the logic would just be:
> 
> 	nlh_bytes = nlmsg_len(nlh);
> 	total  = sizeof(*errmsg);
> 	total += nlh_bytes;
> 	total += tlvlen;
> 
> and:
> 
> 	nlmsg_new(total, ...);
> 	... nlmsg_put(..., sizeof(*errmsg), ...);
> 	...
> 	errmsg->error = err;
> 	errmsg->nlh = *nlh;
> 	if (nlh_bytes) {
> 		data = nlmsg_append(..., nlh_bytes), ...);
> 		...
> 		memcpy(data, nlh->nlmsg_contents, nlh_bytes);
> 	}
> 
> > +		size_t data_len = nlh->nlmsg_len - sizeof(*nlh);  
> 
> I think data_len here is also "payload - sizeof(*errmsg)"? So if it's
> >0, we need to append the nlh contents.

I was trying to avoid using payload in case it has overflown :S

> > +		void *data;
> > +
> > +		data = nlmsg_append(skb, rep, data_len);
> > +		if (!data)
> > +			goto err_bad_put;
> > +
> > +		/* the nlh + 1 is probably going to make you
> > unhappy? */  
> 
> Right, the compiler may think it is an object no larger than
> sizeof(*nlh). My earliest attempt at changes here introduced a
> flex-array for the contents, and split the memcpy:
> https://lore.kernel.org/lkml/d7251d92-150b-5346-6237-52afc154bb00@rasmusvillemoes.dk/
> which is basically the solution you have here, except it wasn't having
> the nlmsg_*-helpers do the bounds checking.
> 
> > +		memcpy(data, nlh + 1, data_len);  
> 
> So with the struct nlmsghdr::nlmsg_contents member, this becomes:
> 
> 		memcpy(data, nlh->nlmsg_contents, data_len);
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ