linux-kernel - Re: [syzbot] upstream boot error: WARNING in netlink

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202210041600.7C90DF917@keescook>
Date:   Tue, 4 Oct 2022 16:40:32 -0700
From:   Kees Cook <keescook@...omium.org>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Dmitry Vyukov <dvyukov@...gle.com>,
        syzbot <syzbot+3a080099974c271cd7e9@...kaller.appspotmail.com>,
        bpf@...r.kernel.org, davem@...emloft.net, edumazet@...gle.com,
        fw@...len.de, harshit.m.mogalapalli@...cle.com,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        pabeni@...hat.com, syzkaller-bugs@...glegroups.com,
        linux-hardening@...r.kernel.org
Subject: Re: [syzbot] upstream boot error: WARNING in netlink_ack

On Tue, Oct 04, 2022 at 10:42:53AM -0700, Jakub Kicinski wrote:
> On Tue, 04 Oct 2022 07:36:55 -0700 Kees Cook wrote:
> > This is fixed in the pending netdev tree coming for the merge window.
> 
> This has been weighing on my conscience a little, I don't like how we
> still depend on putting one length in the skb and then using a
> different one for the actual memcpy(). How would you feel about this
> patch on top (untested):

tl;dr: yes, I like it. Please add a nlmsg_contents member. :)

Rambling below...

> 
> diff --git a/include/net/netlink.h b/include/net/netlink.h
> index 4418b1981e31..6ad671441dff 100644
> --- a/include/net/netlink.h
> +++ b/include/net/netlink.h
> @@ -931,6 +931,29 @@ static inline struct nlmsghdr *nlmsg_put(struct sk_buff *skb, u32 portid, u32 se
>  	return __nlmsg_put(skb, portid, seq, type, payload, flags);
>  }
>  
> +/**
> + * nlmsg_append - Add more data to a nlmsg in a skb
> + * @skb: socket buffer to store message in
> + * @nlh: message header
> + * @payload: length of message payload
> + *
> + * Append data to an existing nlmsg, used when constructing a message
> + * with multiple fixed-format headers (which is rare).
> + * Returns NULL if the tailroom of the skb is insufficient to store
> + * the extra payload.
> + */
> +static inline void *nlmsg_append(struct sk_buff *skb, struct nlmsghdr *nlh,

nlh not needed here?

> +				 u32 size)
> +{
> +	if (unlikely(skb_tailroom(skb) < NLMSG_ALIGN(size)))
> +		return NULL;
> +
> +	if (!__builtin_constant_p(size) || NLMSG_ALIGN(size) - size != 0)

why does a fixed size mean no memset?

> +		memset(skb_tail_pointer(skb) + size, 0,
> +		       NLMSG_ALIGN(size) - size);
> +	return __skb_put(NLMSG_ALIGN(size));
> +}
> +
>  /**
>   * nlmsg_put_answer - Add a new callback based netlink message to an skb
>   * @skb: socket buffer to store message in
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index a662e8a5ff84..bb3d855d1f57 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -2488,19 +2488,28 @@ void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err,
>  		flags |= NLM_F_ACK_TLVS;
>  
>  	skb = nlmsg_new(payload + tlvlen, GFP_KERNEL);
> -	if (!skb) {
> -		NETLINK_CB(in_skb).sk->sk_err = ENOBUFS;
> -		sk_error_report(NETLINK_CB(in_skb).sk);
> -		return;
> -	}
> +	if (!skb)
> +		goto err_bad_put;
>  
>  	rep = nlmsg_put(skb, NETLINK_CB(in_skb).portid, nlh->nlmsg_seq,
> -			NLMSG_ERROR, payload, flags);
> +			NLMSG_ERROR, sizeof(*errmsg), flags);
> +	if (!rep)
> +		goto err_bad_put;
>  	errmsg = nlmsg_data(rep);
>  	errmsg->error = err;
> -	unsafe_memcpy(&errmsg->msg, nlh, payload > sizeof(*errmsg)
> -					 ? nlh->nlmsg_len : sizeof(*nlh),
> -		      /* Bounds checked by the skb layer. */);
> +	memcpy(&errmsg->msg, nlh, sizeof(*nlh));
> +
> +	if (!(flags & NLM_F_CAPPED)) {

Should it test this flag, or test if the sizes show the need for "extra"
payload length?

I always found the progression of sizes here to be confusing. "payload"
starts as sizeof(*errmsg), and gets nlmsg_len(nlh) added but only when if
"(err && !(nlk->flags & NETLINK_F_CAP_ACK)" was true. Why is
nlmsg_len(nlh) _wrong_ if the rest of its contents are correct? If this
was "0" in the other state, the logic would just be:

	nlh_bytes = nlmsg_len(nlh);
	total  = sizeof(*errmsg);
	total += nlh_bytes;
	total += tlvlen;

and:

	nlmsg_new(total, ...);
	... nlmsg_put(..., sizeof(*errmsg), ...);
	...
	errmsg->error = err;
	errmsg->nlh = *nlh;
	if (nlh_bytes) {
		data = nlmsg_append(..., nlh_bytes), ...);
		...
		memcpy(data, nlh->nlmsg_contents, nlh_bytes);
	}

> +		size_t data_len = nlh->nlmsg_len - sizeof(*nlh);

I think data_len here is also "payload - sizeof(*errmsg)"? So if it's >0,
we need to append the nlh contents.

> +		void *data;
> +
> +		data = nlmsg_append(skb, rep, data_len);
> +		if (!data)
> +			goto err_bad_put;
> +
> +		/* the nlh + 1 is probably going to make you unhappy? */

Right, the compiler may think it is an object no larger than sizeof(*nlh).
My earliest attempt at changes here introduced a flex-array for the
contents, and split the memcpy:
https://lore.kernel.org/lkml/d7251d92-150b-5346-6237-52afc154bb00@rasmusvillemoes.dk/
which is basically the solution you have here, except it wasn't having
the nlmsg_*-helpers do the bounds checking.

> +		memcpy(data, nlh + 1, data_len);

So with the struct nlmsghdr::nlmsg_contents member, this becomes:

		memcpy(data, nlh->nlmsg_contents, data_len);

-- 
Kees Cook