lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7f06216e-1e66-433e-a247-2445dac22498@gmail.com>
Date: Tue, 20 May 2025 16:10:42 +0100
From: Pavel Begunkov <asml.silence@...il.com>
To: Stanislav Fomichev <stfomichev@...il.com>,
 Al Viro <viro@...iv.linux.org.uk>
Cc: netdev@...r.kernel.org, davem@...emloft.net, edumazet@...gle.com,
 kuba@...nel.org, pabeni@...hat.com, horms@...nel.org, willemb@...gle.com,
 sagi@...mberg.me, almasrymina@...gle.com, kaiyuanz@...gle.com,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next] net: devmem: remove min_t(iter_iov_len) in
 sendmsg

On 5/17/25 05:29, Stanislav Fomichev wrote:
> On 05/17, Al Viro wrote:
>> On Fri, May 16, 2025 at 08:53:09PM -0700, Stanislav Fomichev wrote:
>>> On 05/17, Al Viro wrote:
>>>> On Fri, May 16, 2025 at 07:17:23PM -0700, Stanislav Fomichev wrote:
>>>>>> Wait, in the same commit there's
>>>>>> +       if (iov_iter_type(from) != ITER_IOVEC)
>>>>>> +               return -EFAULT;
>>>>>>
>>>>>> shortly prior to the loop iter_iov_{addr,len}() are used.  What am I missing now?
>>>>>
>>>>> Yeah, I want to remove that part as well:
>>>>>
>>>>> https://lore.kernel.org/netdev/20250516225441.527020-1-stfomichev@gmail.com/T/#u
>>>>>
>>>>> Otherwise, sendmsg() with a single IOV is not accepted, which makes not
>>>>> sense.
>>>>
>>>> Wait a minute.  What's there to prevent a call with two ranges far from each other?
>>>
>>> It is perfectly possible to have a call with two disjoint ranges,
>>> net_devmem_get_niov_at should correctly resolve it to the IOVA in the
>>> dmabuf. Not sure I understand why it's an issue, can you pls clarify?
>>
>> Er...  OK, the following is given an from with two iovecs.
>>
>> 	while (length && iov_iter_count(from)) {
>> 		if (i == MAX_SKB_FRAGS)
>> 			return -EMSGSIZE;
>>
>> 		virt_addr = (size_t)iter_iov_addr(from);
>>
>> OK, that's iov_base of the first one.
>>
>> 		niov = net_devmem_get_niov_at(binding, virt_addr, &off, &size);
>> 		if (!niov)
>> 			return -EFAULT;
>> Whatever it does, it does *NOT* see iov_len of the first iovec.  Looks like
>> it tries to set something up, storing the length of what it had set up
>> into size
>>
>> 		size = min_t(size_t, size, length);
>> ... no more than length, OK.  Suppose length is considerably more than iov_len
>> of the first iovec.
>>
>> 		size = min_t(size_t, size, iter_iov_len(from));
>> ... now trim it down to iov_len of that sucker.  That's what you want to remove,
>> right?  What happens if iov_len is shorter than what we have in size?
>>
>> 		get_netmem(net_iov_to_netmem(niov));
>> 		skb_add_rx_frag_netmem(skb, i, net_iov_to_netmem(niov), off,
>> 				      size, PAGE_SIZE);
>> Still not looking at that iov_len...
>>
>> 		iov_iter_advance(from, size);
>> ... and now that you've removed the second min_t, size happens to be greater
>> than that iovec[0].iov_len.  So we advance into the second iovec, skipping
>> size - iovec[0].iov_len bytes after iovev[1].iov_base.
>> 		length -= size;
>> 		i++;
>> 	}
>> ... and proceed into the second iteration.
>>
>> Would you agree that behaviour ought to depend upon the iovec[0].iov_len?
>> If nothing else, it affects which data do you want to be sent, and I don't
>> see where would anything even look at that value with your change...
> 
> Yes, I think you have a point. I was thinking that net_devmem_get_niov_at
> will expose max size of the chunk, but I agree that the iov might have
> requested smaller part and it will bug out in case of multiple chunks...
> 
> Are you open to making iter_iov_len more ubuf friendly? Something like
> the following:
> 
> static inline size_t iter_iov_len(const struct iov_iter *i)
> {
> 	if (iter->iter_type == ITER_UBUF)
> 		return ni->count;
> 	return iter_iov(i)->iov_len - i->iov_offset;
> }
> 
> Or should I handle the iter_type here?
> 
> if (iter->iter_type == ITER_IOVEC)
> 	size = min_t(size_t, size, iter_iov_len(from));
> /* else
> 	I don think I need to clamp to iov_iter_count() because length
> 	should take care of it */

FWIW, since it's not devmem specific, I looked through the callers:
io_uring handles ubuf separately, read_write.c and madvise.c advance
strictly by iov_iter_count() and hence always consume ubuf iters in
one go. So the only one with a real problem is devmem tx, which and
hasn't been released yet.

With that said, it feels error prone, and IMO we should either make the
helper work with ubuf well as Stan suggested, or force _all_ users to
check if it's ubuf. Also, I can't say for madvise, but it's not in
any hot / important path of neither io_uring nor rw, so we likely
don't care about this extra check.

-- 
Pavel Begunkov


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ