linux-kernel - Re: [PATCH bpf v2] lib/buildid: use __kernel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <7gyxkpozyno7hl2jz5k2v2k5yo6gpvr3i5whqrgqlc5eahxvjz@p7p2a2aezsbt>
Date: Mon, 22 Dec 2025 11:41:07 -0800
From: Shakeel Butt <shakeel.butt@...ux.dev>
To: Matthew Wilcox <willy@...radead.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>, 
	Andrii Nakryiko <andrii@...nel.org>, Shaurya Rane <ssrane_b23@...vjti.ac.in>, 
	"Darrick J . Wong" <djwong@...nel.org>, Christoph Hellwig <hch@...radead.org>, 
	Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, 
	Meta kernel team <kernel-team@...a.com>, bpf@...r.kernel.org, linux-fsdevel@...r.kernel.org, 
	linux-mm@...ck.org, linux-kernel@...r.kernel.org, 
	syzbot+09b7d050e4806540153d@...kaller.appspotmail.com, Christoph Hellwig <hch@....de>
Subject: Re: [PATCH bpf v2] lib/buildid: use __kernel_read() for sleepable
 context

On Mon, Dec 22, 2025 at 05:31:50AM +0000, Matthew Wilcox wrote:
> On Thu, Dec 18, 2025 at 09:58:43PM -0800, Shakeel Butt wrote:
> > On Fri, Dec 19, 2025 at 04:07:51AM +0000, Matthew Wilcox wrote:
> > > > I am assuming that __kernel_read() can return less data than the
> > > > requested. Is that assumption incorrect?
> > > 
> > > I think it can, but I don't think a second call will get any more data.
> > > For example, it could hit EOF.  What led you to think that calling it in
> > > a loop was the right approach?
> > 
> > I am kind of following the convention of a userspace application doing
> > read() syscall i.e. repeatedly call read() until you hit an error or EOF
> > in which case 0 will be returned or you successfully read the amount of
> > data you want. I am handling negative error and 0 and for 0, I am
> > returning -EIO as that would be unexpected end of an ELF file.
> 
> Oh, you sweet summer child.  I hope Father Christmas leaves you an
> extra special present in your stocking this week!
> 
> While it would be lovely to believe that userspace does that kind of loop,
> it just doesn't.  That's why mounting NFS filesystems with the "intr"
> option (before I made it a no-op) was dangerous -- userspace just isn't
> prepared for short reads.  I mean, we're lucky if userspace even checks
> for errors, let alone does this kind of advanced "oh, we got fewer bytes
> than we wanted, keep trying" scheme.
> 
> A filesystem's ->read_iter() implementation can stop short of reading
> all bytes requested if:
> 
>  - We hit EIO.  No amount of asking will return more bytes, the data's
>    just not there any more.
>  - We hit EOF.  There's no more data to read.
>  - We're unable to access the buffer.  That's only possible for user
>    buffers, not kernel buffers.
>  - We receive a fatal signal.  I suppose there is the tiniest chance
>    that the I/O completes while we're processing the "we returned early"
>    loop, but in practice, the user has asked that we die now, and even
>    trying again is rude.  Just die as quickly as we can.
> 
> I can't think of any other cases.  It's just not allowed to return
> short reads to userspace (other than EIO/EOF), and that drives all
> the behaviour.

Thanks for the explanation. Is calling kernel_read() again after it
returned less amount of data unsafe or unnecessary?

> 
> > Anyways the question is if __kernel_read() returns less amount of data
> > than requested, should we return error instead of retrying? I looked
> > couple of callers of __kernel_read() & kernel_read(). Some are erroring
> > out if received data is less than requested (e.g. big_key_read()) and
> > some are calling in the loop (e.g. kernel_read_file()).
> 
> kernel_read_file() is wrong.  Thanks for reporting it; I'll send a patch.

There are couple more like do_read() and do_verify() in
drivers/usb/gadget/function/f_mass_storage.c. There might be more but I
have not looked into every caller.