linux-kernel - Re: [PATCH 1/2] nvme-pci: Use size_t for length fields to handle larger sizes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251115180547.GC147495@unreal>
Date: Sat, 15 Nov 2025 20:05:47 +0200
From: Leon Romanovsky <leon@...nel.org>
To: David Laight <david.laight.linux@...il.com>
Cc: Jens Axboe <axboe@...nel.dk>, Keith Busch <kbusch@...nel.org>,
	Christoph Hellwig <hch@....de>, Sagi Grimberg <sagi@...mberg.me>,
	linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-nvme@...ts.infradead.org
Subject: Re: [PATCH 1/2] nvme-pci: Use size_t for length fields to handle
 larger sizes

On Sat, Nov 15, 2025 at 05:33:41PM +0000, David Laight wrote:
> On Sat, 15 Nov 2025 18:22:45 +0200
> Leon Romanovsky <leon@...nel.org> wrote:
> 
> > From: Leon Romanovsky <leonro@...dia.com>
> > 
> > This patch changes the length variables from unsigned int to size_t.
> > Using size_t ensures that we can handle larger sizes, as size_t is
> > always equal to or larger than the previously used u32 type.
> 
> Where are requests larger than 4GB going to come from?

The main goal is to reuse phys_vec structure. It is going to represent PCI
regions exposed through VFIO DMABUF interface. Their length is more than u32.

> 
> > Originally, u32 was used because blk-mq-dma code evolved from
> > scatter-gather implementation, which uses unsigned int to describe length.
> > This change will also allow us to reuse the existing struct phys_vec in places
> > that don't need scatter-gather.
> > 
> > Signed-off-by: Leon Romanovsky <leonro@...dia.com>
> > ---
> >  block/blk-mq-dma.c      | 14 +++++++++-----
> >  drivers/nvme/host/pci.c |  4 ++--
> >  2 files changed, 11 insertions(+), 7 deletions(-)
> > 
> > diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
> > index e9108ccaf4b0..cc3e2548cc30 100644
> > --- a/block/blk-mq-dma.c
> > +++ b/block/blk-mq-dma.c
> > @@ -8,7 +8,7 @@
> >  
> >  struct phys_vec {
> >  	phys_addr_t	paddr;
> > -	u32		len;
> > +	size_t		len;
> >  };
> >  
> >  static bool __blk_map_iter_next(struct blk_map_iter *iter)
> > @@ -112,8 +112,8 @@ static bool blk_rq_dma_map_iova(struct request *req, struct device *dma_dev,
> >  		struct phys_vec *vec)
> >  {
> >  	enum dma_data_direction dir = rq_dma_dir(req);
> > -	unsigned int mapped = 0;
> >  	unsigned int attrs = 0;
> > +	size_t mapped = 0;
> >  	int error;
> >  
> >  	iter->addr = state->addr;
> > @@ -296,8 +296,10 @@ int __blk_rq_map_sg(struct request *rq, struct scatterlist *sglist,
> >  	blk_rq_map_iter_init(rq, &iter);
> >  	while (blk_map_iter_next(rq, &iter, &vec)) {
> >  		*last_sg = blk_next_sg(last_sg, sglist);
> > -		sg_set_page(*last_sg, phys_to_page(vec.paddr), vec.len,
> > -				offset_in_page(vec.paddr));
> > +
> > +		WARN_ON_ONCE(overflows_type(vec.len, unsigned int));
> 
> I'm not at all sure you need that test.
> blk_map_iter_next() has to guarantee that vec.len is valid.
> (probably even less than a page size?)
> Perhaps this code should be using a different type for the addr:len pair?

I added this test for future proof, this is why it doesn't "return" on
overflow, but prints dump stack and continues. It can't happen.

> 
> > +		sg_set_page(*last_sg, phys_to_page(vec.paddr),
> > +			    (unsigned int)vec.len, offset_in_page(vec.paddr));
> 
> You definitely don't need the explicit cast.

We degrade type from u64 to u32. Why don't we need cast?

Thanks