netdev - Re: [PATCH net v6 1/2] net/ps3_gelic_net: Fix RX sk

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230228123135.251edc25@kernel.org>
Date:   Tue, 28 Feb 2023 12:31:35 -0800
From:   Jakub Kicinski <kuba@...nel.org>
To:     Alexander Lobakin <aleksander.lobakin@...el.com>
Cc:     Geoff Levand <geoff@...radead.org>, <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Alexander Lobakin <alexandr.lobakin@...el.com>,
        Alexander Duyck <alexander.duyck@...il.com>,
        Paolo Abeni <pabeni@...hat.com>
Subject: Re: [PATCH net v6 1/2] net/ps3_gelic_net: Fix RX sk_buff length

On Tue, 28 Feb 2023 16:47:25 +0100 Alexander Lobakin wrote:
> >> +		return -ENOMEM;
> >> +	}  
> > 
> > And generally reshuffling the code.
> > 
> > Once again - please don't do any of that in a bug fix.
> > Describe precisely what the problem is and fix that problem,  
> 
> IIRC the original problem is that the skb linear parts are not always
> aligned to a boundary which this particular HW requires. So initially
> there was something like "allocate len + alignment - 1, then
> PTR_ALIGN()",

Let's focus on where the bug is. At a quick look I'm guessing that 
the bug is that we align the CPU buffer instead of the DMA buffer.
We should pass the entire allocate len + alignment - 1 as length 
to dma_map() and then align the dma_addr. dma_addr is what the device
sees. Not the virtual address of skb->data.

If I'm right the bug is not in fact directly addressed by the patch.
I'm guessing the patch helps because at least the patch passes the
aligned length to the dma_map(), rather than GELIC_NET_MAX_MTU (which
is not a multiple of 128).

> but I said that it's a waste of memory and we shouldn't do
> that, using napi_alloc_frag_align() instead.
> I guess if that would've been described, this could go as a fix? I don't
> think wasting memory is a good fix, even if we need to change the
> allocation scheme...

In general doing such a rewrite may be fine, if the author is an expert
and the commit message is very precise. Here we are at v6 already and
IMHO the problem is not even well understood.

Let's focus on understanding the problem and writing a _minimal_ fix.

The waste of memory claim can't be taken seriously when the MTU define
is bumped by 500 with no mention in the commit msg, as you also noticed.

Minimal fix, please.