lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 5 Jul 2013 09:50:38 +0200
From:	Thomas Petazzoni <thomas.petazzoni@...e-electrons.com>
To:	Willy Tarreau <w@....eu>
Cc:	netdev@...r.kernel.org,
	Gregory CLEMENT <gregory.clement@...e-electrons.com>,
	Eric Dumazet <edumazet@...gle.com>
Subject: Re: [PATCH] net: convert mvneta to build_skb()

Dear Willy Tarreau,

On Fri, 5 Jul 2013 09:43:30 +0200, Willy Tarreau wrote:

> > Thanks Willy. Sorry for asking such a stupid question, but I'm not very
> > familiar with how this mechanism works. Can you explain why a single
> > 'frag_size' field per port is sufficient? My concern is that this
> > frag_size seems to be a per-packet information, but we have potentially
> > multiple packets being received, and multiple RX queues. Is one single
> > 'frag_size' per network interface sufficient?
> 
> I had the exact same question when Eric sent me an experimental patch to
> do this on mv64xxx_eth a few months ago. Then I checked how the frag_size
> is computed. As you can see below, it does not depend on a per-packet size
> but on a per-port size which is in fact the MTU ("pkt_size" is the misleading
> part here) :
> 
>       skb_size = SKB_DATA_ALIGN(pp->pkt_size + MVNETA_MH_SIZE + NET_SKB_PAD) +
>                  SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> 
> So if skb_size depends solely on pp (struct mvneta_port), then it makes
> sense to have frag_size stored at the same place.
> 
> In practice we don't really need to store the frag_size in the struct, we
> just need to know if the data were allocated using netdev_alloc_frag()
> or kmalloc() to know how to free them. So a single bit would be enough,
> and I thought about just doing a +1 on the pointer when we need to free
> using kmalloc(). But that would add unneeded extra work in the fast path
> to fix the pointers. And since we need to pass frag_size to build_skb()
> it was a reasonable solution in my opinion.

Aah, okay makes sense. So now, the question that comes up is why this
skb_size calculation is done in every call of mvneta_rx_refill() ? It
should be computed once, at the same time pkt_size is calculated, and
stored in the mvneta_port structure? Then you just need to test whether
it is smaller or larger than PAGE_SIZE to decide whether to use
netdev_alloc_frag() vs. kmalloc().

I.e, I would turn all the:

	if (pp->frag_size)
		...
	else
		...

into:

	if (pp->skb_size <= PAGE_SIZE)
		...
	else
		...

Of course, you can always hide this test behind a small macro or inline
function, to make it something like:

	if (mvneta_uses_small_skbs(pp))
		...
	else
		...

A better name than mvneta_uses_small_skbs() would of course be useful.

> > For example, in mvneta_rx_refill(), you store the skb_size in
> > pp->frag_size, and then you later re-use it in mvneta_rxq_drop_pkts.
> > What guarantees you that mvneta_rx_refill() hasn't be called in the
> > mean time for a different packet, in a different rxq, for the same
> > network interface, and the value of pp->frag_size has been overridden?
> 
> It's not a problem since the refill() applies pp->pkt_size which doesn't
> change between calls. It's only changed in mvneta_change_mtu() which
> first stops the device. So I think it's safe.

Yeah, sure, now that I see that pp->frag_size is constant for a given
MTU configuration, it looks safe. But I find the current code that
retests and reassigns pp->frag_size at every call of rxq_refill() to be
very confusing.

Thanks!

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ