lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANJ5vP+_aK0PMc6-hwbu6PduVNDe8jbWR19cN5C+k4mWQzdTGg@mail.gmail.com>
Date:	Wed, 8 Jan 2014 19:16:18 -0800
From:	Michael Dalton <mwdalton@...gle.com>
To:	"Michael S. Tsirkin" <mst@...hat.com>
Cc:	"David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
	Eric Dumazet <edumazet@...gle.com>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Jason Wang <jasowang@...hat.com>,
	lf-virt <virtualization@...ts.linux-foundation.org>
Subject: Re: [PATCH net-next v2 3/4] virtio-net: auto-tune mergeable rx buffer
 size for improved performance

Hi Michael,

On Wed, Jan 8, 2014 at 5:42 PM, Michael S. Tsirkin <mst@...hat.com> wrote:
> Sorry that I didn't notice early, but there seems to be a bug here.
> See below.
Yes, that is definitely a bug. Virtio spec permits OOO completions,
but current code assumes in-order completion. Thanks for catching this.

> Don't need full int really, it's up to 4K/cache line size,
> 1 byte would be enough, maximum 2 ...
> So if all we want is extra 1-2 bytes per buffer, we don't really
> need this extra level of indirection I think.
> We can just allocate them before the header together with an skb.
I'm not sure if I'm parsing the above correctly, but do you mean using a
few bytes at the beginning of the packet buffer to store truesize? I
think that will break Jason's virtio-net RX frag coalescing
code. To coalesce consecutive RX packet buffers, our packet buffers must
be physically adjacent, and any extra bytes before the start of the
buffer would break that.

We could allocate an SKB per packet buffer, but if we have multi-buffer
packets often(e.g., netperf benefiting from GSO/GRO), we would be
allocating 1 SKB per packet buffer instead of 1 SKB per MAX_SKB_FRAGS
buffers. How do you feel about any of the below alternatives:

(1) Modify the existing mrg_buf_ctx to chain together free entries
We can use the 'buf' pointer in mergeable_receive_buf_ctx to chain
together free entries so that we can support OOO completions. This would
be similar to how virtio-queue manages free sg entries.

(2) Combine the buffer pointer and truesize into a single void* value
Your point about there only being a byte needed to encode truesize is
spot on, and I think we could leverage this to eliminate the out-of-band
metadata ring entirely. If we were willing to change the packet buffer
alignment from L1_CACHE_BYTES to 256 (or min (256, L1_CACHE_SIZE)), we
could encode the truesize in the least significant 8 bits of the buffer
address (encoded as truesize >> 8 as we know all sizes are a multiple
of 256). This would allow packet buffers up to 64KB in length.

Is there another approach you would prefer to any of these? If the
cleanliness issues and larger alignment aren't too bad, I think (2)
sounds promising and allow us to eliminate the metadata ring
entirely while still permitting RX frag coalescing.

Best,

Mike
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ