netdev - Re: [PATCH RFC (resend) net-next 0/6] virtio-net: Add support for virtio-net header extensions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Mon, 24 Apr 2017 11:22:35 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     vyasevic@...hat.com, Vladislav Yasevich <vyasevich@...il.com>,
        netdev@...r.kernel.org
Cc:     virtio-dev@...ts.oasis-open.org, mst@...hat.com,
        maxime.coquelin@...hat.com,
        virtualization@...ts.linux-foundation.org
Subject: Re: [PATCH RFC (resend) net-next 0/6] virtio-net: Add support for
 virtio-net header extensions



On 2017年04月21日 21:08, Vlad Yasevich wrote:
> On 04/21/2017 12:05 AM, Jason Wang wrote:
>> On 2017年04月20日 23:34, Vlad Yasevich wrote:
>>> On 04/17/2017 11:01 PM, Jason Wang wrote:
>>>> On 2017年04月16日 00:38, Vladislav Yasevich wrote:
>>>>> Curreclty virtion net header is fixed size and adding things to it is rather
>>>>> difficult to do.  This series attempt to add the infrastructure as well as some
>>>>> extensions that try to resolve some deficiencies we currently have.
>>>>>
>>>>> First, vnet header only has space for 16 flags.  This may not be enough
>>>>> in the future.  The extensions will provide space for 32 possbile extension
>>>>> flags and 32 possible extensions.   These flags will be carried in the
>>>>> first pseudo extension header, the presense of which will be determined by
>>>>> the flag in the virtio net header.
>>>>>
>>>>> The extensions themselves will immidiately follow the extension header itself.
>>>>> They will be added to the packet in the same order as they appear in the
>>>>> extension flags.  No padding is placed between the extensions and any
>>>>> extensions negotiated, but not used need by a given packet will convert to
>>>>> trailing padding.
>>>> Do we need a explicit padding (e.g an extension) which could be controlled by each side?
>>> I don't think so.  The size of the vnet header is set based on the extensions negotiated.
>>> The one part I am not crazy about is that in the case of packet not using any extensions,
>>> the data is still placed after the entire vnet header, which essentially adds a lot
>>> of padding.  However, that's really no different then if we simply grew the vnet header.
>>>
>>> The other thing I've tried before is putting extensions into their own sg buffer, but that
>>> made it slower.h
>> Yes.
>>
>>>>> For example:
>>>>>     | vnet mrg hdr | ext hdr | ext 1 | ext 2 | ext 5 | .. pad .. | packet data |
>>>> Just some rough thoughts:
>>>>
>>>> - Is this better to use TLV instead of bitmap here? One advantage of TLV is that the
>>>> length is not limited by the length of bitmap.
>>> but the disadvantage is that we add at least 4 bytes per extension of just TL data.  That
>>> makes this thing even longer.
>> Yes, and it looks like the length is still limited by e.g the length of T.
> Not only that, but it is also limited by the skb->cb as a whole.  So adding putting
> extensions into a TLV style means we have less extensions for now, until we get rid of
> skb->cb usage.
>
>>>> - For 1.1, do we really want something like vnet header? AFAIK, it was not used by modern
>>>> NICs, is this better to pack all meta-data into descriptor itself? This may need a some
>>>> changes in tun/macvtap, but looks more PCIE friendly.
>>> That would really be ideal and I've looked at this.  There are small issues of exposing
>>> the 'net metadata' of the descriptor to taps so they can be filled in.  The alternative
>>> is to use a different control structure for tap->qemu|vhost channel (that can be
>>> implementation specific) and have qemu|vhost populate the 'net metadata' of the descriptor.
>> Yes, this needs some thought. For vhost, things looks a little bit easier, we can probably
>> use msg_control.
>>
> We can use msg_control in qemu as well, can't we?

AFAIK, it needs some changes since we don't export socket to userspace.

>   It really is a question of who is doing
> the work and the number of copies.
>
> I can take a closer look of how it would look if we extend the descriptor with type
> specific data.  I don't know if other users of virtio would benefit from it?

Not sure, but we can have a common descriptor header followed by device 
specific meta data. This probably need some prototype benchmarking to 
see the benefits first.

Thanks