[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <27c7ae97-c838-37b7-99c3-9bc4b0d6e69f@kot-begemot.co.uk>
Date: Thu, 12 Oct 2017 16:57:11 +0100
From: Anton Ivanov <anton.ivanov@...-begemot.co.uk>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: Network Development <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>,
user-mode-linux-devel@...ts.sourceforge.net
Subject: Re: BUG:af_packet fails to TX TSO frames
Also confirmed using my UML patchset. This is inside the UML guest.
root@...nky:~# iperf -c 192.168.98.1
------------------------------------------------------------
Client connecting to 192.168.98.1, TCP port 5001
TCP window size: 414 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.98.146 port 36744 connected with 192.168.98.1 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 4.52 GBytes 3.89 Gbits/sec
This is GSO on a raw socket Virtual NIC to host. Normal speed without
GSO on same machine is ~ 500.
So for the time being I can turn off TSO for anything but v4 in the User
Mode Linux patchset (declare the guest NIC as TSOV4 capable only) and
enable it once the host kernel is fixed. Alternatively, there is the
rather ugly approach of using multiple FDs - one for v4, one for v6, one
for...
A.
On 10/12/17 16:44, Anton Ivanov wrote:
> Found it.
>
> Two bugs canceling each other.
> The bind sequence in: psock_txring_vnet.c is wrong.
>
> It does the following addr.sll_protocol = htons(ETH_P_IP);
> before calling bind.
>
> If you set addr.sll_protocol to ETH_P_ALL where it should have been in
> the first place the test program blows up with -ENOBUFS
>
> I think what is happening is that this value is taken into account
> when looking at "what should I use to segment it with" in
> skb_mac_gso_segment which is invoked at the end of the verification
> chain which starts in packet_direct_xmit in af_packet.c
>
> I have not tried the other test cases like setting it to ETH_P_IP and
> giving it IPv6 traffic or the opposite, but my guess is that these
> will fail too if they need GSO to be applied.
>
> A.
>
> On 10/12/17 15:12, Anton Ivanov wrote:
>>
>>
>> On 10/12/17 14:39, Willem de Bruijn wrote:
>>>> If I produce a real vnet frame out of a live kernel frame using
>>>> virtio_net_hdr_from_skb() and try to send it it fails on the check in
>>>> af_packet, while succeeding for tap. If I remove the af_packet
>>>> check the
>>>> frame is accepted by the hardware too.
>>>>
>>>> If I produce it a synthetic frame + vnet header using the test
>>>> program - it
>>>> works. Go figure.
>>> Besides looking at the raw frame bytes, also compare the setup
>>> of virtio_net_header, as well as the tcp checksum field. The stack
>>> expects the pseudo header to have already been calculated.
>>
>> I am feeding it a skb which is coming up in the tx routine of a User
>> Mode Linux device which is marked as NETIF_F_HW_CSUM and SG - that
>> results in a skb with csum-ed headers, body set up for CSUM_PARTIAL
>> and multiple fragments (always at least 1 more frag besides the TCP
>> head).
>>
>> That has everything in order as expected by virtio_net_hdr_from_skb
>> and this is what I use to generate the vnet header. It works
>> correctly for csum and GRO with af_packet and it works correctly for
>> everything using a tap device. It fails only on GSO + af_packet TX.
>>
>> What I am doing is the same thing virtio_net does - it just takes the
>> output of virtio_net_hdr_from_skb and does nothing more. There should
>> be no need to do anything more :(
>>
>> It should just work.
>>
>> Unless there is a gremlin somewhere in the machinery and that gremlin
>> needs some light to be flushed out.
>>>
>>>> I am going to continue digging into it.
>>>>
>>>> At the very least I now have a positive test case which uses the same
>>>> semantics as my code so I have something to compare to.
>>> Glad to hear that the test is helpful. I wrote it because I
>>> have run into these exact same issues in the past.
>>
>> It is. I have changes ready for it so it also supports vector IO,
>> need to finish fighting with it.
>>
>> A.
>>
>>>
>>
>
Powered by blists - more mailing lists