lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 28 Mar 2020 14:36:23 -0400
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Yi Yang (杨燚)-云服务集团 
        <yangyi01@...pur.com>
Cc:     "willemdebruijn.kernel@...il.com" <willemdebruijn.kernel@...il.com>,
        "yang_y_yi@....com" <yang_y_yi@....com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "u9012063@...il.com" <u9012063@...il.com>
Subject: Re: [vger.kernel.org代发]Re: [vger.kernel.org代发]Re: [PATCH net-next] net/ packet: fix TPACKET_V3 performance issue in case of TSO

On Sat, Mar 28, 2020 at 4:37 AM Yi Yang (杨燚)-云服务集团 <yangyi01@...pur.com> wrote:
>
>
> -----邮件原件-----
> 发件人: Willem de Bruijn [mailto:willemdebruijn.kernel@...il.com]
> 发送时间: 2020年3月27日 11:17
> 收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@...pur.com>
> 抄送: willemdebruijn.kernel@...il.com; yang_y_yi@....com; netdev@...r.kernel.org; u9012063@...il.com
> 主题: Re: [vger.kernel.org代发]Re: [vger.kernel.org代发]Re: [PATCH net-next] net/ packet: fix TPACKET_V3 performance issue in case of TSO
>
> > On Wed, Mar 25, 2020 at 8:45 PM Yi Yang (杨燚)-云服务集团 <yangyi01@...pur.com> wrote:
> > >
> > > By the way, even if we used hrtimer, it can't ensure so high performance improvement, the reason is every frame has different size, you can't know how many microseconds one frame will be available, early timer firing will be an unnecessary waste, late timer firing will reduce performance, so I still think the way this patch used is best so far.
> > >
> >
> > The key differentiating feature of TPACKET_V3 is the use of blocks to efficiently pack packets and amortize wake ups.
> >
> > If you want immediate notification for every packet, why not just use TPACKET_V2?
> >
> > For non-TSO packet, TPACKET_V3 is much better than TPACKET_V2, but for TSO packet, it is bad, we prefer to use TPACKET_V3 for better performance.
>
> At high rate, blocks are retired and userspace is notified as soon as a packet arrives that does not fit and requires dispatching a new block. As such, max throughput is not timer dependent. The timer exists to bound notification latency when packet arrival rate is slow.
>
> [Yi Yang] Per our iperf3 tcp test with TSO enabled, even if packet size is about 64K and block size is also 64K + 4K (to accommodate tpacket_vX header), we can't see high performance without this patch, I think some small packets before 64K big packets decide what performance it can reach, according to my trace, TCP packet size is increasing from less than 100 to 64K gradually, so it looks like how long this period took decides what performance it can reach. So yes, I don’t think hrtimer can help fix this issue very efficiently. In addition, I also noticed packet size pattern is 1514, 64K, 64K, 64K, 64K, ..., 1514, 64K even if it reaches 64K packet size, maybe that 1514 packet has big impact on performance, I just guess.

Again, the main issue is that the timer does not matter at high rate.
The 3 Gbps you report corresponds to ~6000 TSO packets, or 167 usec
inter arrival time. The timer, whether 1 or 4 ms, should never be
needed.

There are too many unknown variables here. Besides block size, what is
tp_block_nr? What is the drop rate? Are you certain that you are not
causing drops by not reading fast enough? What happens when you
increase tp_block_size or tp_block_nr? It may be worthwhile to pin
iperf to one (set of) core(s) and the packet socket reader to another.
Let it busy spin and do minimal processing, just return blocks back to
the kernel.

If unsure about that, it may be interesting to instrument the kernel
and count how many block retire operations are from
prb_retire_rx_blk_timer_expired and how many from tpacket_rcv.

Note that do_vnet only changes whether a virtio_net_header is prefixed
to the data. Having that disabled (the common case) does not stop GSO
packets from arriving.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ