[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170825085816.3425a70c@xeon-e3>
Date: Fri, 25 Aug 2017 08:58:16 -0700
From: Stephen Hemminger <stephen@...workplumber.org>
To: "Waskiewicz Jr, Peter" <peter.waskiewicz.jr@...el.com>
Cc: "Keller, Jacob E" <jacob.e.keller@...el.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [RFC PATCH] net: limit maximum number of packets to mark with
xmit_more
On Fri, 25 Aug 2017 15:36:22 +0000
"Waskiewicz Jr, Peter" <peter.waskiewicz.jr@...el.com> wrote:
> On 8/25/17 11:25 AM, Jacob Keller wrote:
> > Under some circumstances, such as with many stacked devices, it is
> > possible that dev_hard_start_xmit will bundle many packets together, and
> > mark them all with xmit_more.
> >
> > Most drivers respond to xmit_more by skipping tail bumps on packet
> > rings, or similar behavior as long as xmit_more is set. This is
> > a performance win since it means drivers can avoid notifying hardware of
> > new packets repeat daily, and thus avoid wasting unnecessary PCIe or other
> > bandwidth.
> >
> > This use of xmit_more comes with a trade off because bundling too many
> > packets can increase latency of the Tx packets. To avoid this, we should
> > limit the maximum number of packets with xmit_more.
> >
> > Driver authors could modify their drivers to check for some determined
> > limit, but this requires all drivers to be modified in order to gain
> > advantage.
> >
> > Instead, add a sysctl "xmit_more_max" which can be used to configure the
> > maximum number of xmit_more skbs to send in a sequence. This ensures
> > that all drivers benefit, and allows system administrators the option to
> > tune the value to their environment.
> >
> > Signed-off-by: Jacob Keller <jacob.e.keller@...el.com>
> > ---
> >
> > Stray thoughts and further questions....
> >
> > Is this the right approach? Did I miss any other places where we should
> > limit? Does the limit make sense? Should it instead be a per-device
> > tuning nob instead of a global? Is 32 a good default?
>
> I actually like the idea of a per-device knob. A xmit_more_max that's
> global in a system with 1GbE devices along with a 25/50GbE or more just
> doesn't make much sense to me. Or having heterogeneous vendor devices
> in the same system that have different HW behaviors could mask issues
> with latency.
>
> This seems like another incarnation of possible buffer-bloat if the max
> is too high...
>
> >
> > Documentation/sysctl/net.txt | 6 ++++++
> > include/linux/netdevice.h | 2 ++
> > net/core/dev.c | 10 +++++++++-
> > net/core/sysctl_net_core.c | 7 +++++++
> > 4 files changed, 24 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/sysctl/net.txt b/Documentation/sysctl/net.txt
> > index b67044a2575f..3d995e8f4448 100644
> > --- a/Documentation/sysctl/net.txt
> > +++ b/Documentation/sysctl/net.txt
> > @@ -230,6 +230,12 @@ netdev_max_backlog
> > Maximum number of packets, queued on the INPUT side, when the interface
> > receives packets faster than kernel can process them.
> >
> > +xmit_more_max
> > +-------------
> > +
> > +Maximum number of packets in a row to mark with skb->xmit_more. A value of zero
> > +indicates no limit.
>
> What defines "packet?" MTU-sized packets, or payloads coming down from
> the stack (e.g. TSO's)?
xmit_more is only a hint to the device. The device driver should ignore it unless
there are hardware advantages. The device driver is the place with HW specific
knowledge (like 4 Tx descriptors is equivalent to one PCI transaction on this device).
Anything that pushes that optimization out to the user is only useful for benchmarks
and embedded devices.
Powered by blists - more mailing lists