lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 29 Sep 2023 17:06:53 +1000
From: "Nicholas Piggin" <npiggin@...il.com>
To: "Ilya Maximets" <i.maximets@....org>, <netdev@...r.kernel.org>
Cc: <dev@...nvswitch.org>
Subject: Re: [ovs-dev] [RFC PATCH 0/7] net: openvswitch: Reduce stack usage

On Wed Sep 27, 2023 at 6:36 PM AEST, Ilya Maximets wrote:
> On 9/27/23 02:13, Nicholas Piggin wrote:
> > Hi,
> > 
> > We've got a report of a stack overflow on ppc64le with a 16kB kernel
> > stack. Openvswitch is just one of many things in the stack, but it
> > does cause recursion and contributes to some usage.
> > 
> > Here are a few patches for reducing stack overhead. I don't know the
> > code well so consider them just ideas. GFP_ATOMIC allocations
> > introduced in a couple of places might be controversial, but there
> > is still some savings to be had if you skip those.
> > 
> > Here is one place detected where the stack reaches >14kB before
> > overflowing a little later. I massaged the output so it just shows
> > the stack frame address on the left.
>
> Hi, Nicholas.  Thanks for the patches!
>
> Though it looks like OVS is not really playing a huge role in the
> stack trace below.  How much of the stack does the patch set save
> in total?  How much patches 2-7 contribute (I posted a patch similar
> to the first one last week, so we may not count it)?

Stack usage was tested for the same path (this is backported to
RHEL9 kernel), and saving was 2080 bytes for that. It's enough
to get us out of trouble. But if it was a config that caused more
recursions then it might still be a problem.

>
> Also, most of the changes introduced here has a real chance to
> noticeably impact performance.  Did you run any performance tests
> with this to assess the impact?

Some numbers were posted by Aaron as you would see. 2-4% for that
patch, but I suspect the rest should have much smaller impact.

Maybe patch 2 if you were doing a lot of push_nsh operations, but
that might be less important since it's out of the recursive path.

>
> One last thing is that at least some of the patches seem to change
> non-inlined non-recursive functions.  Seems unnecessary.
>
> Best regards, Ilya Maximets.
>

One thing I do notice in the trace:

> > 
> > [c00000037d480b40] __kmalloc+0x8c/0x5e0
> > [c00000037d480bc0] virtqueue_add_outbuf+0x354/0xac0
> > [c00000037d480cc0] xmit_skb+0x1dc/0x350 [virtio_net]
> > [c00000037d480d50] start_xmit+0xd4/0x3b0 [virtio_net]
> > [c00000037d480e00] dev_hard_start_xmit+0x11c/0x280
> > [c00000037d480e80] sch_direct_xmit+0xec/0x330
> > [c00000037d480f20] __dev_xmit_skb+0x41c/0xa80
> > [c00000037d480f90] __dev_queue_xmit+0x414/0x950
> > [c00000037d481070] ovs_vport_send+0xb4/0x210 [openvswitch]
> > [c00000037d4810f0] do_output+0x7c/0x200 [openvswitch]
> > [c00000037d481140] do_execute_actions+0xe48/0xeb0 [openvswitch]
> > [c00000037d481300] ovs_execute_actions+0x78/0x1f0 [openvswitch]
> > [c00000037d481380] ovs_dp_process_packet+0xb4/0x2e0 [openvswitch]
> > [c00000037d481450] ovs_vport_receive+0x8c/0x130 [openvswitch]
> > [c00000037d481660] internal_dev_xmit+0x40/0xd0 [openvswitch]
> > [c00000037d481690] dev_hard_start_xmit+0x11c/0x280
> > [c00000037d481710] __dev_queue_xmit+0x634/0x950
> > [c00000037d4817f0] neigh_hh_output+0xd0/0x180
> > [c00000037d481840] ip_finish_output2+0x31c/0x5c0
> > [c00000037d4818e0] ip_local_out+0x64/0x90
> > [c00000037d481920] iptunnel_xmit+0x194/0x290
> > [c00000037d4819c0] udp_tunnel_xmit_skb+0x100/0x140 [udp_tunnel]
> > [c00000037d481a80] geneve_xmit_skb+0x34c/0x610 [geneve]
> > [c00000037d481bb0] geneve_xmit+0x94/0x1e8 [geneve]
> > [c00000037d481c30] dev_hard_start_xmit+0x11c/0x280
> > [c00000037d481cb0] __dev_queue_xmit+0x634/0x950
> > [c00000037d481d90] ovs_vport_send+0xb4/0x210 [openvswitch]
> > [c00000037d481e10] do_output+0x7c/0x200 [openvswitch]
> > [c00000037d481e60] do_execute_actions+0xe48/0xeb0 [openvswitch]
> > [c00000037d482020] ovs_execute_actions+0x78/0x1f0 [openvswitch]
> > [c00000037d4820a0] ovs_dp_process_packet+0xb4/0x2e0 [openvswitch]
> > [c00000037d482170] clone_execute+0x2c8/0x370 [openvswitch]

                       ^^^^^

clone_execute is an action which can be deferred AFAIKS, but it is
not deferred until several recursions deep.

If we deferred always when possible, then might avoid such a big
stack (at least for this config). Is it very costly to defer? Would
it help here, or is it just going to process it right away and
cause basically the same call chain?

Thanks,
Nick

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ