[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250313115018.7xe77nJ-@linutronix.de>
Date: Thu, 13 Mar 2025 12:50:18 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Ilya Maximets <i.maximets@....org>
Cc: netdev@...r.kernel.org, linux-rt-devel@...ts.linux.dev,
dev@...nvswitch.org, Eric Dumazet <edumazet@...gle.com>,
Simon Horman <horms@...nel.org>, Jakub Kicinski <kuba@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Paolo Abeni <pabeni@...hat.com>,
"David S. Miller" <davem@...emloft.net>
Subject: Re: [ovs-dev] [PATCH net-next 11/18] openvswitch: Use nested-BH
locking for ovs_actions.
On 2025-03-10 17:56:09 [+0100], Ilya Maximets wrote:
> >>> + local_lock_nested_bh(&ovs_actions.bh_lock);
> >>
> >> Wouldn't this cause a warning when we're in a syscall/process context?
> >
> > My understanding is that is only invoked in softirq context. Did I
> > misunderstood it?
>
> It can be called from the syscall/process context while processing
> OVS_PACKET_CMD_EXECUTE request.
>
> > Otherwise that this_cpu_ptr() above should complain
> > that preemption is not disabled and if preemption is indeed not disabled
> > how do you ensure that you don't get preempted after the
> > __this_cpu_inc_return() in several tasks (at the same time) leading to
> > exceeding the OVS_RECURSION_LIMIT?
>
> We disable BH in this case, so it should be safe (on non-RT). See the
> ovs_packet_cmd_execute() for more details.
Yes, exactly. So if BH is disabled then local_lock_nested_bh() can
safely acquire a per-CPU spinlock on PREEMPT_RT here. This basically
mimics the local_bh_disable() behaviour in terms of exclusive data
structures on a smaller scope.
> >>> + ovs_act->owner = current;
> >>> + }
> >>> +
> >>> level = __this_cpu_inc_return(ovs_actions.exec_level);
> >>> if (unlikely(level > OVS_RECURSION_LIMIT)) {
> >>> net_crit_ratelimited("ovs: recursion limit reached on datapath %s, probable configuration error\n",
> >>> @@ -1710,5 +1718,10 @@ int ovs_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >>>
> >>> out:
> >>> __this_cpu_dec(ovs_actions.exec_level);
> >>> +
> >>> + if (level == 1) {
> >>> + ovs_act->owner = NULL;
> >>> + local_unlock_nested_bh(&ovs_actions.bh_lock);
> >>> + }
> >>
> >> Seems dangerous to lock every time the owner changes but unlock only
> >> once on level 1. Even if this works fine, it seems unnecessarily
> >> complicated. Maybe it's better to just lock once before calling
> >> ovs_execute_actions() instead?
> >
> > My understanding is this can be invoked recursively. That means on first
> > invocation owner == NULL and then you acquire the lock at which point
> > exec_level goes 0->1. On the recursive invocation owner == current and
> > you skip the lock but exec_level goes 1 -> 2.
> > On your return path once level becomes 1, then it means that dec made it
> > go 1 -> 0, you unlock the lock.
>
> My point is: why locking here with some extra non-obvious logic of owner
> tracking if we can lock (unconditionally?) in ovs_packet_cmd_execute() and
> ovs_dp_process_packet() instead? We already disable BH in one of those
> and take appropriate RCU locks in both. So, feels like a better place
> for the extra locking if necessary. We will also not need to move around
> any code in actions.c if the code there is guaranteed to be safe by holding
> locks outside of it.
I think I was considering it but dropped it because it looks like one
can call the other.
ovs_packet_cmd_execute() is an unique entry to ovs_execute_actions().
This could the lock unconditionally.
Then we have ovs_dp_process_packet() as the second entry point towards
ovs_execute_actions() and is the tricky one. One originates from
netdev_frame_hook() which the "normal" packet receiving.
Then within ovs_execute_actions() there is ovs_vport_send() which could
use internal_dev_recv() for forwarding. This one throws the packet into
the networking stack so it could come back via netdev_frame_hook().
Then there is this internal forwarding via internal_dev_xmit() which
also ends up in ovs_execute_actions(). Here I don't know if this can
originate from within the recursion.
After looking at this and seeing the internal_dev_recv() I decided to
move it to within ovs_execute_actions() where the recursion check itself
is.
> > The locking part happens only on PREEMPT_RT because !PREEMPT_RT has
> > softirqs disabled which guarantee that there will be no preemption.
> >
> > tools/testing/selftests/net/openvswitch should cover this?
>
> It's not a comprehensive test suite, it covers some cases, but it
> doesn't test anything related to preemptions specifically.
>From looking at the traces, everything originates from
netdev_frame_hook() and there is sometimes one recursion from within
ovs_execute_actions(). I haven't seen anything else.
> >> Also, the name of the struct ovs_action doesn't make a lot of sense,
> >> I'd suggest to call it pcpu_storage or something like that instead.
> >> I.e. have a more generic name as the fields inside are not directly
> >> related to each other.
> >
> > Understood. ovs_pcpu_storage maybe?
>
> It's OK, I guess, but see also a point about locking inside datapath.c
> instead and probably not needing to change anything in actions.c.
If you say that adding a lock to ovs_dp_process_packet() and another to
ovs_packet_cmd_execute() then I can certainly update. However based on
what I wrote above, I am not sure.
> >> Best regards, Ilya Maximets.
Sebastian
Powered by blists - more mailing lists