[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201127193134.GA23450@linux.home>
Date: Fri, 27 Nov 2020 20:31:34 +0100
From: Guillaume Nault <gnault@...hat.com>
To: Tom Parkin <tparkin@...alix.com>
Cc: netdev@...r.kernel.org, jchapman@...alix.com
Subject: Re: [PATCH net-next 1/2] ppp: add PPPIOCBRIDGECHAN and
PPPIOCUNBRIDGECHAN ioctls
On Thu, Nov 26, 2020 at 12:24:25PM +0000, Tom Parkin wrote:
> This new ioctl pair allows two ppp channels to be bridged together:
> frames arriving in one channel are transmitted in the other channel
> and vice versa.
Thanks!
Some comments below (mostly about locking).
> The practical use for this is primarily to support the L2TP Access
> Concentrator use-case. The end-user session is presented as a ppp
> channel (typically PPPoE, although it could be e.g. PPPoA, or even PPP
> over a serial link) and is switched into a PPPoL2TP session for
> transmission to the LNS. At the LNS the PPP session is terminated in
> the ISP's network.
>
> When a PPP channel is bridged to another it takes a reference on the
> other's struct ppp_file. This reference is dropped when the channels
> are unbridged, which can occur either explicitly on userspace calling
> the PPPIOCUNBRIDGECHAN ioctl, or implicitly when either channel in the
> bridge is unregistered.
>
> In order to implement the channel bridge, struct channel is extended
> with a new field, 'bridge', which points to the other struct channel
> making up the bridge.
>
> This pointer is RCU protected to avoid adding another lock to the data
> path.
>
> To guard against concurrent writes to the pointer, the existing struct
> channel lock 'downl' use is extended (rather than adding a new lock).
> Order of lock acquisition is maintained: i.e. the channel 'upl' lock is
> always acquired before 'downl' in code paths that need to hold both.
>
> Signed-off-by: Tom Parkin <tparkin@...alix.com>
> ---
> drivers/net/ppp/ppp_generic.c | 147 ++++++++++++++++++++++++++++++++-
> include/uapi/linux/ppp-ioctl.h | 2 +
> 2 files changed, 147 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
> index 7d005896a0f9..5e563bfb8e2a 100644
> --- a/drivers/net/ppp/ppp_generic.c
> +++ b/drivers/net/ppp/ppp_generic.c
> @@ -170,11 +170,12 @@ struct channel {
> struct list_head list; /* link in all/new_channels list */
> struct ppp_channel *chan; /* public channel data structure */
> struct rw_semaphore chan_sem; /* protects `chan' during chan ioctl */
> - spinlock_t downl; /* protects `chan', file.xq dequeue */
> + spinlock_t downl; /* protects `chan', 'bridge', file.xq dequeue */
> struct ppp *ppp; /* ppp unit we're connected to */
> struct net *chan_net; /* the net channel belongs to */
> struct list_head clist; /* link in list of channels per unit */
> rwlock_t upl; /* protects `ppp' */
> + struct channel *bridge; /* "bridged" ppp channel */
Missing __rcu annotation (as reported by kernel test robot):
struct channel __rcu *bridge;
With RCU protection, it might make sense to use ->upl, instead of
->downl, to protect the update side. Since ->upl is used to protect the
pointer to the parent unit, it probably makes sense to use it for
->bridge too, which somehow replaces the parent unit (as both are
mutually exclusive). Also, using ->upl would avoid some lock nesting
when updating ->bridge.
> #ifdef CONFIG_PPP_MULTILINK
> u8 avail; /* flag used in multilink stuff */
> u8 had_frag; /* >= 1 fragments have been sent */
> @@ -606,6 +607,95 @@ static struct bpf_prog *compat_ppp_get_filter(struct sock_fprog32 __user *p)
> #endif
> #endif
>
> +/* Bridge one PPP channel to another.
> + * When two channels are bridged, ppp_input on one channel is redirected to
> + * the other's ops->start_xmit handler.
> + * In order to safely bridge channels we must reject channels which are already
> + * part of a bridge instance, or which form part of an existing unit.
> + * Once successfully bridged, each channel holds a reference on the other
> + * to prevent it being freed while the bridge is extant.
> + */
> +static int ppp_bridge_channels(struct channel *pch, struct channel *pchb)
> +{
> + int ret = -EALREADY;
> +
> + /* We need to take each channel upl for access to the 'ppp' field,
> + * and each channel downl for write access to the 'bridge' field.
> + */
> +
> + read_lock_bh(&pch->upl);
> + if (pch->ppp)
> + goto out0;
> +
> + spin_lock(&pch->downl);
> +
> + read_lock_bh(&pchb->upl);
> + if (pchb->ppp)
> + goto out1;
You're verifying that ->ppp isn't set, however, you haven't added a
test in ppp_connect_channel() to avoid setting ->ppp when ->bridge
is already set. Therefore, it'd still be possible to set both ->ppp and
->bridge on a channel.
> + spin_lock(&pchb->downl);
> +
> + if (pch->bridge || pchb->bridge)
> + goto out2;
> +
> + rcu_assign_pointer(pch->bridge, pchb);
> + refcount_inc(&pchb->file.refcnt);
> +
> + rcu_assign_pointer(pchb->bridge, pch);
> + refcount_inc(&pch->file.refcnt);
> +
> + ret = 0;
> +
> +out2:
> + spin_unlock(&pchb->downl);
> +out1:
> + read_unlock_bh(&pchb->upl);
> + spin_unlock(&pch->downl);
> +out0:
> + read_unlock_bh(&pch->upl);
> +
> + return ret;
> +}
Locking looks dangerous here: given that ppp_bridge_channels() is
called with pn->all_channels_lock held, that's 5 nested locks.
Since we have to hold the channels anyway, why not incrementing the
refcount immediately and unlock everything as soon as possible?
That is, instead of doing:
LOCK(all_channels_lock)
LOCK(channel->upl)
LOCK(channel->downl)
LOCK(bridge->upl)
LOCK(bridge->downl)
assign_pointers
UNLOCK()
...
UNLOCK()
what about something like:
LOCK(all_channels_lock)
bridge = find_channel()
refcount_inc(&bridge->refcount)
UNLOCK(all_channels_lock)
LOCK(channel->upl)
LOCK(channel->downl)
set ->bridge
UNLOCK(channel->downl)
UNLOCK(channel->upl)
refcount_inc(&channel->refcount) // so that bridge holds a ref
LOCK(bridge->upl)
LOCK(bridge->downl)
set ->bridge
UNLOCK(bridge->downl)
UNLOCK(bridge->upl)
We could even avoid locking ->downl if ->bridge was protected directly
by ->upl. That way we'd avoid nesting locks entirely.
> +static int ppp_unbridge_channels(struct channel *pch)
> +{
> + struct channel *pchb;
> +
> + rcu_read_lock();
> +
> + pchb = rcu_dereference(pch->bridge);
> + if (!pchb) {
> + rcu_read_unlock();
> + return -ENXIO;
> + }
> +
> + if (pch != rcu_dereference(pchb->bridge)) {
> + rcu_read_unlock();
> + return -ENXIO;
> + }
Looks like we have a TOCTOU problem here: ->bridge might change before
we lock ->downl.
> + spin_lock(&pch->downl);
> + spin_lock(&pchb->downl);
I think we can have a deadlock here. Since ppp_unbridge_channels()
isn't running under the protection of an external lock, we could have
the bridge channel call this function concurrently. Then we'd have
lock inversion:
ppp_unbridge_channels(channel) ppp_unbridge_channels(bridge)
LOCK(channel->downl) LOCK(bridge->downl)
LOCK(bridge->downl) LOCK(channel->downl) // deadlock
Here again I think we should avoid nesting locks and clear each
->bridge pointer independently.
> + rcu_assign_pointer(pch->bridge, NULL);
> + rcu_assign_pointer(pchb->bridge, NULL);
Nit, we can use RCU_INIT_POINTER() when resetting a pointer with NULL.
> + spin_unlock(&pchb->downl);
> + spin_unlock(&pch->downl);
> +
> + rcu_read_unlock();
> +
> + synchronize_rcu();
> +
> + if (refcount_dec_and_test(&pch->file.refcnt))
> + ppp_destroy_channel(pch);
> +
> + if (refcount_dec_and_test(&pchb->file.refcnt))
> + ppp_destroy_channel(pchb);
> +
> + return 0;
> +}
> +
> static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> {
> struct ppp_file *pf;
> @@ -641,8 +731,9 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> }
>
> if (pf->kind == CHANNEL) {
> - struct channel *pch;
> + struct channel *pch, *pchb;
> struct ppp_channel *chan;
> + struct ppp_net *pn;
>
> pch = PF_TO_CHANNEL(pf);
>
> @@ -657,6 +748,22 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> err = ppp_disconnect_channel(pch);
> break;
>
> + case PPPIOCBRIDGECHAN:
> + if (get_user(unit, p))
> + break;
> + err = -ENXIO;
> + pn = ppp_pernet(current->nsproxy->net_ns);
> + spin_lock_bh(&pn->all_channels_lock);
> + pchb = ppp_find_channel(pn, unit);
> + if (pchb)
> + err = ppp_bridge_channels(pch, pchb);
> + spin_unlock_bh(&pn->all_channels_lock);
> + break;
> +
> + case PPPIOCUNBRIDGECHAN:
> + err = ppp_unbridge_channels(pch);
> + break;
> +
> default:
> down_read(&pch->chan_sem);
> chan = pch->chan;
> @@ -2089,6 +2196,35 @@ static bool ppp_decompress_proto(struct sk_buff *skb)
> return pskb_may_pull(skb, 2);
> }
>
> +/* Attempt to handle a frame via. a bridged channel, if one exists.
> + * If the channel is bridged, the frame is consumed by the bridge.
> + * If not, the caller must handle the frame by normal recv mechanisms.
> + * Returns true if the frame is consumed, false otherwise.
> + */
> +static bool ppp_channel_bridge_input(struct channel *pch, struct sk_buff *skb)
> +{
> + struct channel *pchb;
> +
> + rcu_read_lock();
> + pchb = rcu_dereference(pch->bridge);
> + if (pchb) {
> + spin_lock(&pchb->downl);
> + if (pchb->chan) {
> + skb_scrub_packet(skb, !net_eq(pch->chan_net, pchb->chan_net));
> + if (!pchb->chan->ops->start_xmit(pchb->chan, skb))
> + kfree_skb(skb);
> + } else {
> + /* channel got unregistered */
> + kfree_skb(skb);
> + }
> + spin_unlock(&pchb->downl);
> + }
> + rcu_read_unlock();
> +
> + /* If pchb is set then we've consumed the packet */
> + return pchb;
> +}
Maybe "return !!pchb;". I always find it unexpected to store a pointer
into a bool. But maybe it's just me.
Also, I believe the code could be made more readable by returning early
in unhandled cases, instead of nesting all the conditions.
> void
> ppp_input(struct ppp_channel *chan, struct sk_buff *skb)
> {
> @@ -2100,6 +2236,10 @@ ppp_input(struct ppp_channel *chan, struct sk_buff *skb)
> return;
> }
>
> + /* If the channel is bridged, transmit via. bridge */
> + if (ppp_channel_bridge_input(pch, skb))
> + return;
> +
> read_lock_bh(&pch->upl);
> if (!ppp_decompress_proto(skb)) {
> kfree_skb(skb);
> @@ -2796,8 +2936,11 @@ ppp_unregister_channel(struct ppp_channel *chan)
> list_del(&pch->list);
> spin_unlock_bh(&pn->all_channels_lock);
>
> + ppp_unbridge_channels(pch);
> +
> pch->file.dead = 1;
> wake_up_interruptible(&pch->file.rwait);
> +
> if (refcount_dec_and_test(&pch->file.refcnt))
> ppp_destroy_channel(pch);
> }
> diff --git a/include/uapi/linux/ppp-ioctl.h b/include/uapi/linux/ppp-ioctl.h
> index 7bd2a5a75348..8dbecb3ad036 100644
> --- a/include/uapi/linux/ppp-ioctl.h
> +++ b/include/uapi/linux/ppp-ioctl.h
> @@ -115,6 +115,8 @@ struct pppol2tp_ioc_stats {
> #define PPPIOCATTCHAN _IOW('t', 56, int) /* attach to ppp channel */
> #define PPPIOCGCHAN _IOR('t', 55, int) /* get ppp channel number */
> #define PPPIOCGL2TPSTATS _IOR('t', 54, struct pppol2tp_ioc_stats)
> +#define PPPIOCBRIDGECHAN _IOW('t', 53, int) /* bridge one channel to another */
> +#define PPPIOCUNBRIDGECHAN _IO('t', 54) /* unbridge channel */
>
> #define SIOCGPPPSTATS (SIOCDEVPRIVATE + 0)
> #define SIOCGPPPVER (SIOCDEVPRIVATE + 1) /* NEVER change this!! */
> --
> 2.17.1
>
Powered by blists - more mailing lists