lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 10 Nov 2019 22:38:41 -0800
From:   Xiaodong Xu <stid.smth@...il.com>
To:     Steffen Klassert <steffen.klassert@...unet.com>
Cc:     herbert@...dor.apana.org.au, davem@...emloft.net,
        chenborfc@....com, netdev@...r.kernel.org
Subject: Re: [PATCH] xfrm: release device reference for invalid state

Thanks for reviewing the patch, Steffen. Please check my replies below.

On Sun, Nov 10, 2019 at 10:17 PM Steffen Klassert
<steffen.klassert@...unet.com> wrote:
>
> Please make sure to always Cc netdev@...r.kernel.org on networking
> patches.
>
> Aso, what is the difference between this patch and the one you sent
> before? Please add version numbers to your patches and describe the
> changes between the versions.
>
The main difference in the new version is that 'family' will not be
assigned (in which case x->outer_mode needs to be accessed, and I'm
not sure if x->outer_mode is still accessible when the state is
invalid) in an invalid state.
I'll update the version to my patch.

> On Fri, Nov 08, 2019 at 12:20:59AM -0800, Xiaodong Xu wrote:
> > An ESP packet could be decrypted in async mode if the input handler for
> > this packet returns -EINPROGRESS in xfrm_input(). At this moment the device
> > reference in skb is held. Later xfrm_input() will be invoked again to
> > resume the processing.
> > If the transform state is still valid it would continue to release the
> > device reference and there won't be a problem; however if the transform
> > state is not valid when async resumption happens, the packet will be
> > dropped while the device reference is still being held.
> > When the device is deleted for some reason and the reference to this
> > device is not properly released, the kernel will keep logging like:
> >
> > unregister_netdevice: waiting for ppp2 to become free. Usage count = 1
> >
> > The issue is observed when running IPsec traffic over a PPPoE device based
> > on a bridge interface. By terminating the PPPoE connection on the server
> > end for multiple times, the PPPoE device on the client side will eventually
> > get stuck on the above warning message.
> >
> > This patch will check the async mode first and continue to release device
> > reference in async resumption, before it is dropped due to invalid state.
> >
> > Fixes: 4ce3dbe397d7b ("xfrm: Fix xfrm_input() to verify state is valid when (encap_type < 0)")
> > Signed-off-by: Xiaodong Xu <stid.smth@...il.com>
> > Reported-by: Bo Chen <chenborfc@....com>
> > Tested-by: Bo Chen <chenborfc@....com>
> > ---
> >  net/xfrm/xfrm_input.c | 30 +++++++++++++++++++++---------
> >  1 file changed, 21 insertions(+), 9 deletions(-)
> >
> > diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
> > index 9b599ed66d97..80c5af7cfec7 100644
> > --- a/net/xfrm/xfrm_input.c
> > +++ b/net/xfrm/xfrm_input.c
> > @@ -474,6 +474,13 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
> >       if (encap_type < 0) {
> >               x = xfrm_input_state(skb);
> >
> > +             /* An encap_type of -1 indicates async resumption. */
> > +             if (encap_type == -1) {
> > +                     async = 1;
> > +                     seq = XFRM_SKB_CB(skb)->seq.input.low;
> > +                     goto resume;
> > +             }
> > +
> >               if (unlikely(x->km.state != XFRM_STATE_VALID)) {
> >                       if (x->km.state == XFRM_STATE_ACQ)
> >                               XFRM_INC_STATS(net, LINUX_MIB_XFRMACQUIREERROR);
>
> Why not just dropping the reference here if the state became invalid
> after async resumption?
>
I was thinking about releasing the device reference immediately after
checking the state in the async resumption too. However it seems more
natural to me to simply jump to the 'resume' label in the async case.
Suppose there are more resources to be held before the async
resumption, we don't have to worry about that before dropping the
packet.
But if you prefer the other way I am OK with that too.

Regards,
Xiaodong

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ