linux-kernel - RE: [PATCH v2] packet: track ring entry use using a shadow ring to prevent RX ring overrun

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e942c1bbe48b451b92620396746986ab@XCH-RTP-016.cisco.com>
Date:   Wed, 23 May 2018 11:08:33 +0000
From:   "Jon Rosen (jrosen)" <jrosen@...co.com>
To:     Willem de Bruijn <willemdebruijn.kernel@...il.com>
CC:     "David S. Miller" <davem@...emloft.net>,
        Willem de Bruijn <willemb@...gle.com>,
        Eric Dumazet <edumazet@...gle.com>,
        Kees Cook <keescook@...omium.org>,
        David Windsor <dwindsor@...il.com>,
        "Rosen, Rami" <rami.rosen@...el.com>,
        "Reshetova, Elena" <elena.reshetova@...el.com>,
        "Mike Maloney" <maloney@...gle.com>,
        Benjamin Poirier <bpoirier@...e.com>,
        "Thomas Gleixner" <tglx@...utronix.de>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "open list:NETWORKING [GENERAL]" <netdev@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v2] packet: track ring entry use using a shadow ring to
 prevent RX ring overrun

> >>> I think the bigger issues as you've pointed out are the cost of
> >>> the additional spin lock and should the additional state be
> >>> stored in-band (fewer cache lines) or out-of band (less risk of
> >>> breaking due to unpredictable application behavior).
> >>
> >> We don't need the spinlock if clearing the shadow byte after
> >> setting the status to user.
> >>
> >> Worst case, user will set it back to kernel while the shadow
> >> byte is not cleared yet and the next producer will drop a packet.
> >> But next producers will make progress, so there is no deadlock
> >> or corruption.
> >
> > I thought so too for a while but after spending more time than I
> > care to admit I relized the following sequence was occuring:
> >
> >    Core A                       Core B
> >    ------                       ------
> >    - Enter spin_lock
> >    -   Get tp_status of head (X)
> >        tp_status == 0
> >    -   Check inuse
> >        inuse == 0
> >    -   Allocate entry X
> >        advance head (X+1)
> >        set inuse=1
> >    - Exit spin_lock
> >
> >      <very long delay>
> >
> >                                 <allocate N-1 entries
> >                                 where N = size of ring>
> >
> >                                 - Enter spin_lock
> >                                 -   get tp_status of head (X+N)
> >                                     tp_status == 0 (but slot
> >                                     in use for X on core A)
> >
> >    - write tp_status of         <--- trouble!
> >      X = TP_STATUS_USER         <--- trouble!
> >    - write inuse=0              <--- trouble!
> >
> >                                 -   Check inuse
> >                                     inuse == 0
> >                                 -   Allocate entry X+N
> >                                     advance head (X+N+1)
> >                                     set inuse=1
> >                                 - Exit spin_lock
> >
> >
> > At this point Core A just passed slot X to userspace with a
> > packet and Core B has just been assigned slot X+N (same slot as
> > X) for it's new packet. Both cores A and B end up filling in that
> > slot.  Tracking ths donw was one of the reasons it took me a
> > while to produce these updated diffs.
> 
> Is this not just an ordering issue? Since inuse is set after tp_status,
> it has to be tested first (and barriers are needed to avoid reordering).

I changed the code as you suggest to do the inuse check first and
removed the extra added spin_lock/unlock and it seems to be working.
I was able to run through the night without an issue (normally I would
hit the ring corruption in 1 to 2 hours).

Thanks for pointing that out, I should have caught that myself.  Next
I'll look at your suggestion for where to put the shadow ring.