netdev - Re: [PATCH v2] kcm: remove any offset before parsing messages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAPDqMeoH9aLqk=j=fU=xHmpa=U2N-Tzu8pspiH4wqef=YzPOdw@mail.gmail.com>
Date:   Fri, 22 Feb 2019 13:01:47 -0800
From:   Tom Herbert <tom@...ntonium.net>
To:     Dominique Martinet <asmadeus@...ewreck.org>
Cc:     Tom Herbert <tom@...bertland.com>,
        David Miller <davem@...emloft.net>,
        Doron Roberts-Kedes <doronrk@...com>,
        Dave Watson <davejwatson@...com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] kcm: remove any offset before parsing messages

On Fri, Feb 22, 2019 at 12:28 PM Dominique Martinet
<asmadeus@...ewreck.org> wrote:
>
> Tom Herbert wrote on Fri, Feb 22, 2019:
> > > > So basically it sounds like you're interested in supporting TCP
> > > > connections that are half closed. I believe that the error in half
> > > > closed is EPIPE, so if the TCP socket returns that it can be ignored
> > > > and the socket can continue being attached and used to send data.
> > >
> > > Did you mean 'can continue being attached and used to receive data'?
> >
> > No, I meant shutdown on receive side when FIN is receved. TX is still
> > allowed to drain an queued bytes. To support shutdown on the TX side
> > would require additional logic since we need to effectively detach the
> > transmit path but retain the receive path. I'm not sure this is a
> > compelling use case to support.
>
Dominque,

> Hm, it must be a matter of how we see thing but from what I understand
> it's exactly the other way around. The remote closed the connection, so
> trying to send anything would just yield a RST, so TX doesn't make
> sense.
> On the other hand, anything that had been sent by the remote before the
> FIN and is on the local side's memory should still be receivable.

That's right, any data sent before the FIN can be received. After the
FIN, there is not more data to received. But, the FIN doesn't say
anything about the reverse path. For instance, if the peer did
shutdown(SHUT_WR), then it sent the FIN so it no longer transmits on
the connection, but still can receive data.

>
> When you think about it as a TCP stream it's really weird: data coming,
> data coming, data coming, FIN received.
> But in the networking stack that received FIN short-circuits all the
> data that was left around and immediately raises an EPIPE error.
>
Right, it doesn't help that most people automatically assume a
received FIN means the connection is closed! (although in practice
that assumption works pretty well for most applications).

> I don't see what makes this FIN packet so great that it should be
> processed before the data; we should only see that EPIPE when we're
> done reading the data before it or trying to send something.

It's not supposed to be. If your seeing a problem, it might because
state_change is processed before the received data is drained. If
plain recvmsg were being used, zero bytes is on returned after all
outstanding received data on the socket has been read. We might need
some additional logic in KCM to handle this.

>
> I'll check tomorrow/next week but I'm pretty sure the packets before
> that have been ack'd at a tcp level as well, so losing them in the
> application level is really unexpected.
>
> > > I can confirm getsockopt with SO_ERROR gets me EPIPE, but I don't see
> > > how to efficiently ignore EPIPE until POLLIN gets unset -- polling on
> > > both the csock and kcm socket will do many needless wakeups on only the
> > > csock from what I can see, so I'd need some holdoff timer or something.
> > > I guess it's possible though.
> >
> > We might need to clear the error somehow. May a read of zero bytes?
>
> Can try.
>
> > > After a bit more debugging, this part works (__strp_recv() is called
> > > again); but the next packet that is treated properly is rejected because
> > > by the time __strp_recv() was called again a new skb was read and the
> > > length isn't large enough to go all the way into the new packet, so this
> > > test fails:
> > >                         } else if (len <= (ssize_t)head->len -
> > >                                           skb->len - stm->strp.offset) {
> > >                                 /* Length must be into new skb (and also
> > >                                  * greater than zero)
> > >                                  */
> > >                                 STRP_STATS_INCR(strp->stats.bad_hdr_len);
> > >                                 strp_parser_err(strp, -EPROTO, desc);
> > >
> > > So I need to figure a way to say "call this function again without
> > > reading more data" somehow, or make this check more lax e.g. accept any
> > > len > 0 after a retry maybe...
> > > Removing that branch altogether seems to work at least but I'm not sure
> > > we'd want to?
> >
> > I like the check since it's conservative and covers the normal case.
> > Maybe just need some more logic?
>
> I can add a "retrying" state and not fail here if we ewre retrying for
> whatever reason perhaps...
> But I'm starting to wonder how this would work if my client didn't keep
> on sending data, I'll try to fail on the last client's packet and see
> how __strp_recv is called again through the timer, with the same skb
> perhaps?

Right, a timer or some sort of aynchronous callbask is needed. Like I
said, I don't think dealing with memory failure like this is unique to
KCM.

Thanks again for looking into this, I think this is good progress!

Tom

>
> --
> Dominique