linux-kernel - Re: Steam is broken on new kernels

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iJPcD9cOrFUHR_sSVyjxzqYGwB2mG-Crf5vhxc7L+LgsA@mail.gmail.com>
Date:   Wed, 26 Jun 2019 08:38:01 +0200
From:   Eric Dumazet <edumazet@...gle.com>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc:     Guenter Roeck <linux@...ck-us.net>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "Pierre-Loup A. Griffais" <pgriffais@...vesoftware.com>,
        lkml <linux-kernel@...r.kernel.org>
Subject: Re: Steam is broken on new kernels

On Wed, Jun 26, 2019 at 8:22 AM Greg Kroah-Hartman
<gregkh@...uxfoundation.org> wrote:
>
> On Wed, Jun 26, 2019 at 06:20:17AM +0200, Eric Dumazet wrote:
> > On Wed, Jun 26, 2019 at 5:43 AM Guenter Roeck <linux@...ck-us.net> wrote:
> > >
> > > On 6/25/19 7:29 PM, Greg Kroah-Hartman wrote:
> > > > On Tue, Jun 25, 2019 at 07:02:20PM -0700, Guenter Roeck wrote:
> > > >> Hi Greg,
> > > >>
> > > >> On Sat, Jun 22, 2019 at 09:37:53AM +0200, Greg Kroah-Hartman wrote:
> > > >>> On Fri, Jun 21, 2019 at 10:28:21PM -0700, Linus Torvalds wrote:
> > > >>>> On Fri, Jun 21, 2019 at 6:03 PM Pierre-Loup A. Griffais
> > > >>>> <pgriffais@...vesoftware.com> wrote:
> > > >>>>>
> > > >>>>> I applied Eric's path to the tip of the branch and ran that kernel and
> > > >>>>> the bug didn't occur through several logout / login cycles, so things
> > > >>>>> look good at first glance. I'll keep running that kernel and report back
> > > >>>>> if anything crops up in the future, but I believe we're good, beyond
> > > >>>>> getting distros to ship this additional fix.
> > > >>>>
> > > >>>> Good. It's now in my tree, so we can get it quickly into stable and
> > > >>>> then quickly to distributions.
> > > >>>>
> > > >>>> Greg, it's commit b6653b3629e5 ("tcp: refine memory limit test in
> > > >>>> tcp_fragment()"), and I'm building it right now and I'll push it out
> > > >>>> in a couple of minutes assuming nothing odd is going on.
> > > >>>
> > > >>> This looks good for 4.19 and 5.1, so I'll push out new stable kernels in
> > > >>> a bit for them.
> > > >>>
> > > >>> But for 4.14 and older, we don't have the "hint" to know this is an
> > > >>> outbound going packet and not to apply these checks at that point in
> > > >>> time, so this patch doesn't work.
> > > >>>
> > > >>> I'll see if I can figure anything else later this afternoon for those
> > > >>> kernels...
> > > >>>
> > > >>
> > > >> I may have missed it, but I don't see a fix for the problem in
> > > >> older stable branches. Any news ?
> > > >>
> > > >> One possibility might be be to apply the part of 75c119afe14f7 which
> > > >> introduces TCP_FRAG_IN_WRITE_QUEUE and TCP_FRAG_IN_RTX_QUEUE, if that
> > > >> is acceptable.
> > > >
> > > > That's what people have already discussed on the stable mailing list a
> > > > few hours ago, hopefully a patch shows up soon as I'm traveling at the
> > > > moment and can't do it myself...
> > > >
> > >
> > > Sounds good. Let me know if nothing shows up; I'll be happy to do it
> > > if needed.
> >
> >
> > Without the rb-tree for rtx queues, old kernels are vulnerable to SACK
> > attacks if sk_sndbuf is too big,
> > so I would simply  add a cushion in the test, instead of trying to
> > backport an illusion of the rb-tree fixes.
> >
> >
> >
> > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> > index a8772e11dc1cb42d4319b6fc072c625d284c7ad5..a554213afa4ac41120d781fe64b7cd18ff9b56e8
> > 100644
> > --- a/net/ipv4/tcp_output.c
> > +++ b/net/ipv4/tcp_output.c
> > @@ -1274,7 +1274,7 @@ int tcp_fragment(struct sock *sk, struct sk_buff
> > *skb, u32 len,
> >         if (nsize < 0)
> >                 nsize = 0;
> >
> > -       if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf)) {
> > +       if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf + 131072)) {
> >                 NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG);
> >                 return -ENOMEM;
> >         }
>
> That's a funny magic number, can we document what it means?

This is because TCP can cook skb with about 64KB of payload in
tcp_sendmsg() before
checking if memory limits are exceeded. (This is mentioned in commit
b6653b3629e5b88202be3c9abc44713973f5c4b4
" tcp: refine memory limit test in tcp_fragment()" changelog)

Then, if this giant TSO skb needs to be split in ~45 smaller skbs of
one segment each,
the resulting truesize might be twice bigger.

You could use 2 * 65536 if that looks better, and possibly a macro,
 but I feel that adding a macro for this one particular spot and
stable kernels might be overkill ?

>
> And yes, it's a much simpler patch, I'd rather take this than the fake
> backport.