lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170118020942.GA37198@google.com>
Date:   Tue, 17 Jan 2017 18:09:43 -0800
From:   Brian Norris <briannorris@...omium.org>
To:     Dmitry Torokhov <dmitry.torokhov@...il.com>
Cc:     Amitkumar Karwar <akarwar@...vell.com>,
        Nishant Sarmukadam <nishants@...vell.com>,
        linux-kernel@...r.kernel.org, Kalle Valo <kvalo@...eaurora.org>,
        linux-wireless@...r.kernel.org, Cathy Luo <cluo@...vell.com>
Subject: Re: [PATCH v2 2/3] mwifiex: pcie: don't loop/retry interrupt status
 checks

On Tue, Jan 17, 2017 at 12:44:55PM -0800, Dmitry Torokhov wrote:
> On Tue, Jan 17, 2017 at 11:48:22AM -0800, Brian Norris wrote:
> > Also, FWIW, I did some fairly non-scientific tests of this on my
> > systems, and I didn't see much difference. I can run better tests, and
> > even collect data on how often we loop here vs. see new interrupts.
> 
> That would be great. Maybe packet aggregation takes care of interrupts
> arriving "too closely" together most of the time, I dunno.

OK, so I did some basic accounting of how many times this while loop
runs in a row. I don't know if they're highly illuminating, but here
goes. They're listed as a histogram, where the first column is number of
samples that exhibited the behavior and second column is number of times
going through the loop before exiting (i.e., seeing no more INT_STATUS):

Idle (just scanning for networks occasionally, and loading a web page or
two) for a minute or two:
      1 
    265 1
      2 2

Downloading a Chrome .deb package via wget, in a loop:
    857 0
  36406 1
  32386 2
   2230 3
    153 4
     11 5

Running a perf client test (i.e., TX traffic) in a loop:
   1694 0
 247897 1
  25530 2
    441 3
     18 4

So it seems like in some cases, it's at least *possible* to have a little bit
of potential savings on 10-50% of interrupts when under load. (i.e., see
that ~50% of interrupt checks take 2, 3, 4, or 5 loops in the second
example.)

Now, I also did some perf numbers with iperf between a Raspberry Pi iperf
server and an ARM64 system running mwifiex. On the whole, the TX side was
probably bottlenecked by the RPi, but the RX side was pretty good.

I'll attach full numbers, but the percentage delta is as follows:

                               Mean     Median
                               ------   ------
% change, bi-direction (RX):   -0.3     -4.5
% change, bi-direction (TX):    1.034    4.45
% change, TX only:             12.96    13.35
% change, RX only:             -6.5     -3

I'm not sure I have a good explanation for the gain in TX performance.
Perhaps partly the reduction in complexity (e.g., unnecessary register
reads). Perhaps also because I had IEEE power-save enabled, so without
this patch, performance could (theoretically) be harmed by the issue
mentioned in the commit description (i.e., occasional slow PCIe reads)
-- though I guess we probably don't enter power-save often during iperf
tests.

So, there could definitely be some methodology mistakes or other
variables involved, but these don't seem to show any particularly bad
performance loss, and if we did, we might consider other approaches like
NAPI for tuning them.

Brian

View attachment "summary.csv" of type "text/csv" (1141 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ