lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrUyh38ktYNN566o8s_vsjG3UysQVNjmyN5vVU6+GJQ6UA@mail.gmail.com>
Date:   Fri, 17 Mar 2017 08:26:16 -0700
From:   Andy Lutomirski <luto@...capital.net>
To:     Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>
Cc:     Andy Lutomirski <luto@...nel.org>, Jens Axboe <axboe@...com>,
        Christoph Hellwig <hch@....de>,
        LKML <linux-kernel@...r.kernel.org>,
        Chris Wilson <chris@...is-wilson.co.uk>
Subject: Re: Perf regression after enabling nvme autonomous power state transitions

On Fri, Mar 17, 2017 at 3:58 AM, Tvrtko Ursulin
<tvrtko.ursulin@...ux.intel.com> wrote:
>
> Hi Andy, all,
>
> I have bisected and verified an interesting performance regression caused by
> commit c5552fde102fcc3f2cf9e502b8ac90e3500d8fdf "nvme: Enable autonomous
> power state transitions".
>
> Having that patch or not accounts for approx. 3% perf difference in a test
> which is, and this is the best part, not even i/o bound by any stretch of
> the imagination.

Weird.  Is there something in the workload that is periodically
reading or otherwise doing a synchronous IO?  APST shouldn't affect
throughput in any meaningful way, but it certainly affects latency of
sporadic IO.

>
> The test is multi-process with overall medium CPU usage and high GPU (Intel)
> usage. Average runtime is around 13 seconds during which it writes out
> around 14MiB of data.
>
> It does so in chunks during the whole runtime but doesn't do anything
> special, just normal O_RDWR | O_CREAT | O_TRUNC so in practice this is all
> written to the device only the end of the test run in one chunk. Via the
> background write out I suspect.
>
> The 3% mentioned earlier translates to approx. 430ms longer average runtime
> with the above patch.

That is surprisingly long.  If your device is telling the truth about
its latencies, this should be literally impossible.

> ps    0 : mp:9.00W operational enlat:5 exlat:5 rrt:0 rrl:0
>           rwt:0 rwl:0 idle_power:- active_power:-

This is the normal on state.

> ps    1 : mp:4.60W operational enlat:30 exlat:30 rrt:1 rrl:1
>           rwt:1 rwl:1 idle_power:- active_power:-
> ps    2 : mp:3.80W operational enlat:30 exlat:30 rrt:2 rrl:2
>           rwt:2 rwl:2 idle_power:- active_power:-

These won't be used at all.

> ps    3 : mp:0.0700W non-operational enlat:10000 exlat:300 rrt:3 rrl:3
>           rwt:3 rwl:3 idle_power:- active_power:-
> ps    4 : mp:0.0050W non-operational enlat:2000 exlat:10000 rrt:4 rrl:4
>           rwt:4 rwl:4 idle_power:- active_power:-

These two have their latencies specified in microseconds and will be
used autonomously.

Can you re-test with
/sys/class/nvme/nvme0/power/pm_qos_latency_tolerance_us set to the
following values and report back?

0 - will turn APST off entirely
1 - will leave APST on but will not program any transitions
11000 - should allow state 3 but not state 4

Thanks
Andy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ