lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <VI1PR09MB2638EBA2A3A3DD752B9EAA9AC7509@VI1PR09MB2638.eurprd09.prod.outlook.com>
Date:   Fri, 14 May 2021 11:48:14 +0100
From:   David Coe <david.coe@...e.co.uk>
To:     Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
        linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
        iommu@...ts.linux-foundation.org
Cc:     peterz@...radead.org, mingo@...hat.com, joro@...tes.org,
        Jon.Grimm@....com, amonakov@...ras.ru
Subject: Re: [PATCH] x86/events/amd/iommu: Fix invalid Perf result due to
 IOMMU PMC power-gating

Hi all!

On 04/05/2021 07:52, Suravee Suthikulpanit wrote:
> On certain AMD platforms, when the IOMMU performance counter source
> (csource) field is zero, power-gating for the counter is enabled, which
> prevents write access and returns zero for read access.
> 
> This can cause invalid perf result especially when event multiplexing
> is needed (i.e. more number of events than available counters) since
> the current logic keeps track of the previously read counter value,
> and subsequently re-program the counter to continue counting the event.
> With power-gating enabled, we cannot gurantee successful re-programming
> of the counter.
> 
> Workaround this issue by :
> 
> 1. Modifying the ordering of setting/reading counters and enabing/
>     disabling csources to only access the counter when the csource
>     is set to non-zero.
> 
> 2. Since AMD IOMMU PMU does not support interrupt mode, the logic
>     can be simplified to always start counting with value zero,
>     and accumulate the counter value when stopping without the need
>     to keep track and reprogram the counter with the previously read
>     counter value.
> 
> This has been tested on systems with and without power-gating.

I've just noticed kernel-5.13-rc1 includes your full iommu enchilada. A 
quick test with Ubuntu's mainline ppa debs (and a home-spun perf)gives 
on a Ryzen 2400G what seem very satisfactory results. Bravo!

  Performance counter stats for 'system wide':

                  0       amd_iommu_0/cmd_processed/           (33.32%)
                  0       amd_iommu_0/cmd_processed_inv/       (33.34%)
                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/    (33.38%)
                615       amd_iommu_0/int_dte_hit/             (33.44%)
                  5       amd_iommu_0/int_dte_mis/             (33.44%)
              1,347       amd_iommu_0/mem_dte_hit/             (33.46%)
             19,127       amd_iommu_0/mem_dte_mis/             (33.44%)
                 71       amd_iommu_0/mem_iommu_tlb_pde_hit/   (33.43%)
                754       amd_iommu_0/mem_iommu_tlb_pde_mis/   (33.41%)
              1,777       amd_iommu_0/mem_iommu_tlb_pte_hit/   (33.36%)
             20,163       amd_iommu_0/mem_iommu_tlb_pte_mis/   (33.32%)
                  0       amd_iommu_0/mem_pass_excl/           (33.25%)
                  0       amd_iommu_0/mem_pass_pretrans/       (33.28%)
             27,283       amd_iommu_0/mem_pass_untrans/        (33.27%)
                  0       amd_iommu_0/mem_target_abort/        (33.29%)
                645       amd_iommu_0/mem_trans_total/         (33.32%)
                  0       amd_iommu_0/page_tbl_read_gst/       (33.28%)
                183       amd_iommu_0/page_tbl_read_nst/       (33.30%)
                 45       amd_iommu_0/page_tbl_read_tot/       (33.30%)
                  0       amd_iommu_0/smi_blk/                 (33.32%)
                  0       amd_iommu_0/smi_recv/                (33.28%)
                  0       amd_iommu_0/tlb_inv/                 (33.27%)
                  0       amd_iommu_0/vapic_int_guest/         (33.28%)
                613       amd_iommu_0/vapic_int_non_guest/     (33.26%)

        9.998673791 seconds time elapsed

Running Windows 10 & etc under QEMU/KVM produces nothing untoward. 
Again, congratulations and many thanks.

-- 
David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ