linux-kernel - [PATCH 0/2] iommu: amd: Fix intremap IO_PAGE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20200902045110.4679-1-suravee.suthikulpanit@amd.com>
Date:   Wed,  2 Sep 2020 04:51:08 +0000
From:   Suravee Suthikulpanit <suravee.suthikulpanit@....com>
To:     linux-kernel@...r.kernel.org, iommu@...ts.linux-foundation.org
Cc:     joro@...tes.org, sean.m.osborne@...cle.com,
        james.puthukattukaran@...cle.com, joao.m.martins@...cle.com,
        boris.ostrovsky@...cle.com, jon.grimm@....com,
        Suravee Suthikulpanit <suravee.suthikulpanit@....com>
Subject: [PATCH 0/2] iommu: amd: Fix intremap IO_PAGE_FAULT for VMs

Interrupt remapping IO_PAGE_FAULT has been observed under system w/
large number of VMs w/ pass-through devices. This can be reproduced with
64 VMs + 64 pass-through VFs of Mellanox MT28800 Family [ConnectX-5 Ex],
where each VM runs small-packet netperf test via the pass-through device
to the netserver running on the host. All VMs are running in reboot loop,
to trigger IRTE updates.

In addition, to accelerate the failure, irqbalance is triggered periodically
(e.g. 1-5 sec), which should generate large amount of updates to IRTE.
This setup generally triggers IO_PAGE_FAULT within 3-4 hours.

Investigation has shown that the issue is in the code to update IRTE
while remapping is enabled. Please see patch 2/2 for detail discussion.

This serires has been tested running in the setup mentioned above
upto 96 hours w/o seeing issues.

Thanks,
Suravee

Suravee Suthikulpanit (2):
  iommu: amd: Restore IRTE.RemapEn bit after programming IRTE
  iommu: amd: Use cmpxchg_double() when updating 128-bit IRTE

 drivers/iommu/amd/Kconfig |  2 +-
 drivers/iommu/amd/init.c  | 21 +++++++++++++++++++--
 drivers/iommu/amd/iommu.c | 19 +++++++++++++++----
 3 files changed, 35 insertions(+), 7 deletions(-)

-- 
2.17.1