linux-kernel - Re: [PATCH] driver/pci: reduce the single block time in pci_read

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200916165605.GA1554766@bjorn-Precision-5520>
Date:   Wed, 16 Sep 2020 11:56:05 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Jiang Biao <benbjiang@...il.com>
Cc:     Bjorn Helgaas <bhelgaas@...gle.com>, linux-pci@...r.kernel.org,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Jiang Biao <benbjiang@...cent.com>,
        Bin Lai <robinlai@...cent.com>
Subject: Re: [PATCH] driver/pci: reduce the single block time in
 pci_read_config

On Sun, Sep 13, 2020 at 12:27:09PM +0800, Jiang Biao wrote:
> On Thu, 10 Sep 2020 at 09:59, Bjorn Helgaas <helgaas@...nel.org> wrote:
> > On Thu, Sep 10, 2020 at 09:54:02AM +0800, Jiang Biao wrote:
> > > On Thu, 10 Sep 2020 at 09:25, Bjorn Helgaas <helgaas@...nel.org> wrote:
> > > > On Mon, Aug 24, 2020 at 01:20:25PM +0800, Jiang Biao wrote:
> > > > > From: Jiang Biao <benbjiang@...cent.com>
> > > > >
> > > > > pci_read_config() could block several ms in kernel space, mainly
> > > > > caused by the while loop to call pci_user_read_config_dword().
> > > > > Singel pci_user_read_config_dword() loop could consume 130us+,
> > > > >               |    pci_user_read_config_dword() {
> > > > >               |      _raw_spin_lock_irq() {
> > > > > ! 136.698 us  |        native_queued_spin_lock_slowpath();
> > > > > ! 137.582 us  |      }
> > > > >               |      pci_read() {
> > > > >               |        raw_pci_read() {
> > > > >               |          pci_conf1_read() {
> > > > >   0.230 us    |            _raw_spin_lock_irqsave();
> > > > >   0.035 us    |            _raw_spin_unlock_irqrestore();
> > > > >   8.476 us    |          }
> > > > >   8.790 us    |        }
> > > > >   9.091 us    |      }
> > > > > ! 147.263 us  |    }
> > > > > and dozens of the loop could consume ms+.
> > > > >
> > > > > If we execute some lspci commands concurrently, ms+ scheduling
> > > > > latency could be detected.
> > > > >
> > > > > Add scheduling chance in the loop to improve the latency.
> > > >
> > > > Thanks for the patch, this makes a lot of sense.
> > > >
> > > > Shouldn't we do the same in pci_write_config()?
> > >
> > > Yes, IMHO, that could be helpful too.
> >
> > If it's feasible, it would be nice to actually verify that it makes a
> > difference.  I know config writes should be faster than reads, but
> > they're certainly not as fast as a CPU can pump out data, so there
> > must be *some* mechanism that slows the CPU down.
> >
> We failed to build a test case to produce the latency by setpci command,
> AFAIU, setpci could be much less frequently realistically used than lspci.
> So, the latency from pci_write_config() path could not be verified for now,
> could we apply this patch alone to erase the verified latency introduced
> by pci_read_config() path? :)

Thanks for trying!  I'll apply the patch as-is.  I'd like to include a
note in the commit log about the user-visible effect of this.  I
looked through recent similar commits:

  928da37a229f ("RDMA/umem: Add a schedule point in ib_umem_get()")
  47aaabdedf36 ("fanotify: Avoid softlockups when reading many events")
  9f47eb5461aa ("fs/btrfs: Add cond_resched() for try_release_extent_mapping() stalls")
  0a3b3c253a1e ("mm/mmap.c: Add cond_resched() for exit_mmap() CPU stalls")
  b7e3debdd040 ("mm/memory_hotplug.c: fix false softlockup during pfn range removal")
  d35bd764e689 ("dm writecache: add cond_resched to loop in persistent_memory_claim()")
  da97f2d56bbd ("mm: call cond_resched() from deferred_init_memmap()")
  ab8b65be1831 ("KVM: PPC: Book3S: Fix some RCU-list locks")
  48c963e31bc6 ("KVM: arm/arm64: Release kvm->mmu_lock in loop to prevent starvation")
  e84fe99b68ce ("mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()")
  4005f5c3c9d0 ("wireguard: send/receive: cond_resched() when processing worker ringbuffers")
  3fd44c86711f ("io_uring: use cond_resched() in io_ring_ctx_wait_and_kill()")
  7979457b1d3a ("net: bridge: vlan: Add a schedule point during VLAN processing")
  2ed6edd33a21 ("perf: Add cond_resched() to task_function_call()")
  1edaa447d958 ("dm writecache: add cond_resched to avoid CPU hangs")
  ce9a4186f9ac ("macvlan: add cond_resched() during multicast processing")
  7be1b9b8e9d1 ("drm/mm: Break long searches in fragmented address spaces")
  bb699a793110 ("drm/i915/gem: Break up long lists of object reclaim")
  9424ef56e13a ("ext4: add cond_resched() to __ext4_find_entry()")

and many of them mention softlockups, RCU CPU stall warnings, or
watchdogs triggering.  Are you seeing one of those, or are you
measuring latency some other way?

Bjorn