lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210708092755.15660-1-dwagner@suse.de>
Date:   Thu,  8 Jul 2021 11:27:50 +0200
From:   Daniel Wagner <dwagner@...e.de>
To:     linux-nvme@...ts.infradead.org
Cc:     linux-kernel@...r.kernel.org,
        James Smart <james.smart@...adcom.com>,
        Keith Busch <kbusch@...nel.org>, Jens Axboe <axboe@...com>,
        Ming Lei <ming.lei@...hat.com>,
        Sagi Grimberg <sagi@...mberg.me>,
        Daniel Wagner <dwagner@...e.de>
Subject: [PATCH v2 0/5] Handle update hardware queues and queue freeze more carefully

Hi,

I've tested this on top of Ming's patches 'blk-mq: fix
blk_mq_alloc_request_hctx'[1] which fixes all problems (including the
hanger in nvme_wait_freeze()).

Thanks,
Danie

[1] https://lore.kernel.org/linux-nvme/20210629074951.1981284-1-ming.lei@redhat.com/

v1:
 - https://lore.kernel.org/linux-nvme/20210625101649.49296-1-dwagner@suse.de/
v2:
 - reviewed tags collected
 - added 'update hardware queues' for all transport
 - added fix for fc hanger in nvme_wait_freeze_timeout


Initial cover letter:

this is a followup on the crash I reported in

  https://lore.kernel.org/linux-block/20210608183339.70609-1-dwagner@suse.de/

By moving the hardware check up the crash was gone. Unfortuntatly, I
don't understand why this fixes the crash. The per-cpu access is
crashing but I can't see why the blk_mq_update_nr_hw_queues() is
fixing this problem.

Even though I can't explain why it fixes it, I think it makes sense to
update the hardware queue mapping bevore we recreate the IO
queues. Thus I avoided in the commit message to say it fixes
something.

Also during testing I observed the we hang indivinetly in
blk_mq_freeze_queue_wait(). Again I can't explain why we get stuck
there but given a common pattern for the nvme_wait_freeze() is to use
it with a timeout I think the timeout should be used too :)

Anyway, someone with more undertanding of the stack can explain the
problems.


Daniel Wagner (4):
  nvme-fc: Update hardware queues before using them
  nvme-rdma: Update number of hardware queues before using them
  nvme-fc: Wait with a timeout for queue to freeze
  nvme-fc: Freeze queues before destroying them

Hannes Reinecke (1):
  nvme-tcp: Update number of hardware queues before using them

 drivers/nvme/host/fc.c   | 26 +++++++++++++++++---------
 drivers/nvme/host/rdma.c | 13 ++++++-------
 drivers/nvme/host/tcp.c  | 14 ++++++--------
 3 files changed, 29 insertions(+), 24 deletions(-)

-- 
2.29.2

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ