lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <253ec223-98e1-4e7e-b138-0a83ea1a7b0e@flourine.local>
Date: Wed, 7 Aug 2024 14:40:11 +0200
From: Daniel Wagner <dwagner@...e.de>
To: Ming Lei <ming.lei@...hat.com>
Cc: Jens Axboe <axboe@...nel.dk>, Keith Busch <kbusch@...nel.org>, 
	Sagi Grimberg <sagi@...mberg.me>, Thomas Gleixner <tglx@...utronix.de>, 
	Christoph Hellwig <hch@....de>, "Martin K. Petersen" <martin.petersen@...cle.com>, 
	John Garry <john.g.garry@...cle.com>, "Michael S. Tsirkin" <mst@...hat.com>, 
	Jason Wang <jasowang@...hat.com>, Kashyap Desai <kashyap.desai@...adcom.com>, 
	Sumit Saxena <sumit.saxena@...adcom.com>, Shivasharan S <shivasharan.srikanteshwara@...adcom.com>, 
	Chandrakanth patil <chandrakanth.patil@...adcom.com>, Sathya Prakash Veerichetty <sathya.prakash@...adcom.com>, 
	Suganath Prabu Subramani <suganath-prabu.subramani@...adcom.com>, Nilesh Javali <njavali@...vell.com>, 
	GR-QLogic-Storage-Upstream@...vell.com, Jonathan Corbet <corbet@....net>, 
	Frederic Weisbecker <frederic@...nel.org>, Mel Gorman <mgorman@...e.de>, Hannes Reinecke <hare@...e.de>, 
	Sridhar Balaraman <sbalaraman@...allelwireless.com>, "brookxu.cn" <brookxu.cn@...il.com>, 
	linux-kernel@...r.kernel.org, linux-block@...r.kernel.org, linux-nvme@...ts.infradead.org, 
	linux-scsi@...r.kernel.org, virtualization@...ts.linux.dev, megaraidlinux.pdl@...adcom.com, 
	mpi3mr-linuxdrv.pdl@...adcom.com, MPT-FusionLinux.pdl@...adcom.com, storagedev@...rochip.com, 
	linux-doc@...r.kernel.org
Subject: Re: [PATCH v3 15/15] blk-mq: use hk cpus only when isolcpus=io_queue
 is enabled

On Tue, Aug 06, 2024 at 10:55:09PM GMT, Ming Lei wrote:
> On Tue, Aug 06, 2024 at 02:06:47PM +0200, Daniel Wagner wrote:
> > When isolcpus=io_queue is enabled all hardware queues should run on the
> > housekeeping CPUs only. Thus ignore the affinity mask provided by the
> > driver. Also we can't use blk_mq_map_queues because it will map all CPUs
> > to first hctx unless, the CPU is the same as the hctx has the affinity
> > set to, e.g. 8 CPUs with isolcpus=io_queue,2-3,6-7 config
> 
> What is the expected behavior if someone still tries to submit IO on isolated
> CPUs?

If a user thread is issuing an IO the IO is handled by the housekeeping
CPU, which will cause some noise on the submitting CPU. As far I was
told this is acceptable. Our customers really don't want to have any
IO not from their application ever hitting the isolcpus. When their
application is issuing an IO.

> BTW, I don't see any change in blk_mq_get_ctx()/blk_mq_map_queue() in this
> patchset,

I was trying to figure out what you tried to explain last time with
hangs, but didn't really understand what the conditions are for this
problem to occur.

> that means one random hctx(or even NULL) may be used for submitting
> IO from isolated CPUs,
> then there can be io hang risk during cpu hotplug, or
> kernel panic when submitting bio.

Can you elaborate a bit more? I must miss something important here.

Anyway, my understanding is that when the last CPU of a hctx goes
offline the affinity is broken and assigned to an online HK CPU. And we
ensure all flight IO have finished and also ensure we don't submit any
new IO to a CPU which goes offline.

FWIW, I tried really hard to get an IO hang with cpu hotplug.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ