lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1711132220240.2097@nanos>
Date:   Mon, 13 Nov 2017 22:33:49 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Sagi Grimberg <sagi@...mberg.me>
cc:     Jens Axboe <axboe@...com>, Jes Sorensen <jsorensen@...com>,
        Tariq Toukan <tariqt@...lanox.com>,
        Saeed Mahameed <saeedm@....mellanox.co.il>,
        Networking <netdev@...r.kernel.org>,
        Leon Romanovsky <leonro@...lanox.com>,
        Saeed Mahameed <saeedm@...lanox.com>,
        Kernel Team <kernel-team@...com>,
        Christoph Hellwig <hch@....de>
Subject: Re: [RFD] Managed interrupt affinities [ Was: mlx5 broken affinity
 ]

On Mon, 13 Nov 2017, Sagi Grimberg wrote:
> > > >          #1 Before the core tries to move the interrupt so it can veto
> > > > the
> > > > 	  move if it cannot allocate new resources or whatever is required
> > > > 	  to operate after the move.
> > > 
> > > What would the core do if a driver veto a move?
> > 
> > Return the error code from write_affinity() as it does with any other error
> > which fails to set the affinity.
> 
> OK, so this would mean that the driver queue no longer has a vector
> correct? so is the semantics that it needs to cleanup its resources or
> should it expect another callout for that?

The driver queue still has the old vector, i.e.

echo XXX > /proc/irq/YYY/affinity

     write_irq_affinity(newaffinity)

	newvec = reserve_new_vector();

	ret = subsys_pre_move_callback(...., newaffinity);

	if (ret) {
		drop_new_vector(newvec);
		return ret;
	}

	shutdown(oldvec);
	install(newvec);

	susbsys_post_move_callback(....)

	startup(newvec);

subsys_pre_move_callback()

	ret = do_whatever();
	if (ret)
		return ret;

	/*
	 * Make sure nothing is queued anymore and outstanding
	 * requests are completed. Same as for managed CPU hotplug.
	 */
	stop_and_drain_queue();
	return 0;

subsys_post_move_callback()

	install_new_data();

	/* Reenable queue. Same as for managed CPU hotplug */
	reenable_queue();

	free_old_data();
	return;

Does that clarify the mechanism?

> > > This looks like it can work to me, but I'm probably not familiar enough
> > > to see the full picture here.
> > 
> > On the interrupt core side this is workable, I just need the input from the
> > driver^Wsubsystem side if this can be implemented sanely.
> 
> Can you explain what do you mean by "subsystem"? I thought that the
> subsystem would be the irq subsystem (which means you are the one to provide
> the needed input :) ) and the driver would pass in something
> like msi_irq_ops to pci_alloc_irq_vectors() if it supports the driver
> requirements that you listed and NULL to tell the core to leave it alone
> and do what it sees fit (or pass msi_irq_ops with flag that means that).
> 
> ops structure is a very common way for drivers to communicate with a
> subsystem core.

So if you look at the above pseudo code then the subsys_*_move_callbacks
are probably subsystem specific, i.e. block or networking.

Those subsystem callbacks might either handle it at the subsystem level
directly or call into the particular driver.

That's certainly out of the scope what the generic interrupt code can do :)

Thanks,

	tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ