linux-kernel - Re: [PATCH] x86_64: Dynamically allocate arch specific system vectors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <48975A75.4010609@sgi.com>
Date:	Mon, 04 Aug 2008 14:37:25 -0500
From:	Alan Mayer <ajm@....com>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	Cliff Wickman <cpw@....com>, jeremy@...p.org,
	rusty@...tcorp.com.au, suresh.b.siddha@...el.com, mingo@...e.hu,
	torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
	Dean Nelson <dcn@....com>
Subject: Re: [PATCH] x86_64:  Dynamically allocate arch specific system vectors



Eric W. Biederman wrote:
> Alan Mayer <ajm@....com> writes:
> 
>> Okay, I think we have it now.  assign_irq_vector *almost* does what we need.
>> One minor thing is that assign_irq_vector ANDs against cpu_online_map.  We would
>> need cpu_possible_map, so we get the vector on offline cpus that may come
>> online.  The other thing is that assign_irq_vector doesn't allow the
>> specification of interrupt priorities.  It would need to be modified to handle
>> returning either a high priority vector or a low priority vector.  Would
>> modifying the api for assign_irq_vector be the proper approach?
> 
> I don't know if it makes sense to modify assign_irq_vector or to 
> have a companion function that uses the same data structures.
> 
> I think I would work on the companion function and if the code
> can be made sufficiently similar merge the two functions.
> 
Okay, If I understand you, here's what we can do.  We currently have 
this function that does pretty much what the combination of create_irq() 
and __assign_irq_vector() do.  We can accomplish the same thing that our
routine does using create_irq() and __assign_irq_vector() do if we make 
the following changes:

__assign_irq_vector(int irq, cpumask_t mask) ==>
	__assign_irq_vector(int irq, cpumask_t mask, int priority);

priority has three values:  priority_none, priority_low, priority_high
priority_none means do everything the way it is done now.
priority_low means do everything the way its is done now, except use 
cpu_possible_map rather than cpu_online_map.
priority_high means search the interrupt vectors from the top down, 
rather than from the bottom up and use cpu_possible_map rather than 
cpu_online_map.

create_irq(void) ==> create_irq(int priority, cpumask_t *mask)
priority_none, means do everything the way it is done now, passing in 
TARGET_CPUS as the mask, but also sending the priority arg. into 
__assign_irq_vector().
priority_low and priority_high means use create_irq()'s mask arg. as the
mask passed to __assign_irq_vector).

We would add an additional small routine on top of create_irq() to do 
any massaging of the irq_desc, etc. that we need for these system vectors.

Is that what you were thinking about?

			--ajm

>> The interrupts don't necessarily fire on all cpus, it's just that they *can*
>> fire on any cpu.  For example, the GRU triggers an interrupt (it is very
>> IPI'ish) to a particular cpu in the event of a GRU TLB fault. That cpu handles
>> the fault and returns.  But the fault can happen on any cpu, so all cpus need to
>> be registered for the same vector and irq. This is probably splitting hairs; it
>> is certainly no different in principal from timer interrupts or processor TLB
>> faults.
> 
> Reasonable.  As long as you don't need to read a status register to figure
> out what to do that sounds reasonable.  This does sound very much like
> splitting hairs on a very platform specific capability.
> 
> If we can generalize the mechanism to things like per cpu timer
> interrupts and such so that we reduced the total amount of code we
> have to maintain I would find it a very compelling mechanism.
> 
>> As far as kernel_stat is concerned.  I see you're point.  NR_CPUS on our
>> machines is going to be big (4K? 8K? something like that).  NR_IRQS is also
>> going to big because of that.  It's unfortunate since the actual number of
>> interrupt sources is going to be an order of magnitude smaller, at least.
> 
> The number of interrupts sources is going to be smaller only because
> SGI machines have or at least appear to have poor I/O compared to most
> of the rest of machines in existence.  NR_CPUS*16 is a fairly
> reasonable estimate on most machines in existence.  In the short term
> it is going to get worse in the presence of MSI-X.  I was talking to a
> developer at Intel last week about 256 irqs for one card.  I keep
> having dreams about finding a way to just keep stats for a few cpus
> but alas I don't think that is going to happen.  Silly us.
> 
> Eric
> 

-- 
It's getting to the point
Where I'm no fun anymore.
--
Alan J. Mayer
SGI
ajm@....com
WORK: 651-683-3131
HOME: 651-407-0134
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/