[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20080919084856.GF17592@elte.hu>
Date: Fri, 19 Sep 2008 10:48:56 +0200
From: Ingo Molnar <mingo@...e.hu>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Jack Steiner <steiner@....com>, "H. Peter Anvin" <hpa@...or.com>,
Dean Nelson <dcn@....com>, Alan Mayer <ajm@....com>,
jeremy@...p.org, rusty@...tcorp.com.au, suresh.b.siddha@...el.com,
torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
Thomas Gleixner <tglx@...utronix.de>,
Yinghai Lu <Yinghai.lu@....com>
Subject: Re: [RFC 0/4] dynamically allocate arch specific system vectors
* Eric W. Biederman <ebiederm@...ssion.com> wrote:
> Jack Steiner <steiner@....com> writes:
>
> > On Wed, Sep 17, 2008 at 03:15:07PM -0700, Eric W. Biederman wrote:
> >> Jack Steiner <steiner@....com> writes:
> >>
> >> > On Wed, Sep 17, 2008 at 12:15:42PM -0700, H. Peter Anvin wrote:
> >> >> Dean Nelson wrote:
> >> >> >
> >> >> > sgi-gru driver
> >> >> >
> >> >> >The GRU is not an actual external device that is connected to an IOAPIC.
> >> >> >The gru is a hardware mechanism that is embedded in the node controller
> >> >> >(UV hub) that directly connects to the cpu socket. Any cpu (with
> >> >> >permission)
> >> >> >can do direct loads and stores to the gru. Some of these stores will
> > result
> >> >> >in an interrupt being sent back to the cpu that did the store.
> >> >> >
> >> >> >The interrupt vector used for this interrupt is not in an IOAPIC. Instead
> >> >> >it must be loaded into the GRU at boot or driver initialization time.
> >> >> >
> >> >>
> >> >> Could you clarify there: is this one vector number per CPU, or are you
> >> >> issuing a specific vector number and just varying the CPU number?
> >> >
> >> > It is one vector for each cpu.
> >> >
> >> > It is more efficient for software if the vector # is the same for all cpus
> >> Why? Especially in terms of irq counting that would seem to lead to cache
> >> line conflicts.
> >
> > Functionally, it does not matter. However, if the IRQ is not a per-cpu IRQ, a
> > very large number of IRQs (and vectors) may be needed. The GRU requires 32
> > interrupt
> > lines on each blade. A large system can currently support up to 512 blades.
>
> Every vendor of high end hardware is saying they intend to provide
> 1 or 2 queues per cpu and 1 irq per queue. So the GRU is not special in
> that regard. Also a very large number of IRQs is not a problem as
> soon as we start dynamically allocating them, which is currently
> in progress.
>
> Once we start dynamically allocating irq_desc structures we can put
> them in node-local memory and guarantee there is no data shared between
> cpus.
>
> > After looking thru the MSI code, we are starting to believe that we should
> > separate
> > the GRU requirements from the XPC requirements. It looks like XPC can easily use
> > the MSI infrastructure. XPC needs a small number of IRQs, and interrupts are
> > typically
> > targeted to a single cpu. They can also be retargeted using the standard
> > methods.
>
> Alright.
>
> I would be completely happy if there were interrupts who's affinity we can
> not change, and are always targeted at a single cpu.
>
> > The GRU, OTOH, is more like a timer interrupt or like a co-processor interrupt.
> > GRU interrupts can occur on any cpu using the GRU. When interrupts do occur, all
> > that
> > needs to happen is to call an interrupt handler. I'm thinking of something like
> > the following:
> >
> > - permanently reserve 2 system vectors in include/asm-x86/irq_vectors.h
> > - in uv_system_init(), call alloc_intr_gate() to route the
> > interrupts to a function in the file containing uv_system_init().
> > - initialize the GRU chipset with the vector, etc, ...
> > - if an interrupt occurs and the GRU driver is NOT loaded, print
> > an error message (rate limited or one time)
> >
> > - provide a special UV hook for the GRU driver to register/deregister a
> > special callback function for GRU interrupts
>
> That would work. So far the GRU doesn't sound that special.
>
> For a lot of this I would much rather solve the general case on this
> giving us a solution that works for all high end interrupts rather
> than one specific solution just for the GRU. Especially since it
> looks like we have most of the infrastructure already present to solve
> the general case and we have to develop and review the specific case
> from scratch.
ok, great.
Dean, just to make sure the useful bits are not lost now that the
direction has been changed: could you please repost the patchset but
without the driver API bits? It's still all a nice and useful
generalization and cleanup of the x86 vector allocation code, and we can
check it in -tip how well it works in practice.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists