[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140722072337.GG15237@phenom.ffwll.local>
Date: Tue, 22 Jul 2014 09:23:37 +0200
From: Daniel Vetter <daniel@...ll.ch>
To: Oded Gabbay <oded.gabbay@....com>
Cc: Jerome Glisse <j.glisse@...il.com>,
Andrew Lewycky <Andrew.Lewycky@....com>,
Michel Dänzer <michel.daenzer@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
linux-mm <linux-mm@...ck.org>,
Evgeny Pinchuk <Evgeny.Pinchuk@....com>,
Alexey Skidanov <Alexey.Skidanov@....com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2 00/25] AMDKFD kernel driver
On Mon, Jul 21, 2014 at 10:23:43PM +0300, Oded Gabbay wrote:
> But Jerome, the core problem still remains in effect, even with your
> suggestion. If an application, either via userspace queue or via ioctl,
> submits a long-running kernel, than the CPU in general can't stop the
> GPU from running it. And if that kernel does while(1); than that's it,
> game's over, and no matter how you submitted the work. So I don't really
> see the big advantage in your proposal. Only in CZ we can stop this wave
> (by CP H/W scheduling only). What are you saying is basically I won't
> allow people to use compute on Linux KV system because it _may_ get the
> system stuck.
>
> So even if I really wanted to, and I may agree with you theoretically on
> that, I can't fulfill your desire to make the "kernel being able to
> preempt at any time and be able to decrease or increase user queue
> priority so overall kernel is in charge of resources management and it
> can handle rogue client in proper fashion". Not in KV, and I guess not
> in CZ as well.
At least on intel the execlist stuff which is used for preemption can be
used by both the cpu and the firmware scheduler. So we can actually
preempt when doing cpu scheduling.
It sounds like current amd hw doesn't have any preemption at all. And
without preemption I don't think we should ever consider to allow
userspace to directly submit stuff to the hw and overload. Imo the kernel
_must_ sit in between and reject clients that don't behave. Of course you
can only ever react (worst case with a gpu reset, there's code floating
around for that on intel-gfx), but at least you can do something.
If userspace has a direct submit path to the hw then this gets really
tricky, if not impossible.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists