lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Wed, 25 Apr 2018 11:31:45 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Andrey Grodzovsky <Andrey.Grodzovsky@....com>
Cc:     David.Panariti@....com,
        Michel Dänzer <michel@...nzer.net>,
        linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
        oleg@...hat.com, amd-gfx@...ts.freedesktop.org,
        Alexander.Deucher@....com, akpm@...ux-foundation.org,
        Christian.Koenig@....com
Subject: Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

Andrey Grodzovsky <Andrey.Grodzovsky@....com> writes:

> On 04/25/2018 11:29 AM, Eric W. Biederman wrote:
>
>>  Another issue is changing wait_event_killable to wait_event_timeout where I need
>> to understand
>> what TO value is acceptable for all the drivers using the scheduler, or maybe it
>> should come as a property
>> of drm_sched_entity.
>>
>> It would not surprise me if you could pick a large value like 1 second
>> and issue a warning if that time outever triggers.  It sounds like the
>> condition where we wait indefinitely today is because something went
>> wrong in the driver.
>
> We wait here for all GPU jobs in flight which belong to the dying entity to complete. The driver submits
> the GPU jobs but the content of the job might be is not under driver's control and could take 
> long time to finish or even hang (e.g. graphic or compute shader) , I
> guess that why originally the wait is indefinite.


I am ignorant of what user space expect or what the semantics of the
susbsystem are here, so I might be completely off base.  But this wait
for a long time behavior I would expect much more from f_op->flush or a
f_op->fsync method.

fsync so it could be obtained without closing the file descriptor.
flush so that you could get a return value out to close.

But I honestly don't know semantically what your userspace applications
expect and/or require so I can really only say.  Those of weird semantics.

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ