lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 01 Mar 2024 09:38:39 +0100
From: Johannes Berg <johannes@...solutions.net>
To: "Souza, Jose" <jose.souza@...el.com>, "intel-xe@...ts.freedesktop.org"
	 <intel-xe@...ts.freedesktop.org>, "linux-kernel@...r.kernel.org"
	 <linux-kernel@...r.kernel.org>
Cc: "Vivi, Rodrigo" <rodrigo.vivi@...el.com>, "quic_mojha@...cinc.com"
	 <quic_mojha@...cinc.com>, "Cavitt, Jonathan" <jonathan.cavitt@...el.com>
Subject: Re: [PATCH v2 2/4] devcoredump: Add dev_coredumpm_timeout()

On Wed, 2024-02-28 at 17:56 +0000, Souza, Jose wrote:
> 
> In my opinion, the timeout should depend on the type of device driver.
> 
> In the case of server-class Ethernet cards, where corporate users automate most tasks, five minutes might even be considered excessive.
> 
> For our case, GPUs, users might experience minor glitches and only search for what happened after finishing their current task (writing an email,
> ending a gaming match, watching a YouTube video, etc.).
> If they land on https://drm.pages.freedesktop.org/intel-docs/how-to-file-i915-bugs.html or the future Xe version of that page, following the
> instructions alone may take inexperienced Linux users more than five minutes.

That's all not wrong, but I don't see why you wouldn't automate this
even on end user machines? I feel you're boxing the problem in by
wanting to solve it entirely in the kernel?

> I have set the timeout to one hour in the Xe driver, but this could increase if we start receiving user complaints.

At an hour now, people will probably start arguing that "indefinitely"
is about right? But at that point you're probably back to persisting
them on disk anyway? Or maybe glitches happen during logout/shutdown ...

Anyway, I don't want to block this because I just don't care enough
about how you do things, but I think the kernel is the wrong place to
solve this problem... The intent here was to give some userspace time to
grab it (and yes for that 5 minutes is already way too long), not the
users. That's also part of the reason we only hold on to a single
instance, since I didn't want it to keep consuming more and more memory
for it if happens repeatedly.

johannes

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ