lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <57eec58a-6aae-4958-996d-2785da985f04@oracle.com>
Date: Sat, 5 Apr 2025 12:25:00 -0400
From: Chuck Lever <chuck.lever@...cle.com>
To: Paul Menzel <pmenzel@...gen.mpg.de>
Cc: Takashi Iwai <tiwai@...e.de>, linux-fsdevel@...r.kernel.org,
        stable@...r.kernel.org, regressions@...ts.linux.dev,
        linux-kernel@...r.kernel.org, Christian Brauner <brauner@...nel.org>,
        Greg KH <gregkh@...uxfoundation.org>
Subject: Re: [REGRESSION] Chrome and VSCode breakage with the commit
 b9b588f22a0c

On 4/5/25 3:43 AM, Paul Menzel wrote:
> Dear Greg,
> 
> 
> Thank you for replying on a Saturday.
> 
> Am 05.04.25 um 09:29 schrieb Greg KH:
>> On Sat, Apr 05, 2025 at 08:32:13AM +0200, Paul Menzel wrote:
> 
>>> Am 29.03.25 um 15:57 schrieb Chuck Lever:
>>>> On 3/29/25 8:17 AM, Takashi Iwai wrote:
>>>>> On Sun, 23 Feb 2025 09:53:10 +0100, Takashi Iwai wrote:
>>>
>>>>>> we received a bug report showing the regression on 6.13.1 kernel
>>>>>> against 6.13.0.  The symptom is that Chrome and VSCode stopped
>>>>>> working
>>>>>> with Gnome Scaling, as reported on openSUSE Tumbleweed bug tracker
>>>>>>     https://bugzilla.suse.com/show_bug.cgi?id=1236943
>>>>>>
>>>>>> Quoting from there:
>>>>>> """
>>>>>> I use the latest TW on Gnome with a 4K display and 150%
>>>>>> scaling. Everything has been working fine, but recently both Chrome
>>>>>> and VSCode (installed from official non-openSUSE channels) stopped
>>>>>> working with Scaling.
>>>>>> ....
>>>>>> I am using VSCode with:
>>>>>> `--enable-features=UseOzonePlatform --enable-
>>>>>> features=WaylandWindowDecorations --ozone-platform-hint=auto` and
>>>>>> for Chrome, I select `Preferred Ozone platform` == `Wayland`.
>>>>>> """
>>>>>>
>>>>>> Surprisingly, the bisection pointed to the backport of the commit
>>>>>> b9b588f22a0c049a14885399e27625635ae6ef91 ("libfs: Use d_children list
>>>>>> to iterate simple_offset directories").
>>>>>>
>>>>>> Indeed, the revert of this patch on the latest 6.13.4 was
>>>>>> confirmed to
>>>>>> fix the issue.  Also, the reporter verified that the latest 6.14-rc
>>>>>> release is still affected, too.
>>>>>>
>>>>>> For now I have no concrete idea how the patch could break the
>>>>>> behavior
>>>>>> of a graphical application like the above.  Let us know if you need
>>>>>> something for debugging.  (Or at easiest, join to the bugzilla entry
>>>>>> and ask there; or open another bug report at whatever you like.)
>>>>>>
>>>>>> BTW, I'll be traveling tomorrow, so my reply will be delayed.
>>>
>>>>>> #regzbot introduced: b9b588f22a0c049a14885399e27625635ae6ef91
>>>>>> #regzbot monitor: https://bugzilla.suse.com/show_bug.cgi?id=1236943
>>>>>
>>>>> After all, this seems to be a bug in Chrome and its variant, which was
>>>>> surfaced by the kernel commit above: as the commit changes the
>>>>> directory enumeration, it also changed the list order returned from
>>>>> libdrm drmGetDevices2(), and it screwed up the application that worked
>>>>> casually beforehand.  That said, the bug itself has been already
>>>>> present.  The Chrome upstream tracker:
>>>>>     https://issuetracker.google.com/issues/396434686
>>>>>
>>>>> #regzbot invalid: problem has always existed on Chrome and related
>>>>> code
>>>
>>>> Thank you very much for your report and for chasing this to conclusion.
>>> Doesn’t marking this an invalid contradict Linux’ no regression
>>> policy to
>>> never break user space, so users can always update the Linux kernel?
>>> Shouldn’t this commit still be reverted, and another way be found
>>> keeping
>>> the old ordering?
>>>
>>> Greg, Sasha, in stable/linux-6.13.y the two commits below would need
>>> to be
>>> reverted:
>>>
>>> 180c7e44a18bbd7db89dfd7e7b58d920c44db0ca
>>> d9da7a68a24518e93686d7ae48937187a80944ea
>>>
>>> For stable/linux-6.12.y:
>>>
>>> 176d0333aae43bd0b6d116b1ff4b91e9a15f88ef
>>> 639b40424d17d9eb1d826d047ab871fe37897e76
>>
>> Unless the changes are also reverted in Linus's tree, we'll be keeping
>> these in.  Please work with the maintainers to resolve this in mainline
>> and we will be glad to mirror that in the stable trees as well.
> 
> Commit b9b588f22a0c (libfs: Use d_children list to iterate simple_offset
> directories) does not have a Fixes: tag or Cc: stable@...r.kernel.org. I
> do not understand, why it was applied to the stable series at all [1],
> and cannot be reverted when it breaks userspace?
I NACK'd the upstream revert because I expected an RCA before 6.14
final (that didn't happen), and the Chrome issue was the only reported
problem and it was specific to a particular hardware configuration and
the /latest developer release/ of Chrome. Neither v6.14.0 nor a Chrome
developer release are going to be put in front of users who do not
expect to encounter issues.

Note that the libfs series addresses several issues. Commit b9b588f22a0c
itself addresses CVE-2024-46701 [1] (in v6.6). I did not add a "Cc:
stable" for commit b9b588f22a0c because it cannot be cherry picked to
apply to v6.6, it has to be manually adjusted to apply.

The final RCA reported in [2] shows that there is nothing incorrect
about b9b588f22a0c.

In addition, the next Chrome release will carry a fix for the clearly
incorrect library behavior -- applications cannot depend on the order
of directory entry iteration, because that can change arbitrarily, and
not just because of file system implementation quirks. You will note
that even after sorting the directory entries, the library still had
problems discovering the accelerated graphics device.

Reverting now might follow the letter of the rule about "no regressions"
but IMHO moving forward from here seems to me to be the more
constructive approach.


-- 
Chuck Lever

[1] https://nvd.nist.gov/vuln/detail/CVE-2024-46701
[2] https://issuetracker.google.com/issues/396434686?pli=1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ