[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <21c5f793-5894-5101-6c7a-bc27c59b9487@themaw.net>
Date: Fri, 20 Oct 2023 21:01:30 +0800
From: Ian Kent <raven@...maw.net>
To: Anders Roxell <anders.roxell@...aro.org>,
Arnd Bergmann <arnd@...db.de>
Cc: Naresh Kamboju <naresh.kamboju@...aro.org>,
open list <linux-kernel@...r.kernel.org>,
lkft-triage@...ts.linaro.org, linux-fsdevel@...r.kernel.org,
autofs@...r.kernel.org, Bill O'Donnell <bodonnel@...hat.com>,
Christian Brauner <brauner@...nel.org>,
Dan Carpenter <dan.carpenter@...aro.org>
Subject: Re: autofs: add autofs_parse_fd()
On 20/10/23 17:57, Anders Roxell wrote:
> On Fri, 20 Oct 2023 at 11:02, Arnd Bergmann <arnd@...db.de> wrote:
>> On Fri, Oct 20, 2023, at 09:48, Naresh Kamboju wrote:
>>> On Fri, 20 Oct 2023 at 12:07, Arnd Bergmann <arnd@...db.de> wrote:
>>>> On Thu, Oct 19, 2023, at 17:27, Naresh Kamboju wrote:
>>>>> The qemu-x86_64 and x86_64 booting with 64bit kernel and 32bit rootfs we call
>>>>> it as compat mode boot testing. Recently it started to failed to get login
>>>>> prompt.
>>>>>
>>>>> We have not seen any kernel crash logs.
>>>>>
>>>>> Anders, bisection is pointing to first bad commit,
>>>>> 546694b8f658 autofs: add autofs_parse_fd()
>>>>>
>>>>> Reported-by: Linux Kernel Functional Testing <lkft@...aro.org>
>>>>> Reported-by: Anders Roxell <anders.roxell@...aro.org>
>>>> I tried to find something in that commit that would be different
>>>> in compat mode, but don't see anything at all -- this appears
>>>> to be just a simple refactoring of the code, unlike the commits
>>>> that immediately follow it and that do change the mount
>>>> interface.
>>>>
>>>> Unfortunately this makes it impossible to just revert the commit
>>>> on top of linux-next. Can you double-check your bisection by
>>>> testing 546694b8f658 and the commit before it again?
>>> I will try your suggested ways.
>>>
>>> Is this information helpful ?
>>> Linux-next the regression started happening from next-20230925.
>>>
>>> GOOD: next-20230925
>>> BAD: next-20230926
>>>
>>> $ git log --oneline next-20230925..next-20230926 -- fs/autofs/
>>> dede367149c4 autofs: fix protocol sub version setting
>>> e6ec453bd0f0 autofs: convert autofs to use the new mount api
>>> 1f50012d9c63 autofs: validate protocol version
>>> 9b2731666d1d autofs: refactor parse_options()
>>> 7efd93ea790e autofs: reformat 0pt enum declaration
>>> a7467430b4de autofs: refactor super block info init
>>> 546694b8f658 autofs: add autofs_parse_fd()
>>> bc69fdde0ae1 autofs: refactor autofs_prepare_pipe()
>> Right, and it looks like the bottom five patches of this
>> should be fairly harmless as they only try to move code
>> around in preparation of the later changes, and even the
>> other ones should not cause any difference between a 32-bit
>> or a 64-bit /sbin/mount binary.
>>
>> If the native (full 64-bit or full 32-bit) test run still
>> works with the same version, there may be some other difference
>> here.
>>
>>>> What are the exact mount options you pass to autofs in your fstab?
>>> mount output shows like this,
>>> systemd-1 on /proc/sys/fs/binfmt_misc type autofs
>>> (rw,relatime,fd=30,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=1421)
>> This is only the binfmt-misc mount, which should not
>> prevent your rootfs from getting mounted, but it's possible
>> that failure to mount this prevents you from running
>> 32-bit binaries.
>>
>> I see this comes from the "proc-sys-fs-binfmt_misc.automount"
>> service in systemd. I see this is defined in
>> https://github.com/systemd/systemd/blob/main/units/proc-sys-fs-binfmt_misc.automount
>> but I don't know exactly what its purpose is here. On a
>> 64-bit system, you normally use compat_binfmt_elf.ko to run
>> 32-bit binaries, and this does not require any specific mount
>> points. Alternatively, you could use binfmt_misc.ko with
>> the procfs mount to configure running arbitrary binary
>> formats such as arm32 on x86_64 with qemu-user emulation.
>>
>> I double-checked your rootfs image from
>> https://storage.tuxboot.com/debian/bookworm/i386/rootfs.ext4.xz
>> to ensure that this indeed contains i386 executables rather than
>> arm32 ones, and that is all fine.
>>
>> I also see in your log file at
>> https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20230926/testrun/20125035/suite/boot/test/gcc-13-lkftconfig-compat/log
>> that it is running the i386 binaries from the rootfs, but
>> it does get stuck soon after trying to set up the binfmt-misc
>> mount at the end of the log:
>>
>> [[0;32m OK [0m] Reached target [0;1;39mlocal-fs.target[0m - Local File Systems.
>> Starting [0;1;39msystemd-binfmt.se…et Up Additional Binary Formats...
>> Starting [0;1;39msystemd-tmpfiles-… Volatile Files and Directories...
>> Starting [0;1;39msystemd-udevd.ser…ger for Device Events and Files...
>> [ 15.869404] igb 0000:01:00.0 eno1: renamed from eth0 (while UP)
>> [ 15.883753] igb 0000:02:00.0 eno2: renamed from eth1
>> [ 20.053885] (udev-worker) (175) used greatest stack depth: 12416 bytes left
>> quit
>>
>> I'm a bit out of ideas at that point, my best guess now is
>> that your bisection points to something in autofs that makes
>> it hang while setting up autofs, but that neither autofs
>> nor binfmt-misc are actually being used otherwise.
>>
>> Maybe you can try to modify your rootfs to disable or remove
>> the systemd-binfmt.service, to confirm that autofs is not
>> actually needed here but does cause the crash?
> I removed systemd-binfmt.service from the rootfs and booted
> 546694b8f658 ("autofs: add autofs_parse_fd()") and now it booted fine.
I don't suppose you could try an automount after the boot is completed?
It seems a bit odd, it must be some sort of object lifetime inconsistency
but if that was the case automounts would at least fail to function mmm ...
Ian
Powered by blists - more mailing lists