[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a5dfbe4f-b6fc-e282-2a3c-3e487493336c@themaw.net>
Date: Mon, 23 Oct 2023 15:35:24 +0800
From: Ian Kent <raven@...maw.net>
To: Arnd Bergmann <arnd@...db.de>,
Dan Carpenter <dan.carpenter@...aro.org>,
Anders Roxell <anders.roxell@...aro.org>
Cc: Naresh Kamboju <naresh.kamboju@...aro.org>,
open list <linux-kernel@...r.kernel.org>,
lkft-triage@...ts.linaro.org, linux-fsdevel@...r.kernel.org,
autofs@...r.kernel.org, Bill O'Donnell <bodonnel@...hat.com>,
Christian Brauner <brauner@...nel.org>
Subject: Re: autofs: add autofs_parse_fd()
On 23/10/23 08:48, Ian Kent wrote:
> On 20/10/23 21:09, Ian Kent wrote:
>> On 20/10/23 19:23, Arnd Bergmann wrote:
>>> On Fri, Oct 20, 2023, at 12:45, Dan Carpenter wrote:
>>>> On Fri, Oct 20, 2023 at 11:55:57AM +0200, Anders Roxell wrote:
>>>>> On Fri, 20 Oct 2023 at 08:37, Arnd Bergmann <arnd@...db.de> wrote:
>>>>>> On Thu, Oct 19, 2023, at 17:27, Naresh Kamboju wrote:
>>>>>>> The qemu-x86_64 and x86_64 booting with 64bit kernel and 32bit
>>>>>>> rootfs we call
>>>>>>> it as compat mode boot testing. Recently it started to failed to
>>>>>>> get login
>>>>>>> prompt.
>>>>>>>
>>>>>>> We have not seen any kernel crash logs.
>>>>>>>
>>>>>>> Anders, bisection is pointing to first bad commit,
>>>>>>> 546694b8f658 autofs: add autofs_parse_fd()
>>>>>>>
>>>>>>> Reported-by: Linux Kernel Functional Testing <lkft@...aro.org>
>>>>>>> Reported-by: Anders Roxell <anders.roxell@...aro.org>
>>>>>> I tried to find something in that commit that would be different
>>>>>> in compat mode, but don't see anything at all -- this appears
>>>>>> to be just a simple refactoring of the code, unlike the commits
>>>>>> that immediately follow it and that do change the mount
>>>>>> interface.
>>>>>>
>>>>>> Unfortunately this makes it impossible to just revert the commit
>>>>>> on top of linux-next. Can you double-check your bisection by
>>>>>> testing 546694b8f658 and the commit before it again?
>>>>> I tried these two patches again:
>>>>> 546694b8f658 ("autofs: add autofs_parse_fd()") - doesn't boot
>>>>> bc69fdde0ae1 ("autofs: refactor autofs_prepare_pipe()") - boots
>>>>>
>>>> One difference that I notice between those two patches is that we no
>>>> long call autofs_prepare_pipe(). We just call autofs_check_pipe().
>>> Indeed, so some of the f_flags end up being different. I assumed
>>> this was done intentionally, but it might be worth checking if
>>> the patch below makes any difference when the flags get put
>>> back the way they were. This is probably not the correct fix, but
>>> may help figure out what is going on. It should apply to anything
>>> from 546694b8f658 ("autofs: add autofs_parse_fd()") to the current
>>> linux-next:
>>>
>>> --- a/fs/autofs/inode.c
>>> +++ b/fs/autofs/inode.c
>>> @@ -358,6 +358,11 @@ static int autofs_fill_super(struct super_block
>>> *s, struct fs_context *fc)
>>> pr_debug("pipe fd = %d, pgrp = %u\n",
>>> sbi->pipefd, pid_nr(sbi->oz_pgrp));
>>> + /* We want a packet pipe */
>>> + sbi->pipe->f_flags |= O_DIRECT;
>>> + /* We don't expect -EAGAIN */
>>> + sbi->pipe->f_flags &= ~O_NONBLOCK;
>>> +
>>
>>
>> That makes sense, we do want a packet pipe and that does also mean
>>
>> we don't want a non-blocking pipe, it will be interesting to see
>>
>> if that makes a difference. It's been a long time since Linus
>>
>> implemented that packet pipe and I can't remember now what the
>>
>> case was that lead to it.
>
> After thinking about this over the weekend I'm pretty sure my mistake
>
> is dropping the call to autofs_prepare_pipe() without adding the tail
>
> end of it into autofs_parse_fd().
>
>
> To explain a bit of history which I'll include in the fix description.
>
> During autofs v5 development I decided to stay with the existing usage
>
> instead of changing to a packed structure for autofs <=> user space
>
> communications which turned out to be a mistake on my part.
>
>
> Problems arose and they were fixed by allowing for the 64 bit to 32 bit
>
> size difference in the automount(8) code.
>
>
> Along the way systemd started to use autofs and eventually encountered
>
> this problem too. systemd refused to compensate for the length difference
>
> insisting it be fixed in the kernel. Fortunately Linus implemented the
>
> packetized pipe which resolved the problem in a straight forward and
>
> simple way.
>
>
> So I pretty sure that the cause of the problem is the inadvertent
> dropping
>
> of the flags setting in autofs_fill_super() that Arnd spotted although I
>
> don't think putting it in autofs_fill_super() is the right thing to do.
>
>
> I'll produce a patch today which includes most of this explanation for
>
> future travelers ...
So I have a patch.
I'm of two minds whether to try and use the instructions to reproduce this
or not because of experiences I have had with other similar testing
automation
systems that claim to provide a reproducer and end up a huge waste of
time and
are significantly frustrating.
Can someone please perform a test for me once I provide the patch?
Ian
>
>
>>
>>
>> Ian
>>
>>> sbi->flags &= ~AUTOFS_SBI_CATATONIC;
>>> /*
Powered by blists - more mailing lists