[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9217caeb-0d7e-b101-33f0-859da175a6ef@themaw.net>
Date: Mon, 23 Oct 2023 08:48:50 +0800
From: Ian Kent <raven@...maw.net>
To: Arnd Bergmann <arnd@...db.de>,
Dan Carpenter <dan.carpenter@...aro.org>,
Anders Roxell <anders.roxell@...aro.org>
Cc: Naresh Kamboju <naresh.kamboju@...aro.org>,
open list <linux-kernel@...r.kernel.org>,
lkft-triage@...ts.linaro.org, linux-fsdevel@...r.kernel.org,
autofs@...r.kernel.org, Bill O'Donnell <bodonnel@...hat.com>,
Christian Brauner <brauner@...nel.org>
Subject: Re: autofs: add autofs_parse_fd()
On 20/10/23 21:09, Ian Kent wrote:
> On 20/10/23 19:23, Arnd Bergmann wrote:
>> On Fri, Oct 20, 2023, at 12:45, Dan Carpenter wrote:
>>> On Fri, Oct 20, 2023 at 11:55:57AM +0200, Anders Roxell wrote:
>>>> On Fri, 20 Oct 2023 at 08:37, Arnd Bergmann <arnd@...db.de> wrote:
>>>>> On Thu, Oct 19, 2023, at 17:27, Naresh Kamboju wrote:
>>>>>> The qemu-x86_64 and x86_64 booting with 64bit kernel and 32bit
>>>>>> rootfs we call
>>>>>> it as compat mode boot testing. Recently it started to failed to
>>>>>> get login
>>>>>> prompt.
>>>>>>
>>>>>> We have not seen any kernel crash logs.
>>>>>>
>>>>>> Anders, bisection is pointing to first bad commit,
>>>>>> 546694b8f658 autofs: add autofs_parse_fd()
>>>>>>
>>>>>> Reported-by: Linux Kernel Functional Testing <lkft@...aro.org>
>>>>>> Reported-by: Anders Roxell <anders.roxell@...aro.org>
>>>>> I tried to find something in that commit that would be different
>>>>> in compat mode, but don't see anything at all -- this appears
>>>>> to be just a simple refactoring of the code, unlike the commits
>>>>> that immediately follow it and that do change the mount
>>>>> interface.
>>>>>
>>>>> Unfortunately this makes it impossible to just revert the commit
>>>>> on top of linux-next. Can you double-check your bisection by
>>>>> testing 546694b8f658 and the commit before it again?
>>>> I tried these two patches again:
>>>> 546694b8f658 ("autofs: add autofs_parse_fd()") - doesn't boot
>>>> bc69fdde0ae1 ("autofs: refactor autofs_prepare_pipe()") - boots
>>>>
>>> One difference that I notice between those two patches is that we no
>>> long call autofs_prepare_pipe(). We just call autofs_check_pipe().
>> Indeed, so some of the f_flags end up being different. I assumed
>> this was done intentionally, but it might be worth checking if
>> the patch below makes any difference when the flags get put
>> back the way they were. This is probably not the correct fix, but
>> may help figure out what is going on. It should apply to anything
>> from 546694b8f658 ("autofs: add autofs_parse_fd()") to the current
>> linux-next:
>>
>> --- a/fs/autofs/inode.c
>> +++ b/fs/autofs/inode.c
>> @@ -358,6 +358,11 @@ static int autofs_fill_super(struct super_block
>> *s, struct fs_context *fc)
>> pr_debug("pipe fd = %d, pgrp = %u\n",
>> sbi->pipefd, pid_nr(sbi->oz_pgrp));
>> + /* We want a packet pipe */
>> + sbi->pipe->f_flags |= O_DIRECT;
>> + /* We don't expect -EAGAIN */
>> + sbi->pipe->f_flags &= ~O_NONBLOCK;
>> +
>
>
> That makes sense, we do want a packet pipe and that does also mean
>
> we don't want a non-blocking pipe, it will be interesting to see
>
> if that makes a difference. It's been a long time since Linus
>
> implemented that packet pipe and I can't remember now what the
>
> case was that lead to it.
After thinking about this over the weekend I'm pretty sure my mistake
is dropping the call to autofs_prepare_pipe() without adding the tail
end of it into autofs_parse_fd().
To explain a bit of history which I'll include in the fix description.
During autofs v5 development I decided to stay with the existing usage
instead of changing to a packed structure for autofs <=> user space
communications which turned out to be a mistake on my part.
Problems arose and they were fixed by allowing for the 64 bit to 32 bit
size difference in the automount(8) code.
Along the way systemd started to use autofs and eventually encountered
this problem too. systemd refused to compensate for the length difference
insisting it be fixed in the kernel. Fortunately Linus implemented the
packetized pipe which resolved the problem in a straight forward and
simple way.
So I pretty sure that the cause of the problem is the inadvertent dropping
of the flags setting in autofs_fill_super() that Arnd spotted although I
don't think putting it in autofs_fill_super() is the right thing to do.
I'll produce a patch today which includes most of this explanation for
future travelers ...
>
>
> Ian
>
>> sbi->flags &= ~AUTOFS_SBI_CATATONIC;
>> /*
Powered by blists - more mailing lists