[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220627163640.74890-1-kuniyu@amazon.com>
Date: Mon, 27 Jun 2022 09:36:40 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <sachinp@...ux.ibm.com>
CC: <davem@...emloft.net>, <kuniyu@...zon.com>,
<linux-next@...r.kernel.org>, <linuxppc-dev@...ts.ozlabs.org>,
<netdev@...r.kernel.org>
Subject: Re: [powerpc] Fingerprint systemd service fails to start (next-20220624)
Hi Sachin,
Thanks for the report.
From: Sachin Sant <sachinp@...ux.ibm.com>
Date: Mon, 27 Jun 2022 10:28:27 +0530
> With the latest -next I have observed a peculiar issue on IBM Power
> server running -next(5.19.0-rc3-next-20220624) .
>
> Fingerprint authentication systemd service (fprintd) fails to start while
> attempting OS login after kernel boot. There is a visible delay of 18-20
> seconds before being prompted for OS login password.
>
> Kernel 5.19.0-rc3-next-20220624 on an ppc64le
>
> ltcden8-lp6 login: root
> <<=======. delay of 18-20 seconds
> Password:
>
> Following messages(fprintd service) are seen in /var/log/messages:
>
> systemd[1]: Startup finished in 1.842s (kernel) + 1.466s (initrd) + 29.230s (userspace) = 32.540s.
It seems the kernel finishes its job immediately but userspace takes more
time on retrying or something. The service_start_timeout seems to be the
timeout period.
> NetworkManager[1100]: <info> [1656304146.6686] manager: startup complete
> dbus-daemon[1027]: [system] Activating via systemd: service name='net.reactivated.Fprint' unit='fprintd.service' requested by ':1.21' (uid=0 pid=1502 comm="/bin/login -p -- ")
> systemd[1]: Starting Fingerprint Authentication Daemon...
> fprintd[2521]: (fprintd:2521): fprintd-WARNING **: 00:29:08.568: Failed to open connection to bus: Could not connect: Connection refused
I think this message comes from here.
https://github.com/freedesktop/libfprint-fprintd/blob/master/src/main.c#L183-L189
I'm not sure what the program does though, I guess it failed to find a peer
socket in the hash table while calling connect()/sendmsg() syscalls and got
-ECONNREFUSED in unix_find_bsd() or unix_find_abstract().
> systemd[1]: fprintd.service: Main process exited, code=exited, status=1/FAILURE
> systemd[1]: fprintd.service: Failed with result 'exit-code'.
> systemd[1]: Failed to start Fingerprint Authentication Daemon.
> dbus-daemon[1027]: [system] Failed to activate service 'net.reactivated.Fprint': timed out (service_start_timeout=25000ms)
>
> Mainline (5.19.0-rc3) or older -next does not have this problem.
>
> Git bisect between mainline & -next points to the following patch:
>
> # git bisect bad
> cf2f225e2653734e66e91c09e1cbe004bfd3d4a7 is the first bad commit
> commit cf2f225e2653734e66e91c09e1cbe004bfd3d4a7
>
> Date: Tue Jun 21 10:19:12 2022 -0700
>
> af_unix: Put a socket into a per-netns hash table.
>
> I don’t know how the above identified patch is related to the failure,
> but given that I can consistently recreate the issue assume the bisect
> result can be trusted.
Before the commit, all of sockets on the host are linked in a global hash
table, and after the commit, they are linked in their network namespace's
hash table. So, I believe there is no change visible to userspace.
> I have attached dmesg log for reference. Let me know if any additional
> Information is required.
* Could you provide
* dmesg and /var/log/messages on a successful case? (without the commit)
* Unit file
* repro steps
* Is it reproducible after login? (e.g. systemctl restart)
* If so, please provide
* the result of strace -t -ff
* Does it happen on only powerpc? How about x86 or arm64?
* What does the service does?
* connect() or sendmsg()
* protocol family
* abstract or BSD socket
Best regards,
Kuniyuki
Powered by blists - more mailing lists