[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170411150040.k2uq5uzvllrindkq@thunk.org>
Date: Tue, 11 Apr 2017 11:00:40 -0400
From: Theodore Ts'o <tytso@....edu>
To: linux-kernel@...r.kernel.org
Cc: torvalds@...ux-foundation.org, gregkh@...uxfoundation.org
Subject: [REGRESSION] 4.11-rc: systemd doesn't see most devices
There is a frustrating regression in 4.11 that I've been trying to
track down. The symptoms are that a large number of systemd devices
don't show up. So instead of "systemctl | grep .device | wc -l"
listing some 50+ lines which look like this:
sys-devices-pci0000:00-0000:00:14.0-usb1-1\x2d7-1\x2d7:1.0-bluetooth-hci0.device loaded active plugged /sys/devices/pci0000:00/0000:00:14.0/usb1/1-7/1-7:1.0/bluetooth/hci0
I only get 5-10 lines of devices. This is problematic because it
means that the wifi firmware is not automatically loaded. More
annoyingly, because the device mapper systemd devices are missing:
sys-devices-virtual-block-dm\x2d0.device loaded active plugged /sys/devices/virtual/block/dm-0
... the boot hangs for 90 seconds because it can't fsck devices that
systemd doesn't think exists yet. (I'm using LVM on top of an
encrypted block device, and it doesn't think the dm-crypt device is
created, although given that the root file system is an LVM volume,
obviously LVM and the LUKS setup had worked just fine --- and people
wonder why some folks hate systemd. :-)
The failure past v4.10-5879-gcaa59428971d starts starts becoming
flaky, so sometimes I have to reboot three times or more before the
failure shows up. This is why the bisect has been taking so long, and
so while I'm *faily* certain that the failure is somewhere in the
staging branch merge, it's possible that one of the earlier "git
bisect good"'s are in error. I have been trying multiple reboots
before concluding that a bisection point is "good" but this takes a
huge amount of time, since having GRUB unlock an encrypted LVM volume
takes a long time, and I have to type the decryption password twice at
each boot.
The end of the bisection doesn't make any sense, and so at this point
I've given up, and am posting this to LKML with Linus and Greg cc'ed,
in the hopes that someone else has seen this, or understands what sort
of failure would cause systemd to not think various devices are
present and/or finished initializing. I'm using a Debian testing
distribution, and it would be really good to figure out what the ?!@#
is going on, since if 4.11 releases with this, I suspect a lot of
people will be affected. Unfortunately, while it's not particularly
reliable deep into the bisection, at -rc3 or -rc5 it's **damned**
reproducible. I know how to work around the systemd brain damage for
now by using rc.local, futzing with the dependencies, manually loading
the Wifi module by hand, and (sometimes) living without Audio, but
this is requires a lot of hacking, and it's not, shall we say, a
particularly nice user experience. :-(
- Ted
P.S. I've also attached the output of "systemd | grep devices" so you
can see what happens in a good and bad case, in case that helps.
git bisect start
# good: [v4.10] Linux 4.10
git bisect good c470abd4fde40ea6a0846a2beab642a578c0b8cd
# bad: [v4.10-5879-gcaa59428971d] Merge tag 'staging-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
git bisect bad caa59428971d5ad81d19512365c9ba580d83268c
# good: [v4.10-2518-g1e74a2eb1f5c] Merge tag 'gcc-plugins-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
git bisect good 1e74a2eb1f5cc7f2f2b5aa9c9eeecbcf352220a3
# good: [v4.10-4456-g3051bf36c25d] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
git bisect good 3051bf36c25d5153051704291782f8d44e744d36
# good: [v4.10-5190-ge30aee9e10bb] Merge tag 'char-misc-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
git bisect good e30aee9e10bb5168579e047f05c3d13d09e23356
# good: [v4.10-5202-gb2064617c74f] Merge tag 'driver-core-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
git bisect good b2064617c74f301dab1448f1f9c8dbb3c8021058
# good: [v4.10-rc3-325-g994261dc8f3d] devicetree: sort the Garmin vendor prefix properly.
git bisect good 994261dc8f3dfcd19feecafae3040e932c8f90cf
# good: [v4.10-rc3-501-gc2351249f140] staging: lustre: libcfs: avoid stomping on module param cpu_pattern
git bisect good c2351249f140aebdf911d86d7a1542c40b20fca3
# good: [v4.10-rc7-589-g1b2d7f198140] staging: comedi: dt2815: usleep_range is preferred over udelay
git bisect good 1b2d7f198140aae48dd93c41abde37283312a98c
# bad: [v4.10-rc7-633-g4d0bdcb10c43] staging: rtl8192e: Aligning the * on each line in block comments
git bisect bad 4d0bdcb10c43056489b69186ee43669f2a73b8f9
# bad: [v4.10-rc7-611-g30d69ada0771] Staging: rtl8192u: r819xU_firmware.c - style fix
git bisect bad 30d69ada0771ed25b63ec56887faea31d8f551bd
# good: [v4.10-rc7-600-g8a0e4b9e469c] staging: android: ion: fix coding style issue
git bisect good 8a0e4b9e469c7bbaa206d81e2e7515e1abb1aa00
# bad: [v4.10-rc7-605-gdc223652c6de] staging: r8712u: Fix macros used to read/write the TX/RX descriptors
git bisect bad dc223652c6de409fe5e073b0f631f0413a90e69f
# good: [v4.10-rc7-602-g76b94eb1eb88] staging: set msi_domain_ops as __ro_after_init
git bisect good 76b94eb1eb882e3a8beb51b7d287a21a38a56938
# bad: [v4.10-rc7-604-g221c46d28957] staging: rtl8712u: Fix endian settings for structs describing network packets
git bisect bad 221c46d28957bd6e2158abc2179ce4a8c9ce07d3
# bad: [v4.10-rc7-603-ge2288bce8ec8] staging: rtl8712: Fix some Sparse endian messages
git bisect bad e2288bce8ec828ce36f6443e298e04d41fddff7e
# first bad commit: [v4.10-rc7-603-ge2288bce8ec8] staging: rtl8712: Fix some Sparse endian messages
View attachment "device-4.10" of type "text/plain" (10808 bytes)
View attachment "device-4.10.0-rc2-00031-g38b0a526ec33" of type "text/plain" (746 bytes)
Powered by blists - more mailing lists