lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOYeF9VsmqKMcQjo1k6YkGNujwN-nzfxY17N3F-CMikE1tYp+w@mail.gmail.com>
Date: Mon, 15 Jan 2024 13:13:49 +0100
From: Allison Karlitskaya <allison.karlitskaya@...hat.com>
To: linux-kernel@...r.kernel.org, linux-block@...r.kernel.org, 
	Jens Axboe <axboe@...nel.dk>
Subject: PROBLEM: BLKPG_DEL_PARTITION with GENHD_FL_NO_PART used to return
 ENXIO, now returns EINVAL

hi,

[1.] One line summary of the problem:
BLKPG_DEL_PARTITION on an empty loopback device used to return ENXIO
but now returns EINVAL, breaking partprobe

[2.] Full description of the problem/report:
We recently caught this problem in our CI for Cockpit:
https://github.com/cockpit-project/bots/pull/5793

The summary is that if you do something like this:

$ dd if=/dev/zero of=/tmp/foo bs=1M count=50
$ partprobe $(losetup --find --show /tmp/foo)

Then this will fail with the following error message:

Error: Partition(s) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,
33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49,
50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64 on
/dev/loop2 have been written, but we have been unable to inform the
kernel of the change, probably because it/they are in use.  As a
result, the old partition(s) will remain in use.  You should reboot
now before making further changes.

.. when it used to be successful.  That's down to this syscall
(called by partprobe) changing its behaviour between kernel versions:

-ioctl(3, BLKPG, {op=BLKPG_DEL_PARTITION, flags=0, datalen=152,
data={start=0, length=0, pno=1, devname="", volname=""}}) = -1 ENXIO
(No such device or address)
+ioctl(3, BLKPG, {op=BLKPG_DEL_PARTITION, flags=0, datalen=152,
data={start=0, length=0, pno=1, devname="", volname=""}}) = -1 EINVAL
(Invalid argument)

This is observed on Ubuntu jammy with partprobe from parted
3.4-2build1.  I've confirmed that the original parted-3.4 download
from https://ftp.gnu.org/gnu/parted/ is also impacted in the same way.

[3.] Keywords:
block, partition, BLKPG_DEL_PARTITION, loop device, EINVAL, ENXIO

[4.] Kernel information:
Linux ubuntu 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC
2024 x86_64 x86_64 x86_64 GNU/Linux

This is the version currently in jammy-proposed.  The likely culprit
is this commit:

  https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/jammy/commit/?id=49a502554e8aa853a0357f287121d4cdf4442a58

which is also upstream as 1a721de8489fa559ff4471f73c58bb74ac5580d3.

There has been discussion on linux-kernel before about this:
https://marc.info/?l=linux-kernel&m=169753467305218&w=2

but now we have a pretty clear case of "breaks userspace in the wild".

[4.1.] Kernel version (from /proc/version):

Linux version 5.15.0-94-generic (buildd@...02-amd64-096) (gcc (Ubuntu
11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38)
#104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024

[4.2.] Kernel .config file:

I pasted a copy here:

https://paste.centos.org/view/8d6506bc

but it won't be around for more than 24 hours.  It's just the config
file present in /boot on the affected install.

[5.] Most recent kernel version which did not have the bug:

We last tested 5.15.0-91-generic and found it to be working with the
previous behaviour (ie: returning ENXIO).

[7.] A small shell script or example program which triggers the
     problem (if possible)

as above:

$ dd if=/dev/zero of=/tmp/foo bs=1M count=50
$ partprobe $(losetup --find --show /tmp/foo)

[8.] Environment
[8.1.] Software (add the output of the ver_linux script here)
[8.2.] Processor information (from /proc/cpuinfo):
[8.3.] Module information (from /proc/modules):
[8.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)
[8.5.] PCI information ('lspci -vvv' as root)
[8.6.] SCSI information (from /proc/scsi/scsi)
[8.7.] Other information that might be relevant to the problem
       (please look in /proc and include all information that you
       think to be relevant):
[X.] Other notes, patches, fixes, workarounds:


I don't expect there would be anything relevant here, but feel free to
ask.  It's a qemu x86_64 VM image running on my Intel laptop.  If you
want to test this, check out

   https://github.com/cockpit-project/bots/tree/image-refresh-ubuntu-2204-20240114-225118

and run

  ./vm-run -q ubuntu-2204

at which point you should be presented with instructions about how to
ssh to the machine.

Thanks for the attention!

Allison Karlitskaya


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ