lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <87pllv90ow.ffs@tglx> Date: Fri, 13 Dec 2024 14:23:11 +0100 From: Thomas Gleixner <tglx@...utronix.de> To: Ming Lei <ming.lei@...hat.com> Cc: David Woodhouse <dwmw2@...radead.org>, Stefan Hajnoczi <stefanha@...hat.com>, Jason Wang <jasowang@...hat.com>, "x86@...nel.org" <x86@...nel.org>, hpa <hpa@...or.com>, dyoung <dyoung@...hat.com>, kexec <kexec@...ts.infradead.org>, linux-ext4 <linux-ext4@...r.kernel.org>, "Michael S. Tsirkin" <mst@...hat.com>, Stefano Garzarella <sgarzare@...hat.com>, eperezma <eperezma@...hat.com>, Paolo Bonzini <bonzini@...hat.com>, Petr Mladek <pmladek@...e.com>, John Ogness <jogness@...utronix.de>, Peter Zijlstra <peterz@...radead.org>, Jens Axboe <axboe@...nel.dk>, "Rafael J. Wysocki" <rafael@...nel.org> Subject: Re: Lockdep warnings on kexec (virtio_blk, hrtimers) On Fri, Dec 13 2024 at 19:48, Ming Lei wrote: > On Fri, Dec 13, 2024 at 12:31:24PM +0100, Thomas Gleixner wrote: >> I'd rather say, that's a kexec problem. On the same instance a loop test >> of suspend to ram with pm_test=core just works fine. That's equivalent >> to the kexec scenario. It goes down to syscore_suspend() and skips the >> actual suspend low level magic. It then resumes with syscore_resume() >> and brings the machine back up. >> >> That runs for 2 hours now, while the kexec muck dies within 2 >> minutes.... >> >> And if you look at the difference of these implementations, you might >> notice that kexec just implemented some rudimentary version of the >> actual suspend logic. Based on let's hope it works that way. >> >> This is just insane and should be rewritten to actually reuse the suspend >> mechanism, which is way better tested than this kexec jump muck. > > But kexec is supposed to align with reboot/shutdown, instead of suspend, > and it is calling ->shutdown() for notifying driver & device. That's only true for the case where the new kernel takes over. In the case KEXEC_JUMP=n and kexec_image->preserve_context == true, then it is supposed to align with suspend/resume and if you look at the code then it actually mimics suspend/resume in the most dilettanteish way. It's a patently bad idea to clobber the kernel with kexec jump "fixes" instead of using the well tested and established suspend/resume machinery. All it takes is to: 1) disable the wakeup logic 2) provide a mechanism to invoke machine_kexec() instead of the actual suspend mechanism. No? Thanks tglx
Powered by blists - more mailing lists