lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 26 Feb 2009 10:58:56 +0200
From:	Tarkan Erimer <tarkan.erimer@...knet.net.tr>
To:	linux-kernel@...r.kernel.org
Subject: Failover Kernel

Hi all,

I'm thinking about a kernel feature called "Failover Kernel". The basic 
idea is to put 2 kernels (One is running "Primary Kernel" and the next 
one is "Backup Kernel") into the memory for disaster recovery of kernel 
panic'ing/crashing.

This feature's working schema could be like this :

- "Backup Kernel" could be stated and loaded into the memory via a boot 
line option like : "failover_kernel=/boot/vmlinuz-2.6.26"
- Primary running kernel will send keepalives to the "Backup Kernel" to 
state that it's alive.
- Primary running kernel can write a journal (like the journaled 
filesystems.) about needed infos for the backup kernel to recover.
- When the primary kernel crashed and couldn't send anymore keepalives, 
the backup kernel will recover from this journal to proceed to where the 
primary kernel left and will become primary.
- When "Backup Kernel" became "Primary" it will load the previous one as 
"Backup Kernel" again or maybe it could be left to manual. User could 
decide after the disaster recovery which kernel will be load as backup 
via a utility like "kexec".
- At kernel compile time, user can choose the the timing for failover 
kernel. For example, "Recover After 10 MS. of inactivity (not receiving 
keepalives). "


The usage scenarios of this feature could be :

- For people whose Datacenter is remote, it's a big problem when you 
compiled a new kernel and rebooting into a crashing/non-booting new 
kernel. You left with a completely crashed and non-functioning system. 
Hard reset and manual action is required. If there could be "Failover 
Kernel feature, the system will simply switch back to the "Backup 
Kernel" (This backup kernel will be the known stable kernel of the 
system.) and the system will proceed to work without any manual action 
required.

- Your system runs fine for the last several months and one day you hit 
a bug and kernel crashed/panic'ed . With "Failover Kernel", the system 
will switch to the "Backup Kernel" quickly (maybe some milliseconds or 
few seconds.) to recover and the system could proceed to work normally.

So,I'm not a coder and I don't know it is really possible as technically 
or not. You the kernel hackers, what's your opinion about it ? Could it 
be really possible ? If so, how we really can implement it ?

Many thanks for reading this long (and maybe stupid) post! :-)

Tarkan ERIMER


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ