[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <49A659D0.2040903@turknet.net.tr>
Date: Thu, 26 Feb 2009 10:58:56 +0200
From: Tarkan Erimer <tarkan.erimer@...knet.net.tr>
To: linux-kernel@...r.kernel.org
Subject: Failover Kernel
Hi all,
I'm thinking about a kernel feature called "Failover Kernel". The basic
idea is to put 2 kernels (One is running "Primary Kernel" and the next
one is "Backup Kernel") into the memory for disaster recovery of kernel
panic'ing/crashing.
This feature's working schema could be like this :
- "Backup Kernel" could be stated and loaded into the memory via a boot
line option like : "failover_kernel=/boot/vmlinuz-2.6.26"
- Primary running kernel will send keepalives to the "Backup Kernel" to
state that it's alive.
- Primary running kernel can write a journal (like the journaled
filesystems.) about needed infos for the backup kernel to recover.
- When the primary kernel crashed and couldn't send anymore keepalives,
the backup kernel will recover from this journal to proceed to where the
primary kernel left and will become primary.
- When "Backup Kernel" became "Primary" it will load the previous one as
"Backup Kernel" again or maybe it could be left to manual. User could
decide after the disaster recovery which kernel will be load as backup
via a utility like "kexec".
- At kernel compile time, user can choose the the timing for failover
kernel. For example, "Recover After 10 MS. of inactivity (not receiving
keepalives). "
The usage scenarios of this feature could be :
- For people whose Datacenter is remote, it's a big problem when you
compiled a new kernel and rebooting into a crashing/non-booting new
kernel. You left with a completely crashed and non-functioning system.
Hard reset and manual action is required. If there could be "Failover
Kernel feature, the system will simply switch back to the "Backup
Kernel" (This backup kernel will be the known stable kernel of the
system.) and the system will proceed to work without any manual action
required.
- Your system runs fine for the last several months and one day you hit
a bug and kernel crashed/panic'ed . With "Failover Kernel", the system
will switch to the "Backup Kernel" quickly (maybe some milliseconds or
few seconds.) to recover and the system could proceed to work normally.
So,I'm not a coder and I don't know it is really possible as technically
or not. You the kernel hackers, what's your opinion about it ? Could it
be really possible ? If so, how we really can implement it ?
Many thanks for reading this long (and maybe stupid) post! :-)
Tarkan ERIMER
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists