Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 14 Feb 2012 19:53:43 +0530
From:      Maninya M <maninya@gmail.com>
To:        freebsd-hackers@freebsd.org
Subject:   OS support for fault tolerance
Message-ID:  <CAC46K3mc=V=oBOQnvEp9iMTyNXKD1Ki_%2BD0Akm8PM7rdJwDF8g@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
For multicore desktop computers, suppose one of the cores fails, the
FreeBSD OS crashes. My question is about how I can make the OS tolerate
this hardware fault.
The strategy is to checkpoint the state of each core at specific intervals
of time in main memory. Once a core fails, its previous state is retrieved
from the main memory, and the processes that were running on it are
rescheduled on the remaining cores.

I read that the OS tolerates faults in large servers. I need to make it do
this for a Desktop OS. I assume I would have to change the scheduler
program. I am using FreeBSD 9.0 on an Intel core i5 quad core machine.
How do I go about doing this? What exactly do I need to save for the
"state" of the core? What else do I need to know?
I have absolutely no experience with kernel programming or with FreeBSD.
Any pointers to good sources about modifying the source-code of FreeBSD
would be greatly appreciated.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAC46K3mc=V=oBOQnvEp9iMTyNXKD1Ki_%2BD0Akm8PM7rdJwDF8g>