I guess many of you have already known about epckpt, a patch written
by Eduardo Pinheiro that adds process checkpoint/restart capability to the
Linux kernel. CRAK does the similar thing - in fact, I started this
project based on epckpt's code, but now they have been very different.
The major differences are:
* CRAK is a kernel module (!!)
* CRAK doesn't do any bookkeeping (thus no run time overhead)
* CRAK uses different strategy to checkpoint parallel processes (user
space vs kernel space, and signal vs semaphore)
Moreover, I've successfully (in the sense of working for simple cases such
as telnet) added network socket support. Due to some academic reasons I
have not put this portion of code online, but I'll do so as soon as
possible.
The main website is at http://www.cs.columbia.edu/~huaz/research/crak.htm.
It works for 2.2.19 and 2.4.4 (the latter is still beta). You can also
learn more about checkpointing at http://www.checkpointing.org (maintained
by Eduardo Pinheiro).
Speaking of reliability, it's not 100% reliable. Originally I wanted to
make it more reliable before annoucing it, and now I realized (and was
convinced) that letting people know about it earlier could make this goal
happen sooner.
All comments/praise/criticism are welcome. Thanks.
----------------------------------------------------------------
Hua Zhong
Central Research Facilities Department of Computer Science
Columbia University New York, NY 10027
Email: huaz@cs.columbia.edu http://www.cs.columbia.edu/~huaz
----------------------------------------------------------------
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/