[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ0PZbSRX6Rqn5NnnMkVGrPovtSuc2NWasysOYnjr3j9rON1VQ@mail.gmail.com>
Date: Wed, 2 Nov 2011 18:44:31 +0900
From: MyungJoo Ham <myungjoo.ham@...il.com>
To: Tejun Heo <tj@...nel.org>
Cc: David Fries <david@...es.net>, netdev@...r.kernel.org,
linux-pm@...ts.linux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [linux-pm] hiberante hangs TCP Re: [EXAMPLE CODE] Parasite thread
injection and TCP connection hijacking
On Mon, Oct 31, 2011 at 5:16 AM, Tejun Heo <tj@...nel.org> wrote:
> (cc'ing Rafael and linux-pm)
>
> On Sat, Oct 29, 2011 at 11:48:21PM -0500, David Fries wrote:
>> I saw the write up on this on lwn.net, pretty creative by the way, and
>> it got me thinking about a different checkpoint/restart problem I've
>> been running into. Specifically in hibernating to disk. In the
>> hibernate case active TCP connections hang after resuming, while an
>> idle TCP connection will continue after the system is back up. My
>> observation is the kernel checkpoints itself to memory, enables
>> devices, writes out that checkpoint image to storage, then powers off.
>> The problem is if TCP packets are received while writing to storage,
>> the kernel will continue to queue and ack those TCP packets, but the
>> running kernel and it's network state is shortly lost. When the
>> computer resumes, those TCP byte sequences hang the TCP connection for
>> an extended period of time while the resumed computer refuses to
>> acknowledge the data that was received after checkpointing and the now
>> running kernel knew nothing about, and the other computer tries in
>> vain to resend any data that hadn't yet been acknowledged, which is
>> always after the data that was lost, until one of them eventually
>> gives up.
>>
>> I've been wondering if it was safe or possible to leave any network
>> interfaces down after the checkpoint, or what the right solution would
>> be. I didn't think marking every TCP connection with a ZOMBIE_KERNEL
>> bit just after the kernel checkpoint (for the kernel is walking dead
>> and won't remember anything that happens), and then prevent any TCP
>> acks from being sent for those connections would be the right
>> solution. I've taken to unplugging the physical lan cable,
>> hibernating to disk, and plugging it back in after the system is down,
>> to avoid the problem. Any ideas?
>
> Hmmm... sounds like taking down network interfaces before starting
> hibernation sequence should be enough, which shouldn't be too
> difficult to implement from userland. Rafael, what do you think?
>
> Thanks.
Um... it seems that the "thaw" callbacks of network interfaces or TCP
should do something on this.
Probably, the "thaw" callbacks should make sure that the TCP
connections are closed?
Cheers,
MyungJoo
>
> --
> tejun
> _______________________________________________
> linux-pm mailing list
> linux-pm@...ts.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/linux-pm
>
--
MyungJoo Ham, Ph.D.
Mobile Software Platform Lab, DMC Business, Samsung Electronics
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists