[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CA+HkQJs1ycDqGwxxZ-6LphP5Kpm-Hx1e4-GJEMGVqX0KEn_POQ@mail.gmail.com>
Date: Wed, 19 Dec 2012 15:05:34 +0100
From: Džiugas Baltrūnas <dziugas@...ula.no>
To: Jiri Slaby <jslaby@...e.cz>
Cc: Shachar Shemesh <shachar@...eu.tv>, Greg KH <greg@...ah.com>,
LKML <linux-kernel@...r.kernel.org>, kzak@...hat.com,
Alan Cox <alan@...rguk.ukuu.org.uk>
Subject: Re: tty ldisc: Close/Reopen race prevention should check the proper flag
Hello,
It appears that the original patch [1] was committed in 3.6 tree, but
not in 3.2. Therefore I can confirm that with the 3.2.33 kernel and
without having the patch applied, I still see the pppd process going
into the "uninterruptible sleep" state occasionally. I'm using Huawei
3G modems (E173u-2, E3131, E353u-2) and typically I'm facing the issue
when there is a poor radio signal. My watchdog daemon forks pppd
process using the following command line:
pppd /dev/ttyUSB0 460800 unit 0 connect "/usr/sbin/chat -V -v -s ABORT
'BUSY' ABORT 'NO CARRIER' ABORT 'VOICE' ABORT 'NO DIALTONE' ABORT 'NO
DIAL TONE' ABORT 'NO ANSWER' ABORT 'DELAYED' REPORT CONNECT TIMEOUT 6
'' 'ATQ0' 'OK-AT-OK' 'ATZ' TIMEOUT 3 'OK\d-AT-OK' 'ATI' 'OK' 'ATZ'
'OK' 'ATQ0 V1 E1 S0=0 &C1 &D2' 'OK-AT-OK'
'AT+CGDCONT=1,\"IP\",\"internet\"' 'OK' 'AT+CGREG=2' 'OK' 'ATD*99#'
TIMEOUT 30 CONNECT ''" nodetach debug lock crtscts modem noauth novj
nobsdcomp nodefaultroute noipdefault lcp-echo-interval 2
lcp-echo-failure 3 child-timeout 5
The same watchdog daemon monitors whether a pppX interface has an IP
address assigned and if not, it sends the SIGTERM signal to the pppd
process, waits for it to exit and then restarts it by forking another
pppd instance. This is usually (but now always) the case when the pppd
process is blocked in the "uninterruptible sleep" state and the kernel
message (INFO: task pppd:PID blocked for more than 120 seconds).
Additionally, both with and without the patch applied, chat process
(forked by pppd) is usually left orphan and ignores SIGTERM (but dies
after SIGKILL).
I'm wondering if there are any reasons why this patch is not present
in the 3.2 tree and I'm also looking forward for the 'right' patch to
address the issue. When the pppd process is the "uninterruptible
sleep" state, the only workaround that we have is to reset modems
using a secondary serial interface for AT commands, so that the USB
device reconnection is triggered.
Regards,
Džiugas Baltrūnas
[1] https://github.com/torvalds/linux/commit/40c9f61eae9098212b6906f29f30f08f7a19b5e2
On 19 September 2012 21:25, Jiri Slaby <jslaby@...e.cz> wrote:
> On 07/10/2012 06:54 AM, Shachar Shemesh wrote:
>> From: Shachar Shemesh <shachar@...eu.tv>
>>
>> Commit acfa747b introduced the TTY_HUPPING flag to distinguish
>> closed TTY from currently closing ones. The test in tty_set_ldisc
>> still remained pointing at the old flag. This causes pppd to
>> sometimes lapse into uninterruptible sleep when killed and
>> restarted.
>>
>> Signed-off-by: Shachar Shemesh <shachar@...eu.tv>
>> ---
>> Tested with 3.2.20 kernel.
>>
>> diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
>> index 24b95db..a662a24 100644
>> --- a/drivers/tty/tty_ldisc.c
>> +++ b/drivers/tty/tty_ldisc.c
>> @@ -658,7 +658,7 @@ int tty_set_ldisc(struct tty_struct *tty, int ldisc)
>> goto enable;
>> }
>>
>> - if (test_bit(TTY_HUPPED, &tty->flags)) {
>> + if (test_bit(TTY_HUPPING, &tty->flags)) {
>> /* We were raced by the hangup method. It will have stomped
>> the ldisc data and closed the ldisc down */
>> clear_bit(TTY_LDISC_CHANGING, &tty->flags);
>
> Yes, that makes the issue go away, but does not seem to be right too.
> There are two issues I see:
> * TTY_HUPPED has no use now. That is incorrect. Here should be a test
> for both flags, I think.
> * The change forces the set_ldisc path to always re-open the ldisc even
> if it the terminal is HUPPED.
>
> The bug is not in the kernel. It was in login(1) (util-linux). And it
> should be fixed by now. See:
> http://git.kernel.org/?p=utils/util-linux/util-linux.git;a=commitdiff;h=2e7035646eb85851171cc2e989bfa858a4f00cd4
>
>
> I'm for an in-fact revert of the patch. But temporarily I would still
> return EIO. However let's do it even after the reopen is done, so that
> we get rid of the user-visible regression. Like in the attached patch.
> The regression was introduced by commit c65c9bc3e (tty: rewrite the
> ldisc locking). That is !three! years ago. And that is suspicious in itself.
>
>
> BTW Why pppd thinks it is a good idea to ignore EIO from TIOCSETD ioctl?
>
> regards,
>
>
> >From b3e5a7f778be46b4b8dd38d5565010755ccafd67 Mon Sep 17 00:00:00 2001
> From: Jiri Slaby <jslaby@...e.cz>
> Date: Wed, 19 Sep 2012 20:23:59 +0200
> Subject: [PATCH 1/1] TTY: forbid setting ldisc on HUPPED tty, step 1
>
> Commit c65c9bc3e (tty: rewrite the ldisc locking) introduced a check
> into tty_set_ldisc whether the terminal is already hung. In that case
> it returns -EIO.
>
> Later, commit aa3c8af86 (tty ldisc: Close/Reopen race prevention
> should check the proper flag) effectively disabled the check by
> replacing HUPPED by HUPPING flag. Those are two different flags and
> the former check was actually correct. In principle we revert here to
> the previous behavior plus we add a check if we are HUPPING. Because
> it makes no sense to change ldisc when some other thread is making a
> HUP but it had to unlock BTM temporarily.
>
> The bug which commit aa3c8af86 above tried to avoid lays in fact in
> login(1). It was fixed by util-linux commit 2e7035646 (login: close
> tty before vhangup()) already. It was reproducible for example by the
> following setup:
> 1) add a user ppp with shell set to /usr/bin/pppd
> 2) run agetty on ttyS0 to wait for a user
> 3) user connects via modem on ttyS0, types ppp to getty
> 4) 'login ppp' is executed
> 5) login performs hangup without properly closing ttyS0
> 6) login executes pppd
> 7) pppd does TIOCSETD attempt to N_PPP
> 8) kernel denies step 7) as ttyS0 is HUPPED
> 9) pppd calls some N_PPP ldisc's ioctls regardless -- uninterruptible
> sleep in tty_ioctl, trying to have a ref on tty->ldisc
>
> Now to sum up the fixes:
> * The fix in util-linux is that 5 is changed to properly close all
> ttyS0 instances.
> * The workaround from commit aa3c8af86 was that 8) succeeds.
> * This patch is the same as aa3c8af86 except we still return -EIO and
> warn the user this will not be supported in the future. This is
> because we do not want to have a user-visible regression here.
>
> "Step 1" because step two would be to immediatelly return -EIO like in
> the attached false branch in the patch.
>
> Signed-off-by: Jiri Slaby <jslaby@...e.cz>
> Cc: Shachar Shemesh <shachar@...eu.tv>
> Cc: Alan Cox <alan@...rguk.ukuu.org.uk>
> ---
> drivers/tty/tty_ldisc.c | 37 ++++++++++++++++++++++++++++---------
> 1 file changed, 28 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
> index f4e6754..7cca3fd 100644
> --- a/drivers/tty/tty_ldisc.c
> +++ b/drivers/tty/tty_ldisc.c
> @@ -563,6 +563,7 @@ int tty_set_ldisc(struct tty_struct *tty, int ldisc)
> struct tty_ldisc *o_ldisc, *new_ldisc;
> int work, o_work = 0;
> struct tty_struct *o_tty;
> + int retval_hupped = 0;
>
> new_ldisc = tty_ldisc_get(ldisc);
> if (IS_ERR(new_ldisc))
> @@ -659,14 +660,32 @@ int tty_set_ldisc(struct tty_struct *tty, int ldisc)
> goto enable;
> }
>
> - if (test_bit(TTY_HUPPING, &tty->flags)) {
> - /* We were raced by the hangup method. It will have stomped
> - the ldisc data and closed the ldisc down */
> - clear_bit(TTY_LDISC_CHANGING, &tty->flags);
> - mutex_unlock(&tty->ldisc_mutex);
> - tty_ldisc_put(new_ldisc);
> - tty_unlock(tty);
> - return -EIO;
> + if (test_bit(TTY_HUPPED, &tty->flags) ||
> + test_bit(TTY_HUPPING, &tty->flags)) {
> + /*
> + * Until login(1) is fixed everywhere, let's fall through.
> + * Change to goto with -EIO after 3.10.
> + */
> + if (1) {
> + char ttyn[64], comm[TASK_COMM_LEN];
> + printk_ratelimited(KERN_WARNING "%s: %s calls TIOCSETD on a hung TTY (%s). This is deprecated and will stop working soon.\n",
> + __func__, get_task_comm(comm, current),
> + tty_name(tty, ttyn));
> + retval_hupped = -EIO;
> + } else {
> + /*
> + * We were raced by the hangup method. It will have
> + * stomped the ldisc data and closed the ldisc down
> + */
> + tty_ldisc_put(new_ldisc);
> + retval = -EIO;
> + /*
> + * We have to enable the original ldisc so that
> + * processes see N_TTY and fail instead of waiting
> + * infinitely.
> + */
> + goto enable;
> + }
> }
>
> /* Shutdown the current discipline. */
> @@ -709,7 +728,7 @@ enable:
> schedule_work(&o_tty->port->buf.work);
> mutex_unlock(&tty->ldisc_mutex);
> tty_unlock(tty);
> - return retval;
> + return retval ? : retval_hupped;
> }
>
> /**
> --
> 1.7.12
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists