netdev - Re: CPU stuck due to the taprio hrtimer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3ff7f761-dd46-40f9-ab34-4eb05d4f2ecd@kylinos.cn>
Date: Fri, 28 Jun 2024 10:48:47 +0800
From: luyun <luyun@...inos.cn>
To: Vinicius Costa Gomes <vinicius.gomes@...el.com>, jhs@...atatu.com,
 xiyou.wangcong@...il.com, jiri@...nulli.us
Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: CPU stuck due to the taprio hrtimer


在 2024/6/28 07:30, Vinicius Costa Gomes 写道:
> Yun Lu <luyun@...inos.cn> writes:
>
>> Hello,
>>
>> When I run a taprio test program on the latest kernel(v6.10-rc4), CPU stuck
>> is detected immediately, and the stack shows that CPU is stuck on taprio
>> hrtimer.
>>
>> The reproducer program link:
>> https://github.com/xyyluyun/taprio_test/blob/main/taprio_test.c
>> gcc taprio_test.c -static -o taprio_test
>>
>> In this program, start the taprio hrtimer which clockid is set to REALTIME, and
>> then adjust the system time by a significant value backwards. Thus, CPU will enter
>> an infinite loop in the__hrtimer_run_queues function, getting stuck and unable to
>> exit or respond to any interrupts.
>>
>> I have tried to avoid this problem by apllying the following patch, and it does work.
>> But I am not sure if this can be the final solution?
>>
>> Thanks.
>>
>> Signed-off-by: Yun Lu <luyun@...inos.cn>
>> ---
>>   net/sched/sch_taprio.c | 24 ++++++++++++++++++++++++
>>   1 file changed, 24 insertions(+)
>>
>> diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
>> index a0d54b422186..2ff8d34bdbac 100644
>> --- a/net/sched/sch_taprio.c
>> +++ b/net/sched/sch_taprio.c
>> @@ -104,6 +104,7 @@ struct taprio_sched {
>>   	u32 max_sdu[TC_MAX_QUEUE]; /* save info from the user */
>>   	u32 fp[TC_QOPT_MAX_QUEUE]; /* only for dump and offloading */
>>   	u32 txtime_delay;
>> +	ktime_t offset;
>>   };
>>   
>>   struct __tc_taprio_qopt_offload {
>> @@ -170,6 +171,19 @@ static ktime_t sched_base_time(const struct sched_gate_list *sched)
>>   	return ns_to_ktime(sched->base_time);
>>   }
>>   
>> +static ktime_t taprio_get_offset(const struct taprio_sched *q)
>> +{
>> +	enum tk_offsets tk_offset = READ_ONCE(q->tk_offset);
>> +	ktime_t time = ktime_get();
>> +
>> +	switch (tk_offset) {
>> +	case TK_OFFS_MAX:
>> +		return 0;
>> +	default:
>> +		return ktime_sub_ns(ktime_mono_to_any(time, tk_offset), time);
>> +	}
>> +}
>> +
>>   static ktime_t taprio_mono_to_any(const struct taprio_sched *q, ktime_t mono)
>>   {
>>   	/* This pairs with WRITE_ONCE() in taprio_parse_clockid() */
>> @@ -918,6 +932,7 @@ static enum hrtimer_restart advance_sched(struct hrtimer *timer)
>>   	int num_tc = netdev_get_num_tc(dev);
>>   	struct sched_entry *entry, *next;
>>   	struct Qdisc *sch = q->root;
>> +	ktime_t now_offset = taprio_get_offset(q);
>>   	ktime_t end_time;
>>   	int tc;
>>   
>> @@ -957,6 +972,14 @@ static enum hrtimer_restart advance_sched(struct hrtimer *timer)
>>   	end_time = ktime_add_ns(entry->end_time, next->interval);
>>   	end_time = min_t(ktime_t, end_time, oper->cycle_end_time);
>>   
>> +	if (q->offset != now_offset) {
>> +		ktime_t diff = ktime_sub_ns(now_offset, q->offset);
>> +
>> +		end_time = ktime_add_ns(end_time, diff);
>> +		oper->cycle_end_time = ktime_add_ns(oper->cycle_end_time, diff);
>> +		q->offset = now_offset;
>> +	}
>> +
> I think what we should do here is a bit different. Let me try to explain
> what I have in mind with some context.
>
> A bit of context: The idea of taprio is to enforce "TSN" traffic
> schedules, these schedules require time synchronization, for example via
> PTP, and in those cases, time jumps are not expected or a sign that
> something is wrong.
>
> In my mind, a time jump, specially a big one, kind of invalidates the
> schedule, as the schedule is based on an absolute time value (the
> base_time), and when time jumps that reference in time is lost.
>
> BUT making the user's system unresponsive is a bug, a big one, as if
> this happens in the real world, the user will be unable to investigate
> what made the system have so big a time correction.
>
> So my idea is to warn the user that the time jumped, say that the user
> needs to reconfigure the schedule, as it is now invalid, and disable the
> schedule.
>
> Does this make sense?
>
> Ah, and thanks for the report.
>
Yeah, I understand what you mean,  and your idea is more reasonable.

Thank you for confirming this bug, my patch is only for temporary 
testing to avoid it.

BRs.