lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1201dfde-c552-4f61-b2bd-b3415eb7b4ed@redhat.com>
Date: Tue, 29 Oct 2024 17:08:37 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: "Chia-Yu Chang (Nokia)" <chia-yu.chang@...ia-bell-labs.com>,
 "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
 "davem@...emloft.net" <davem@...emloft.net>,
 "stephen@...workplumber.org" <stephen@...workplumber.org>,
 "jhs@...atatu.com" <jhs@...atatu.com>,
 "edumazet@...gle.com" <edumazet@...gle.com>,
 "kuba@...nel.org" <kuba@...nel.org>, "dsahern@...nel.org"
 <dsahern@...nel.org>, "ij@...nel.org" <ij@...nel.org>,
 "ncardwell@...gle.com" <ncardwell@...gle.com>,
 "Koen De Schepper (Nokia)" <koen.de_schepper@...ia-bell-labs.com>,
 "g.white@...leLabs.com" <g.white@...leLabs.com>,
 "ingemar.s.johansson@...csson.com" <ingemar.s.johansson@...csson.com>,
 "mirja.kuehlewind@...csson.com" <mirja.kuehlewind@...csson.com>,
 "cheshire@...le.com" <cheshire@...le.com>, "rs.ietf@....at"
 <rs.ietf@....at>, "Jason_Livingood@...cast.com"
 <Jason_Livingood@...cast.com>, "vidhi_goel@...le.com" <vidhi_goel@...le.com>
Cc: Olga Albisser <olga@...isser.org>,
 "Olivier Tilmans (Nokia)" <olivier.tilmans@...ia.com>,
 Henrik Steen <henrist@...rist.net>, Bob Briscoe <research@...briscoe.net>
Subject: Re: [PATCH v4 net-next 1/1] sched: Add dualpi2 qdisc

On 10/29/24 16:27, Chia-Yu Chang (Nokia) wrote:
> On Tuesday, October 29, 2024 1:56 PM Paolo Abeni <pabeni@...hat.com>  wrote:
>> On 10/22/24 00:12, chia-yu.chang@...ia-bell-labs.com wrote:
>>> +/* Default alpha/beta values give a 10dB stability margin with 
>>> +max_rtt=100ms. */ static void dualpi2_reset_default(struct 
>>> +dualpi2_sched_data *q) {
>>> +     q->sch->limit = 10000;                          /* Max 125ms at 1Gbps */
>>> +
>>> +     q->pi2.target = 15 * NSEC_PER_MSEC;
>>> +     q->pi2.tupdate = 16 * NSEC_PER_MSEC;
>>> +     q->pi2.alpha = dualpi2_scale_alpha_beta(41);    /* ~0.16 Hz * 256 */
>>> +     q->pi2.beta = dualpi2_scale_alpha_beta(819);    /* ~3.20 Hz * 256 */
>>> +
>>> +     q->step.thresh = 1 * NSEC_PER_MSEC;
>>> +     q->step.in_packets = false;
>>> +
>>> +     dualpi2_calculate_c_protection(q->sch, q, 10);  /* wc=10%, 
>>> + wl=90% */
>>> +
>>> +     q->ecn_mask = INET_ECN_ECT_1;
>>> +     q->coupling_factor = 2;         /* window fairness for equal RTTs */
>>> +     q->drop_overload = true;        /* Preserve latency by dropping */
>>> +     q->drop_early = false;          /* PI2 drops on dequeue */
>>> +     q->split_gso = true;
> 
>> This is a very unexpected default. Splitting GSO packets earlier WRT the H/W constaints definitely impact performances in a bad way.
> 
>> Under which condition this is expected to give better results?
>> It should be at least documented clearly.
> 
> I see a similar operation exists in other qdisc (e.g., sch_tbf.c and sch_cake). They both walk through segs of skb_list.
> Instead, I see other qdisc use "skb_list_walk_safe" macro, so I was thinking to follow a similar approach in dualpi2 (or other comments please let me know).
> Or do you suggest we should force gso-splitting like in other qdisc?

The main point is not traversing an skb list, but the segmentation this
scheduler performs by default. Note that the sch_tbf case is slightly
different, as it segments skbs as a last resort to avoid dropping
packets exceeding the burst limit.
You should provide some more wording or test showing when and how such
splitting is advantageous - i.e. as done in the cake scheduler in commit
2db6dc2662bab14e59517ab4b86a164cc4d2db42.

The reason for the above is that performing unneeded S/W segmentation
is, generally speaking, a huge loss.

Thanks,

Paolo


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ