lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <BYAPR18MB2423B06D1366DB7D32A82C13AC800@BYAPR18MB2423.namprd18.prod.outlook.com>
Date:   Thu, 11 Jun 2020 21:49:06 +0000
From:   Derek Chickles <dchickles@...vell.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Satananda Burla <sburla@...vell.com>,
        Felix Manlunas <fmanlunas@...vell.com>
CC:     "frederic@...nel.org" <frederic@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: liquidio vs smp_call_function_single_async()

> From: Peter Zijlstra <peterz@...radead.org>
> Sent: Monday, June 8, 2020 6:05 AM
> To: Derek Chickles <dchickles@...vell.com>; Satananda Burla
> <sburla@...vell.com>; Felix Manlunas <fmanlunas@...vell.com>
> Cc: frederic@...nel.org; linux-kernel@...r.kernel.org;
> davem@...emloft.net; kuba@...nel.org; netdev@...r.kernel.org
> Subject: liquidio vs smp_call_function_single_async()
> 
> Hi,
> 
> I'm going through the smp_call_function_single_async() users, and stumbled
> over your liquidio thingy. It does:
> 
> 		call_single_data_t *csd = &droq->csd;
> 
> 		csd->func = napi_schedule_wrapper;
> 		csd->info = &droq->napi;
> 		csd->flags = 0;
> 
> 		smp_call_function_single_async(droq->cpu_id, csd);
> 
> which is almost certainly a bug. What guarantees that csd is unused when
> you do this? What happens, if the remote CPU is already running RX and
> consumes the packets before the IPI lands, and then this CPU gets another
> interrupt.
> 
> AFAICT you then call this thing again, causing list corruption.

Hi Peter,

I think you're right that this might be a functional bug, but it won't cause list
corruption. We don't rely on the IPI to process packets; only to move NAPI
processing to another CPU. There are separate register counters that indicate
if and how many new packets have arrived, that will be re-read once it
executes.

I think a patch to check if NAPI is already scheduled would address the
unexpected rescheduling issue here. Otherwise, it can probably live as is,
as there is no harm.
 
Thanks,
Derek

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ