lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 3 Mar 2014 18:58:32 +0100
From:	Gerald Schaefer <gerald.schaefer@...ibm.com>
To:	Amir Vadai <amirv@...lanox.com>
Cc:	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	Eyal Perry <eyalpe@...lanox.com>
Subject: Re: mlx4_core: DMA-API: device driver tries to sync DMA memory it
 has not allocated

On Sun, 2 Mar 2014 10:40:47 +0200
Amir Vadai <amirv@...lanox.com> wrote:

> On 28/02/14 22:24 +0100, Gerald Schaefer wrote:
> > On Thu, 27 Feb 2014 19:01:09 +0200
> > Amir Vadai <amirv@...lanox.com> wrote:
> > 
> > > On 10/12/13 14:44 +0100, Gerald Schaefer wrote:
> > > > Hi,
> > > > 
> > > > During network stress test with CONFIG_DMA_API_DEBUG=y we get the following
> > > > message on the server side:
> > > > 
> > > > [57523.955982] mlx4_core 0000:00:00.0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x0000000143d2c202] [size=1526 bytes]
> > > > [57523.956018] ------------[ cut here ]------------
> > > > [57523.956021] WARNING: at /home/dailybuild/builder/packages/BUILD/linux-3.12.3-20131209/lib/dma-debug.c:986
> > > > [57523.956022] Modules linked in: mlx4_en dm_multipath scsi_dh mlx4_core eadm_sch dm_mod lcs autofs4
> > > > [57523.956034] CPU: 3 PID: 20 Comm: ksoftirqd/3 Not tainted 3.12.3-65.x.20131209-s390xdefault #1
> > > > [57523.956037] task: 000000026f47cb60 ti: 000000026f484000 task.ti: 000000026f484000
> > > > [57523.956044] Krnl PSW : 0404e00180000000 00000000004a0422 (check_sync+0xf2/0x52c)
> > > > [57523.956047]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 EA:3
> > > > [57523.956047] Krnl GPRS: 0000000000000111 000000008b767e9c 0000000000000092 0000000000000000
> > > > [57523.956051]            00000000004a041e 0000000000000000 00000000000005f6 0000000143d2c202
> > > > [57523.956053]            000000026f487a98 0000000000000001 0000000000a71144 000000026f10dd60
> > > > [57523.956055]            0000000000000000 000000000075bb18 00000000004a041e 000000026f487980
> > > > [57523.956064] Krnl Code: 00000000004a0412: c020002143d7        larl    %r2,8c8bc0
> > > > [57523.956064]            00000000004a0418: c0e50012717a        brasl   %r14,6ee70c
> > > > [57523.956064]           #00000000004a041e: a7f40001            brc     15,4a0420
> > > > [57523.956064]           >00000000004a0422: e330a0000012        lt      %r3,0(%r10)
> > > > [57523.956064]            00000000004a0428: a7740201            brc     7,4a082a
> > > > [57523.956064]            00000000004a042c: a7f401f3            brc     15,4a0812
> > > > [57523.956064]            00000000004a0430: d5078030c030        clc     48(8,%r8),48(%r12)
> > > > [57523.956064]            00000000004a0436: a7c40070            brc     12,4a0516
> > > > [57523.956081] Call Trace:
> > > > [57523.956084] ([<00000000004a041e>] check_sync+0xee/0x52c)
> > > > [57523.956087]  [<00000000004a0a26>] debug_dma_sync_single_for_cpu+0x66/0x74
> > > > [57523.956094]  [<000003ff801fcae0>] mlx4_en_complete_rx_desc+0x15c/0x240 [mlx4_en]
> > > > [57523.956098]  [<000003ff801fe1fa>] mlx4_en_process_rx_cq+0x56a/0xb58 [mlx4_en]
> > > > [57523.956101]  [<000003ff801fe8ea>] mlx4_en_poll_rx_cq+0x92/0xf8 [mlx4_en]
> > > > [57523.956105]  [<00000000005d558c>] net_rx_action+0xa0/0x2f0
> > > > [57523.956109]  [<000000000013cd10>] __do_softirq+0x1e8/0x480
> > > > [57523.956111]  [<000000000013cfe4>] run_ksoftirqd+0x3c/0x80
> > > > [57523.956114]  [<000000000016f496>] smpboot_thread_fn+0x2c6/0x2e8
> > > > [57523.956118]  [<0000000000163834>] kthread+0xd0/0xe4
> > > > [57523.956123]  [<00000000006ff19e>] kernel_thread_starter+0x6/0xc
> > > > [57523.956125]  [<00000000006ff198>] kernel_thread_starter+0x0/0xc
> > > > [57523.956127] 1 lock held by ksoftirqd/3/20:
> > > > [57523.956129]  #0:  (&(&dma_entry_hash[i].lock)->rlock){-.-...}, at: [<000000000049f1d6>] get_hash_bucket+0x4a/0x64
> > > > [57523.956135] Last Breaking-Event-Address:
> > > > [57523.956137]  [<00000000004a041e>] check_sync+0xee/0x52c
> > > > [57523.956139] ---[ end trace ec82af84e1d16c5f ]---
> > > > 
> > > > --
> > > 
> > > Hi Gerald,
> > > 
> > > It seems that I missed this mail.
> > > 
> > > I didn't see this warning in our lab in x86 platform.
> > > Is it reproduced easily on PPC?
> > > Any hints how to reproduce it in our lab?
> > 
> > Hi Amir,
> > 
> > I could narrow it down to a uperf testrun with the following profile
> > and CONFIG_DMA_API_DEBUG=y (it usually takes multiple runs to show):
> > 
> > <profile name="test250">
> >   <group nthreads="250">
> >     <transaction iterations="1">
> >       <flowop type="accept" options="remotehost=10.100.1.36 protocol=tcp"/>
> >     </transaction>
> >     <transaction duration="300">
> >       <flowop type="write" options="size=1"/>
> >       <flowop type="read" options="size=1"/>
> >     </transaction>
> >     <transaction iterations="1">
> >       <flowop type="disconnect" />
> >     </transaction>
> >   </group>
> > </profile>
> > 
> > > 
> > > Do you have a 'good' version that didn't have the warning?
> > 
> > No, it showed on 3.12 and 3.13, before that we had other warnings but
> > not this one as far as I remember.

I checked again with 3.10 and 3.11, and couldn't reproduce the warning
there. However, there is another warning on those kernel versions:

mlx4_core 0000:00:00.0: DMA-API: device driver failed to check map error[device address=0x00000001046af186] [size=258 bytes] [mapped as single]

That one was fixed with commit 237a3a3b "net/mlx4_en: Fix handling of dma_map failure".
When I apply commit 237a3a3b on top of 3.11, I can also reproduce the
new "sync DMA" warning there.

So, either the new "sync DMA" warning was introduced with commit 237a3a3b,
or it was just occluded by the previous "map error" warning.

> > 
> > Thanks,
> > Gerald
> > 
> 
> Ok, We're preparing a setup to reproduce the issue.
> 
> Amir
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ