[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4e27098e-ac8d-98f0-3a9a-ea25242e24ec@de.ibm.com>
Date: Wed, 24 Jun 2020 20:09:55 +0200
From: Christian Borntraeger <borntraeger@...ibm.com>
To: Luis Chamberlain <mcgrof@...nel.org>
Cc: Christoph Hellwig <hch@...radead.org>, ast@...nel.org,
axboe@...nel.dk, bfields@...ldses.org,
bridge@...ts.linux-foundation.org, chainsaw@...too.org,
christian.brauner@...ntu.com, chuck.lever@...cle.com,
davem@...emloft.net, dhowells@...hat.com,
gregkh@...uxfoundation.org, jarkko.sakkinen@...ux.intel.com,
jmorris@...ei.org, josh@...htriplett.org, keescook@...omium.org,
keyrings@...r.kernel.org, kuba@...nel.org,
lars.ellenberg@...bit.com, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-nfs@...r.kernel.org,
linux-security-module@...r.kernel.org, nikolay@...ulusnetworks.com,
philipp.reisner@...bit.com, ravenexp@...il.com,
roopa@...ulusnetworks.com, serge@...lyn.com, slyfox@...too.org,
viro@...iv.linux.org.uk, yangtiezhu@...ngson.cn,
netdev@...r.kernel.org, markward@...ux.ibm.com,
linux-s390 <linux-s390@...r.kernel.org>
Subject: Re: linux-next: umh: fix processed error when UMH_WAIT_PROC is used
seems to break linux bridge on s390x (bisected)
On 24.06.20 19:58, Christian Borntraeger wrote:
>
>
> On 24.06.20 18:09, Luis Chamberlain wrote:
>> On Wed, Jun 24, 2020 at 05:54:46PM +0200, Christian Borntraeger wrote:
>>>
>>>
>>> On 24.06.20 16:43, Christoph Hellwig wrote:
>>>> On Wed, Jun 24, 2020 at 01:11:54PM +0200, Christian Borntraeger wrote:
>>>>> Does anyone have an idea why "umh: fix processed error when UMH_WAIT_PROC is used" breaks the
>>>>> linux-bridge on s390?
>>>>
>>>> Are we even sure this is s390 specific and doesn't happen on other
>>>> architectures with the same bridge setup?
>>>
>>> Fair point. AFAIK nobody has tested this yet on x86.
>>
>> Regardless, can you enable dynamic debug prints, to see if the kernel
>> reveals anything on the bridge code which may be relevant:
>>
>> echo "file net/bridge/* +p" > /sys/kernel/debug/dynamic_debug/control
>>
>> Luis
>
> When I start a guest the following happens with the patch:
>
> [ 47.420237] virbr0: port 2(vnet0) entered blocking state
> [ 47.420242] virbr0: port 2(vnet0) entered disabled state
> [ 47.420315] device vnet0 entered promiscuous mode
> [ 47.420365] virbr0: port 2(vnet0) event 16
> [ 47.420366] virbr0: br_fill_info event 16 port vnet0 master virbr0
> [ 47.420373] virbr0: toggle option: 12 state: 0 -> 0
> [ 47.420536] virbr0: port 2(vnet0) entered blocking state
> [ 47.420538] virbr0: port 2(vnet0) event 16
> [ 47.420539] virbr0: br_fill_info event 16 port vnet0 master virbr0
>
> and the nothing happens.
>
>
> without the patch
> [ 33.805410] virbr0: hello timer expired
> [ 35.805413] virbr0: hello timer expired
> [ 36.184349] virbr0: port 2(vnet0) entered blocking state
> [ 36.184353] virbr0: port 2(vnet0) entered disabled state
> [ 36.184427] device vnet0 entered promiscuous mode
> [ 36.184479] virbr0: port 2(vnet0) event 16
> [ 36.184480] virbr0: br_fill_info event 16 port vnet0 master virbr0
> [ 36.184487] virbr0: toggle option: 12 state: 0 -> 0
> [ 36.184636] virbr0: port 2(vnet0) entered blocking state
> [ 36.184638] virbr0: port 2(vnet0) entered listening state
> [ 36.184639] virbr0: port 2(vnet0) event 16
> [ 36.184640] virbr0: br_fill_info event 16 port vnet0 master virbr0
> [ 36.184645] virbr0: port 2(vnet0) event 16
> [ 36.184646] virbr0: br_fill_info event 16 port vnet0 master virbr0
> [ 37.805478] virbr0: hello timer expired
> [ 38.205413] virbr0: port 2(vnet0) forward delay timer
> [ 38.205414] virbr0: port 2(vnet0) entered learning state
> [ 38.205427] virbr0: port 2(vnet0) event 16
> [ 38.205430] virbr0: br_fill_info event 16 port vnet0 master virbr0
> [ 38.765414] virbr0: port 2(vnet0) hold timer expired
> [ 39.805415] virbr0: hello timer expired
> [ 40.285410] virbr0: port 2(vnet0) forward delay timer
> [ 40.285411] virbr0: port 2(vnet0) entered forwarding state
> [ 40.285418] virbr0: topology change detected, propagating
> [ 40.285420] virbr0: decreasing ageing time to 400
> [ 40.285427] virbr0: port 2(vnet0) event 16
> [ 40.285432] virbr0: br_fill_info event 16 port vnet0 master virbr0
> [ 40.765408] virbr0: port 2(vnet0) hold timer expired
> [ 41.805415] virbr0: hello timer expired
> [ 42.765426] virbr0: port 2(vnet0) hold timer expired
> [ 43.805425] virbr0: hello timer expired
> [ 44.765426] virbr0: port 2(vnet0) hold timer expired
> [ 45.805418] virbr0: hello timer expired
>
> and continuing....
Just reverting the umh.c parts like this makes the problem go away.
diff --git a/kernel/umh.c b/kernel/umh.c
index f81e8698e36e..79f139a7ca03 100644
--- a/kernel/umh.c
+++ b/kernel/umh.c
@@ -154,8 +154,8 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
* the real error code is already in sub_info->retval or
* sub_info->retval is 0 anyway, so don't mess with it then.
*/
- if (KWIFEXITED(ret))
- sub_info->retval = KWEXITSTATUS(ret);
+ if (ret)
+ sub_info->retval = ret;
}
/* Restore default kernel sig handler */
Powered by blists - more mailing lists