lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 9 Jan 2017 05:21:29 +0000
From:   "Zheng, Lv" <lv.zheng@...el.com>
To:     "Zheng, Lv" <lv.zheng@...el.com>, Borislav Petkov <bp@...en8.de>,
        "Rafael J. Wysocki" <rafael@...nel.org>
CC:     "Wysocki, Rafael J" <rafael.j.wysocki@...el.com>,
        "Moore, Robert" <robert.moore@...el.com>,
        J?rg R?del <joro@...tes.org>,
        lkml <linux-kernel@...r.kernel.org>,
        Linux ACPI <linux-acpi@...r.kernel.org>
Subject: RE: 174cc7187e6f ACPICA: Tables: Back port
 acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel

Hi, Borislav

> From: Zheng, Lv
> Subject: RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> early_acpi_os_unmap_memory() from Linux kernel
> 
> Hi,
> 
> > From: linux-acpi-owner@...r.kernel.org [mailto:linux-acpi-owner@...r.kernel.org] On Behalf Of Zheng,
> > Lv
> > Subject: RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> > early_acpi_os_unmap_memory() from Linux kernel
> >
> > Hi,
> >
> > > From: linux-acpi-owner@...r.kernel.org [mailto:linux-acpi-owner@...r.kernel.org] On Behalf Of
> > Borislav
> > > Petkov
> > > Subject: Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> > > early_acpi_os_unmap_memory() from Linux kernel
> > >
> > > On Sun, Jan 08, 2017 at 03:20:20AM +0100, Rafael J. Wysocki wrote:
> > > >  drivers/iommu/amd_iommu_init.c |    2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > Index: linux-pm/drivers/iommu/amd_iommu_init.c
> > > > ===================================================================
> > > > --- linux-pm.orig/drivers/iommu/amd_iommu_init.c
> > > > +++ linux-pm/drivers/iommu/amd_iommu_init.c
> > > > @@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
> > > >  	 */
> > > >  	ret = check_ivrs_checksum(ivrs_base);
> > > >  	if (ret)
> > > > -		return ret;
> > > > +		goto out;
> > > >
> > > >  	amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
> > > >  	DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);
> > >
> > > Good catch, this one needs to be applied regardless.
> > >
> > > However, it doesn't fix my issue though.
> > >
> > > But I think I have it - I went and applied the well-proven debugging
> > > technique of sprinkling printks around. Here's what I'm seeing:
> > >
> > > early_amd_iommu_init()
> > > |-> acpi_put_table(ivrs_base);
> > > |-> acpi_tb_put_table(table_desc);
> > > |-> acpi_tb_invalidate_table(table_desc);
> > > |-> acpi_tb_release_table(...)
> > > |-> acpi_os_unmap_memory
> > > |-> acpi_os_unmap_iomem
> > > |-> acpi_os_map_cleanup
> > > |-> synchronize_rcu_expedited	<-- the kernel/rcu/tree_exp.h version with CONFIG_PREEMPT_RCU=y
> > >
> > > Now that function goes and sends IPIs, i.e., schedule_work()
> > > but this is too early - we haven't even done workqueue_init().
> > > Actually, from looking at the callstack, we do
> > > kernel_init_freeable->native_smp_prepare_cpus() and workqueue_init()
> > > comes next.
> > >
> > > And this makes sense because the splat rIP points to __queue_work() but
> > > we haven't done that yet.
> > >
> > > So that acpi_put_table() is happening too early. Looks like AMD IOMMU
> > > should not put the table but WTH do I know?!
> > >
> > > In any case, commenting out:
> > >
> > >         acpi_put_table(ivrs_base);
> > >         ivrs_base = NULL;
> > >
> > > and the end of early_amd_iommu_init() makes the box boot again.
> >
> > So please help to comment out these 2 lines (with descriptions and do not delete them).
> > Until acpi_os_unmap_memory() is able to handle such an early case.
> 
> IMO, synchronize_rcu_expedited() should be improved:
> If rcu_init() isn't called or there is nothing to synchronize, schedule_work() shouldn't be invoked.


Hmm, I think this problem has its history.
In APEI code, NMI handlers cannot utilize spinlock.
So the developers there used RCU to synchronize NMI handlers before the register region is unmapped.
At that time, there might not be pre-map/post-unmap code prepared in the APEI drivers.
So the APEI developers relied on map/unmap logics implemented in the ACPICA register read/write APIs where the OSL map/unmap is invoked.

That's why the RCU code is in acpi_os_xxx().
If:
1. there is map/unmap/read/write operations available for APEI developers to invoke;
2. RCU synchronization is invoked before invoking the last unmap operation;
3. map/unmap/read/write order is strictly ensured inside of the APEI drivers.
Then we can remove RCU stuffs from acpi_os_xxx.
And the root cause of this issue can be fixed.

I'm not sure if this approach is possible, but can give it a try.
Before that I think all such acpi_put_table() should just be commented out.

Thanks and best regards
Lv

> 
> Thanks and best regards
> Lv
> 
> >
> > Thanks and best regards
> > Lv
> >
> > >
> > > --
> > > Regards/Gruss,
> > >     Boris.
> > >
> > > Good mailing practices for 400: avoid top-posting and trim the reply.

Powered by blists - more mailing lists