lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160829235403.GA14177@localhost>
Date:   Mon, 29 Aug 2016 18:54:03 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Roland Singer <roland.singer@...ertbit.com>
Cc:     linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-acpi@...r.kernel.org, dri-devel@...ts.freedesktop.org
Subject: Re: Kernel Freeze with American Megatrends BIOS

On Mon, Aug 29, 2016 at 09:55:56PM +0200, Roland Singer wrote:
> Just tried it and the system didn't freeze. However it will freeze
> after some time (few minutes while working).
> 
> Seams to be pci_read_config_dword. Where is this exactly defined?

pci_read_config_dword() is defined in include/linux/pci.h.  It calls
pci_bus_read_config_dword() which is defined by the PCI_OP_READ() macro
in drivers/pci/access.c.

If I understand correctly, this:

  dis_dev_get();
  pci_read_config_dword(dis_dev, 0, &cfg_word);
  dis_dev_put();

causes an immediate system hang, but if you only do this:

  dis_dev_get();
  dis_dev_put();

the system hangs a few minutes later.  Right?

> Am 29.08.2016 um 21:07 schrieb Bjorn Helgaas:
> > On Mon, Aug 29, 2016 at 08:46:17PM +0200, Roland Singer wrote:
> >> Hi Bjorn,
> >>
> >> I am using the bbswitch kernel module to switch off/on the GPU and
> >> to obtain the GPU power state.
> >> Obtaining the GPU state immediately after starting the graphical user
> >> session freezes the system.
> >>
> >> This code triggers something, which is responsible for the freeze.
> >>
> >> ---
> >> // Returns 1 if the card is disabled, 0 if enabled
> >> static int is_card_disabled(void) {
> >>     u32 cfg_word;
> >>     // read first config word which contains Vendor and Device ID. If all bits
> >>     // are enabled, the device is assumed to be off
> >>     pci_read_config_dword(dis_dev, 0, &cfg_word);
> >>     // if one of the bits is not enabled (the card is enabled), the inverted
> >>     // result will be non-zero and hence logical not will make it 0 ("false")
> >>     return !~cfg_word;
> >> }
> >>
> >> static int bbswitch_proc_show(struct seq_file *seqfp, void *p) {
> >>     // show the card state. Example output: 0000:01:00:00 ON
> >>     dis_dev_get();
> >>     seq_printf(seqfp, "%s %s\n", dev_name(&dis_dev->dev),
> >>              is_card_disabled() ? "OFF" : "ON");
> >>     dis_dev_put();
> >>     return 0;
> >> }
> >> ---
> >>
> >> Either dis_dev_get or pci_read_config_dword is the trigger.
> > 
> > What happens if you remove the call to is_card_disabled()?  Does the
> > system still freeze if you only do the dis_dev_get()/dis_dev_put()?
> > 
> >> Link to the bbswitch module source code:
> >> https://github.com/Bumblebee-Project/bbswitch/blob/master/bbswitch.c#L333
> >>
> >>
> >> Am 29.08.2016 um 18:02 schrieb Bjorn Helgaas:
> >>> [+cc linux-acpi, linux-kernel, dri-devel]
> >>>
> >>> Hi Roland,
> >>>
> >>> I have no idea how to debug this problem.  Are you seeing something
> >>> that suggests it may be a PCI problem?
> >>>
> >>> On Tue, Aug 23, 2016 at 11:23:45AM +0200, Roland Singer wrote:
> >>>> Hi,
> >>>>
> >>>> hope somebody can help me fix this kernel problem which affects the following machines:
> >>>>
> >>>> - Clevo P651RA (i7-6700HQ/GTX 965M, part of the P6xxRx family which are also affected)
> >>>> - MSI GE62 Apache Pro (i7-6700HQ/GTX 960M)
> >>>> - Gigabyte P35V5 (i7-6700HQ/GTX 970M)
> >>>> - Razer Blade 14" (2016) (i7-6700HQ/GTX 970M) (BIOS 5.11, 04/07/2016)
> >>>>
> >>>>
> >>>> The kernel freezes if the graphical user session (Xorg & Wayland) is
> >>>> started with a switched off discrete GPU card (NVIDIA).
> >>>> If the discrete GPU is switched off after the graphical session start,
> >>>> then everything works as expected, until the graphical session is restarted.
> >>>>
> >>>> This problem seams to be linked to specific BIOS settings. If the computer
> >>>> is started with the following command line:
> >>>>
> >>>> acpi_osi=! acpi_osi="Windows 2009"
> >>>>
> >>>> then the kernel freeze does not occur anymore. However this required a special
> >>>> ACPI DSDT firmware patch for the Razer Blade 2016 laptop:
> >>>>
> >>>> https://github.com/m4ng0squ4sh/razer_blade_14_2016_acpi_dsdt
> >>>>
> >>>> I strongly recommend to fix this in the kernel and I am ready to help and solve
> >>>> this problem with some help.
> >>>>
> >>>> Here is a link to the GitHub issue with further information:
> >>>>
> >>>> https://github.com/Bumblebee-Project/Bumblebee/issues/764#issuecomment-241212595
> >>>>
> >>>> Here are some more detailed information:
> >>>>
> >>>> https://github.com/Lekensteyn/acpi-stuff/blob/master/Clevo-P651RA/notes.txt
> >>>>
> >>>> Hope somebody can help.
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> >> the body of a message to majordomo@...r.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> > the body of a message to majordomo@...r.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ