lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Thu, 10 May 2012 15:58:53 -0600
From:	Stephen Warren <swarren@...dotorg.org>
To:	Kay Sievers <kay@...y.org>
CC:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	William Douglas <william.douglas@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	linux-next@...r.kernel.org
Subject: Re: Regression due to 7ff9554 "printk: convert byte-buffer to variable-length
 record buffer"

On 05/10/2012 02:26 PM, Kay Sievers wrote:
> On Thu, May 10, 2012 at 10:20 PM, Stephen Warren <swarren@...dotorg.org> wrote:
>> On 05/10/2012 02:15 PM, Kay Sievers wrote:
>>> On Thu, May 10, 2012 at 10:13 PM, Stephen Warren <swarren@...dotorg.org> wrote:
>>>> On 05/10/2012 02:09 PM, Stephen Warren wrote:
>>>>> On 05/10/2012 02:06 PM, Kay Sievers wrote:
>>>>>> On Thu, May 10, 2012 at 9:54 PM, Stephen Warren <swarren@...dotorg.org> wrote:
>>>>>>> On 05/09/2012 12:31 PM, Stephen Warren wrote:
>>>>>>>> For me, next-20120508 prints nothing when booted, and I think also
>>>>>>>> hangs. To solve this, I reverted:
>>>>>>>>
>>>>>>>> 7ff9554 printk: convert byte-buffer to variable-length record buffer
>>>>>>>>
>>>>>>>> In order to build, I also had to revert:
>>>>>>>>
>>>>>>>> c4e00da driver-core: extend dev_printk() to pass structured data
>>>>>>>>
>>>>>>>> Note: I'm running on an ARM system using a serial console, with
>>>>>>>> earlyprintk enabled.
>>>>>>>
>>>>>>> This issue still occurs in next-20120510.
>>>>>>>
>>>>>>> I've tracked it down to the assignment of msg->ts_nsec near the end of
>>>>>>> log_store(). If I comment this out, everything works. The problem is the
>>>>>>> assignment, not the call to local_clock():
>>>>>>>
>>>>>>> fails:
>>>>>>>        msg->ts_nsec = local_clock();
>>>>>>> fails:
>>>>>>>        msg->ts_nsec = 0;//local_clock();
>>>>>>> works:
>>>>>>>        //msg->ts_nsec = local_clock();
>>>>>>
>>>>>> Weird.
>>>>>>
>>>>>> What happens if you change it to:
>>>>>>   cpu_clock(logbuf_cpu);
>>>>>> ?
>>>>>>
>>>>>> If it works, the timestamps look ok?
>>>>>
>>>>> I doubt that would work - after all, assigning 0 fails, but not
>>>>> performing the assignment at all works. But, I'll go try it...
>>>>
>>>> Calling cpu_clock() instead of local_clock() fails in the same way.
>>>
>>> Ok, didn't really see the assign to 0 you tried, sorry. :)
>>>
>>> And 'dmesg' works when you run the box with the line commented out?
>>>
>>> And 'cat /dev/kmsg'?
>>
>> Yes, both work and produce reasonable output.
> 
> Really weird. I have zero ideas at what this could be. The messages in
> /dev/kmsg have seqnums, you see the 0 as the first?
> 
> 5,0,0;Linux version 3.4.0-rc6+ ...
> 6,1,0;Command line: root=/dev/sda1 ...
> 
>> "cat /dev/kmsg" does hang
>> at the end of the log until I CTRL-C it - is that expected?
> 
> Yeah, 'cat' is a blocking read(), tools would use poll().

Found the bug. __log_buf isn't correctly aligned, so the very first
message gets the struct log written to an unaligned address. Subsequent
entries would be correctly aligned in log_store(), except the kernel has
already crashed by then.

The ts_nsec field is a 64-bit write, which triggers the alignment
requirement. I'm not quite sure why the previous 16-bit writes for the
various len fields didn't experience an issue, since __log_buf ended up
at an odd address for me. Perhaps ARM doesn't have an alignment
requirement for 16-bit writes, but does for 32-bit and greater.

I'll post a patch in a minute.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ