lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 31 Dec 2011 15:41:32 +0200
From:	Konstantinos Skarlatos <k.skarlatos@...il.com>
To:	Jeff Layton <jlayton@...hat.com>
CC:	linux-kernel@...r.kernel.org, linux-cifs@...r.kernel.org
Subject: Re: cifs: ls of mount point gives input/output error (probably related
 to CIFS: getdents() broken for large dirs)

On 31/12/2011 2:49 μμ, Konstantinos Skarlatos wrote:
> On Σάββατο, 31 Δεκέμβριος 2011 1:59:22 μμ, Jeff Layton wrote:
>> On Fri, 30 Dec 2011 20:00:46 +0200
>> Konstantinos Skarlatos<k.skarlatos@...il.com> wrote:
>>
>>> On 30/12/2011 3:11 μμ, Jeff Layton wrote:
>>>> On Fri, 30 Dec 2011 11:04:59 +0200
>>>> Konstantinos Skarlatos<k.skarlatos@...il.com> wrote:
>>>>
>>>>> On 29/12/2011 3:54 μμ, Konstantinos Skarlatos wrote:
>>>>>> On Πέμπτη, 29 Δεκέμβριος 2011 3:39:30 μμ, Jeff Layton wrote:
>>>>>>> On Thu, 29 Dec 2011 12:30:18 +0200
>>>>>>> Konstantinos Skarlatos<k.skarlatos@...il.com> wrote:
>>>>>>>
>>>>>>>> On 29/12/2011 4:04 πμ, Jeff Layton wrote:
>>>>>>>>> On Thu, 29 Dec 2011 02:08:57 +0200
>>>>>>>>> Konstantinos Skarlatos<k.skarlatos@...il.com> wrote:
>>>>>>>>>
>>>>>>>>>> I mount via cifs a windows XP share, df gives me correct sizes,
>>>>>>>>>> but when
>>>>>>>>>> I ls the mount point i get input/output error.
>>>>>>>>>> strace: http://pastebin.com/WXf8M1nu
>>>>>>>>>>
>>>>>>>>>> mount --verbose -t cifs -o
>>>>>>>>>> username=administrator,password=blahblah
>>>>>>>>>> //192.168.0.11/jobs /mnt/backups/montaz/jobs
>>>>>>>>>> mount.cifs kernel mount options:
>>>>>>>>>> ip=192.168.0.11,unc=\\192.168.0.11\jobs,,ver=1,user=administrator,pass=********
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> df
>>>>>>>>>> //192.168.0.11/jobs 114464
>>>>>>>>>> 105196 9268 92% /mnt/backups/montaz/jobs
>>>>>>>>>>
>>>>>>>>>> ls /mnt/backups/montaz/jobs/
>>>>>>>>>> ls: reading directory /mnt/backups/montaz/jobs/: Input/output
>>>>>>>>>> error
>>>>>>>>>> total 0
>>>>>>>>>>
>>>>>>>>>> the fun thing is that i can cd to a lower level directory, and ls
>>>>>>>>>> works
>>>>>>>>>> fine there! only the mount point has the problem
>>>>>>>>>>
>>>>>>>>>> ls /mnt/backups/montaz/jobs/test
>>>>>>>>>> total 44K
>>>>>>>>>> drwxr-xr-x 1 root root 0 Apr 30 2010 blah blah/
>>>>>>>>>> ......
>>>>>>>>>>
>>>>>>>>>> kernel version 3.2rc7
>>>>>>>>>>
>>>>>>>>>> this seems to be related to :
>>>>>>>>>> https://lkml.org/lkml/2011/8/1/427
>>>>>>>>>> Re: [3.0.0+][Regression][Bisected] CIFS: getdents() broken for
>>>>>>>>>> large dirs
>>>>>>>>>>
>>>>>>>>> Hmmm, maybe. What makes you think that it's related? What sort of
>>>>>>>>> server are you seeing this against?
>>>>>>>> Windows XP service pack 2 (greek)
>>>>>>>
>>>>>>> How many files are in the directory?
>>>>>>>
>>>>>> 140 folders and 20 files
>>>>>>
>>>>> Attached is a tcp dump of my session.
>>>> I tried reproducing this here, but wasn't able to. Testing against my
>>>> xp box worked fine.
>>>>
>>>> Most likely, the FIND_FILE responses are falling afoul of the code in
>>>> coalesce_t2 or check2ndT2. Unfortunately that code is pretty
>>>> complicated and I'm not certain what the problem actually is...
>>>>
>>>> One thing that's interesting is that the total data being sent in the
>>>> request is rather large (16336 bytes). I think that's legit, but maybe
>>>> it's exceeding the end of the buffer once we try to coalesce it.
>>>>
>>>> Would it be possible to get the cFYI output from this test?
>>> I did not get a cFYI output from that test, but i redid a
>>> mount-ls-umount and am attaching the tcpdump
>>> Also here http://pastebin.com/J20uC6kU you can find the cifsFYI and the
>>> contents of /proc/fs/cifs/DebugData form the same test
>>>>
>>>> Is this a regression? Did it work with earlier kernels and only
>>>> recently start failing?
>>>>
>>> I do not know, and i am a bit afraid to downgrade this machine below 3.0
>>> due to some changes arch linux has introduced recently. I can always set
>>> up a few virtual machines though, and i can even request permission from
>>> my company to give you shell access if you like. Which kernel versions
>>> would you like me to test?
>>
>>
>> Ok, that tells us a little:
>>
>> -------------------[snip]---------------------
>> [96268.787078] fs/cifs/cifssmb.c: In FindFirst for
>> [96268.787083] fs/cifs/transport.c: For smb_command 50
>> [96268.787086] fs/cifs/transport.c: Sending smb: total_len 88
>>
>> ...FIND_FIRST command is sent
>>
>> [96268.787690] fs/cifs/connect.c: RFC1002 header 0x1104
>> [96268.787697] fs/cifs/connect.c: missing 12048 bytes from transact2,
>> check next response
>> [96268.787865] fs/cifs/connect.c: RFC1002 header 0x1104
>> [96268.787870] fs/cifs/connect.c: missing 12036 bytes from transact2,
>> check next response
>> [96268.788037] fs/cifs/connect.c: RFC1002 header 0x1104
>> [96268.788042] fs/cifs/connect.c: missing 12036 bytes from transact2,
>> check next response
>> [96268.788371] fs/cifs/connect.c: RFC1002 header 0xdb0
>> [96268.788375] fs/cifs/connect.c: missing 12888 bytes from transact2,
>> check next response
>>
>> ...all four parts of the response are collected here
>>
>> [96268.788391] fs/cifs/transport.c: cifs_sync_mid_result: cmd=50
>> mid=12 state=16
>>
>> ...but the state at this point is MID_RESPONSE_MALFORMED
>>
>> [96268.788395] fs/cifs/cifssmb.c: Error in FindFirst = -5
>> [96268.788397] fs/cifs/readdir.c: initiate cifs search rc -5
>> [96268.788398] fs/cifs/readdir.c: CIFS VFS: leaving cifs_readdir (xid
>> = 737644) rc = -5
>>
>> ...which makes readdir return -EIO
>>
>> -------------------[snip]---------------------
>>
>> Based on that, it looks like something in one of these frames caused
>> coalesce_t2() to return an error. I don't see the problem right offhand
>> in the capture, but T2 response handling is pretty complex so it can be
>> hard to see.
>>
>> Would it be possible for you to rebuild your kernel (or just cifs.ko)
>> with this patch? Once you do that, rerun the test with cFYI turned up,
>> and it should help point out what the problem is.
>>
>> Thanks,
>
> Ok i am now rebuilding the kernel and will report when i have results.
http://pastebin.com/scgyDjhT and attached tcpdump

Download attachment "cifs-traffic.pcap" of type "application/octet-stream" (22933 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ