Just passing by

  • 1 Post
  • 4 Comments
Joined 1 year ago
cake
Cake day: June 9th, 2023

help-circle
  • Hi.

    There is one usb drive in an nvme enclosure without their own power supply. I know the brand and I can find the chipset however what I need to understand is the issue from the logs.

    The error usb 2-4: Enable of device-initiated U1 failed. seems common for USB devices not working.

    What does it point to and what to look for to understand it?

    Thanks.

    PS: Just for curiosity I did swap the enclosure days ago and the cable but had the same issue, so the error message is not specific to it. Also I was using this enclosure as the main disk for one of my PI with no issue, so power via USB or cable should not be the problem. Not that I want to use that as metric, I need data/logs from the OS.


  • Hi! Thanks for the pointers. Unfortunately dmesg and system logs where the first places I looked at, but I found nothing at the time. I tried it again now to give you the output of a zpool clear, you can obviously ignore the failed email attempt. journalctl:

    Jun 07 08:06:24 truenas kernel: WARNING: Pool 'tank-02' has encountered an uncorrectable I/O failure and has been suspended.
    Jun 07 08:06:24 truenas zed[799040]: eid=309 class=statechange pool='tank-02' vdev=xxx-xxx-xxx-xxx-xxx vdev_state=ONLINE
    Jun 07 08:06:24 truenas zed[799049]: eid=310 class=statechange pool='tank-02' vdev=xxx-xxx-xxx-xxx-xxx vdev_state=FAULTED
    Jun 07 08:06:24 truenas zed[799057]: eid=313 class=data pool='tank-02' priority=3 err=28 flags=0x20004000 bookmark=0:0:0:1
    Jun 07 08:06:24 truenas zed[799058]: eid=311 class=vdev_clear pool='tank-02' vdev=xxx-xxx-xxx-xxx-xxx vdev_state=FAULTED
    Jun 07 08:06:24 truenas zed[799067]: eid=312 class=data pool='tank-02' priority=3 err=28 flags=0x20004000 bookmark=0:62:0:0
    Jun 07 08:06:24 truenas zed[799081]: eid=316 class=io_failure pool='tank-02'
    Jun 07 08:06:24 truenas zed[799082]: eid=315 class=data pool='tank-02' priority=3 err=28 flags=0x20004000 bookmark=0:0:-1:0
    Jun 07 08:06:24 truenas zed[799090]: eid=314 class=data pool='tank-02' priority=3 err=28 flags=0x20004000 bookmark=0:0:1:0
    Jun 07 08:06:24 truenas find_alias_for_smtplib.py[799114]: sending mail to 
                                                               To: root
                                                               Subject: ZFS device fault for pool tank-02 on truenas
                                                               MIME-Version: 1.0
                                                               Content-Type: text/plain; charset="ANSI_X3.4-1968"
                                                               Content-
    Jun 07 08:06:24 truenas find_alias_for_smtplib.py[799114]: No aliases found to send email to root
    Jun 07 08:06:24 truenas zed[799144]: error: statechange-notify.sh: eid=310: mail exit=1
    

    dmesg says even less.

    I also tried to reboot the machine with the drive detached and then attach it at runtime while tailing dmesg and journalctl. Now, they are pretty verbose, so will only add here any interesting part (I didn’t notice anything new however):

    [...]
    [  221.952569] usb 2-4: Enable of device-initiated U1 failed.
    [  221.954164] usb 2-4: Enable of device-initiated U2 failed.
    [  221.965756] usbcore: registered new interface driver usb-storage
    [  221.983528] usb 2-4: Enable of device-initiated U1 failed.
    [  221.983997] usb 2-4: Enable of device-initiated U2 failed.
    [  221.987603] scsi host2: uas
    [  221.987831] usbcore: registered new interface driver uas
    [...]
    [  222.040564] sd 2:0:0:0: Attached scsi generic sg1 type 0
    [  222.049860] sd 2:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
    [  222.051867] sd 2:0:0:0: [sdb] Write Protect is off
    [  222.051879] sd 2:0:0:0: [sdb] Mode Sense: 37 00 00 08
    [  222.056719] sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
    [  222.058407] sd 2:0:0:0: [sdb] Preferred minimum I/O size 512 bytes
    [  222.058413] sd 2:0:0:0: [sdb] Optimal transfer size 33553920 bytes
    [  222.252607]  sdb: sdb1
    [  222.253015] sd 2:0:0:0: [sdb] Attached SCSI disk
    [  234.935926] usb 2-4: USB disconnect, device number 2
    [  234.983962] sd 2:0:0:0: [sdb] Synchronizing SCSI cache
    [  235.227936] sd 2:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
    [...]
    

    Thanks for the advices, it was worth another try. Anything more that comes to mind?


  • Thank you! A new path to check :) I didn’t find this in my search until now, so I added it on my documentation.


    Unfortunately it doesn’t tell me much, but I am really happy there is some more new info here. I can see some FAILED steps but it may be just connected to the fact it is a striped volume?

    1717612906   spa.c:6623:spa_import(): spa_import: importing tank-02
    1717612906   spa_misc.c:418:spa_load_note(): spa_load(tank-02, config trusted): LOADING
    1717612906   vdev.c:161:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/xxx-xxx-xxx-xxx-xxxx': best uberblock found for spa tank-02. txg 6462
    1717612906   spa_misc.c:418:spa_load_note(): spa_load(tank-02, config untrusted): using uberblock with txg=6462
    1717612906   spa.c:8925:spa_async_request(): spa=tank-02 async request task=4
    1717612906   spa_misc.c:404:spa_load_failed(): spa_load(tank-02, config trusted): FAILED: cannot open vdev tree after invalidating some vdevs
    1717612906   spa_misc.c:418:spa_load_note(): spa_load(tank-02, config trusted): UNLOADING
    1717612906   spa_misc.c:418:spa_load_note(): spa_load(tank-02, config trusted): spa_load_retry: rewind, max txg: 6461
    1717612906   spa_misc.c:418:spa_load_note(): spa_load(tank-02, config trusted): LOADING
    1717612907   spa_misc.c:418:spa_load_note(): spa_load(tank-02, config untrusted): vdev tree has 1 missing top-level vdevs.
    1717612907   spa_misc.c:418:spa_load_note(): spa_load(tank-02, config untrusted): current settings allow for maximum 0 missing top-level vdevs at this stage.
    1717612907   spa_misc.c:404:spa_load_failed(): spa_load(tank-02, config untrusted): FAILED: unable to open vdev tree [error=2]
    1717612907   spa_misc.c:418:spa_load_note(): spa_load(tank-02, config untrusted): UNLOADING
    

    It goes on and after a while:

    1717614235   spa_misc.c:2311:spa_import_progress_set_notes_impl(): 'tank-02' Finished importing
    1717614235   spa.c:8925:spa_async_request(): spa=tank-02 async request task=2048
    1717614235   spa_misc.c:418:spa_load_note(): spa_load(tank-02, config trusted): LOADED
    1717614235   metaslab.c:2445:metaslab_load_impl(): metaslab_load: txg 6464, spa tank-02, vdev_id 0, ms_id 95, smp_length 0, unflushed_allocs 0, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 1362018 ms, loading_time 0 ms, ms_max_size 8589934592, max size error 8589934592, old_weight 840000000000001, new_weight 840000000000001
    

    But I see no other issue otherwise. Any other new path/logs/ways I can query the system?


  • Thanks, I understand the point of view. So maybe let me rephrase it. ZFS is not telling me more info that the one I posted above (maybe this is all it sees like you said). Do you know of any other way to make ZFS more verbose on the issue or giving me more info? If not, it is ok but I have a second question: Where would you look on which is the culprit amongst “bad USB controller, firmware, cable, or driver” without trying-by-switching them out? Thank you for your advice.