External USB and Firewire Hard Drives

USB hard drives such as the Maxtor 5000DV work in Linux. It's necessary to use 2.5 kernels to prevent locking up of the kernel when large numbers of files are copied. Kernel 2.4.20 and earlier will cause the system to hang during use. However, 2.5.53 works well. Unfortunately, this version has problems with, of all things, the mouse.

  1. It's essential to use USB 2.0 (EHCI), otherwise the drive will be too slow to be useful.
  2. Enable USB, USB_BANDWIDTH, USB_UHCI, USB_EHCI_HDD in kernel
  3. Enable BLK_DEV_SD and BLK_DEV_SR in kernel.

The USB drive will be automatically assigned the next SCSI drive letter at boot-up. If you have no SCSI drives, the USB drive will be /dev/sda1. If the disk is plugged into the USB port, it should be immediately identified by the kernel.

Finding out whether your computer has a USB 2.0 controller
You must have a EHCI (USB 2.0) controller for the drive to work at a reasonable speed. The Linux USB driver does not give any error messages if an EHCI device is not found. To find which controller you have, type lspci -vv . If it says "UHCI" or "OHCI" the controller is USB 1.1; if it says "EHCI" it can handle USB 2.0. (Note that a computer can have both USB 1.1 and USB 2.0 controllers simultaneously. USB 2.0 does not necessarily mean high speed. There are actually three types of USB 2.0: "low speed" (1.5 Mb/s), "full speed" (12 Mb/s), and "hi speed" (up to 480 Mb/s).).

lspci -vv
00:1f.2 USB Controller: Intel Corp. 82801BA/BAM USB (Hub 
  (rev 12) (prog-if 00 [UHCI])
  Subsystem: Sony Corporation: Unknown device 80f0
  Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- 
  VGASnoop- ParErr- Stepping- SERR- FastB2B-
  Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=
  medium >TAbort- <TAbort- <MAbort- >SERR- 
  <PERR-
  Latency: 0
  Interrupt: pin D routed to IRQ 3
  Region 4: I/O ports at b400 [size=32]

00:1f.4 USB Controller: Intel Corp. 82801BA/BAM USB (Hub  
  (rev 12) (prog-if 00 [UHCI])
  Subsystem: Sony Corporation: Unknown device 80f0
  Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- 
  VGASnoop- ParErr- Stepping- SERR- FastB2B-
  Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=
  medium >TAbort- <TAbort- <MAbort- >SERR- 
  <PERR-
  Latency: 0
  Interrupt: pin C routed to IRQ 9
  Region 4: I/O ports at b000 [size=32]

This means that this computer had a USB 1.1 controller. A USB 2.0 controller looks like this:

00:0c.0 USB Controller: NEC Corporation USB 
  (rev 41) (prog-if 10 [OHCI])
  Subsystem: Unknown device 3083:0035
  Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ 
  VGASnoop- ParErr- Stepping- SERR+ FastB2B-
  Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=
  medium >TAbort- <TAbort- <MAbort- >SE
  RR- <PERR-
  Latency: 64 (250ns min, 10500ns max), cache line 
  size 08
  Interrupt: pin A routed to IRQ 11
  Region 0: Memory at dfff7000 (32-bit, non-prefetc
  hable) [size=4K]
  Capabilities: [40] Power Management version 2
          Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
          PME(D0+,D1+,D2+,D3hot+,D3cold-)
          Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:0c.1 USB Controller: NEC Corporation USB (rev 41) 
  (prog-if 10 [OHCI])
  Subsystem: Unknown device 3083:0035
  Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ 
  VGASnoop- ParErr- Stepping- SERR+ FastB2B-
  Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=
  medium >TAbort- <TAbort- <MAbort- >SE
  RR- <PERR-
  Latency: 64 (250ns min, 10500ns max), cache line 
  size 08
  Interrupt: pin B routed to IRQ 9
  Region 0: Memory at dfffc000 (32-bit, non-prefetc
  hable) [size=4K]
  Capabilities: [40] Power Management version 2
          Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0m
          A PME(D0+,D1+,D2+,D3hot+,D3cold-)
          Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:0c.2 USB Controller: NEC Corporation USB Enhanced 
  Host Controller (rev 02) (prog-if 20)
  Subsystem: Unknown device 3083:00e0
  Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ 
  VGASnoop- ParErr- Stepping- SERR+ FastB2B-
  Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=
  medium >TAbort- <TAbort- <MAbort- >SE
  RR- <PERR-
  Latency: 64 (4000ns min, 8500ns max), cache line 
  size 08
  Interrupt: pin C routed to IRQ 5
  Region 0: Memory at dfffd700 (32-bit, non-prefetc
  hable) [size=256]
  Capabilities: [40] Power Management version 2
          Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
          PME(D0+,D1+,D2+,D3hot+,D3cold-)
          Status: D0 PME-Enable- DSel=0 DScale=0 PME- 

Note that EHCI was not listed by lspci for either card; more recent kernels will usually have an entry for EHCI. Dmesg also may or may not print 'EHCI', despite the presence of a functioning USB 2.0 card, so you should look for phrases like "high speed USB", "USB 1.1 Controller", or "USB 2.0" as well.

Peak transfer rates 2 megabytes per second and average transfer rates of 104,000 bytes per second (about 17 times faster than transferring data over a modem) were obtained with a USB 1.1 controller installed. At this rate, backing up a typical 40 GB drive would take 4.4 days. Incremental backups, using commands like

   cp -pRuv /home /external/ 

would be faster. During the test, the system load alternated between 3 and 4, with occasional spikes to 7.

When an Adaptec AUA-4000A USB 2.0 PCI card was installed, dmesg showed that USB 2.0 was automatically activated:

drivers/usb/core/hcd.c: new USB bus registered, assigned bus number 1
ehci-hcd 02:09.2: USB 2.0 enabled, EHCI 0.95, driver 2002-Nov-29 

Average transfer rates of 1.8 MB/s were obtained. A cheap 5-port USB brand USB 2.0 card with a NEC chipset also worked with no problem, but failed after two months of use.

Changing disks in Maxtor 5000DV External USB Hard Disk

It is possible to replace the 120MB HD with a larger disk. The hard disks inside the case are ordinary ATA disks. I substituted a 200GB Maxtor DiamondMax Plus 8 ATA/133. The larger disk was recognized when the following procedure was used:

  1. Boot up with the old 120GB disk. If you replace the disk first, it won't be recognized by Linux.
  2. Open the case by removing the two black clamps on either side. There are no screws on the plastic case.
  3. Unplug the disk's power supply and replace the disk, making sure the jumper is on J48. (Note: I also tweaked the jumper inside the Maxtor, just in case. It's not clear whether this is necessary).
  4. While the cover was off, I also drilled some ventilation holes in the top of the case. The sealed case causes the disk drive to get extremely hot, which causes premature drive failure.
  5. Plug the USB disk back in and reconnect the USB cable.
  6. Type fdisk /dev/sda and add a Linux partition. On the first attempt, it only made a 120GB partition. After power-cycling it, on the second attempt fdisk created a partition of the correct size.
  7. Reboot and check dmesg to make sure it's recognized as the correct size. It should say:
      hub 1-0:0: debounce: port 3: delay 100ms stable 4 status 0x501
      hub 1-0:0: new USB device on port 3, assigned address 4
      WARNING: USB Mass Storage data integrity not assured
      USB Mass Storage device found at 4
      SCSI device sda: cache data unavailable
      SCSI device sda: 398295040 512-byte hdwr sectors (203927 MB)
       sda: sda1 

  8. Format the partition with ext3.

Fixing PSAUX mouse in 2.5.53 kernel

The mouse tends to go crazy in the later 2.5 kernels, selecting windows at random, and sending the following message to syslog:

   kernel: psmouse.c: Lost synchronization, throwing 3 bytes away. 

To fix this, edit drivers/input/mouse/psmouse.c and change

  if (psmouse->pktcnt && time_after(jiffies, psmouse->last + HZ/20)) 

to

  if (psmouse->pktcnt && time_after(jiffies, psmouse->last + 1*HZ))  

and rebuild the kernel.

WARNINGS

  1. Be sure to use cp -R and not cp -r when copying directories. cp -r will lock up the entire system if it tries to copy a FIFO or files in /dev.
  2. Don't put an entry for the drive in /etc/fstab. If the drive is disconnected for some reason, the system may not boot up.
  3. Don't mount a USB drive over NFS. NFS adds additional load on the CPU, which may be enough to lock your system entirely under heavy use.
  4. It is necessary to reset the system clock after using the USB drive. We get the following message:
     carbon kernel: Losing too many ticks! 
     carbon kernel: Falling back to a sane timesource.

  5. The disk must be powered up before attaching the USB cable to the computer.
  6. Don't use any of the early 2.5 kernels (<2.5.20); they have serious bugs and are very dangerous. Some later 2.5 kernels (e.g., 2.5.70) also have problems with USB.

Firewire

For Firewire:

  1. Set the following:
         CONFIG_IEEE1394
         CONFIG_IEEE1394_PCILYNX
         CONFIG_IEEE1394_OHCI1394
         CONFIG_IEEE1394_SBP2
         CONFIG_IEEE1394_SBP2_PHYS_DMA
         CONFIG_IEEE1394_RAWIO
         CONFIG_I2C
         CONFIG_I2C_ALGOBIT 

  2. Set "enable onboard 1394 controller" in the BIOS.
  3. The ieee1394 documentation says that the sbp2 driver can only be made as a module. You must compile the kernel to handle modules and load the modules in the following order:
    1. ieee1394 (e.g. insmod ieee1394)
    2. ohci1394 (e.g. insmod ohci1394)
    3. sbp2 (e.g. insmod sbp2)
    However, "make modules" does not seem to work anymore in the 2.5 kernels, giving a large number of unresolved symbols in the modules. CONFIG_IEEE1394_PCILYNX, which is essential for using Maxtor external hard drives on the Sony PCVRX570, is not implemented in the 2.4 kernels (at least up to 2.4.20).

To manually add/detect a new SBP-2 device

 echo "scsi add-single-device 0 0 0 0" > /proc/scsi/scsi

To manually remove a SBP-2 device after it's been unplugged
 echo "scsi remove-single-device 0 0 0 0" > /proc/scsi/scsi

To check to see which SBP-2/SCSI devices are currently registered
 cat /proc/scsi/scsi

To check to see whether SCSI drives are currently enabled in your kernel
 cat /proc/devices | grep sd

The Firewire drive will supposedly be recognized as a SCSI drive e.g. /dev/sda1 at boot-up.

After scanning for new SCSI devices (above), you may access any attached SBP-2 storage devices as if they were SCSI devices (e.g. mount /dev/sda1, fdisk, mkfs, etc.).

However, despite the above, and despite a lack of error messages at bootup, I was unable to get the Maxtor 5000DV to work as a Firewire device.

These drives do not seem to be very reliable. We purchased three of them and two died with a "click of death" within a few months. The hard disk inside is an ordinary hard disk which is easy to replace, however. The cheap "USB 2.0 Hi-Speed" cards were also found to be unreliable.

SCSI Problems with external USB drives

On one of our external USB drives, after a few months of use we began to have SCSI problems.

kernel: SCSI error : <1 0 0 0> return code = 0x8000002
kernel: Current sda: sense = 70  0
kernel: Raw sense data:0x70 0x00 0x00 0x00 0x00 0x00 0x00 0x00 
kernel: end_request: I/O error, dev sda, sector 10479175

If the disk is formatted with Reiserfs, we also got a kernel oops which required a reboot to prevent hanging the kernel within a short time. This bug is still present in kernel 2.6.1.
carbon kernel: vs-13070: reiserfs_read_locked_inode: 
i/o failure occurred trying to find stat data of 
[288141 288142 0x0 SD]
... repeated ...
kernel: end_request: I/O error, dev 
sda, sector 11350263
kernel: Unable to handle kernel NULL pointer dereference at 
virtual address 00000018
kernel:  printing eip:
kernel: c01b1010
kernel: *pde = 00000000
kernel: Oops: 0000 [#1]
... etc ...

You should reboot immediately when you get a kernel oops to prevent severe filesystem damage. To avoid these problems, reformat external drives only with ext3 (mke2fs -jc /dev/sda1). In any event, mkreiserfs does not seem to work in the 2.6 kernels, giving the message:
 mkreiserfs:  Kernel 2.6.1 is running. 
 You should run either 2.4 or 2.2 to be able 
 to create reiserfs filesystem 

After we power-cycled the computer and disk drive, we found that the partition on the external disk drive had vanished. Re-running fdisk on the removable drive and formatting it with ext3 seemed to fix the problem so far, suggesting that this was not due to bad hardware, but was a software problem.


Back