Turn Off Error Recovery in RAID Drives: TLER, ERC, and CCTL

Hard disk drives encounter errors from time to time, so it’s a good thing that most have the ability to recover data anyway. But RAID systems usually have their own error recovery capabilities and can be thrown off when a hard disk pauses I/O. So it’s a good idea to use hard disk drives that allow you to disable or limit error recovery in RAID systems.

It’s a good idea to limit error recovery for hard disk drives used in RAID systems

Error Recovery Basics

Hard disk drives have more points of failure than most other modern computer components: They are physical devices that rely on magnetism and mechanical precision, not just solid state electronics. And ever-increasing drive density magnifies the challenge of always returning valid data. In fact, magnetic disk media is surprisingly unreliable, with hard drives often relying on error recovery technologies to cover for read and write errors.

The most basic form of error recovery on hard disk drives is CRC32C, a simple error-detecting code that reliably uncovers read and write errors. In most cases, disk drives can re-try a read, adjusting the heads slightly to detect the correct value. Once an error is detected and the correct data is uncovered, the disk drive will either re-write the data in place or mark that spot as bad and re-map it to another physical location.

All this should happen very quickly, but the application must wait for it to complete. Under light load, this process is barely noticeable. But systems with heavy I/O can escalate this wait time to unacceptable levels. In busy systems, an error recovery can take many seconds or even minutes to complete.

RAID and Error Recovery

Multi-drive systems, including RAID and similar solutions, can’t tolerate long waits for error recovery. Most RAID controllers assume that a drive that hasn’t completed an I/O request within a few seconds has failed. The controller will then mark the entire disk drive as “offline” and attempt to rebuild using an available spare disk or simply take the entire RAID set offline to avoid data loss. This can prove problematic, since a RAID rebuild can take hour or days to complete!

It’s not the fault of the RAID system, either. There has to be some threshold where a disk is declared to have failed. It wouldn’t be practical (or even desirable) to escalate the I/O wait “up the stack” and pause all operations until a disk recovers (if ever). So most RAID solutions or controllers set a threshold of a few seconds.

The rule of thumb for RAID controllers is 8 seconds, though this can vary. Some controllers wait for 10, 20, or 30 seconds, for example, and this can be configured on many. ZFS will generally wait as long as needed for error recovery, and this can dramatically impact performance.

Time-Limited Error Recovery

Disk drives intended for RAID use typically implement some form of time limiting for error recovery. Western Digital calls this Time Limited Error Recovery (TLER), while Seagate calls it Error Recovery Control (ERC) and Samsung and Hitachi call it Command Completion Time Limit (CCTL).

Regardless of what it’s called, the drive will limit the wait time on any error recovery command to a settable value, typically 7 seconds by default. The drive will usually report a failed I/O up the stack and attempt to re-try the error recovery at a later time. Meanwhile, the RAID controller will likely recover the data from parity or erasure code and continue operation.

ZFS, and other software RAID systems, will typically “react” the same way when TLER is enabled, recovering data and remapping that block.

Note that most desktop hard disk drives to not have this capability. Error recovery is always turned on and recovery will take as long as necessary. This is one reason that conventional desktop disk drives are not appropriate for use in RAID solutions.

Checking and Setting TLER

If a hard drive is to be used in a RAID or similar setup, it is desirable to have TLER or ERC enabled and set to a value under 8 seconds.

Most UNIX-like systems have the “smartmon” tools package, including the command, smartctl. This can be used to query TLER and similar settings. For example, here is the result of that command in FreeNAS (FreeBSD) for a Western Digital Red NAS drive:

# smartctl -l scterc /dev/da2

smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.0-STABLE amd64] (local build)
 Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

SCT Error Recovery Control:
 Read: 70 (7.0 seconds)
 Write: 70 (7.0 seconds)

This tool can also set TLER on a drive as follows:

smartctl -l scterc,80,80 /dev/da2

Western Digital provides a DOS utility, WDTLER.EXE, with similar functionality.

Stephen’s Stance

One reason to use enterprise or NAS hard disk drives is the capability to limit error recovery for smoother performance. I strongly recommend only using such drives with RAID systems, especially ZFS (as in FreeNAS)!

You might also want to read these other posts...

Comments

Howard Marks says

May 30, 2017 at 7:56 pm

I’d recommend 2-3 seconds. If the drive hasn’t recovered the data in 2 seconds I want the RAID or related data protection to take over and not cause a 5-8sec stutter.
Dan Bilzerian says

June 4, 2017 at 6:47 am

I agree with you.
Truman HW says

May 25, 2018 at 7:31 pm

What about making it a function of the [real] MTBF times the number of drives ?

Isn’t 3 seconds quick for something that’s a rare event ?
TimC says

May 29, 2019 at 2:40 pm

A short TLER/ERC limit may cause the RAID controller to drop a drive and label it faulty even thou data recover may be possible. A major problem occurs if drive 1 is dropped, degrading the array and then a 2nd drive is dropped (for RAID5, or 3 drives for RAID6) before the data had been copied to the hot-spare(s). The act of a full array rebuild can cause other drives from the same batch to fail close to each other. RAID is for convenience and some protection but you still need a 2nd/3rd device/tape/location for BACKUP.