• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Home
  • About
    • Stephen Foskett
      • My Publications
        • Urban Forms in Suburbia: The Rise of the Edge City
      • Storage Magazine Columns
      • Whitepapers
      • Multimedia
      • Speaking Engagements
    • Services
    • Disclosures
  • Categories
    • Apple
    • Ask a Pack Rat
    • Computer History
    • Deals
    • Enterprise storage
    • Events
    • Personal
    • Photography
    • Terabyte home
    • Virtual Storage
  • Guides
    • The iPhone Exchange ActiveSync Guide
      • The iPhone Exchange ActiveSync Troubleshooting Guide
    • The iPad Exchange ActiveSync Guide
      • iPad Exchange ActiveSync Troubleshooting Guide
    • Toolbox
      • Power Over Ethernet Calculator
      • EMC Symmetrix WWN Calculator
      • EMC Symmetrix TimeFinder DOS Batch File
    • Linux Logical Volume Manager Walkthrough
  • Calendar

Stephen Foskett, Pack Rat

Understanding the accumulation of data

You are here: Home / Everything / Enterprise storage / Turn Off Error Recovery in RAID Drives: TLER, ERC, and CCTL

Turn Off Error Recovery in RAID Drives: TLER, ERC, and CCTL

May 30, 2017 By Stephen 3 Comments

Hard disk drives encounter errors from time to time, so it’s a good thing that most have the ability to recover data anyway. But RAID systems usually have their own error recovery capabilities and can be thrown off when a hard disk pauses I/O. So it’s a good idea to use hard disk drives that allow you to disable or limit error recovery in RAID systems.

It’s a good idea to limit error recovery for hard disk drives used in RAID systems

Error Recovery Basics

Hard disk drives have more points of failure than most other modern computer components: They are physical devices that rely on magnetism and mechanical precision, not just solid state electronics. And ever-increasing drive density magnifies the challenge of always returning valid data. In fact, magnetic disk media is surprisingly unreliable, with hard drives often relying on error recovery technologies to cover for read and write errors.

The most basic form of error recovery on hard disk drives is CRC32C, a simple error-detecting code that reliably uncovers read and write errors. In most cases, disk drives can re-try a read, adjusting the heads slightly to detect the correct value. Once an error is detected and the correct data is uncovered, the disk drive will either re-write the data in place or mark that spot as bad and re-map it to another physical location.

All this should happen very quickly, but the application must wait for it to complete. Under light load, this process is barely noticeable. But systems with heavy I/O can escalate this wait time to unacceptable levels. In busy systems, an error recovery can take many seconds or even minutes to complete.

RAID and Error Recovery

Multi-drive systems, including RAID and similar solutions, can’t tolerate long waits for error recovery. Most RAID controllers assume that a drive that hasn’t completed an I/O request within a few seconds has failed. The controller will then mark the entire disk drive as “offline” and attempt to rebuild using an available spare disk or simply take the entire RAID set offline to avoid data loss. This can prove problematic, since a RAID rebuild can take hour or days to complete!

It’s not the fault of the RAID system, either. There has to be some threshold where a disk is declared to have failed. It wouldn’t be practical (or even desirable) to escalate the I/O wait “up the stack” and pause all operations until a disk recovers (if ever). So most RAID solutions or controllers set a threshold of a few seconds.

The rule of thumb for RAID controllers is 8 seconds, though this can vary. Some controllers wait for 10, 20, or 30 seconds, for example, and this can be configured on many. ZFS will generally wait as long as needed for error recovery, and this can dramatically impact performance.

Time-Limited Error Recovery

Disk drives intended for RAID use typically implement some form of time limiting for error recovery. Western Digital calls this Time Limited Error Recovery (TLER), while Seagate calls it Error Recovery Control (ERC) and Samsung and Hitachi call it Command Completion Time Limit (CCTL).

Regardless of what it’s called, the drive will limit the wait time on any error recovery command to a settable value, typically 7 seconds by default. The drive will usually report a failed I/O up the stack and attempt to re-try the error recovery at a later time. Meanwhile, the RAID controller will likely recover the data from parity or erasure code and continue operation.

ZFS, and other software RAID systems, will typically “react” the same way when TLER is enabled, recovering data and remapping that block.

Note that most desktop hard disk drives to not have this capability. Error recovery is always turned on and recovery will take as long as necessary. This is one reason that conventional desktop disk drives are not appropriate for use in RAID solutions.

Checking and Setting TLER

If a hard drive is to be used in a RAID or similar setup, it is desirable to have TLER or ERC enabled and set to a value under 8 seconds.

Most UNIX-like systems have the “smartmon” tools package, including the command, smartctl. This can be used to query TLER and similar settings. For example, here is the result of that command in FreeNAS (FreeBSD) for a Western Digital Red NAS drive:

# smartctl -l scterc /dev/da2

smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.0-STABLE amd64] (local build)
 Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

SCT Error Recovery Control:
 Read: 70 (7.0 seconds)
 Write: 70 (7.0 seconds)

This tool can also set TLER on a drive as follows:

smartctl -l scterc,80,80 /dev/da2

Western Digital provides a DOS utility, WDTLER.EXE, with similar functionality.

Stephen’s Stance

One reason to use enterprise or NAS hard disk drives is the capability to limit error recovery for smoother performance. I strongly recommend only using such drives with RAID systems, especially ZFS (as in FreeNAS)!

You might also want to read these other posts...

  • Electric Car Over the Internet: My Experience Buying From…
  • Liberate Wi-Fi Smart Bulbs and Switches with Tasmota!
  • How To Connect Everything From Everywhere with ZeroTier
  • How To Install ZeroTier on TrueNAS 12
  • Introducing Rabbit: I Bought a Cloud!

Filed Under: Enterprise storage, Terabyte home Tagged With: CCTL, CRC, CRC32C, enterprise disk, ERC, FreeNAS, hard disk, hard disk drive, NAS, NAS disk, RAID, SMART, TLER, ZFS

Primary Sidebar

The same thing can be identified by many different terms, and the same term may mean many different things.

Douglas John Foskett

Subscribe via Email

Subscribe via email and you will receive my latest blog posts in your inbox. No ads or spam, just the same great content you find on my site!
 New posts (daily)
 Where's Stephen? (weekly)

Download My Book


Download my free e-book:
Essential Enterprise Storage Concepts!

Recent Posts

How To Install ZeroTier on TrueNAS 12

February 3, 2022

Scam Alert: Fake DMCA Takedown for Link Insertion

January 24, 2022

How To Connect Everything From Everywhere with ZeroTier

January 14, 2022

Electric Car Over the Internet: My Experience Buying From Vroom

November 28, 2020

Powering Rabbits: The Mean Well LRS-350-12 Power Supply

October 18, 2020

Tortoise or Hare? Nvidia Jetson TK1

September 22, 2020

Running Rabbits: More About My Cloud NUCs

September 21, 2020

Introducing Rabbit: I Bought a Cloud!

September 10, 2020

Remove ROM To Use LSI SAS Cards in HPE Servers

August 23, 2020

Test Your Wi-Fi with iPerf for iOS

July 9, 2020

Symbolic Links

    Featured Posts

    Donate Your Swag to School Kids In Need

    July 28, 2010

    Faster Ethernet Gets Weird

    June 19, 2015

    It’s Time To Speak Out Against Sexism In IT Recruiting

    May 6, 2013

    Mac OS X Lion Adds CoreStorage, a Volume Manager (Finally!)

    August 4, 2011

    Debit or Credit? Always Choose Credit!

    December 19, 2013

    Top VMware Blogs 2014: How I Voted

    February 25, 2014

    ZFS Is the Best Filesystem (For Now…)

    July 10, 2017

    Regarding My Symbolic Links and Good Reads

    April 16, 2015

    Scaling Storage In Conventional Arrays

    November 19, 2013

    Introducing Rabbit: I Bought a Cloud!

    September 10, 2020

    Footer

    Legalese

    Copyright © 2022 · Log in