Avoid costly premature SSD drive failure using SMART evaluation tools


Avoid costly premature SSD drive failure using SMART evaluation tools5740daf986400cba1fd6-438a983e20136d2f376705dfe1c68aea.r82.cf3.rackcdn.com/A...

0 downloads 124 Views 159KB Size

Avoid costly premature SSD drive failure using SMART evaluation tools There are a wide variety of SSDs on the market today and choice is seen as a good thing, increasing competition and driving pricing down. But how do you evaluate the options for different SSDs and determine which drive is best for your application? How do you become confident that the SSD deployed in your product is going to last? And how do you avoid the pitfalls of early SSD failure, resulting in costly product returns, recalls, retrofitting alternative SSDs and potential loss of reputation? By using SMART evaluation tools, you can:

• Accurately estimate the life of your SSD • Effectively evaluate SSD products and choose the right one for your application

• Implement predictive failure monitoring • Avoid costly implications from premature SSD card failure

Why do solid state drives wear out? Flash memory devices ‘Read’ and ‘Write’ in pages. A ‘Read’ is relatively straight forward, with a read command with the address issued and the respective data is returned. A ‘Write’ can only occur to pages erased or marked for erasing. Therefore, host write commands invoke flash erase cycles prior to writing to the flash. This write/erase cycle causes cell wear which results in a limited write-life. NAND flash memory is susceptible to wear due to repeated program and erase cycles that are commonly done in data storage applications and systems using Flash Translation Layer (FTL). Constantly programming and erasing to the same memory location eventually wears that block of memory out and the flash controller will make that block redundant. The redundant block will be replaced with a spare block, until all the spare blocks are used. The end result is the SSD will have limited lifetime. The erase process involves hitting the flash cell with a relatively large charge of electrical energy. This causes the semiconductor layer on the chip itself to degrade a minute amount. The degraded semi-conductor layer becomes more permeable to charge and finds it hard to hold onto charge. The cell which would previously hold a ‘0’ or ‘1’ would not be able to hold the charge properly and the controller may not correctly determine if the state of the cell is ‘0’ or ‘1’. Some of these errors can be corrected using error correction code (ECC) but that can become overwhelmed if the number of errors becomes too many to correct.

consult. design. integrate.

01

How SMART tools can help. S.M.A.R.T. (Self-Monitoring, Analysis and Reporting Technology; often written as SMART) is a monitoring system included in the solid-state drives (SSDs)[1] that detects and reports on various indicators of drive reliability, with the intent of enabling anticipation of hardware failures. When S.M.A.R.T. data indicates a possible imminent drive failure, software running on the host system may notify the user. Stored data can then be copied to another storage device, preventing data loss, and the failing drive can be replaced.

How SMART tools predict how long your SSD will last? By using an SSD in an application for specific length of time (e.g. 50-100 days) and then using SMART tools to extract data about the write/erase cycles, it is possible to calculate the percentage of the drive that has worn out. Using this data we can understand wear and estimate how long the drive will last. If the drive is being used in a production system, with representative data from the customer’s specific use case, this can provide a useful indication of expected lifetime of the drive. SMART tools, when used in this way, may still have limitations. For example, it may not take into account any accelerated wear due to e.g. use at high operating temperatures. Acal BFi can improve the accuracy of SMART tool predictions and minimise the impact of situations like this.

consult. design. integrate.

02

How can Acal BFi help customers to select an appropriate drive for their application? Acal BFi can support customers through the selection process using our strong relationships with suppliers, such as Delkin Devices. Our suppliers focus entirely on the embedded market. Typical features being offered;

• Fixed Bill of Materials and Fixed firmware per part number – no batch to batch product variation. • End of life process with last time buy opportunity • Wide range of SSD form factors and support to decide which form factor is suitable for your specific application • Devices manufactured by Delkin in USA and Swissbit in Germany • Conformal coating options (some form factors) • Customisation options (labelling, programming, configuration, others) • Low minimum order quantities • Local, good quality support – talk to people who understand SSD design • SMART tools available to determine expected SSD lifetime and to implement predictive failure monitoring • Extended lifecycle products (some form factors) for customers who have high qualification costs and need longer term availability

• Support for applications where products are required to tolerate enhanced shock and vibration Custom Services from Acal BFi;

• Supply of custom formatted and/or programmed SSDs • Custom labelling • Conformal coated units • Units tested to specific shock and vibration environments • Devices specifically designed to withstand environments with corrosive chemical vapours Typical applications being supported;

• Demanding industrial applications • Transportation e.g. rail, automotive (aftersales), military vehicles, public transportation • Outdoor equipment e.g. pollution and environmental monitoring equipment • Point of sales equipment and ticketing sales • Power generation e.g. wind turbines in difficult to access environments • Healthcare e.g. diagnostic equipment, ophthalmology, assisted living • Portable equipment e.g. warehouse picking equipment • Building access control e.g. locally recording building entry with fobs and photos • Military and Aerospace e.g. recording video and instrument data in armoured vehicles

consult. design. integrate.

03

The issues of theoretically predicting the lifetime of an SSD An SSD’s lifetime can be predicted with the use of mathematical models, however, there are certain factors which make the determining the actual usage difficult, and these estimations inaccurate. For example, operating system write requests create temporary files and unpredictable processes running in the background can vastly increase the erase cycles. In addition, because of the way most controllers write and erase data in fixed size pages and the data being written is not in page size chunks, the available write erase cycles get used up faster than a simple calculation would indicate. This effect is quantified by the write amplification factor. The write amplification factor particularly difficult to predict, and can have a massive impact on reducing the life expectancy of the SSD. Using mathematical models can be used to estimate an SSD’s lifetime, however with these variables, it is not accurate and should not be relied upon for critical applications.

Summary Choosing a suitable SSD for an industrial embedded application should not be done as an afterthought. The same care and consideration that is applied to other critical components should be taken when selecting an SSD. Not all SSDs are created equal. Experience of storage types and typical storage failures is an advantage to select drives that will not fail within the useful lifetime of the products they are installed in. Acal BFi and suppliers have proven experience in SSD selection for demanding applications and can help to take some of the guesswork out of this selection process.

consult. design. integrate.

04