Identifying the Key Features of Enterprise-Class SSDs

By Scott Harlin and Grant van Patten | May 18, 2016

The Host Managed SSD Solution
Adding user controls to SSDs by means of APIs to enable system-level management is the basis of HMS technology. By coordinating the background tasks of every SSD in the pool, the entire population shares the burden of a given workload and works collaboratively with the host to improve overall system performance and efficiency. The system-level knowledge obtained from the internal state of pooled SSDs is opposite to a single SSD HMS concept where no system level knowledge is available. Therefore, the primary use case for OCZ HMS technology is to obtain consistent and predictable latency at low over-provisioning (OP). For many users, consistency is the important parameter. Feedback from large hyperscale customers mentions that 1ms average latency is sufficient but a spike to 6ms for example, would be unacceptable. In enterprise SSDs, this is achieved by adjusting the OP to a high 28%, adversely increasing the average cost of the drive as well. Therefore, the main objective of OCZ HMS technology is to bring entry-level enterprise SSDs to the required HMS working-point.

Saber 1000 HMS Overview
The main concept of the Saber 1000 HMS solution is to enable storage system vendors to optimize pools of SSDs with an API algorithm of their choosing. These API primitives include Enabling/Disabling HMS Functionality, Starting/Stopping Background Operations, Getting Background Operation Status, Starting/Stopping/Forcing Metadata Log Dumps, Getting Geometry/Endurance/Free Block/ Device Status.

By controlling such flash functions as scheduled garbage collection, log dumps and other system-level capabilities via the management layer, better aggregated efficiency is achieved that translates into improved system performance, increased endurance and improved energy usage. In the future, additional points of control could be added for other background operations such as: Dynamic OP (Over-Provisioning), Dynamic power control or multi-streaming.

In the event of a sudden power loss or failure, Saber 1000 supports Power Failure Management (PFM) so that data written to it will still reside in the drive when power is restored. The data is kept intact (and not lost) for IT management to continue using, and for many hyperscale, web-hosting and distributed computing environments, PFM is all that is required.

Saber 1000 HMS Technical Requirements
This section addresses the HMS technical requirements that affect Saber 1000 SSD performance and latency, and includes techniques on how to enable HMS technology to positively influence background garbage collection, inconsistent
I/O responses, log dumps, write cliffs, etc. A quick review of these technical requirements now follows:

Overcoming Background Processing Impact on Latency
One of the main features of HMS technology is designed to avoid background operations affecting performance consistency. The HMS Reference Design assures consistent performance in the aggregated SSD pool using the following APIs:

Primitive	Description
Get Free Pages	Retrieves the number of free pages currently available for programming from the SSD( data that can be written to the SSD
Disable GC	Stop garbage collection
Enable GC	Resume normal background garbage collection operation

The management layer aggregates the storage devices into a pool and provides volumes over it. The mapping between the volumes and the Saber SSDs is dynamic such that each volume can span over the entire pool of SSDs and change mapping requirements. This is a fundamental aspect of the LUN management structure whereby each arriving write command can be forwarded to every device, balancing the workload across the device pool according to any policy. Thus, for each arriving write command the management layer can apply this rule-base:

Get the free pages available for each Saber 1000 SSD

Select the Saber 1000 SSD(s) with maximum free pages (or enough free pages) and write the data

The above rule-base eliminates the chance of writing data that cannot be served due to free blocks shortage. Taking a more active approach, garbage collection can be stopped on some drives, enabling writes and reads from these drives and minimal garbage collection latency impact. Therefore, the HMS rule-base can be modified to overcome the background processing impact on latency as follows:

Get the free pages available for each Saber 1000 SSD

Select the Saber 1000 SSD(s) with maximum free pages (or enough free pages) and stop each respective garbage collection operation

Force garbage collection on the other Saber 1000 SSDs

Every incoming write command is directed to stopped-GC SSD(s)

Every read command is fetched from stopped-GC SSD

Overcoming Endurance Impact on Latency
Flash media is limited by the number of erase cycles that the memory supports before degradation and is determined by the density of the flash itself. As deployed SSDs are subject to malfunctions, errors and failures, this translates to non-successful commands and applications failures. Therefore, each SSD controller applies a mechanism of wear levelling whereby blocks are erased evenly to maintain as little deviation as possible. By levelling erase count across all blocks, the drive write potential is maximized. The following API supports this implementation:

Primitive	Description
Get Erase Count	Get the erase count of maximal/median block
Read Device Geometry	Retrieve information about the device geometry
Read Device Endurance	Retrieve device endurance information
Read debug statistics	Retrieve debug statistics from the device

Therefore, the HMS rule-base can be modified to overcome the endurance impact on latency as follows:

Get the free pages available for each Saber 1000 SSD

Select the Saber 1000 SSD(s) with minimal erase count and maximum free pages (or enough free pages) and direct write commands to it.

Overcoming Latency Variations Due to Metadata Updates
On some SSDs, metadata is written frequently as this operation impacts latency. In the Saber 1000 Series for example, this operation occurs about every 14 seconds and its effect is clearly evident. During this metadata write operation, SSD Read/Write performance is impacted and any operation that coincides with this log dump will experience a latency spike. The following API can be used to control the log dump operation:

Primitive	Description
Force log dump	Trigger a log dump operation

System software can use this API to schedule a log dump while preventing any access to the Saber 1000 SSD until the operation has completed.

Summary
Hyperscale, Cloud and Big Data datacenter requirements have diverged from traditional server or SAN/NAS storage to a new usage model that adds controls to SSDs. By understanding the system-level information on the internal state of SSDs provides IT managers with the ability to direct I/O accordingly to improve overall performance, endurance or efficiency. In the classic sense of the whole being greater than the sum of its parts, the population of deployed SSDs can share the burden of critical requirements, and by working collaboratively with the host, deliver high value and greater functionality.

The primary use case for this initial HMS offering is for obtaining consistent and predictable latency at low over-provisioning. Reaching the highest IOPS performance is not as important as achieving consistent latency of I/O responses. HMS will become a requirement for SSD vendors enabling IT managers to control/manage these costs and to maximize SSD pool performance. The Saber 1000 HMS solution will enable Toshiba/OCZ to gain early mindshare with hyperscale and large datacenter customers and work with them on embedding the HMS enabled API SSDs into their datacenter management stacks (or HMS reference designs), or for those enterprise customers who want to control their own software management layer.

There are three core development tools that are available with Saber 1000 HMS solutions and include: