VMAX FASTVP Best Practice Essentials

I spend a lot of time talking to VMAX customers about FASTVP deployment and administration best practices. FASTVP has lots of 'nerd knobs' -- and the mere presence of these optional parameters can sometimes cause storage admins to overthink things, leading to a needlessly complex deployment. But the truth is, keeping things simple and following a few basic best practices will generally result in an array that is both easier to manage and more efficient.

knobs

There are several FASTVP whitepapers available that provide information on FASTVP architecture, deployment, and best practices -- for example, this whitepaper at support.emc.com. John Adams also has a great general VMAX performance presentation here -- he presents this at EMC World regularly, so be sure to catch his session if you'll be at EMC World this year. With this post, I'm intending to condense these best practices down to a few simple, easily consumable recommendations. These recommendations generally assume a typical FASTVP config with three tiers -- the EFD/SSD ultra performance tier, the 10k or 15k RPM performance tier, and the capacity-oriented 7.2k RPM tier. For the sake of simplicity (which will be a theme throughout this post), I'll  refer to the ultra performance tier as "EFD", the 10k/15k tier as "FC", and the 7.2k tier as "SATA."

On to the recommendations!

Recommendation #1 -- Start with a solid foundation: Drive types, RAID types, and balance

The task of properly designing the hardware configuration is primarily owned by your EMC or partner Presales Systems Engineer. But this topic is important enough to cover here anyway. "Balance" is the most important point. You want your drives balanced evenly across the entire backend of the VMAX. Most importantly, this applies to EFDs -- but it applies to mechanical drives as well. We have a few rules of thumb to help with this:

  • For VMAXe (VMAX 10k serial number 959), VMAX "Classic", VMAX 20k, and VMAX/SE -- configure multiples of 8 EFDs per engine. In a perfect world, you'll also want your FC and SATA drives to be added in multiples of 8 per engine. This is because there are 8 backend CPU cores per engine, and you want your drives evenly distributed among all of those backend cores.
  • For VMAX 10k (serial 987) and VMAX 40k -- configure multiples of 16 EFDs per engine. Again, your FC and SATA drives should ideally be added in multiples of 16 per engine as well. For these array models, there are 16  backend CPU cores per engine. In the case of the 10k, the backend cores are logical cores via hyperthreading, whereas on the 40k they are physical cores.

We also have recommendations for the RAID types in each tier:

  • For the EFD tier, choose RAID5 (3+1)
  • For the FC tier, choose RAID1 -- this is explained in further detail in its own section below.
  • For the SATA tier, choose RAID6 (6+2) -- avoid RAID5 for resiliency reasons, and avoid RAID6 (14+2) because it is not capable of performing optimized/coalesced full stripe writes.

Recommendation #2 -- Bind to the FC tier

Before a thin device (TDEV) can be used, it must be bound to a pool. This binding relationship simply defines the pool where new allocations will be initially written. By new allocations, I mean writes to new logical block addresses (LBAs) that have not been written to yet. When a new write comes in from a host, that write must land in a particular pool. The binding relationship determines which pool this will be.

Binding to the FC pool provides several benefits. First, your new writes land in the middle pool, where they can be easily promoted or demoted by FASTVP as the workload dictates. Second, a significant portion of your writes, at least initially, will likely be new allocations. Ideally we want to capture as many writes as possible into the pool with the lowest RAID write penalty. Assuming you've followed Recommendation #5 (Mirror the FC tier), binding to FC will indeed direct new writes to the pool with the lowest write overhead. This reduces overall load on the drives and the DAs (backend controllers). And finally, binding everything to the FC pool gives you one central pool to manage oversubscription -- see Recommendation #7 for more information on this.

Recommendation #3 -- Associate everything to a 100/100/100 FAST Policy

Generally speaking, FASTPVP does a really good job making promotion/demotion decisions on its own. Assuming you're following the final recommendation, FASTVP will be analyzing and moving data all the time, 24x7x365. By associating everything to a 100/100/100 policy, you're giving FASTVP free reign to make its own decisions without restrictions. In most cases, this is the best way to go.

When administrators configure too many policies, or associate too many workloads to policies that don't have access to the higher-performing tiers, this can often have undesirable effects on other workloads. Some administrators who subscribe to this model will associate storage groups that are less important to the business (e.g. Dev, Test, UAT, etc) to "lower" policies. The problem is, while these workloads may be less critical than production, they don't necessarily generate less IO than production.

When you trap heavy workloads in the lower tiers -- particularly in SATA, which should be configured as RAID6 -- it can have a negative effect on the entire array, which can degrade performance for your critical workloads. A heavy workload that is trapped in the SATA tier will increase utilization for all of the SATA drives, which are shared with other workloads. More importantly, a heavy workload trapped in SATA will increase utilization of the DAs -- because of the RAID6 parity penalty. The DAs are shared components, so when they get hot, it affects everything on the backend -- including your EFD and FC tiers.

So keep it simple, and start by associating everything to a 100/100/100 policy. You may eventually run across some exceptions -- but generally speaking, starting with 100/100/100 is the simplest and best-performing option.

And if you _really_ want to keep specific workloads down in the SATA/FC tiers only -- consider using Host IO Limits to prevent these workloads from over-utilizing the backend. But here we're starting to get into "complex" territory, so unless you've got a solid SLO-based automation layer on top of this (e.g. ViPR), consider whether or not the extra effort associated with managing this is really worth it.

Recommendation #4 -- Enable VP Allocation by FAST Policy and associate everything to a Policy

Typically, the first objection to Recommendation #1 is that the FC pool tends not to have very much capacity. When you bind everything to FC, your oversubscription rate for this pool is very high. So what happens when the FC pool fills up? Generally speaking, having an oversubscribed pool reach 100% capacity is really bad. Like crossing the streams kind of bad.

ghostbust1

But if you're using VP Allocation by FAST Policy, it's OK to cross the streams. You can oversubscribe the FC pool -- often to the tune of 500-600% -- and if the FC pool fills up, Allocation by FAST Policy will allow new host allocations to "spill over" into the other tiers in your FAST policy. Typically, this will be the SATA tier.

But this feature only kicks in for TDEVs that are associated with a policy that has access to SATA capacity. So the second part of this recommendation echoes recommendation #2 -- associate everything to a 100/100/100 policy, so VP Allocation by FAST Policy works. If you have certain devices that you _really_ don't want on SATA, but you want them to have access to FC and EFD, you could associate them to a 100/100/1 policy. This will allow new writes to spill over to SATA, and then the FAST compliance algorithm will start promoting those "spillovers" back to FC/EFD (assuming free space becomes available). Just bear in mind that this deviates from the "keep it simple" philosophy I'm trying to espouse here.

Recommendation #5 -- Mirror the FC tier (RAID1)

As mentioned before, ideally your FC tier should be Mirrored (RAID1). To most customers, this sounds anachronistic and inefficient at first. But the reality is, for most workloads, a Mirrored FC tier is actually cheaper and more resilient than a RAID5 FC tier. Ideally, most of your workload will be captured by the EFD tier. The EFD tier is often capable of servicing around 40-50% of your workload. The rest of it needs to be serviced by mechanical drives, and of those mechanical drives, it's typically the FC tier that picks up most of what's left over -- often times in the 40% range. Point being, the FC tier is still servicing a significant amount of workload, and should be optimized for performance; not capacity. The SATA tier is where your capacity comes from.

Assuming you're binding everything to FC as recommended, the FC tier will be picking up a significant amount of writes. The RAID write penalty impacts both drives and DAs. By configuring the FC tier as RAID1, we reduce the RAID write penalty by 50% versus RAID5, or 67% versus RAID6. Because the parity penalty is handled by both disks and DAs, we often times require more engines and drives for a RAID5 or RAID6 FC tier versus a RAID1 FC tier -- thus driving up the cost of the overall solution when RAID5 is used for FC.

Recommendation #6 -- Do not preallocate

Preallocating devices (i.e. reserving space before a host begins using it) is not recommended in a FASTVP environment. Some administrators like to preallocate in order to reduce the "first write" penalty -- there is some degree of overhead (often measured in microseconds) associated with the initial allocation work of writing to a brand new block vs. updating an existing block. But if you preallocate, FASTVP will begin tracking performance on that preallocated capacity; and because that data is doing literally nothing, FAST will demote all of those preallocated blocks to the lowest tier. Given that most administrators preallocate for performance reasons, this achieves the exact opposite result of what was intended.

For customers who are preallocating in order to avoid oversubscription, I typically advise that they apply the next recommendation -- Control oversubscription by managing the subscription cap on the FC tier.

Recommendation #7 -- Control oversubscription by managing the subscription cap on the "bind" tier (FC)

When talking about these recommendations, I'm often asked how customers can control oversubscription if they're binding everything to FC and avoiding preallocation. As long as you're keeping things simple and following the rest of the recommendations here, it's actually fairly straightforward to cap oversubscription. First, start by making sure you've bound everything to the FC pool. Then set a subscription cap on the EFD and SATA pools to zero -- this will prevent you from binding any more TDEVs to those pools.

Now you need to set a subscription cap on the FC pool -- the only pool you're binding to -- that will allow you to use all of the capacity in the array (across all tiers), without oversubscribing the array, as a whole, beyond what you're comfortable with. Typically this will result in an FC subscription cap of around 500% to 600%.

Here are a couple examples. Consider an array with 100TB usable over three tiers -- 2TB EFD, 20TB FC, and 78TB SATA.

If you want to be able to use all 100TB, and you don't want to oversubscribe, then you'll need to bind no more than 100TB of TDEVs (the array's total usable capacity) against the 20TB FC pool. Simply divide the total amount of TDEVs you want to be able to provision (100TB) by the usable capacity of the FC pool (20TB), and you'll get the subscription cap you need to apply to the FC pool. In this case, 100TB / 20TB = 500%.

If you want to oversubscribe the array by no more than 20%, then you'll need to bind no more than 120TB of TDEVs (20% more than the array's total usable capacity) against the 20TB FC pool. We can apply the same formula from the previous example here as well: 120TB / 20TB = 600%.

Recommendation #8 -- Reduce the Pool Reserved Capacity (PRC) on the EFD tier

By default, the VMAX comes with a 10% global Pool Reserved Capacity (PRC) on every pool. This PRC is essentially a portion of capacity in each pool that FASTVP cannot write to. It is reserved for new host writes only. We reserve this space so that FASTVP cannot fill a pool to 100% capacity -- only new host writes can do that. But if you've been following all of the previous recommendations -- particularly binding only to the FC pool, and managing oversubscription at the FC pool -- then this reserved space is only desirable on the pool that you're binding everything to: the FC pool.

So keep the FC pool's PRC set to 10% -- or higher, if that's what you're comfortable with. But for those pools where you're not binding anything (EFD and SATA), override the PRC to 1% -- the lowest possible setting. This is particularly important for the EFD tier, where capacity is expensive -- you want to use as much EFD capacity as you can. Reducing the EFD PRC to 1% will allow FASTVP to use 99% of the EFD pool's capacity, without having any devices explicitly bound to EFD.

Recommendation #9 -- Use the defaults for everything else (mostly)

For everything else, just stick with the defaults.

The performance and movement time windows should be open all the time -- there's rarely a need to restrict FASTVP from analyzing or moving data within particular time windows. FAST is generally intelligent enough to differentiate between your typical daytime transactional workloads and your nightly backup workloads and batch jobs.

Storage Group priority, which allows you to allocate higher promotion priority to certain storage groups, is very rarely used. Just leave it to the default of 2.

The defaults for Initial Analysis Period and Workload Analysis Period -- a week -- are generally fine.

Finally, the one setting you might consider tweaking is the FAST Relocation Rate. This defines how aggressive the FASTVP movement engine is when moving data. The default value is 5; setting this to a higher value will decrease the aggressiveness of the movement engine. Setting it lower obviously does the opposite. In most cases, the default of 5 is fine. But if you're just turning on FAST for the first time, you may want to start with a less aggressive setting, like a 7 or 8, so FAST slowly moves things around to the most appropriate tiers. Once things have normalized, set it back to 5.

The other case where you might want to change the FRR is if your DA's are already running at high utilization levels, or if you'll be upgrading to a recent version of 5876. In 5876.229.145, the aggressiveness of the FASTVP movement engine was increased -- so what was an FRR of "5" on 5875 is more like a 2 or 3 in 5876.229. So if your DAs are already running hot and you're planning to upgrade to 5876.229 or later -- you probably want a less aggressive FRR, around 7 or 8. See this support article for more information.

In summary -- keep it simple, follow these best practices, and you'll have an environment that is easier to manage and performs better. Please feel free to drop me a line in the comments, on Twitter, or via email if you have any questions or if I've missed something.

Written on March 13, 2014