Tuesday, August 9, 2016

Facebook Pursues Hierarchical Flash Storage Strategy

Building data center infrastructure at Facebook is all about understanding engagement, said Vijay Rao, Director Technology, Strategy Facebook, speaking at Flash Memory Summit in Santa Clara, California.

Facebook's growth metrics are well known - whether in posts, comments, likes, photos, messages, videos -- the curve moves strongly in the upward and to the right direction.

Designing infrastructure for this type of runaway engagement means planning for spikes, such as Mother's Day.  In terms of raw power, Rao estimated Facebook's overall compute power at about 7.5 quadrillion instructions per second. Facebook engineers now have seven types of servers to choose from depending on the application they are supporting.

On the storage side, Facebook is pursuing a hierarchical storage model -- various technologies are employed depending on the "temperature" of the data.  Hot data resides in DRAM or NVM-based DIMMs while warm data is stored in PCIe NVM Flash. User content grows colder over time, so archival photos and posts eventually will move to cold storage systems.

Facebook's overall strategy is to pursue a "disaggregated rack" architecture with designs contributed to the Open Compute Project.

Earlier this year at Open Compute Project Summit, Facebook introduced its "Lightning"- a flexible, NVMe JBOF (just a bunch of flash) box. It is designed to provide a PCIe gen 3 connection from end to end (CPU to SSD). It leverages the existing Open Vault (Knox) SAS JBOD infrastructure to provide a faster time to market, maintain a common look and feel, enable modularity in PCIe switch solutions, and enable flexibility in SSD form factors.

Rao highlighted three other solid storage innovations.

Next gen Non-Volatile Memory (NVM), which promise significant boosts in performance and scalability using 3D technologies and other breakthroughs.

Facebook's own AVA card, which puts four M.2 modules on a card.

WORM, which promises very large capacities (>100TB) with low endurance (150 write cycles) for low-cost, long-term storage.


See also