HotStorage'22

#### Hello Bytes, Bye Blocks: PCIe Storage Meets Compute Express Link for Memory Expansion

Myoungsoo Jung

**Computer Architecture and MEmory systems Laboratory** 



# High-Level Summary

#### There is a need for storage devices as working memory

Can achieve ~50x larger memory



 $\,\,$  capacity comparison: when DRAM chip and NAND flash chip have the same price (excluding controller price)

#### Emerging CXL can make it possible (+ cache-coherent)



#### We propose a "storage-integrated memory expander"



Hardware prototyping

User guides for better use



#### 1. Long-Standing Dream

#### 2. Why CXL?

#### 3. Storage-Integrated Memory Expander

#### 4. User Guide #1 - Pooling

#### 5. User Guide #2 -Storage-Aware Annotation



## PCIe Storage as Working Memory



Working memory



# Benefits: Larger Memory



CAMEL <5>

# Benefits: Larger Memory



Scientific analysis



Recommendation system



Health care



ML-based automotive system

Achieving larger memory (with data persistence) can **open a new door** for big data applications (more exploration and more analysis).



# Conventional Attempts

Industry prototype

NVMe

standard

SAMSUNG : 2BSSD Electrolytic capacitor NAND flash memory PDRAM 28.550

PCIe interface

Microsemi : NVRAM cards



nymess : PMR

#### 8.14 Persistent Memory Region

The Persistent Memory Region (PMR) is an optional region of general purpose PCI Express read/write persistent memory that may be used for a variety of purposes. The controller indicates support for the PMR by setting CAP.PMRS (refer to section 3.1.3.1) to '1' and indicates whether the controller supports command data and metadata transfers to or from the PMR by setting support flags in the PMRCAP property. When command data and metadata transfers to or from PMR are supported, all data and metadata associated with a particular command shall be either entirely located in the Persistent Memory Region or outside the Persistent Memory Region.

The PMR's PCI Express address range is used for external memory read and write requests to the PMR. The PCI Express address range and size of the PMR is defined by the PCI Base Address Register (BAR) indicated by PMRCAP.BIR. The PMR consumes the entire address region exposed by the BAR and supports all the required features of the PCI Express programming model (i.e., it in no way restricts what is otherwise permitted by PCI Express). There were several attempts to use SSD as working memory by supporting byteaddressability.

CAMEL <7>

# Commonality of Conventional Attempts

#### **SAMSUNG**: 2BSSD Electrolytic capa NAND flash memory NAND Physical flash mem 28-SSD Backend media memory map PCIe interface Wicrosemi : NVRAM cards Ctrl. CPU DRAM PMR : PMR 8.14 Persistent Memory Region The Persistent Memory Region (PMR) is an optional region of general purpose PCI Express read/write persistent memory that may be used for a variety of purposes. The controller indicates support for the PMR by setting The conventional attempts CAP.PMRS (refer to section 3.1.3.1) to '1' and indicates whether the controller supports command data and metadata transfers to or from the PMR expose SSD's internal DRAM to by setting support flags in the PMRCAP property. When command data and metadata transfers to or from PMR are supported, all data and metadata associated with a particular command shall be either entirely located in the gap the byte and block I/O Persistent Memory Region or outside the Persistent Memory Region. The PMR's PCI Express address range is used for external memory read granularity and hide the long and write requests to the PMR. The PCI Express address range and size of the PMR is defined by the PCI Base Address Register (BAR) indicated by latency of SSD backend block PMRCAP.BIR. The PMR consumes the entire address region exposed by the BAR and supports all the required features of the PCI Express programming model (i.e., it in no way restricts what is otherwise permitted by media as much as possible. PCI Express).

CAMEL <8>

# Commonality of Conventional Attempts



CAMEL <9>

#### I. Long-Standing Dream

#### 2. Why CXL? (Not Conventional Attempts, PCIe BAR?)

#### Storage-Integrated Memory Expander

#### 4. User Guide #1 - Pooling

#### 5. User Guide #2 -Storage-Aware Annotation



#### **Limitations of Conventional Attempts**





# Good Enough PCIe Bandwidth





#### But, Non-Cacheable Access





#### But, Non-Cacheable Access





# But, Non-Cacheable Access





#### **Advocation: CXL for Memory Expansion**

Instead of the conventional approaches, we advocate emerging cache-coherent interconnection technology, called CXL.





# CXL: Multi Protocols





# CXL: Multi Protocols











# Cacheable Access



**CAMEL** <20>





#### **CAMEL** <21>

#### Side-by-Side Comparison





#### Side-by-Side Comparison



#### **CAMEL <**23>

# 1. Long-Standing Dream

2. Why CXL?

#### 3. Storage-Integrated Memory Expander

#### 4. User Guide #1 - Pooling

#### 5. User Guide #2 -Storage-Aware Annotation



#### **Design #1: Device Type Consideration**

In order to enable storage-integrated memory expander over CXL, we have to decide CXL device type first.





#### CXL Device: Mix & Match CXL Protocols





#### CXL Device: Mix & Match CXL Protocols





**CAMEL <**27>

# Best-Fit: Type 3 CXL Device





# Limits of Type 2 CXL Device





# Limits of Type 2 CXL Device

Then, the address spaces that the CPU and SSD manage should be coherent.







# Limits of Type 2 CXL Device

In other words, it causes excessive cache coherency traffic, which slows down memory access.







## Design #2: Enable CXL SSD (Type 3)





















#### Simple Modification is Enough



**CAMEL <**37>

#### **Performance Projection**



- RISC-V64 bit O3
- 128 KB L1 cache
- 4MB L2 cache
- Z-NAND emulation
- 32GB capacity
- OpenExpress-based

%Separated customized FPGA board (16nm)



We speculate how much effect a CXL SSD has on system performance using our CXL hardware prototype.



#### Experimental Group











**CAMEL** <40>





CAMEL <41>





CAMEL <42>

#### 1. Long-Standing Dream

#### 2. Why CXL?

#### 3. Storage-Integrated Memory Expander

#### 4. User Guide #1 – Pooling

#### 5. User Guide #2 - Storage - Aware Annotation



#### **Needs: Memory Pooling**



Memory resource pool



#### **Guide #1:** Resource Expansion



Memory resource pool



#### #1-1: CXL Switch





#### #1-1: CXL Switch



CAMEL <47>

#### **Guide #2:** Resource Pooling

As the second step, resource pooling should be supported by multiple hosts when the multiple memory expanders are ready.



Memory resource pool



#### **#2-1:** Switch Virtualization



**CAMEL** <50>

#### **#2-1:** Switch Virtualization



**CAMEL** <51>

#### **#2-2:** Device Virtualization



CAMEL <52>

#### **#2-2:** Device Virtualization



CAMEL <53>

# 5. User Guide #2-Storage-Aware Annotation



#### **Needs: Latency & Persistence Control**





Unexpected Case #1: Internal Tasks



delayed due to SSD's internal tasks, such as read reclaiming and garbage collection.



Unexpected Case #2: Internal Buffer



policy.







#### CAMEL <58>

### Conclusion

We study how CXL can be applied to PCIe storage to fill the semantic gap between bytes and blocks by exploring:

Success in converting the block semantics to byte semantics over CXL
Speculate the performance with a CXL hardware prototype
Explore new opportunities to use the memory expander efficiently



## Thank You

Contact: Myoungsoo Jung (mj@camelab.org)

