Enery Efficient Dynamic Memory Bank and NV Swap Device Management Kwangyoon Lee and Bumyong Choi Department of Computer Science and Engineering University of California, San Diego {kwl002,buchoi}@cs.ucsd.edu Abstract As demand for mobile devices increases, prolonging battery life has been a focus of mobile device manufacturers. While manufactures support a partial self-refresh capability for MSDRAMs (Mobile SDRAM), most operating systems do not include this feature due to the complexity of the memory management. Utilizing this capability correctly could potentially reduce the amount of power consumed by MSDRAMs while a system is in a suspended mode. The goal of this project focuses on implementing the partial self-refresh capability on ARM Linux and saving the maximum amount of power while the system is in the suspended mode. 1 Introduction As demand for mobile devices increases, prolonging battery life has been a focus of mobile device manufacturers. There has been numerous efforts to reduce power consumption: turning off display and suspending processors are dominant examples. Additionally, manufacturers use a MSDRAM (Mobile SDRAM) because the MSDRAM operates on a low supply voltage, and provides a PSR (Partial Self-Refresh) feature. While a characteristic of a SDRAM require self-refresh, the MSDRAM allows users to enable/disable self-refresh on predetermined sets of banks[2]. Mobile devices such as PDAs support a suspended mode that turns off most of the components except the MSDRAM and a component that wakes the system up. If the system is the suspended mode for a long time, the amount of energy consumed to maintain the MSDRAM s self-refresh is significantly large. Even though MSDRAMs support the PSR capability, most OS (Operating Systems) do not utilize this because it is difficult to manage it correctly and efficiently. The goal of this project focuses on implementing the partial self-refresh capability on ARM Linux and saving the maximum amount of power while the system is in the suspended mode by successfully turning off the maximum number of memory banks without losing any information stored in memory. Section 2 introduces the background behind the motivation, Section 3 covers our methodology, Section 4 discusses the experiment, Section 5 analyzes the potential power saving and Section 6 concludes. 2 Background To prolong battery life in mobile systems, we should have special hardware and software which are ready to control power usage dynamically. In software, OS define several operating modes and change to the proper operating mode to save power consumption when there is no need to turn on the whole system components. For example, ARM Linux 2.6.x for Intel PXA27x development platforms [4] currently has more complex power management capability; however, the basic operating modes are defined as running, idle and suspend. Moreover, OS kernels and device drivers should be capable of managing and adjusting these dynamic operating modes to adapt to changing power demand..
Table 1: Operating States Operating States Active Idle Suspend Off Description All the system components are actively running but they consume maximal power No process is running and many system components go idle Most of system components are turned off but the DRAM should retain its contents with self-refresh Every system component is shut down Figure 1: General Linux Memory Structure [8] In parallel, hardware also has capability to control power consumption based on an operating mode. Mobile CPUs have several power management features such as dynamic power/frequency management and different levels of operating modes. The entire hardware platform including peripheral devices are also designed to operate in different levels of operating modes. 2.1 Linux memory management Linux 2.6.9 has a very complicated but efficient memory management systems [1][6]. The whole memory space is managed by three different memory zones ZONE DMA, ZONE NORMAL and ZONE HIGHMEM. In ARM Linux 2.6.9, only ZONE DMA is activated for efficiency. The heart of the Linux memory management is the well-known Buddy System [5][7]. The smallest unit in the Buddy System is a 4KB page and different orders of page pools are maintained to minimize internal and external fragmentation. All the pages in Linux are mapped to a page descriptor array by which the kernel manages free and used pages. Figure 1 and Figure 2 show the basic model of the Linux memory management architecture. Most importantly, free pages and used pages can be traversed by simple entry points in each zone structure. Figure 2: ARM Linux Memory Structure 2
Table 2: Possible combinations of enabled banks under PSR mode Enabled banks Description Full All the 8 memory banks are enabled Half First 4 banks are enabled and others are disabled Quarter First 2 banks are enabled and others are disabled Eighth Only the first bank is enabled and others are diabled Figure 3: compaction 2.2 Infineon HYB25L512160AC Mobile SDRAM The Intel PXA27x processor development platform has two discrete Infineon 512Mb MSDRAMs. Thus, the tolal memory size is 128MB. The HYB25L512160AC MSDRAM has a special power-related feature, PSR (Partial Self- Refresh). This MSDRAM has eight banks and can enable/disable predetermined sets of memory banks. However, the combinations of enabled sets of banks are limited as depicted in Table 2. 2.3 Swapping to NV storage In order to maximize the number of banks to disable, we need to minimize load on memory as much as possible. The obvious way is to keep the memory content low by terminating unnecessary applications; however, such task is cumbersome and it is usually not clear to define unnecessary applications. The solution is to swap the memory content with the NV memory [9] and restore them when the system resumes from the suspended mode. In that case, the MSDRAM would only refresh on the minimum possible set of banks that preserve the critical information needed to keep the kernel operational. However, in this experiment, we were not able to implement this feature because of the limited capacity of the NV storage in the platform. 3 Methodology To efficiently utilize the PSR feature, two different approaches can be applied. A memory manager can be modified to arrange pages such that used pages are placed in lower banks of the memory by compacting the memory content as much as possible. In Linux OS, the memory management code is heavily dependent on the Buddy system, and it is not feasible to modify the Buddy system as described above because the Buddy system does not care about compaction of memory pages. The other method is relatively straightforward. The Linux memory management code is not modified and a new algorithm Memory Compaction code is added. Memory compaction is performed in the following phases: 1. Scan the whole memory pages and build free[max PAGES] and used[max PAGES] arrays of which each entry indicates whether the page is in use. 2. Calculate the size of used pages and determine which PSR mode is to be used based on the size. 3. Move the used pages in OFF-banks to the free pages in ON-banks. 3
Figure 4: compaction Now, all the used pages are gathered in lower banks as depicted in Figure4. Then System goes into the suspended mode by issuing a suspend command and the PSR mode is automatically enabled based on the information from Phase 2. While the banks that are refreshed would retain the contents, the other banks would lose all the contents, which are not valid any more. As the systems are back to the normal mode by pressing any predefined buttons, the CPU resets and the memory pages should be restored by referring to the mapping information in Phase 1. The Intel PXA27x processor DVK has two discrete 64MB of MSDRAM chips. A PSR mode setting is applied for both MSDRAM modules simultaneous as Figure 4 illustrates. Thus, compaction/de-compaction algorithm takes care of the two non-continuous ON-banks regions in each module under the PSR mode. There is an overhead to copy pages back and forth when compacting and de-compating. Nonetheless, only pages in the OFF-banks are copied, and if the usage of all the banks are balanced, the average number of pages that will be copied do not exceed the half of the entire used pages. 4 Experiment The experiment is conducted on the Intel PXA27x Processor DVK(Developer s Kit) with the modified ARM Linux 2.6.9. The DVK includes two modules of Infineon HYB25L512160AC-7.5 and each module has capacity of 512Mb. The Modified Linux includes our compaction/de-compaction algorithm and new power management code that selects an appropriate PSR mode. The experiment consists of (1)allocating memory, (2)suspend with the PSR, and (3)verifying the memory content. We wrote an application that allocates a desired amount of memory. For example, if we want to test the half PSR mode, 50-75% of memory should be allocated. In terms of suspending the system, we provide a forced mode and an autodetect mode. The forced mode forces the MSDRAM to disable certain banks and the auto-detect mode selects the PSR mode that does not turn off banks that have data. By experimenting with the forced mode, we were able to demonstrate that applications would be corrupted if banks that are in use are forced to be turned off. When operating in the auto-detect mode, applications resume from the suspended mode without crashing; however, the forced mode causes applications or the kernel to crash if the amount of memory used is greater than the banks that are kept alive. More importantly, the compaction/de-compaction algorithm exhibited low overhead that is not noticeable to users. And, hence, our approach is practical. 5 Analysis The suspended mode with the capability to select an optimal PSR mode is essential for devices that require a longer data retention period without recharging. The PDA is a prominent candidate for this benefit because most users experience that they use their devices for less than two hours a day. The increased amount of time in the suspended mode means that the PDA would require to be charged less frequently, because less power is consumed during the suspended mode. The following analysis discusses how power consumption is affected by different operating modes and total energy consumption in a system is defined as the following:. 4
Table 3: Power consumption by memory[3] Temp(C) Full bank(uw) Half bank(uw) Quarter bank(uw) Eighth bank(uw) 85 2640 1898 1485 1238 70 1815 1440 1320 1155 E total = P active T active + P suspend T suspend (1) P active is normally over a hundred times greater than P suspend. Thus, unless T active is much less than T suspend by a factor of 100, the total energy consumption is dependent on the active mode. However, if we focus on the maximum operational time without losing data, the suspended mode is the only factor to affect the length of device operation time. Table 3 shows sample parameters that reflect values from real a PDA. The total data retention time in suspended mode is directly proportional to P suspend. From Table 3, we can conclude that if the PSR mode is used, the total retention time will be at least 20% longer. Moreover, to obtain the best results, the minimum number of banks should be refreshed. In the real Linux environment, if a device is used over a long period of time, the most of pages are consumed for code pages, disk buffer caches, and other various small memory allocations by the OS. Therefore, there is no chance to apply the PSR mode although there is enough available memory pages because these pages are not freed yet. The cached pages should be released and swappable pages should be swapped out. As a result, we can preserve the minimum number of banks alive to minimize power consumption. 6 Conclusion The PSR mode in the MSDRAM is an elegant feature. However, no widely-used OS including the latest release of Linux supports this feature because of the complexity of the memory management code change. We proposed very efficient page compaction/de-compaction algorithm and this can be ported to every OS because it is simple and platform independent. However, before compacting memory pages, we should maximize the compaction by exploring unused pages if possible. First, discardable pages should be thrown away to obtain more free pages. Second, swappable pages should be swapped to a NV (Non-Volatile) swap device. These extra works will make more free banks in the MSDRAM and we can utilize the PSR feature more aggressively. Unfortunately, the Intel PXA27x Processor DVK does not have enough capacity of the NV memory (NOR flash memory on this platform) for the swap partition. For other devices with the larger NV memory capacity, it is a promising solution to maximize the power saving. References [1] Daniel P. Bovet and Marco Cesati. Understanding the Linux Kernel. OReilly Media, Inc, Sebastopol, CA, third edition edition. [2] Infineon Technologies. Application Note - Power Saving Modes for Mobile-RAMs, 2004. [3] Infineon Technologies. HYB25L512160AC-7.5 Data Sheet, Rev 1.2, 2004. [4] Intel Corporation. Intel PXA27x Processor Family Developer s Manual, 2005. [5] Robert Love. Linux Kernel Development. Sams Publishing, 2003. [6] Prentice Hall Mel Gorman. Understanding the Linux Virtual Memory Manager. 2004. [7] James L. Peterson and Theodore A. Norman. Buddy systems. Commun. ACM, 20(6):421 431, 1977. [8] Hao ran Liu. Physical memory management in linux. [9] Seon yeong Park, Dawoon Jung, Jeong uk Kang, Jin soo Kim, and Joonwon Lee. Cflru: a replacement algorithm for flash memory. In CASES 06: Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems, pages 234 241, New York, NY, USA, 2006. ACM Press. 5