Abstract
Applications needs to have their working set in main memory to work
efficiently. When there is memory contention, data has to be read from
and written to slow backing store, such as disk. Compressed caching is a way to keep more pages in main memory by compressing the oldest pages.
This reduces disk I/O because more of the needed data is available in main memory. Applications that can gain most from compressed caching have low entropy in their data and reuse recently data as a rule. Entropy quantifies the information contained in a message, in our case a page, usually in bits or bits/symbol. This gives an absolute limit on the best possible lossless compression for a page. The opposite charactristics apply to applications that perform worse with compressed caching than without it. They do not reuse recently used data and have high entropy causing the compression to have a bad ratio and the cache to have a low hit rate on the compressed pages.
In this master thesis, we design, implement and evaluate compressed
caching for Linux (version 2.6.22). In our implementation, we modify the PFRA (page frame reclaiming algorithm) to convert pages to compressed pages instead of evicting them, and convert them back when they are accessed in the page cache. The compressed pages are kept in a list in the same order as they are put in by the PFRA. We adapt the size of the compressed cache by looking at how it is used, if we need to shrink the compressed cache,
the oldest compressed page is evicted.
For compressed caching unfriendly applications we extend an earlier approach of disabling compressed caching globally when it is not useful, with a more fine-grained cache disabling algorithm. We do this by defining ”memory areas”, and disabling compressed caching for them based on their recent behavior. We extend upon earlier approaches of compressed caching and measure the impact on performance.
We define workloads and run tests to measure the performance of compressed caching compared to an unmodified Linux kernel. We change the amount of main memory available and the number of processes running simultaneously
and observe the impact on the performance. We then evaluate
the results and find more than 40% reduction in running time for some tests.
We discuss our findings and conclude that compressed caching reduces
disk I/O when there is memory contention, and can therefore increase performance of applications that can not keep their complete working set in memory uncompressed, but have low enough entropy to keep it in main memory in compressed form.