Issue

A few days ago my Redis instance went down. All new write attempts failed with this error:

OOM command not allowed when used memory > 'maxmemory'

It only recovered after I flushed all the data. It runs in a VPS that has 24gb of RAM and only runs Redis, with this configuration:

maxmemory 20gb
maxmemory-policy volatile-lru
save ""

I use this instance to store sessions (notice though that persistence is disabled). Every session is written into Redis with an expiration time of 2 days, that means that all the keys have a TTL:

# Keyspace
db0:keys=1426936,expires=1425758,avg_ttl=87980766

Then why did it run out of memory? If the eviction policy is volatile-lru and all the keys have an expiration time set, why did it failed when it reached the maxmemory setting instead of evicting keys to free up memory?

Another thing to consider: the load of my application is very constant and stable, no peaks. The sessions are stored with an expiration time of 2 days. Now it's been six days running since I restarted the instance and flushed all, and Redis reports used_memory_human:781.54M. But when I check my server stats, I can see that the memory usage had been slowly increasing until the incident. And when I say slowly is really slowly: it took almost an year to reach the maxmemory=20gb limit.

But wait! How is that possible if sessions expire in two days? Could the incident be related to fragmentation ratio? I mean, sessions expire in two days, and all the time new sessions are being written to Redis. Is it possible that fragmentation ratio increased slowly during an year, making Redis fail regardless it had been configured with an eviction policy?

Or is there another theoretical situation where Redis can't free up memory fast enough? Thanks in advance!

Solution

After a lot of testing and reading, I conclude that the incident was caused by memory fragmentation. I've been able to solve it turning on "activedefrag".

It turns out that memory fragmentation is something to expect in certain scenarios. Scenarios like mine: a Redis instance that handles millions of sessions, that is, a lot of small new keys are constantly being written, and also being deleted (because of expiration).

I was able to verify this: a few days after the incident, having flushed the data and restarted Redis, I saw how mem_fragmentation_ratio went increasing very slowly, from 1.05 up to 2 or even more! Then I turned on activedefrag and mem_fragmentation_ratio started to decrease until 1.05 again. Now it's been a week since it's running smoothly and mem_fragmentation_ratio never goes beyond 1.08 :)

In case you have doubts, I can say that the performance cost of turning on activedefrag is almost negligible.

Here is some interesting readings: https://serverfault.com/questions/971804/are-there-situations-were-activedefrag-should-be-kept-disabled-in-redis-5-with

Answered By - Lisandro

Answer Checked By - Timothy Miller (PHPFixing Admin)

Monday, September 5, 2022

[FIXED] Why did Redis run out of memory even with maxmemory volatile-lru?

Issue

Solution

0 Comments:

Post a Comment

Total Pageviews

Featured Post

Why Learn PHP Programming

Monday, September 5, 2022

Issue

Solution

0 Comments:

Post a Comment

Total Pageviews

Featured Post

Why Learn PHP Programming

Subscribe To