PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Saturday, November 12, 2022

[FIXED] How does django handle multiple memcached servers?

 November 12, 2022     django, memcached, python, sharding     No comments   

Issue

In the django documentation it says this:

...

One excellent feature of Memcached is its ability to share cache over multiple servers. This means you can run Memcached daemons on multiple machines, and the program will treat the group of machines as a single cache, without the need to duplicate cache values on each machine. To take advantage of this feature, include all server addresses in LOCATION, either separated by semicolons or as a list.

...

Django's cache framework - Memcached

How exactly does this work? I've read some answers on this site that suggest this is accomplished by sharding across the servers based on hashes of the keys.

Multiple memcached servers question

How does the MemCacheStore really work with multiple servers?

That's fine, but I need a much more specific and detailed answer than that. Using django with pylibmc or python-memcached how is this sharding actually performed? Does the order of IP addresses in the configuration setting matter? What if two different web servers running the same django app have two different settings files with the IP addresses of the memcached servers in a different order? Will that result in each machine using a different sharding strategy that causes duplicate keys and other inefficiencies?

What if a particular machine shows up in the list twice? For example, what if I were to do something like this where 127.0.0.1 is actually the same machine as 172.19.26.240?

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
        'LOCATION': [
            '127.0.0.1:11211',
            '172.19.26.240:11211',
            '172.19.26.242:11211',
        ]
    }
}

What if one of the memcached servers has more capacity than the others? If machine one has as 64MB memcached and machine 2 has a 128MB, Will the sharding algorithm take that into account and give machine 2 a greater proportion of the keys?

I've also read that if a memcached server is lost, then those keys are lost. That is obvious when sharding is involved. What's more important is what will happen if a memcached server goes down and I leave its IP address in the settings file? Will django/memcached simply fail to get any keys that would have been sharded to that failed server, or will it realize that server has failed and come up with a new sharding strategy? If there is a new sharding strategy, does it intelligently take the keys that were originally intended for the failed server and divide them among the remaining servers, or does it come up with a brand new strategy as if the first server didn't exist and result in keys being duplicated?

I tried reading the source code of python-memcached, and couldn't figure this out at all. I plan to try reading the code of libmemcached and pylibmc, but I figured asking here would be easier if someone already knew.


Solution

It's the actual memcached client who does the sharding. Django only passes the configuration from settings.CACHES to the client.

The order of the servers doesn't matter*, but (at least for python-memcached) you can specify a 'weight' for each of the servers:

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
        'LOCATION': [
                ('cache1.example.org:11211', 1),
                ('cache2.example.org:11211', 10),
            ],
}

I think that a quick look at memcache.py (from python-memcached) and especially memcached.Client._get_server should answer the rest of your questions:

def _get_server(self, key):
    if isinstance(key, tuple):
        serverhash, key = key
    else:
        serverhash = serverHashFunction(key)

    for i in range(Client._SERVER_RETRIES):
        server = self.buckets[serverhash % len(self.buckets)]
        if server.connect():
            #print "(using server %s)" % server,
            return server, key
        serverhash = serverHashFunction(str(serverhash) + str(i))
    return None, None

I would expect that the other memcached clients are implemented in a similar way.


Clarification by @Apreche: The order of servers does matter in one case. If you have multiple web servers, and you want them all to put the same keys on the same memcached servers, you need to configure them with the same server list in the same order with the same weights



Answered By - Jakub Roztocil
Answer Checked By - Gilberto Lyons (PHPFixing Admin)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing