MemCache分布式缓存的一个bug

Memcached分布式缓存策略不是由服务器端至支持的，多台服务器之间并不知道彼此的存在。分布式的实现是由客户端代码（Memcached.ClientLibrary）通过缓存key-server映射来实现的，基本原理就是对缓存key求hash值，用hash值对服务器数量进行模运算，该key值被分配到模运算结果为索引的那台server上。

Memcached.ClientLibrary对缓存key计算hashcode的核心算法如下：

  1 /// <summary>

  2 /// Returns appropriate SockIO object given

  3 /// string cache key and optional hashcode.

  4 ///

  5 /// Trys to get SockIO from pool.  Fails over

  6 /// to additional pools in event of server failure.

  7 /// </summary>

  8 /// <param name="key">hashcode for cache key</param>

  9 /// <param name="hashCode">if not null, then the int hashcode to use</param>

 10 /// <returns>SockIO obj connected to server</returns>

 11 public SockIO GetSock(string key, object hashCode)

 12 {

 13     string hashCodeString = "<null>";

 14     if(hashCode != null)

 15         hashCodeString = hashCode.ToString();

 16

 17     if(Log.IsDebugEnabled)

 18     {

 19         Log.Debug(GetLocalizedString("cache socket pick").Replace("$$Key$$", key).Replace("$$HashCode$$", hashCodeString));

 20     }

 21

 22     if (key == null || key.Length == 0)

 23     {

 24         if(Log.IsDebugEnabled)

 25         {

 26             Log.Debug(GetLocalizedString("null key"));

 27         }

 28         return null;

 29     }

 30

 31     if(!_initialized)

 32     {

 33         if(Log.IsErrorEnabled)

 34         {

 35             Log.Error(GetLocalizedString("get socket from uninitialized pool"));

 36         }

 37         return null;

 38     }

 39

 40     // if no servers return null

 41     if(_buckets.Count == 0)

 42         return null;

 43

 44     // if only one server, return it

 45     if(_buckets.Count == 1)

 46         return GetConnection((string)_buckets[0]);

 47

 48     int tries = 0;

 49

 50     // generate hashcode

 51     int hv;

 52     if(hashCode != null)

 53     {

 54         hv = (int)hashCode;

 55     }

 56     else

 57     {

 58

 59         // NATIVE_HASH = 0

 60         // OLD_COMPAT_HASH = 1

 61         // NEW_COMPAT_HASH = 2

 62         switch(_hashingAlgorithm)

 63         {

 64             case HashingAlgorithm.Native:

 65                 hv = key.GetHashCode();

 66                 break;

 67

 68             case HashingAlgorithm.OldCompatibleHash:

 69                 hv = OriginalHashingAlgorithm(key);

 70                 break;

 71

 72             case HashingAlgorithm.NewCompatibleHash:

 73                 hv = NewHashingAlgorithm(key);

 74                 break;

 75

 76             default:

 77                 // use the native hash as a default

 78                 hv = key.GetHashCode();

 79                 _hashingAlgorithm = HashingAlgorithm.Native;

 80                 break;

 81         }

 82     }

 83

 84     // keep trying different servers until we find one

 85     while(tries++ <= _buckets.Count)

 86     {

 87         // get bucket using hashcode

 88         // get one from factory

 89         int bucket = hv % _buckets.Count;

 90         if(bucket < 0)

 91             bucket += _buckets.Count;

 92

 93         SockIO sock = GetConnection((string)_buckets[bucket]);

 94

 95         if(Log.IsDebugEnabled)

 96         {

 97             Log.Debug(GetLocalizedString("cache choose").Replace("$$Bucket$$", _buckets[bucket].ToString()).Replace("$$Key$$", key));

 98         }

 99

100         if(sock != null)

101             return sock;

102

103         // if we do not want to failover, then bail here

104         if(!_failover)

105             return null;

106

107         // if we failed to get a socket from this server

108         // then we try again by adding an incrementer to the

109         // current key and then rehashing

110         switch(_hashingAlgorithm)

111         {

112             case HashingAlgorithm.Native:

113                 hv += ((string)("" + tries + key)).GetHashCode();

114                 break;

115

116             case HashingAlgorithm.OldCompatibleHash:

117                 hv += OriginalHashingAlgorithm("" + tries + key);

118                 break;

119

120             case HashingAlgorithm.NewCompatibleHash:

121                 hv += NewHashingAlgorithm("" + tries + key);

122                 break;

123

124             default:

125                 // use the native hash as a default

126                 hv += ((string)("" + tries + key)).GetHashCode();

127                 _hashingAlgorithm = HashingAlgorithm.Native;

128                 break;

129         }

130     }

131

132     return null;

133 }

根据缓存key得到服务器的核心代码

从源码中（62--82行代码）可以发现，计算hashcode的算法共三种：

（1）HashingAlgorithm.Native:
即使用.NET本身的hash算法，速度快，但与其他client可能不兼容，例如需要和java、ruby的client共享缓存的情况；

（2）HashingAlgorithm.OldCompatibleHash:
可以与其他客户端兼容，但速度慢；

（3）HashingAlgorithm.NewCompatibleHash:
可以与其他客户端兼容，据称速度快。

进一步分析发现，Memcached.ClientLibrary默认计算缓存key的hashcode的方式就是HashingAlgorithm.Native，而HashingAlgorithm.Native计算hashcode的算法为“hv
= key.GetHashCode()”，即用了.net类库string类型自带的GetHashCode()方法。

Bug就要浮现出来了，根据微软（http://msdn.microsoft.com/zh-cn/library/system.object.gethashcode.aspx）对GetHashCode的解释：the
.NET Framework does not guarantee the default implementation of the GetHashCode method, and the value this method
returns may differ between .NET Framework versions and platforms, such as 32-bit
and 64-bit platforms。string类型的GetHashCode()函数并不能保证不同平台同一个字符串返回的hash值相同，这样问题就出来了，对于不同服务器的同一缓存key来说，产生的hashcode可能不同，同一key对应的数据可能缓存到了不同的MemCache服务器上，数据的一致性无法保证，清除缓存的代码也可能失效。

// 64位 4.0

[__DynamicallyInvokable, ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail), SecuritySafeCritical]

public unsafe override int GetHashCode()

{

    if (HashHelpers.s_UseRandomizedStringHashing)

    {

        return string.InternalMarvin32HashString(this, this.Length, 0L);

    }

    IntPtr arg_25_0;

    IntPtr expr_1C = arg_25_0 = this;

    if (expr_1C != 0)

    {

        arg_25_0 = (IntPtr)((int)expr_1C + RuntimeHelpers.OffsetToStringData);

    }

    char* ptr = arg_25_0;

    int num = 5381;

    int num2 = num;

    char* ptr2 = ptr;

    int num3;

    while ((num3 = (int)(*(ushort*)ptr2)) != 0)

    {

        num = ((num << 5) + num ^ num3);

        num3 = (int)(*(ushort*)(ptr2 + (IntPtr)2 / 2));

        if (num3 == 0)

        {

            break;

        }

        num2 = ((num2 << 5) + num2 ^ num3);

        ptr2 += (IntPtr)4 / 2;

    }

    return num + num2 * 1566083941;

}
// 64位 2.0

// string

[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]

public unsafe override int GetHashCode()

{

    IntPtr arg_0F_0;

    IntPtr expr_06 = arg_0F_0 = this;

    if (expr_06 != 0)

    {

        arg_0F_0 = (IntPtr)((int)expr_06 + RuntimeHelpers.OffsetToStringData);

    }

    char* ptr = arg_0F_0;

    int num = 5381;

    int num2 = num;

    char* ptr2 = ptr;

    int num3;

    while ((num3 = (int)(*(ushort*)ptr2)) != 0)

    {

        num = ((num << 5) + num ^ num3);

        num3 = (int)(*(ushort*)(ptr2 + (IntPtr)2 / 2));

        if (num3 == 0)

        {

            break;

        }

        num2 = ((num2 << 5) + num2 ^ num3);

        ptr2 += (IntPtr)4 / 2;

    }

    return num + num2 * 1566083941;

}
//32位 4.0

[__DynamicallyInvokable, ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail), SecuritySafeCritical]

public unsafe override int GetHashCode()

{

    if (HashHelpers.s_UseRandomizedStringHashing)

    {

        return string.InternalMarvin32HashString(this, this.Length, 0L);

    }

    IntPtr arg_25_0;

    IntPtr expr_1C = arg_25_0 = this;

    if (expr_1C != 0)

    {

        arg_25_0 = (IntPtr)((int)expr_1C + RuntimeHelpers.OffsetToStringData);

    }

    char* ptr = arg_25_0;

    int num = 352654597;

    int num2 = num;

    int* ptr2 = (int*)ptr;

    int i;

    for (i = this.Length; i > 2; i -= 4)

    {

        num = ((num << 5) + num + (num >> 27) ^ *ptr2);

        num2 = ((num2 << 5) + num2 + (num2 >> 27) ^ ptr2[(IntPtr)4 / 4]);

        ptr2 += (IntPtr)8 / 4;

    }

    if (i > 0)

    {

        num = ((num << 5) + num + (num >> 27) ^ *ptr2);

    }

    return num + num2 * 1566083941;

}

GetHashCode几种版本的实现代码

解决问题的方法就是不要用MemCache默认的hash算法，实现方式有两种：

（1）初始化MemCache服务器的时候，指定为MemCahce自带其它的hash算法，代码为“this.pool.HashingAlgorithm
= HashingAlgorithm.OldCompatibleHash;”。

（2）自定义hash算法，调用set()、get()、delete()等方式时传递hash值，这几个方法有参数传递hashcode的重载。

参考资料：分析Memcached客户端如何把缓存数据分布到多个服务器上(转)、memcached client - memcacheddotnet (Memcached.ClientLibrary)
1.1.5、memcache分布式实现、Object.GetHashCode 方法、关于 HashCode做key的可能性。

时间： 2024-11-06 10:38:24

MemCache分布式缓存的一个bug

MemCache分布式缓存的一个bug的相关文章

C# Memcache分布式缓存简单入门

memcache 分布式缓存

memcache分布式缓存

什么是分布式缓存？

php5.4之分布式缓存memcache（windows7下安装配置）

分布式缓存--MVC+EF+Memcache

Nginx+Memcache+一致性hash算法实现页面分布式缓存（转）

分布式缓存——memcache原理

分布式缓存Memcache和Redis