HashMap简单源码及多线程下的死循环

主要记录hashMap的一些基本操作源码实现原理以及多线程情况下get()操作的死循环引发原因

一、hashMap简介

1.hashMap集合的主要属性及方法

（默认初始化容量）DEFAULT_INITIAL_CAPACITY = 16

（默认最大容量）MAXIMUM_CAPACITY = 1 << 30

（默认加载因子）DEFAULT_LOAD_FACTOR = 0.75f

（Entry数组）Entry[] table

（Entry实例的数量）size

put(K key, V value)方法

get(K key)方法

2.hashMap结构及操作（new方法 put方法 get方法）：

数组+链表的形式，以实例Entry<K,V>的形式存储

a.new方法：

从图中我们可以看到一个hashmap就是一个数组结构，当新建一个hashmap的时候，就会初始化一个数组（默认长度16，加载因子0.75）：

源码：

    /**
     * Constructs an empty <tt>HashMap</tt> with the default initial capacity
     * (16) and the default load factor (0.75).
     */
    public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        threshold = (int)(DEFAULT_INITIAL_CAPACITY * DEFAULT_LOAD_FACTOR);
        table = new Entry[DEFAULT_INITIAL_CAPACITY];
        init();
    }

例：table[0],table[1],table[2],table[3]...table[15]

b.put方法

当执行put方法时，会先计算key值的hash值，将hash值和（数组长度减一得到的值）进行与运算，得到数组下标值，将键值对以Entry实例的形式放入数组中：

put方法源码：

    /**
     * Associates the specified value with the specified key in this map.
     * If the map previously contained a mapping for the key, the old
     * value is replaced.
     *
     * @param key key with which the specified value is to be associated
     * @param value value to be associated with the specified key
     * @return the previous value associated with <tt>key</tt>, or
     *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
     *         (A <tt>null</tt> return can also indicate that the map
     *         previously associated <tt>null</tt> with <tt>key</tt>.)
     */
    public V put(K key, V value) {
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key.hashCode()); //计算key的hash值
        int i = indexFor(hash, table.length); //根据hash值和数组长度计算数组位置
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
//如遇到hash冲突（e不为空），遍历链表
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { //如遇到key值相等的，进行替换
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

indexFor方法源码(计算数组下标)，方法和简单，就把key值的hash值和数组长度-1做了与运算

    /**
     * Returns index for hash code h.
     */
    static int indexFor(int h, int length) {
        return h & (length-1);
    }

例：添加键值对为7，77，调用方法put(7，77)，7经过计算后的hash值为7（可自行验证，不多做解释），调用indexFor方法进行与运算：0111&1111=0111，下标值为7。

table[0],table[1],table[2],table[3]...table[7]=7...table[15]

再添加元素8，put(8，88)，15经过计算后的hash值仍为8，1000&1111=1000，下标值为8。

table[0],table[1],table[2],table[3]...table[7]=7，table[8]=8...table[15]

再添加元素22，put(22，2222)，22经过计算后的hash值为23，10111&1111=7，下标值为7，此时由于table[7]中已存在元素(7，77)，产生hash冲突，hashMap会将23放入table[7]中，再执行(e=e.next)，以链表形式将next指针指向之前的元素(7，77)，如下图所示：

当Entry的个数超过最大容量值*负载因子(16*0.75=12)时，hashMap会进入resize方法，重新创建一个数组并扩容为原来的2倍，再将数据拷贝到新的数组中，如下图所示：

resize方法源码：

    /**
     * Rehashes the contents of this map into a new array with a
     * larger capacity.  This method is called automatically when the
     * number of keys in this map reaches its threshold.
     *
     * If current capacity is MAXIMUM_CAPACITY, this method does not
     * resize the map, but sets threshold to Integer.MAX_VALUE.
     * This has the effect of preventing future calls.
     *
     * @param newCapacity the new capacity, MUST be a power of two;
     *        must be greater than current capacity unless current
     *        capacity is MAXIMUM_CAPACITY (in which case value
     *        is irrelevant).
     */
    void resize(int newCapacity) {
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }

        Entry[] newTable = new Entry[newCapacity];
        transfer(newTable);
        table = newTable;
        threshold = (int)(newCapacity * loadFactor);
    }

c.get方法：

当执行get方法时，根据key值的hash值，得到数组下标，将数组内的entry的key值与get中的参数做对比(如该数组内有链表，则会继续遍历链表)，若hash相等且equals，则返回。

get方法源码：

   /**
     * Returns the value to which the specified key is mapped,
     * or {@code null} if this map contains no mapping for the key.
     *
     * <p>More formally, if this map contains a mapping from a key
     * {@code k} to a value {@code v} such that {@code (key==null ? k==null :
     * key.equals(k))}, then this method returns {@code v}; otherwise
     * it returns {@code null}.  (There can be at most one such mapping.)
     *
     * <p>A return value of {@code null} does not <i>necessarily</i>
     * indicate that the map contains no mapping for the key; it‘s also
     * possible that the map explicitly maps the key to {@code null}.
     * The {@link #containsKey containsKey} operation may be used to
     * distinguish these two cases.
     *
     * @see #put(Object, Object)
     */
    public V get(Object key) {
        if (key == null)
            return getForNullKey();
        int hash = hash(key.hashCode()); //计算hash值
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
//这里遍历数组中的链表，若找到key值相等的，则返回
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
                return e.value;
        }
        return null;
    }

二、Java多线程下的HashMap死循环

摘自：http://blog.csdn.net/xiaohui127/article/details/11928865

正常的ReHash的过程

画了个图做了个演示。

我假设了我们的hash算法就是简单的用key mod 一下表的大小（也就是数组的长度）。

最上面的是old hash 表，其中的Hash表的size=2, 所以key = 3, 7, 5，在mod 2以后都冲突在table[1]这里了。

接下来的三个步骤是Hash表 resize成4，然后所有的<key,value> 重新rehash的过程

并发下的Rehash

1）假设我们有两个线程。我用红色和浅蓝色标注了一下。

我们再回头看一下我们的 transfer(resize方法中)代码中的这个细节：

                do {
                    Entry<K,V> next = e.next; // <--假设线程一执行到这里就被调度挂起了
                    int i = indexFor(e.hash, newCapacity);
                    e.next = newTable[i];
                    newTable[i] = e;
                    e = next;
                } while (e != null);

而我们的线程二执行完成了。于是我们有下面的这个样子。

注意，因为Thread1的 e 指向了key(3)，而next指向了key(7)，其在线程二rehash后，指向了线程二重组后的链表。我们可以看到链表的顺序被反转后。

2）线程一被调度回来执行。

先是执行 newTalbe[i] = e;
然后是e = next，导致了e指向了key(7)，
而下一次循环的next = e.next导致了next指向了key(3)

3）一切安好。

线程一接着工作。把key(7)摘下来，放到newTable[i]的第一个，然后把e和next往下移。

4）环形链接出现。

e.next = newTable[i] 导致 key(3).next 指向了 key(7)

注意：此时的key(7).next 已经指向了key(3)，环形链表就这样出现了。

于是，当我们的线程一调用到，HashTable.get(11)时，悲剧就出现了——Infinite Loop。

时间： 2024-10-05 01:28:46

HashMap简单源码及多线程下的死循环的相关文章

ArrayList源码和多线程安全问题分析

1.ArrayList源码和多线程安全问题分析在分析ArrayList线程安全问题之前,我们线对此类的源码进行分析,找出可能出现线程安全问题的地方,然后代码进行验证和分析. 1.1 数据结构 ArrayList内部是使用数组保存元素的,数据定义如下: transient Object[] elementData; // non-private to simplify nested class access 在ArrayList中此数组即是共享资源,当多线程对此数据进行操作的时候如果不进行同步控

HashMap的源码分析

hashMap的底层实现是数组+链表的数据结构,数组是一个Entry<K,V>[] 的键值对对象数组,在数组的每个索引上存储的是包含Entry的节点对象,每个Entry对象是一个单链表结构,维护这下一个Entry节点的引用:有点绕,用个图来展示吧: Entry<K,V>[] 数组部分保存的是首个Entry节点:Entry节点包含一个 K值引用 V值引用以及引用下一个Entry 节点的next引用: Entry节点的java代码实现如下: static class Entr

Java——HashMap底层源码分析

1.简介 HashMap 根据键的 hashCode 值存储数据,大多数情况下可以直接定位到它的值,因而具有很快的访问速度,但遍历顺序却是不确定的. HashMap 最多只允许一条记录的key为 null,允许多条value的值为 null. HashMap 非线程安全,即任一时刻可以有多个线程同时写 HashMap,可能会导致数据的不一致.(如果需要满足线程安全,可以用 Collections 的 synchronizedMap 方法使HashMap 具有线程安全的能力,或者使用 Concur

《UNIX环境高级编程》第二版源码在Ubuntu下的编译

涂鸦一文,自娱娱乐. 无题草舍如沙天地卷,且放白鹿青崖间. 望闻问切麻雀全,漫卷诗书彩云乡. 天地琴心天地曲,天际行将遥望远. 大浪淘沙鲁智深,乱云飞渡仍从容. <UNIX环境高级编程>第二版源码在Ubuntu下的编译

HashMap(2) 源码剖析(推荐)

今天看代码,想到去年发生的HashMap发生的CPU使用率100%的事件,转载下当时看的三个比较不错的博客(非常推荐) 参考:http://coolshell.cn/articles/9606.html http://github.thinkingbar.com/hashmap-analysis/ http://developer.51cto.com/art/201102/246431.htm 在 Java 集合类中,使用最多的容器类恐怕就是 HashMap 和 ArrayList 了,所以

结合ThreadLocal来看spring事务源码，感受下清泉般的洗涤！

在我的博客spring事务源码解析中,提到了一个很关键的点:将connection绑定到当前线程来保证这个线程中的数据库操作用的是同一个connection.但是没有细致的讲到如何绑定,以及为什么这么绑定:另外也没有讲到连接池的相关问题:如何从连接池获取,如何归还连接到连接池等等.那么下面就请听我慢慢道来. ThreadLocal 讲spring事务之前,我们先来看看ThreadLocal,它在spring事务中是占据着比较重要的地位:不管你对ThreadLocal熟悉与否,且都静下心来听我唐僧

【转】java 的HashMap的源码分析

一.HashMap概述二.HashMap的数据结构三.HashMap源码分析 1.关键属性 2.构造方法 3.存储数据 4.调整大小 5.数据读取 6.HashMap的性能参数 7.Fail-Fast机制一.HashMap概述 HashMap基于哈希表的 Map 接口的实现.此实现提供所有可选的映射操作,并允许使用 null 值和 null 键.(除了不同步和允许使用 null

从静态代码扫描引擎PMD源码学习-多线程任务模型和File过滤设计

不知不觉在工作中研究PMD并定制规则已经4个月左右了.其实PMD有许多值得我学习的源码,不过出于时间并不曾动笔.今天简单记录总结一下PMD的多线程和File过滤设计的源码. 1 public class MultiThreadProcessor extends AbstractPMDProcessor { 2 3 private ExecutorService executor; 4 private CompletionService<Report> completionService; 5

java jdk 中HashMap的源码解读

HashMap是我们在日常写代码时最常用到的一个数据结构,它为我们提供key-value形式的数据存储.同时,它的查询,插入效率都非常高. 在之前的排序算法总结里面里,我大致学习了HashMap的实现原理,并制作了一个简化版本的HashMap. 今天,趁着项目的间歇期,我又仔细阅读了Java中的HashMap的实现. HashMap的初始化: Java代码 public HashMap(int initialCapacity, float loadFactor) public HashMap(i