`
BrokenDreams
  • 浏览: 249063 次
  • 性别: Icon_minigender_1
  • 来自: 北京
博客专栏
68ec41aa-0ce6-3f83-961b-5aa541d59e48
Java并发包源码解析
浏览量:97973
社区版块
存档分类
最新评论

Jdk1.6 Collections Framework源码解析(6)-IdentityHashMap

阅读更多
        这篇总结一下java.util.IdentityHashMap。从类名上可以猜到,这个类本质应该还是一个散列表,只是前面有Identity修饰,是一种特殊的HashMap。
        简单的说,IdentityHashMap和HashMap的区别在于对key的比较。
        HashMap中会调用key的hashCode方法,hashCode方法可能会根据具体情况进行重写。在比较key时会用equals方法进行比较,equals方法也可能被重写。
    public V put(K key, V value) {
        if (key == null)
            return putForNullKey(value);
                       //调用key的hashCode方法
        int hash = hash(key.hashCode());
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
                                   //比较:k1==k2 或者 k1.equals(k2)
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

        IdentityHashMap中会调用System.identityHashCode(x)来获得对象的hashCode(也就是对象的hashCode方法没有被覆盖情况下的返回值),仅用“==”来进行后面key的比较。
    public V put(K key, V value) {
        Object k = maskNull(key);
        Object[] tab = table;
        int len = tab.length;
                //在hash方法中会调用System.identityHashCode(x)
        int i = hash(k, len);

        Object item;
        while ( (item = tab[i]) != null) {
            //只用==进行比较
            if (item == k) {
		V oldValue = (V) tab[i + 1];
                tab[i + 1] = value;
                return oldValue;
            }
            i = nextKeyIndex(i, len);
        }

        modCount++;
        tab[i] = k;
        tab[i + 1] = value;
        if (++size >= threshold)
            resize(len); // len == 2 * current capacity.
        return null;
    }

    private static int hash(Object x, int length) {
        int h = System.identityHashCode(x);
        // Multiply by -127, and left-shift to use least bit as part of hash
        return ((h << 1) - (h << 8)) & (length - 1);
    }


        看个具体的例子:
	public static void main(String[] args) {
		String k1 = new String("a");
		String v1 = new String("A");
		String k2 = new String("a");
		String v2 = new String("A");
		HashMap<String, String> hashMap = new HashMap<String, String>();
		hashMap.put(k1, v1);
		hashMap.put(k2, v2);
		System.out.println("hashMap:"+hashMap);
		
		IdentityHashMap<String, String> identityHashMap = new IdentityHashMap<String, String>();
		identityHashMap.put(k1, v1);
		identityHashMap.put(k2, v2);
		System.out.println("identityHashMap:"+identityHashMap);
	}

        运行结果:
hashMap:{a=A}
identityHashMap:{a=A, a=A}

        IdentityHashMap和HashMap内部都实现了散列表,但有区别,体现在对散列冲突的处理上。HashMap中以分离链表的方式来解决散列冲突,也就是将散列在同一个桶内的数据组织成一个链表结构;IdentityHashMap中则以开放寻址的方式来解决散列冲突,当发生散列冲突时,数据被放入下一个空闲地址(也叫线性探测法)。
        所以IdentityHashMap和HashMap在实现细节上区别很大,来看一下吧。
public class IdentityHashMap<K,V>
    extends AbstractMap<K,V>
    implements Map<K,V>, java.io.Serializable, Cloneable
{
    /**
     * The initial capacity used by the no-args constructor.
     * MUST be a power of two.  The value 32 corresponds to the
     * (specified) expected maximum size of 21, given a load factor
     * of 2/3.
     */
    private static final int DEFAULT_CAPACITY = 32;

    /**
     * The minimum capacity, used if a lower value is implicitly specified
     * by either of the constructors with arguments.  The value 4 corresponds
     * to an expected maximum size of 2, given a load factor of 2/3.
     * MUST be a power of two.
     */
    private static final int MINIMUM_CAPACITY = 4;

    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<29.
     */
    private static final int MAXIMUM_CAPACITY = 1 << 29;

    /**
     * The table, resized as necessary. Length MUST always be a power of two.
     */
    private transient Object[] table;

    /**
     * The number of key-value mappings contained in this identity hash map.
     *
     * @serial
     */
    private int size;

    /**
     * The number of modifications, to support fast-fail iterators
     */
    private transient volatile int modCount;

    /**
     * The next size value at which to resize (capacity * load factor).
     */
    private transient int threshold;

    /**
     * Value representing null keys inside tables.
     */
    private static final Object NULL_KEY = new Object();

        几个地方要注意下,MAXIMUM_CAPACITY是2的29次方,还记得HashMap的是2的30次方,为啥会差2呗呢?这个问题先记住。还有数组的类型是Object类型,由于数组的每个位置只放一个数据(目前看起来是这样),所以下面出现NULL_KEY,表示一个为null的键,为了区别于null。那value放在哪呢?继续往下看。
    /**
     * Constructs a new, empty identity hash map with a default expected
     * maximum size (21).
     */
    public IdentityHashMap() {
        init(DEFAULT_CAPACITY);
    }

    /**
     * Constructs a new, empty map with the specified expected maximum size.
     * Putting more than the expected number of key-value mappings into
     * the map may cause the internal data structure to grow, which may be
     * somewhat time-consuming.
     *
     * @param expectedMaxSize the expected maximum size of the map
     * @throws IllegalArgumentException if <tt>expectedMaxSize</tt> is negative
     */
    public IdentityHashMap(int expectedMaxSize) {
        if (expectedMaxSize < 0)
            throw new IllegalArgumentException("expectedMaxSize is negative: "
                                               + expectedMaxSize);
        init(capacity(expectedMaxSize));
    }

    /**
     * Returns the appropriate capacity for the specified expected maximum
     * size.  Returns the smallest power of two between MINIMUM_CAPACITY
     * and MAXIMUM_CAPACITY, inclusive, that is greater than
     * (3 * expectedMaxSize)/2, if such a number exists.  Otherwise
     * returns MAXIMUM_CAPACITY.  If (3 * expectedMaxSize)/2 is negative, it
     * is assumed that overflow has occurred, and MAXIMUM_CAPACITY is returned.
     */
    private int capacity(int expectedMaxSize) {
        // Compute min capacity for expectedMaxSize given a load factor of 2/3
        int minCapacity = (3 * expectedMaxSize)/2;

        // Compute the appropriate capacity
        int result;
        if (minCapacity > MAXIMUM_CAPACITY || minCapacity < 0) {
            result = MAXIMUM_CAPACITY;
        } else {
            result = MINIMUM_CAPACITY;
            while (result < minCapacity)
                result <<= 1;
        }
        return result;
    }

    /**
     * Initializes object to be an empty map with the specified initial
     * capacity, which is assumed to be a power of two between
     * MINIMUM_CAPACITY and MAXIMUM_CAPACITY inclusive.
     */
    private void init(int initCapacity) {
        // assert (initCapacity & -initCapacity) == initCapacity; // power of 2
        // assert initCapacity >= MINIMUM_CAPACITY;
        // assert initCapacity <= MAXIMUM_CAPACITY;

        threshold = (initCapacity * 2)/3;
        table = new Object[2 * initCapacity];
    }

    /**
     * Constructs a new identity hash map containing the keys-value mappings
     * in the specified map.
     *
     * @param m the map whose mappings are to be placed into this map
     * @throws NullPointerException if the specified map is null
     */
    public IdentityHashMap(Map<? extends K, ? extends V> m) {
        // Allow for a bit of growth
        this((int) ((1 + m.size()) * 1.1));
        putAll(m);
    }

        还记得HashMap中有个加载因子,这里没有这个属性,在代码中固定为2/3了。要注意init方法中的table = new Object[2 * initCapacity],这里怎么变成了容量的2倍了?继续看看添加方法吧。
    /**
     * Associates the specified value with the specified key in this identity
     * hash map.  If the map previously contained a mapping for the key, the
     * old value is replaced.
     *
     * @param key the key with which the specified value is to be associated
     * @param value the value to be associated with the specified key
     * @return the previous value associated with <tt>key</tt>, or
     *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
     *         (A <tt>null</tt> return can also indicate that the map
     *         previously associated <tt>null</tt> with <tt>key</tt>.)
     * @see     Object#equals(Object)
     * @see     #get(Object)
     * @see     #containsKey(Object)
     */
    public V put(K key, V value) {
        Object k = maskNull(key);
        Object[] tab = table;
        int len = tab.length;
        int i = hash(k, len);

        Object item;
        while ( (item = tab[i]) != null) {
            if (item == k) {
		V oldValue = (V) tab[i + 1];
                tab[i + 1] = value;
                return oldValue;
            }
            i = nextKeyIndex(i, len);
        }

        modCount++;
        tab[i] = k;
        tab[i + 1] = value;
        if (++size >= threshold)
            resize(len); // len == 2 * current capacity.
        return null;
    }

        首先是对"Null"key的处理,将其转化成前面的NULL_KEY对象,以便计算hash值,同时能和真正为null的位置区分。
        接下来通过一个hash函数计算hash值:
    /**
     * Returns index for Object x.
     */
    private static int hash(Object x, int length) {
        int h = System.identityHashCode(x);
        // Multiply by -127, and left-shift to use least bit as part of hash
        return ((h << 1) - (h << 8)) & (length - 1);
    }

        这个hash函数中,语句((h<<1) - (h<<8)) & (length-1),后半部分之前分析过,就是取余。前面的(h<<1) - (h<<8)会保留h的后7位和一个0作为后8位,前面的由于减操作变的不确定了。这么做的意图还不是很清晰(路过高手指点下),但这样做会产生一个结果就是整合hash函数得到的结果都是偶数。为什么是偶数?继续往下看。
        接下来会用得到的hash值作为内部数据的下标来获取数组中对应的数据,如果对应位置没有数据的话(没有发生冲突),会把Key放到这个位置,然后把Value放到下一个位置(hash + 1)。也就是说,所有的Key都保存在内部数组的偶数下标位置,所有的Value都保存在所有的奇数下标位置。现在明白了为什么初始化时内部容量变成了2倍,为什么hash函数的结果是偶数了吧。所以当产生冲突的时候,会继续寻找下一个可用位置:
    /**
     * Circularly traverses table of size len.
     */
    private static int nextKeyIndex(int i, int len) {
        return (i + 2 < len ? i + 2 : 0);
    }

        有了put方法的分析,get方法也就显而易见了。
    /**
     * Returns the value to which the specified key is mapped,
     * or {@code null} if this map contains no mapping for the key.
     *
     * <p>More formally, if this map contains a mapping from a key
     * {@code k} to a value {@code v} such that {@code (key == k)},
     * then this method returns {@code v}; otherwise it returns
     * {@code null}.  (There can be at most one such mapping.)
     *
     * <p>A return value of {@code null} does not <i>necessarily</i>
     * indicate that the map contains no mapping for the key; it's also
     * possible that the map explicitly maps the key to {@code null}.
     * The {@link #containsKey containsKey} operation may be used to
     * distinguish these two cases.
     *
     * @see #put(Object, Object)
     */
    public V get(Object key) {
        Object k = maskNull(key);
	Object[] tab = table;
        int len = tab.length;
        int i = hash(k, len);
        while (true) {
	    Object item = tab[i];
            if (item == k)
                return (V) tab[i + 1];
            if (item == null)
                return null;
            i = nextKeyIndex(i, len);
        }
    }

        添加和获取的方法实现起来比较容易,但删除一个数据就蛋疼了。需要考虑冲突的问题,如果删除的位置之前发生过n次冲突,那么删除动作不仅要删除当前位置的数据,还要把之前发生过冲突的n个元素的位置往前调整(不严格的说),否则线性探测链就会断掉,会影响其他操作(如get)的正确性。来看一下具体实现:
    /**
     * Removes the mapping for this key from this map if present.
     *
     * @param key key whose mapping is to be removed from the map
     * @return the previous value associated with <tt>key</tt>, or
     *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
     *         (A <tt>null</tt> return can also indicate that the map
     *         previously associated <tt>null</tt> with <tt>key</tt>.)
     */
    public V remove(Object key) {
        Object k = maskNull(key);
        Object[] tab = table;
        int len = tab.length;
        int i = hash(k, len);

        while (true) {
            Object item = tab[i];
            if (item == k) {
                modCount++;
                size--;
                V oldValue = (V) tab[i + 1];
                tab[i + 1] = null;
                tab[i] = null;
                closeDeletion(i);
                return oldValue;
            }
            if (item == null)
                return null;
            i = nextKeyIndex(i, len);
        }

    }

        可以看到,前面的代码和put类似,只是删除东西完成之后还要调用一个closeDeletion方法。
    /**
     * Rehash all possibly-colliding entries following a
     * deletion. This preserves the linear-probe
     * collision properties required by get, put, etc.
     *
     * @param d the index of a newly empty deleted slot
     */
    private void closeDeletion(int d) {
        // Adapted from Knuth Section 6.4 Algorithm R
        Object[] tab = table;
        int len = tab.length;

        // Look for items to swap into newly vacated slot
        // starting at index immediately following deletion,
        // and continuing until a null slot is seen, indicating
        // the end of a run of possibly-colliding keys.
        Object item;
        for (int i = nextKeyIndex(d, len); (item = tab[i]) != null;
             i = nextKeyIndex(i, len) ) {
            // The following test triggers if the item at slot i (which
            // hashes to be at slot r) should take the spot vacated by d.
            // If so, we swap it in, and then continue with d now at the
            // newly vacated i.  This process will terminate when we hit
            // the null slot at the end of this run.
            // The test is messy because we are using a circular table.
            int r = hash(item, len);
            if ((i < r && (r <= d || d <= i)) || (r <= d && d <= i)) {
                tab[d] = item;
                tab[d + 1] = tab[i + 1];
                tab[i] = null;
                tab[i + 1] = null;
                d = i;
            }
        }
    }

        大概过程是从删除的位置开始往下找(通过nextKeyIndex方法),当存在下一个元素的时候,通过重新计算元素hash值的方式,判断下一个元素放到本容器中时是否产生过冲突。如果产生过冲突,那么将该元素放到当前位置(d、d+1),将该元素之前位置(i、i+1)清空。然后把i赋给d,继续往下找。知道找到null位置为止。
        有了上面的分析,其他代码很容易看懂了。java.util.IdentityHashMap总结到这里。
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics