Java删除集合中的元素

16 min

写在前面

我真的了解Java吗?这其实是我经常好奇的问题之一,什么叫了解一门语言。会用?能用?遇到问题会解决问题?

当然这只是随意的漫谈,聊回这篇博客本来的问题,如何删除集合中为固定值的元素。

问题描述

import java.util.ArrayList;
import java.util.List;

public class CollectionDelete {
    public static void main(String[] args) {
        List<String> stringList = new ArrayList<>();
        stringList.add("hello");
        stringList.add("zhima");
        stringList.add("good");
        stringList.add("luck");
        stringList.add("karlyn");
        stringList.add("zhima");
        //假设我有这样一个List,我要删除集合里值为“zhima”的元素,我该如何删除呢
      	//是否可以像下面这样的删除方法删除呢
        for (String str:stringList){
            if(str.equals("zhima")){
                stringList.remove(str);
            }
        }
    }
}

答案是否定的,会抛出异常

Exception in thread "main" java.util.ConcurrentModificationException
	at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1013)
	at java.base/java.util.ArrayList$Itr.next(ArrayList.java:967)
	at CollectionDelete.main(CollectionDelete.java:14)

探究原因

其实我们应该能理解无法删除的原因,因为还在遍历这个集合,但是我们已经在修改这个集合了,其实这是不合理的,这会导致我们遍历的错误。但是我们还是要从源码的角度来看看,抛这个异常是怎么实现的。

首先,异常并不是在删除的时候抛出,而是删除完成之后,去寻找下一个元素的时被抛出的,所以抛出这个异常的不是remove方法,而是for (String str:stringList)的操作

这个操作叫做增强型循环,也叫for-each循环,是Java中典型的语法糖之一,这个语法糖在编译的时候有两种情况。

如果遍历的是一个数组的话,编译的时候等价于

int[] numbers = {1, 2, 3, 4, 5};
for (int i = 0; i < numbers.length; i++) {
    int num = numbers[i];
    System.out.println(num);
}

如果是集合的话,编译的时候等价于

List<String> stringList = new ArrayList<>();
stringList.add("hello");
stringList.add("world");

Iterator<String> iterator = stringList.iterator();
while (iterator.hasNext()) {
    String str = iterator.next();
    System.out.println(str);
}

然后我们再来看看ArrayList的Itr的定义

private class Itr implements Iterator<E> {
    int cursor;       // index of next element to return
    int lastRet = -1; // index of last element returned; -1 if no such
    int expectedModCount = modCount;

    // prevent creating a synthetic constructor
    Itr() {}

    public boolean hasNext() {
        return cursor != size;
    }

    @SuppressWarnings("unchecked")
    public E next() {
        checkForComodification();
        int i = cursor;
        if (i >= size)
            throw new NoSuchElementException();
        Object[] elementData = ArrayList.this.elementData;
        if (i >= elementData.length)
            throw new ConcurrentModificationException();
        cursor = i + 1;
        return (E) elementData[lastRet = i];
    }

这里我们,当调用Itr.next()方法的时候会调用checkForComodification();方法

这个方法实际上是什么呢?

final void checkForComodification() {
    if (modCount != expectedModCount)
        throw new ConcurrentModificationException();
}

比较修改次数是否和期待的修改次数相同,在定义迭代器的时候,会直接设置为List的modCount值,而如果调用List.remove操作,则会修改modCount的值,具体见下面的源码

/**
 * Removes the first occurrence of the specified element from this list,
 * if it is present.  If the list does not contain the element, it is
 * unchanged.  More formally, removes the element with the lowest index
 * {@code i} such that
 * {@code Objects.equals(o, get(i))}
 * (if such an element exists).  Returns {@code true} if this list
 * contained the specified element (or equivalently, if this list
 * changed as a result of the call).
 *
 * @param o element to be removed from this list, if present
 * @return {@code true} if this list contained the specified element
 */
public boolean remove(Object o) {
    final Object[] es = elementData;
    final int size = this.size;
    int i = 0;
    found: {
        if (o == null) {
            for (; i < size; i++)
                if (es[i] == null)
                    break found;
        } else {
            for (; i < size; i++)
                if (o.equals(es[i]))
                    break found;
        }
        return false;
    }
    fastRemove(es, i);
    return true;
}

/**
 * Private remove method that skips bounds checking and does not
 * return the value removed.
 */
private void fastRemove(Object[] es, int i) {
    modCount++;
    final int newSize;
    if ((newSize = size - 1) > i)
        System.arraycopy(es, i + 1, es, i, newSize - i);
    es[size = newSize] = null;
}

使用增强型循环的时候,其实并没有新建一个迭代器,而是直接使用了ArrayList的内部类Itr的next方法,该方法会用expectedModCount和ArrayList对象的modCount进行比较。

/**
 * Returns an iterator over the elements in this list in proper sequence.
 *
 * <p>The returned iterator is <a href="#fail-fast"><i>fail-fast</i></a>.
 *
 * @return an iterator over the elements in this list in proper sequence
 */
public Iterator<E> iterator() {
    return new Itr();
}

所以我们的for-each循环里的删除之后,再次调用迭代器之后就会出现迭代器期待的修改次数和实际修改次数的不匹配。

如何删除

那么实际上该如何删除呢?

迭代器为我们提供了一种删除方式。

import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

public class CollectionDelete {
    public static void main(String[] args) {
        List<String> stringList = new ArrayList<>();
        stringList.add("hello");
        stringList.add("zhima");
        stringList.add("good");
        stringList.add("luck");
        stringList.add("karlyn");
        stringList.add("zhima");
        Iterator<String> stringIterator = stringList.iterator();
        //假设我有这样一个List,我要删除集合里值为“zhima”的元素,我该如何删除呢
        while(stringIterator.hasNext()){
            String str = stringIterator.next();
            if(str.equals("zhima")){
                stringIterator.remove();
            }
        }
      	System.out.println(stringList);
    }
}

也就是通过迭代器删除。

[hello, good, luck, karlyn]

那么迭代器的删除是如何实现的呢?

public void remove() {
    if (lastRet < 0)
        throw new IllegalStateException();
    checkForComodification();

    try {
        ArrayList.this.remove(lastRet);
        cursor = lastRet;
        lastRet = -1;
        expectedModCount = modCount;
    } catch (IndexOutOfBoundsException ex) {
        throw new ConcurrentModificationException();
    }
}

其实就是多做了一步工作,也就是更新了expectedModCount,这样他就会与modCount一致了,在进行下一次next()方法调用的时候就不会报错了。

如果是Map呢?

如果这段代码要删除的是Map中Value为指定值的Value呢?该如何删除呢?

import java.util.*;

public class CollectionDelete {
    public static void main(String[] args) {
        HashMap<Integer, String> map = new HashMap<>();
        map.put(1, "AA");
        map.put(2, "BB");
        map.put(3, "AA");
        map.put(4, "CC");

        Iterator<Map.Entry<Integer, String>> iterator = map.entrySet().iterator();
        while (iterator.hasNext()) {
            Map.Entry<Integer, String> entry = iterator.next();
            if ("AA".equals(entry.getValue())) {
                iterator.remove();
            }
        }

        System.out.println(map);
    }
}

那这时候其实我就又一点疑惑,我能不能用KeySet作为迭代器,然后使用迭代器删除KeySet中的Key呢?

同时我们又回存在疑惑,如果删除KeySet中的Key,但是没有处理Value,会不会有内存泄漏的风险呢?

import java.util.*;

public class CollectionDelete {
    public static void main(String[] args) {
        HashMap<Integer, String> map = new HashMap<>();
        map.put(1, "AA");
        map.put(2, "BB");
        map.put(3, "AA");
        map.put(4, "CC");

        Iterator<Integer> iterator = map.keySet().iterator();
        while (iterator.hasNext()) {
            Integer key = iterator.next();
            if ("AA".equals(map.get(key))) {
                iterator.remove();
            }
        }

        System.out.println(map);
    }
}

首先,回答第一个问题,能否删除,答案是能删除。

输出结果为:

{2=BB, 4=CC}

那么会不会造成Value的内存泄漏呢?也就是说,是否还有指向Value的引用呢?或者说,再直白一些,HashMap的value集合是否还能获取到value呢?

我们先看看EntrySet的iterator方法。

public final Iterator<Map.Entry<K,V>> iterator() {
            return new EntryIterator();
        }

然后我们接下来看找到EntryIterator,发现它继承了抽象类HashIterator

    final class EntryIterator extends HashIterator
        implements Iterator<Map.Entry<K,V>> {
        public final Map.Entry<K,V> next() { return nextNode(); }
    }

那么我们接下来再看看KeySet的iterator方法

final class KeySet extends AbstractSet<K> {
    public final int size()                 { return size; }
    public final void clear()               { HashMap.this.clear(); }
    public final Iterator<K> iterator()     { return new KeyIterator(); }
final class KeyIterator extends HashIterator
    implements Iterator<K> {
    public final K next() { return nextNode().key; }
}

final class ValueIterator extends HashIterator
    implements Iterator<V> {
    public final V next() { return nextNode().value; }
}

可以看到,无论是KeySet还是ValueSet的迭代器都是继承了HashIterator这个抽象类,那么他们是怎么提供删除的呢?

abstract class HashIterator {
    Node<K,V> next;        // next entry to return
    Node<K,V> current;     // current entry
    int expectedModCount;  // for fast-fail
    int index;             // current slot

    HashIterator() {
        expectedModCount = modCount;
        Node<K,V>[] t = table;
        current = next = null;
        index = 0;
        if (t != null && size > 0) { // advance to first entry
            do {} while (index < t.length && (next = t[index++]) == null);
        }
    }

    public final boolean hasNext() {
        return next != null;
    }

    final Node<K,V> nextNode() {
        Node<K,V>[] t;
        Node<K,V> e = next;
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
        if (e == null)
            throw new NoSuchElementException();
        if ((next = (current = e).next) == null && (t = table) != null) {
            do {} while (index < t.length && (next = t[index++]) == null);
        }
        return e;
    }

    public final void remove() {
        Node<K,V> p = current;
        if (p == null)
            throw new IllegalStateException();
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
        current = null;
        removeNode(p.hash, p.key, null, false, false);
        expectedModCount = modCount;
    }
}

这时候我们就发现了,它删除的其实就是current的Node,也就是说,其实删除的就是key value对。

或者换而言之,KeySet的迭代器,其实就是EntrySet的迭代器,只不过重写了next方法,让next的输出更少了一些。

至此,本文完结!

稚嫩的一次源码解读,写的不好,烦请见谅。

补充一条

其实在做这个实验的时候,我还发现了另一点

List<String> stringList = Arrays.asList(new String[]{"hello","zhima","karlyn","nice","day","good","luck","zhima"});

使用Arrays.asList创建的List不能修改,也就是没有实现List接口的remove方法,如果对他remove的话,会报错不支持的操作异常,UnsupportedOperationException

这点在迭代器的接口里也考虑到了,真优雅啊Java:

/**
 * Removes from the underlying collection the last element returned
 * by this iterator (optional operation).  This method can be called
 * only once per call to {@link #next}.
 * <p>
 * The behavior of an iterator is unspecified if the underlying collection
 * is modified while the iteration is in progress in any way other than by
 * calling this method, unless an overriding class has specified a
 * concurrent modification policy.
 * <p>
 * The behavior of an iterator is unspecified if this method is called
 * after a call to the {@link #forEachRemaining forEachRemaining} method.
 *
 * @implSpec
 * The default implementation throws an instance of
 * {@link UnsupportedOperationException} and performs no other action.
 *
 * @throws UnsupportedOperationException if the {@code remove}
 *         operation is not supported by this iterator
 *
 * @throws IllegalStateException if the {@code next} method has not
 *         yet been called, or the {@code remove} method has already
 *         been called after the last call to the {@code next}
 *         method
 */
default void remove() {
    throw new UnsupportedOperationException("remove");
}

不能修改的原因

通过 Arrays.asList() 创建的 List 是一个固定大小的列表(fixed-size list),它不能直接进行结构性修改(如添加或删除元素)。这是因为 Arrays.asList() 返回的是一个由数组支持的 List 实现,而不是一个普通的 ArrayList

具体原因

  1. Arrays.asList() 的实现
    • Arrays.asList() 方法返回的是 java.util.Arrays.ArrayList,这是 Arrays 类中的一个内部静态类,而不是 java.util.ArrayList
    • 这个内部类的实现是对原始数组的一个包装,它的大小是固定的,与原始数组共享数据。
  2. 固定大小的限制
    • 因为底层仍然是数组,所以不能动态调整大小。任何试图改变列表大小的操作(如 add()remove())都会抛出 UnsupportedOperationException
    • 但是,你可以对列表中的元素进行修改(例如通过索引设置新值),因为这不会改变数组的大小。

看源码!

public static <T> List<T> asList(T... a) {
    return new ArrayList<>(a);
}

private static class ArrayList<E> extends AbstractList<E>
        implements RandomAccess, java.io.Serializable
    {
        @java.io.Serial
        private static final long serialVersionUID = -2764017481108945198L;
        @SuppressWarnings("serial") // Conditionally serializable
        private final E[] a;

        ArrayList(E[] array) {
            a = Objects.requireNonNull(array);
        }

        @Override
        public int size() {
            return a.length;
        }

        @Override
        public Object[] toArray() {
            return Arrays.copyOf(a, a.length, Object[].class);
        }

        @Override
        @SuppressWarnings("unchecked")
        public <T> T[] toArray(T[] a) {
            int size = size();
            if (a.length < size)
                return Arrays.copyOf(this.a, size,
                                     (Class<? extends T[]>) a.getClass());
            System.arraycopy(this.a, 0, a, 0, size);
            if (a.length > size)
                a[size] = null;
            return a;
        }

        @Override
        public E get(int index) {
            return a[index];
        }

        @Override
        public E set(int index, E element) {
            E oldValue = a[index];
            a[index] = element;
            return oldValue;
        }

        @Override
        public int indexOf(Object o) {
            E[] a = this.a;
            if (o == null) {
                for (int i = 0; i < a.length; i++)
                    if (a[i] == null)
                        return i;
            } else {
                for (int i = 0; i < a.length; i++)
                    if (o.equals(a[i]))
                        return i;
            }
            return -1;
        }

        @Override
        public boolean contains(Object o) {
            return indexOf(o) >= 0;
        }

        @Override
        public Spliterator<E> spliterator() {
            return Spliterators.spliterator(a, Spliterator.ORDERED);
        }

        @Override
        public void forEach(Consumer<? super E> action) {
            Objects.requireNonNull(action);
            for (E e : a) {
                action.accept(e);
            }
        }

        @Override
        public void replaceAll(UnaryOperator<E> operator) {
            Objects.requireNonNull(operator);
            E[] a = this.a;
            for (int i = 0; i < a.length; i++) {
                a[i] = operator.apply(a[i]);
            }
        }

        @Override
        public void sort(Comparator<? super E> c) {
            Arrays.sort(a, c);
        }

        @Override
        public Iterator<E> iterator() {
            return new ArrayItr<>(a);
        }
    }

注意我们看这个ArrayList,它其实是Arrays的一个内部类!

这也可以解释为什么asList一定要传引用类型的数组了,因为定义了范型!

为什么它能不实现remove和add接口呢?

那当然是因为它继承了抽象类AbstractList啊,这个抽象类直接对这两个方法抛异常了!

public abstract class AbstractList<E> extends AbstractCollection<E> implements List<E> {
    /**
     * Sole constructor.  (For invocation by subclass constructors, typically
     * implicit.)
     */
    protected AbstractList() {
    }

    /**
     * Appends the specified element to the end of this list (optional
     * operation).
     *
     * <p>Lists that support this operation may place limitations on what
     * elements may be added to this list.  In particular, some
     * lists will refuse to add null elements, and others will impose
     * restrictions on the type of elements that may be added.  List
     * classes should clearly specify in their documentation any restrictions
     * on what elements may be added.
     *
     * @implSpec
     * This implementation calls {@code add(size(), e)}.
     *
     * <p>Note that this implementation throws an
     * {@code UnsupportedOperationException} unless
     * {@link #add(int, Object) add(int, E)} is overridden.
     *
     * @param e element to be appended to this list
     * @return {@code true} (as specified by {@link Collection#add})
     * @throws UnsupportedOperationException if the {@code add} operation
     *         is not supported by this list
     * @throws ClassCastException if the class of the specified element
     *         prevents it from being added to this list
     * @throws NullPointerException if the specified element is null and this
     *         list does not permit null elements
     * @throws IllegalArgumentException if some property of this element
     *         prevents it from being added to this list
     */
    public boolean add(E e) {
        add(size(), e);
        return true;
    }

    /**
     * {@inheritDoc}
     *
     * @throws IndexOutOfBoundsException {@inheritDoc}
     */
    public abstract E get(int index);

    /**
     * {@inheritDoc}
     *
     * @implSpec
     * This implementation always throws an
     * {@code UnsupportedOperationException}.
     *
     * @throws UnsupportedOperationException {@inheritDoc}
     * @throws ClassCastException            {@inheritDoc}
     * @throws NullPointerException          {@inheritDoc}
     * @throws IllegalArgumentException      {@inheritDoc}
     * @throws IndexOutOfBoundsException     {@inheritDoc}
     */
    public E set(int index, E element) {
        throw new UnsupportedOperationException();
    }

    /**
     * {@inheritDoc}
     *
     * @implSpec
     * This implementation always throws an
     * {@code UnsupportedOperationException}.
     *
     * @throws UnsupportedOperationException {@inheritDoc}
     * @throws ClassCastException            {@inheritDoc}
     * @throws NullPointerException          {@inheritDoc}
     * @throws IllegalArgumentException      {@inheritDoc}
     * @throws IndexOutOfBoundsException     {@inheritDoc}
     */
    public void add(int index, E element) {
        throw new UnsupportedOperationException();
    }

    /**
     * {@inheritDoc}
     *
     * @implSpec
     * This implementation always throws an
     * {@code UnsupportedOperationException}.
     *
     * @throws UnsupportedOperationException {@inheritDoc}
     * @throws IndexOutOfBoundsException     {@inheritDoc}
     */
    public E remove(int index) {
        throw new UnsupportedOperationException();
    }

其实ArrayList也继承了抽象类AbstractList,但是它自己重写了这两个方法!

至此,大完结!