網頁

2020/12/15

Java 移除List中重複的元素 remove duplicates from List

Java把List中重複的元素移除的方法如下。

List物件做為HashSet建構式的參數取得HashSet物件即可得到不重複元素的集合,因為SET的特性是所含的元素必須是唯一的。

List<Integer> integerList = Arrays.asList(1, 2, 2, 3, 3, 4, 4, 4, 5);

Set<Integer> integerSet = new HashSet<>(integerList);
System.out.println(integerSet); // [1, 2, 3, 4, 5]

或是轉為Streamdistinct()排除多的重覆。

List<Integer> integerList = Arrays.asList(1, 2, 2, 3, 3, 4, 4, 4, 5);

integerList = integerList.stream().distinct().collect(Collectors.toList());
System.out.println(integerList); // [1, 2, 3, 4, 5]

List的元素為自訂類別的物件則該類別必須覆寫equals()hashCode()才有效果。

例如下面的Order依照id來區別是否相同,所以複寫equals()hashCode()時僅利用id比較及計算。

Order

package com.abc.demo;

import java.util.Objects;

public class Order {

    private long id; // key
    private long amount;

    Order(long id, long amount) {
        this.id = id;
        this.amount = amount;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Order order = (Order) o;
        return id == order.id;
    }

    @Override
    public int hashCode() {
        return Objects.hash(id);
    }

    @Override
    public String toString() {
        return String.format("id=%s, amount=%s", id, amount);
    }
}

下面建立有重覆OrderList<Order>,利用HashSet排除重覆的物件。

List<Order> orderList = Arrays.asList(
        new Order(1, 10),
        new Order(2, 100),
        new Order(2, 100), // duplicate
        new Order(3, 1000),
        new Order(3, 1000) // duplicate
);

Set<Order> orderSet = new HashSet<>(orderList);
System.out.println(orderSet); // [id=1, amount=10, id=2, amount=100, id=3, amount=1000]

利用Stream.distinct()排除重覆的物件。

List<Order> orderList = Arrays.asList(
        new Order(1, 10),
        new Order(2, 100),
        new Order(2, 100), // duplicate
        new Order(3, 1000),
        new Order(3, 1000) // duplicate
);

orderList = orderList.stream().distinct().collect(Collectors.toList());
System.out.println(orderList); // [id=1, amount=10, id=2, amount=100, id=3, amount=1000]


若要依物件的多個非equals()hashCode()的屬性來排除重覆,可利用TreeSet,其為有排序的Set可定義排序的規則。

例如下面的OrderPaymentequals()hashCode()僅依id來區別是否相同。

OrderPayment

package com.abc.demo;

import java.util.Objects;

public class OrderPayment {

    private long id;
    private String orderId;
    private String paymentId;

    public OrderPayment(long id, String orderId, String paymentId) {
        this.id = id;
        this.orderId = orderId;
        this.paymentId = paymentId;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        OrderPayment that = (OrderPayment) o;
        return id == that.id;
    }

    @Override
    public int hashCode() {
        return Objects.hash(id);
    }

    @Override
    public String toString() {
        return "OrderPayment{" +
                "id=" + id +
                ", orderId='" + orderId + '\'' +
                ", paymentId='" + paymentId + '\'' +
                '}';
    }

    // getters and setters
}

但下面的List<OrderPayment>要根據orderIdpaymentId是否相同來排除重覆的元素,則在建構TreeSet時傳入依orderId結合paymentIdComparator來比較。

List<OrderPayment> orderPaymentList = Arrays.asList(
        new OrderPayment(1, "H001", "C001"),
        new OrderPayment(1, "H001", "C001"), // duplicate
        new OrderPayment(2, "H001", "C002"),
        new OrderPayment(3, "H002", "C003"),
        new OrderPayment(3, "H002", "C003") // duplicate
);

Set<OrderPayment> set = new TreeSet<>(
        Comparator.comparing(orderPayment -> 
                orderPayment.getOrderId() + orderPayment.getPaymentId()));

set.addAll(orderPaymentList);

orderPaymentList = new ArrayList<>(set);
System.out.println(orderPaymentList); // [OrderPayment{id=1, orderId='H001', paymentId='C001'}, OrderPayment{id=2, orderId='H001', paymentId='C002'}, OrderPayment{id=3, orderId='H002', paymentId='C003'}]

或用Stream.collect()搭配TreeSet排除多餘的重覆。

List<OrderPayment> orderPaymentList = Arrays.asList(
        new OrderPayment(1, "H001", "C001"),
        new OrderPayment(1, "H001", "C001"), // duplicate
        new OrderPayment(2, "H001", "C002"),
        new OrderPayment(3, "H002", "C003"),
        new OrderPayment(3, "H002", "C003") // duplicate
);

orderPaymentList = orderPaymentList.stream().collect(Collectors.collectingAndThen(
        Collectors.toCollection(() -> new TreeSet<>(
                Comparator.comparing(orderPayment -> 
                        orderPayment.getOrderId() + orderPayment.getPaymentId()
        ))), ArrayList::new));

System.out.println(orderPaymentList); // [OrderPayment{id=1, orderId='H001', paymentId='C001'}, OrderPayment{id=2, orderId='H001', paymentId='C002'}, OrderPayment{id=3, orderId='H002', paymentId='C003'}]


2 則留言:

  1. List在java8可用Stream的distinct()去重喔。

    api如下
    https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html#distinct--

    回覆刪除
  2. @JAVA吉他手,謝謝您的分享,lambda真的還超不熟啊。

    回覆刪除