本篇為LeetCode上演算法的簡單問題,Unique Email Addresses。
題目如下:
Every email consists of a local name and a domain name, separated by the @ sign.
For example, in
alice@leetcode.com,aliceis the local name, andleetcode.comis the domain name.Besides lowercase letters, these emails may contain
'.'s or'+'s.If you add periods (
'.') between some characters in the local name part of an email address, mail sent there will be forwarded to the same address without dots in the local name. For example,"alice.z@leetcode.com"and"alicez@leetcode.com"forward to the same email address. (Note that this rule does not apply for domain names.)If you add a plus (
'+') in the local name, everything after the first plus sign will be ignored. This allows certain emails to be filtered, for examplem.y+name@email.comwill be forwarded tomy@email.com. (Again, this rule does not apply for domain names.)It is possible to use both of these rules at the same time.
Given a list of
emails, we send one email to each address in the list. How many different addresses actually receive mails?
Example 1:Input: ["test.email+alex@leetcode.com","test.e.mail+bob.cathy@leetcode.com","testemail+david@lee.tcode.com"] Output: 2 Explanation: "testemail@leetcode.com" and "testemail@lee.tcode.com" actually receive mails
一個email包含了兩部份,分別為local name(帳號名稱)及domain name(網域名稱),兩者以@符號隔開。例如alice@leetcode.com中的alice是local name;而leetcode.com為domain name。
除了小寫字母外,email可能包含'.' or '+'符號。如果你在local name中加了'.'符號,其效果等同沒加。例如"alice.z@leetcode.com"及"alicez@leetcode.com"是相同的郵件地址。注意此規則僅適用在local name,domain name並不適用此規則。
如果在local name中加了'+',則加號'+'後面的任意符號都會被忽略,例如m.y+name@email.com與my@email.com相同。同樣地,此規則不適用domain name。
以上兩符號可能同時出現在local name中。
輸入一email清單,請查出在這清單中有幾個email地址?
其他條件
- 郵件清單長度介於1到100。
- 郵件地址的長度介於1到100。
- 每個郵件地址確實只有一個
'@'符號
第一次想到的解法步驟如下:
- 跑迴圈遍歷輸入的郵件地址清單。
- 將郵件地址拆為local name及domain name部分。
- 將local name中
'+'後的字符移除以及將'.'移除。 - 重新將local name + @ + domain name串起放入
HashSet中。 - 回傳
HashSet中裝載的郵件數目。
public static int numUniqueEmails(String[] emails) {
Set<String> set = new HashSet<>();
for (String email : emails) {
String[] ss = email.split("@");
String localName = ss[0];
String domainName = ss[1];
set.add(new StringBuilder(
localName.substring(0, localName.indexOf("+")).replaceAll("\\.", ""))
.append(domainName).toString());
}
return set.size();
}
不過這樣的效能不好,在leetCode跑分為53 ms左右。
下面使用Java 8的lambda改寫,效能更差,107ms
public static int numUniqueEmails(String[] emails) {
return Arrays.stream(emails).map((email) -> {
String[] ss = email.split("@");
String localName = ss[0];
String domainName = ss[1];
localName = localName.substring(0, localName.indexOf("+")).replaceAll("\\.", "");
email = new StringBuilder(localName).append("@").append(domainName).toString();
return email;
}).collect(Collectors.toMap(email -> email, email -> 1, (email1, email2) -> {
return email1;
})).size();
}
參考了討論區其他人的解法,改寫為下面則縮短到17ms。
public static int numUniqueEmails(String[] emails) {
Set<String> set = new HashSet<>();
for (String email : emails) {
StringBuilder name = new StringBuilder(30);
char[] chars = email.toCharArray();
int i = 0;
while (i < chars.length) {
if (chars[i] == '@') {
break;
}
name.append(chars[i]);
i++;
}
StringBuilder sb = new StringBuilder();
for (char c : name.toString().toCharArray()) {
if (c == '+') {
break;
}
if (c == '.') {
continue;
}
sb.append(c);
}
set.add(sb.append(email.substring(i)).toString());
}
return set.size();
}
主要差異在於使用String.indexOf()及relaceAll()的效能都沒有直接依需求判斷每個字符是否為'.'及'+'來的快速,因為兩個方法中都會再跑迴圈。而改寫後對'.'及'+'的處理只要跑一次迴圈就好。
從這題了解到雖然Java String或其他工具等提供了好用的方法,但若要更好的表現,程式碼則要依問題去特別設計。
參考:
沒有留言:
張貼留言