本篇為LeetCode上演算法的簡單問題,Unique Email Addresses。
題目如下:
Every email consists of a local name and a domain name, separated by the @ sign.
For example, in
alice@leetcode.com
,alice
is the local name, andleetcode.com
is the domain name.Besides lowercase letters, these emails may contain
'.'
s or'+'
s.If you add periods (
'.'
) between some characters in the local name part of an email address, mail sent there will be forwarded to the same address without dots in the local name. For example,"alice.z@leetcode.com"
and"alicez@leetcode.com"
forward to the same email address. (Note that this rule does not apply for domain names.)If you add a plus (
'+'
) in the local name, everything after the first plus sign will be ignored. This allows certain emails to be filtered, for examplem.y+name@email.com
will be forwarded tomy@email.com
. (Again, this rule does not apply for domain names.)It is possible to use both of these rules at the same time.
Given a list of
emails
, we send one email to each address in the list. How many different addresses actually receive mails?
Example 1:
Input: ["test.email+alex@leetcode.com","test.e.mail+bob.cathy@leetcode.com","testemail+david@lee.tcode.com"] Output: 2 Explanation: "testemail@leetcode.com" and "testemail@lee.tcode.com" actually receive mails
一個email包含了兩部份,分別為local name(帳號名稱)及domain name(網域名稱),兩者以@
符號隔開。例如alice@leetcode.com
中的alice
是local name;而leetcode.com
為domain name。
除了小寫字母外,email可能包含'.'
or '+'
符號。如果你在local name中加了'.'
符號,其效果等同沒加。例如"alice.z@leetcode.com"
及"alicez@leetcode.com"
是相同的郵件地址。注意此規則僅適用在local name,domain name並不適用此規則。
如果在local name中加了'+'
,則加號'+'
後面的任意符號都會被忽略,例如m.y+name@email.com
與my@email.com
相同。同樣地,此規則不適用domain name。
以上兩符號可能同時出現在local name中。
輸入一email清單,請查出在這清單中有幾個email地址?
其他條件
- 郵件清單長度介於1到100。
- 郵件地址的長度介於1到100。
- 每個郵件地址確實只有一個
'@'
符號
第一次想到的解法步驟如下:
- 跑迴圈遍歷輸入的郵件地址清單。
- 將郵件地址拆為local name及domain name部分。
- 將local name中
'+'
後的字符移除以及將'.'
移除。 - 重新將local name + @ + domain name串起放入
HashSet
中。 - 回傳
HashSet
中裝載的郵件數目。
public static int numUniqueEmails(String[] emails) {
Set<String> set = new HashSet<>();
for (String email : emails) {
String[] ss = email.split("@");
String localName = ss[0];
String domainName = ss[1];
set.add(new StringBuilder(
localName.substring(0, localName.indexOf("+")).replaceAll("\\.", ""))
.append(domainName).toString());
}
return set.size();
}
不過這樣的效能不好,在leetCode跑分為53 ms左右。
下面使用Java 8的lambda改寫,效能更差,107ms
public static int numUniqueEmails(String[] emails) {
return Arrays.stream(emails).map((email) -> {
String[] ss = email.split("@");
String localName = ss[0];
String domainName = ss[1];
localName = localName.substring(0, localName.indexOf("+")).replaceAll("\\.", "");
email = new StringBuilder(localName).append("@").append(domainName).toString();
return email;
}).collect(Collectors.toMap(email -> email, email -> 1, (email1, email2) -> {
return email1;
})).size();
}
參考了討論區其他人的解法,改寫為下面則縮短到17ms。
public static int numUniqueEmails(String[] emails) {
Set<String> set = new HashSet<>();
for (String email : emails) {
StringBuilder name = new StringBuilder(30);
char[] chars = email.toCharArray();
int i = 0;
while (i < chars.length) {
if (chars[i] == '@') {
break;
}
name.append(chars[i]);
i++;
}
StringBuilder sb = new StringBuilder();
for (char c : name.toString().toCharArray()) {
if (c == '+') {
break;
}
if (c == '.') {
continue;
}
sb.append(c);
}
set.add(sb.append(email.substring(i)).toString());
}
return set.size();
}
主要差異在於使用String.indexOf()
及relaceAll()
的效能都沒有直接依需求判斷每個字符是否為'.'
及'+'
來的快速,因為兩個方法中都會再跑迴圈。而改寫後對'.'
及'+'
的處理只要跑一次迴圈就好。
從這題了解到雖然Java String
或其他工具等提供了好用的方法,但若要更好的表現,程式碼則要依問題去特別設計。
參考:
沒有留言:
張貼留言