HDOJ 2473 HDU 2473 Junk-Mail Filter ACM 2473 IN HDU
Posted on 2010-08-26 09:40 MiYu 閱讀(975) 評論(0) 編輯 收藏 引用 所屬分類: ACM ( 并查集 )MiYu原創, 轉帖請注明 : 轉載自 ______________白白の屋
題目地址:
http://acm.hdu.edu.cn/showproblem.php?pid=2473
題目描述:
Junk-Mail Filter
Time Limit: 15000/8000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)Total Submission(s): 1785 Accepted Submission(s): 521
1) Extract the common characteristics from the incoming email.
2) Use a filter matching the set of common characteristics extracted to determine whether the email is a spam.
We want to extract the set of common characteristics from the N sample junk emails available at the moment, and thus having a handy data-analyzing tool would be helpful. The tool should support the following kinds of operations:
a) “M X Y”, meaning that we think that the characteristics of spam X and Y are the same. Note that the relationship defined here is transitive, so
relationships (other than the one between X and Y) need to be created if they are not present at the moment.
b) “S X”, meaning that we think spam X had been misidentified. Your tool should remove all relationships that spam X has when this command is received; after that, spam X will become an isolated node in the relationship graph.
Initially no relationships exist between any pair of the junk emails, so the number of distinct characteristics at that time is N.
Please help us keep track of any necessary information to solve our problem.
Each test case starts with two integers, N and M (1 ≤ N ≤ 105 , 1 ≤ M ≤ 106), the number of email samples and the number of operations. M lines follow, each line is one of the two formats described above.
Two successive test cases are separated by a blank line. A case with N = 0 and M = 0 indicates the end of the input file, and should not be processed by your program.
5 6 M 0 1 M 1 2 M 1 3 S 1 M 1 2 S 3 3 1 M 1 2 0 0
Case #1: 3 Case #2: 2
題目分析:
題目的意思大概就是 有N 封郵件, 編號 0 -> N-1, 然后有2種操作, M : 合并操作, 將 2 種郵件合并為一種.
S : 分離操作, 將一封郵件獨立出去, 單獨占一個集合.
最后題目要求統計 集合的 個數. 從這里可以很容易的看出, 這是一個 并查集的題目, 不過按樸素方法來做的話, 一般都會 TLE.
加上數據量很大 , 不要使用 cin , 會超時. 而且一般來說 G ++ 和 C++ 在處理大量數據的時候會有1倍的時差 !!! 所以一般建議使用
C++ 提交代碼.
先看代碼 :
/*
MiYu原創, 轉帖請注明 : 轉載自 ______________白白の屋
http://www.cnblog.com/MiYu
Author By : MiYu
Test : 1
Program : 2473
*/
#include <iostream>
#include <algorithm>
using namespace std;
int set[1350005];
int a[125000];
int N,M;
int inline find ( int x )
{
return x != set[x] ? set[x] = find ( set[x] ) : set[x];
}
void inline merge ( int x, int y )
{
x = find ( x );
y = find ( y );
if ( x == y ) return ;
set[x] = y;
}
int main ()
{
int ca = 1;
while ( scanf ( "%d%d",&N,&M ), M || N )
{
for ( int i = 0; i < N; ++ i )
set[i] = i+N;
for ( int i = N; i <= N + N + M; ++ i )
set[i] = i; // <------------------------------這是關鍵, 雖然空間的消耗比較大, 但是節省了大量時間, 這樣處理的目的是將0 -> N-1 的 節點處理成葉子節點
// 這樣在對這些 節點做 S 操作的時候就不會影響到其他的節點, 而 find 操作是帶路徑壓縮的, 所以就保證了我們所 // 有要處理的節點一直是葉子節點 !!!
int sep = N + N;
int x,y; char ch[5];
for ( int i = 0; i != M; ++ i )
{
scanf ( "%s",ch );
switch ( ch[0] )
{
case 'M': scanf ( "%d%d",&x,&y ); merge ( x,y ); break;
case 'S': scanf ( "%d",&x ); set[x] = sep ++; break; //初始化操作就是為這一步準備的
}
}
for ( int i = 0; i != N; ++ i )
a[i] = find ( i );
sort ( a, a + N );
int nCount = 1;
for ( int i = 1; i < N; ++ i )
if ( a[i] != a[i-1] ) nCount ++;
printf("Case #%d: %d\n",ca++,nCount);
}
return 0;
}