Description
World-renowned Prof. A. N. Agram‘s current research deals with large anagram groups. He has just found a new application for his theory on the distribution of characters in English language texts. Given such a text, you are to find the largest anagram groups.
A text is a sequence of words. A word w is an anagram of a word v if and only if there is some permutation p of character positions that takes w to v. Then, w and v are in the same anagram group. The size of an anagram group is the number of words in that group. Find the 5 largest anagram groups.
Input
The input contains words composed of lowercase alphabetic characters, separated by whitespace(or new line). It is terminated by EOF. You can assume there will be no more than 30000 words.
Output
Output the 5 largest anagram groups. If there are less than 5 groups, output them all. Sort the groups by decreasing size. Break ties lexicographically by the lexicographical smallest element. For each group output, print its size and its member words. Sort the member words lexicographically and print equal words only once.
Sample Input
undisplayed trace tea singleton eta eat displayed crate cater carte caret beta beat bate ate abet
Sample Output
Group of size 5: caret carte cater crate trace . Group of size 4: abet bate beat beta . Group of size 4: ate eat eta tea . Group of size 1: displayed . Group of size 1: singleton .
Source
把每一个字符串,升序排序,可以得到一个字符串,如果两个字符串的字典序最小的字符串相同,就属于一个组, 所以用字典树记录这个最小字典序的字符串,然后映射下标到group结构体(存各种字符串,及个数,由于字符串输出要去重,所以用set,这样函数传参数要引用传递,否则很慢),最后按照个数排序。
代码:
#include <iostream> #include <algorithm> #include <cstdio> #include <cstring> #include <set> #define MAX 30001 using namespace std; struct group { int num; set<string> gr; group() { num = 0; } }g[MAX]; int trie[MAX * 100][26],to[MAX * 100]; int pos,num; int snum[26]; bool cmp(const group &a,const group &b) {///注意这里 if(a.num == b.num) return *(a.gr.begin()) < *(b.gr.begin()); return a.num > b.num; } void Insert(char *s) { int len = strlen(s); string s1 = s; string s2 = ""; for(int i = 0;s[i];i ++) { snum[s[i] - ‘a‘] ++; } for(int i = 0;i < 26;i ++) { while(snum[i]) { s2 += ‘a‘ + i; snum[i] --; } } int i = 0,c = 0; while(i < len) { int d = s2[i] - ‘a‘; if(!trie[c][d]) trie[c][d] = ++ pos; c = trie[c][d]; i ++; } if(!to[c]) to[c] = ++ num; g[to[c]].gr.insert(s1) ; g[to[c]].num ++; } int main() { char str[100]; while(~scanf("%s",str)) { Insert(str); } sort(g + 1,g + 1 + num,cmp); for(int i = 1;i <= 5;i ++) { printf("Group of size %d:",g[i].num); for(set<string>::iterator it = g[i].gr.begin();it != g[i].gr.end();it ++) { printf(" %s",(*it).c_str()); } puts(" ."); } }
原文地址:https://www.cnblogs.com/8023spz/p/9629787.html