C语言-简单哈希表（hash table）

　　腾讯三面的时候，叫我写了个哈希表，当时紧张没写好···结果跪了···

　　回来后粪发涂墙，赶紧写了一个！

　　什么都不说了···先让我到厕所里面哭一会···

　　%>_<%

　　果然现场发挥，以及基础扎实才是important的！　　

　　用链地址法解决冲突的哈希表（C语言，VS2008编写、测试）：

  1 #include <stdio.h>
  2 #include <stdlib.h>
  3 #include <math.h>
  4 #include <string.h>
  5
  6 struct node {
  7     int count;   // count the same value
  8     char *value;
  9     node *next;
 10 };
 11
 12 // 使用链地址法解决冲突
 13 struct hash_table {
 14     int size;    // table size
 15     node **list; // 链表队列一条链为一个散列位置
 16 };
 17
 18 //==================================================//
 19 // declare
 20
 21 int hash_func(char *str, int tableSize);
 22 hash_table *hash_table_init(int tableSize);
 23 node *hash_table_new_node(char *str, int len);
 24 int hash_table_insert(char *str, struct hash_table * head);
 25 node *hash_table_find(char *str, struct hash_table * head);
 26 void hash_table_clear(struct hash_table * head);
 27 void hash_table_print(hash_table *head);
 28
 29 //==================================================//
 30 // realize
 31
 32 // hash function: return the position of str in the hash table
 33 int hash_func(char *str, int tableSize) {
 34     unsigned int hashVal = 0;
 35
 36     while (*str != ‘\0‘)
 37         hashVal += (hashVal << 5) + *str++;
 38
 39     return hashVal % tableSize;
 40 }
 41
 42 // init & create hash table
 43 hash_table *hash_table_init(int tableSize) {
 44     hash_table *head;
 45
 46     head = (hash_table *)malloc(sizeof(hash_table));
 47     if (NULL == head)
 48         return NULL;
 49     // 元素总数量尽可能为素数，以保证mod尽可能均匀
 50     head->size = tableSize;
 51
 52     // 链表队列中，一条链为一个散列位置
 53     head->list = (node **)malloc(sizeof(node *) * tableSize);
 54
 55     // initialize each hash list
 56     int i;
 57     for (i = 0; i < head->size; i++)
 58         head->list[i] = NULL;
 59
 60     return head;
 61 }
 62
 63 // return one new node
 64 node *hash_table_new_node(char *str, int len) {
 65     node *newNode = (node *)malloc(sizeof(node));
 66     newNode->count = 1;
 67     newNode->next = NULL;
 68     newNode->value = (char *)malloc(len + 1);
 69     memset(newNode->value, 0, len + 1);
 70     memcpy(newNode->value, str, len);
 71
 72     return newNode;
 73 }
 74
 75 // insert one node into hash table
 76 int hash_table_insert(char *str, hash_table *head) {
 77     int len = strlen(str);
 78     // get str‘s position in the hash table
 79     int pos = hash_func(str, head->size);
 80
 81     printf("[insert] %s at pos: %d\n", str, pos);
 82
 83     // locate list
 84     node *q = head->list[pos], *p = head->list[pos];
 85     for ( ; p != NULL; p = p->next) {
 86         if (memcmp(p->value, str, len) == 0) {
 87             p->count++; // found the same value, count+1
 88             return pos;
 89         }
 90         q = p; // store the previous node
 91     }
 92
 93     // if not found, then insert one new node
 94     node *newNode = hash_table_new_node(str, len);
 95     /*
 96     //===================================================================//
 97     // method 1:
 98     // TODO: 如果是字符串不同，但是哈希值一样呢？？？貌似没考虑到这种情况
 99     // insert into the head of list
100     newNode->next = head->list[pos];
101     head->list[pos] = newNode;
102     */
103     //===================================================================//
104     // method 2:
105     // insert into the tail of list
106     // 由于p指向了NULL，所以要插入链表尾的话，就必须增加一个变量q记录p前一个节点位置
107     if (NULL == q) {
108         newNode->next = head->list[pos];
109         head->list[pos] = newNode;
110     } else {
111         q->next = newNode;
112         newNode->next = NULL;
113     }
114
115     return pos;
116 }
117
118 // find the node which stored str & return it
119 node *hash_table_find(char *str, hash_table *head) {
120     if (NULL == head)
121         return NULL;
122
123     int pos = hash_func(str, head->size);
124     node *p = head->list[pos];
125
126     int len = strlen(str);
127
128     for ( ; p != NULL; p = p->next)
129         if (memcmp(p->value, str, len) == 0)
130             break;//return p; // found & return
131
132     return p; //return NULL;
133 }
134
135 // clear the whole hash table
136 void hash_table_clear(hash_table *head) {
137     if (NULL == head)
138         return;
139
140     node *p = NULL, *q = NULL;
141
142     int i;
143     for (i = 0; i < head->size; i++) {
144         p = head->list[i];
145         while (p != NULL) {
146             p->count = 0; // TODO: test
147             q = p->next;
148             // free value
149             if (p->value) {
150                 free(p->value);
151                 p->value = NULL;
152             }
153             // free current node
154             if (p) {
155                 free(p);
156                 p = NULL;
157             }
158             // point to next node
159             p = q;
160         }
161     }
162     // free list
163     if (head->list) {
164         free(head->list);
165         head->list = NULL;
166     }
167     // free head
168     if (head) {
169         free(head);
170         head = NULL;
171     }
172 }
173
174 // print the whole hash table
175 void hash_table_print(hash_table *head) {
176     if (NULL == head) {
177         printf("hash table is NULL! \n");
178         return;
179     }
180
181     int i;
182     node *p = NULL;
183
184     for ( i = 0; i < head->size; i++) {
185         p = head->list[i];
186         printf("//============list %d============//\n", i);
187         while (p != NULL) {
188             if (p->value)
189                 printf("%s:%d ", p->value, p->count);
190             else
191                 printf("(NULL):(0) ");
192             p = p->next;
193         }
194         printf("\n");
195     }
196 }
197
198 // test
199 int main() {
200     // create
201     hash_table *head = hash_table_init(10);
202
203     // insert
204     hash_table_insert("test 1", head);
205     hash_table_insert("test 2", head);
206     hash_table_insert("test 2", head);
207     hash_table_insert("test 3", head);
208     hash_table_insert("test 4", head);
209
210     hash_table_print(head);
211
212     // find
213     node *find = hash_table_find("test 2", head);
214     printf("\n[Find] %s:%d\n\n", find->value, find->count);
215
216     // clear
217     hash_table_clear(head);
218
219     hash_table_print(head);
220
221     // create
222     head = hash_table_init(6);
223
224     // insert
225     hash_table_insert("test 1", head);
226     hash_table_insert("test 2", head);
227     hash_table_insert("test 2", head);
228     hash_table_insert("test 3", head);
229     hash_table_insert("test 4", head);
230
231     hash_table_print(head);
232
233     return 0;
234 }

时间： 2024-10-30 02:00:39

C语言-简单哈希表（hash table）的相关文章

PHP关联数组与哈希表(hash table) 不指定

PHP中有一种数据类型非常重要,它就是关联数组,又称为哈希表(hash table),是一种非常好用的数据结构. 在程序中,我们可能会遇到需要消重的问题,举一个最简单的模型: 有一份用户名列表,存储了 10000 个用户名,没有重复项: 还有一份黑名单列表,存储了 2000 个用户名,格式与用户名列表相同: 现在需要从用户名列表中删除处在黑名单里的用户名,要求用尽量快的时间处理. 这个问题是一个小规模的处理量,如果实际一点,2 个表都可能很大,比如有 2 亿条记录. 我最开始想到的方法,就是做一

哈希表(Hash table)（1）

哈希表(Hash table)经常被用来做字典(dictionary),或称符号表(symbol-table) 直接存取表(Direct-access table): ? 直接存取表(Direct-access table)的基本思想是:如果key的范围为0~m-1而且所有key都不相同, 那么可以设计一个数组T[0..m-1],让T[k]存放key为k的元素, 否则为空(NIL) ? 显然, 所有操作都是O(1)的 ? 问题:key的范围可能很大! 64位整数有18,446,744,073,7

哈希表 hash table

散列表(Hash table,也叫哈希表),是根据关键码值(Key value)而直接进行访问的数据结构.也就是说,它通过把关键码值映射到表中一个位置来访问记录,以加快查找的速度.这个映射函数叫做散列函数,存放记录的数组叫做散列表. 给定表M,存在函数f(key),对任意给定的关键字值key,代入函数后若能得到包含该关键字的记录在表中的地址,则称表M为哈希(Hash)表,函数f(key)为哈希(Hash) 函数. 首先问题规模确定,例如5台服务器怎么把数据散落在5台上面呢,就用到了hash算法

什么叫哈希表(Hash Table)

散列表(也叫哈希表),是根据关键码值直接进行访问的数据结构,也就是说,它通过把关键码值映射到表中一个位置来访问记录,以加快查找的速度.这个映射函数叫做散列函数,存放记录的数组叫做散列表. - 数据结构中,有个时间算法复杂度O(n)的概念来衡量某种算法在时间效率上的优劣.哈希表的理想算法复杂度为O(1),也就是说利用哈希表查找某个值,系统所使用的时间在理想情况下为定值,这就是它的优势.那么哈希表是如何做到这一点的呢? - 我们定义一个很大的有序数组,想要得到位于该数组第n个位置的值,它的算法复杂度

[BS]散列表哈希表 Hash table

<第五章> 散列散列表的实现常常叫做散列(hashing).散列是一种用于以常数平均时间执行插入.删除和查找的技术. 关于散列有一个很重要的概念:散列函数.散列函数是散列的关键处之一,散列函数又是基于映射机制的一种对应关系(一般是多对一的关系). 这章可以分为5个部分:一般想法,散列函数,分离链接法,开放定址法(可分为线性探测.平方探测.双散列).再散列.可扩散列. 本文只写到前四节.即:一般想法,散列函数,分离链接法,开放定址法(可分为线性探测.平方探测.双散列)() 第五章第一节:一般

哈希表Hash

大家都学过数据结构: 内存里面为了更好的管理对象,通常采用链表或者数据以及Hash表来存储数据. 数据存储一下是数据存储到计算机的两种模式线性的存储:数组---寻址方便,更新不好(连续的) 链式的存储: 链表----寻址不方便,更新方便.(不连续的) 为了提高检索的速度,我们可以采取Hash机制,key采取数据存储,方便寻址,其次我们可以利用链表方便更新数据的具体的值. 哈希表Hash,布布扣,bubuko.com

[译]C语言实现一个简易的Hash table(3)

上一章,我们讲了hash表的数据结构,并简单实现了hash表的初始化与删除操作,这一章我们会讲解Hash函数和实现算法,并手动实现一个Hash函数. Hash函数本教程中我们实现的Hash函数将会实现如下操作: 输入一个字符串,然后返回一个0到m(Hash表的大小)的数字为一组平常的输入返回均匀的bucket索引.如果Hash函数不是均匀分布的,就会将多个记录插入到相同的bucket中,这就回提高冲突的几率,而这个冲突就会影响到我们的Hash表的效率. Hash算法我们将会设计一个普通的字

[译]C语言实现一个简易的Hash table(4)

上一章我们解释了Hash table中最重要的hash函数,并用伪代码和C语言实现了一个我们自己的hash函数,hash函数中碰撞是无法避免的,当发生碰撞时我们改如何有效的处理呢?这章我们就来讲解下. 处理碰撞 hash函数中将无限大的输入映射到有限的输出中,当不同的输入映射到相同的输出时,就会发生碰撞,每个的hash表都会采用不同的方法来处理碰撞. 我们的哈希表将使用一种称为开放地址的双重哈希的技术来处理冲突.双重哈希使用两个散列函数来计算在发生碰撞后存储记录的索引. 双重哈希当i发生碰撞后

c语言构建哈希表

/*哈希查找 *哈希函数的构造方法常用的有5种.分别是: *数字分析法 *平方取中法 *分段叠加 *伪随机数 *除留取余法 *这里面除留取余法比较常用 *避免哈希冲突常用的方法有4种: *开放定址法(线性探测再散列.二次探测再散列) *链地址法 *再哈希法 *建立公共溢出区其中,线性探测再散列比较常用*/ 这是一道2009年武汉科技大学的考研题,但是按照要求却做不出来,因为对7取模最多只有7个空间,不可能放进8个数,所以怀疑这道题是不是出错了,但这是考研题,应该不会出错吧.所以各位大神,你们怎