基数排序法

原创于 2024-11-28 08:14:34 发布 · 885 阅读

CC 4.0 BY-SA版权

文章标签：

一，计数排序法

计数排序（Counting Sort）是一种非比较的排序算法，它通过统计数组中每个元素出现的次数来排序，适用于元素范围较小的情况。其主要思想是利用一个额外的计数数组，记录待排序元素的出现次数，然后通过累加计数数组中的值，确定每个元素在排序数组中的位置。

计数排序的基本步骤：

确定元素的范围：首先确定待排序数组中的最小值和最大值，记为 min 和 max。
创建计数数组：创建一个计数数组 count，其大小为 max - min + 1，用于记录每个元素出现的次数。数组的下标代表元素的值，数组的值代表该元素出现的次数。
统计频次：遍历待排序数组，统计每个元素出现的频率，并将该频率存储到计数数组中。
累计计数：将计数数组中的值进行累加，得到每个元素在最终排序数组中的位置。这个步骤确保了排序稳定性（即相同元素的相对位置不变）。
构造排序结果：根据累加后的计数数组，将元素按顺序放入目标排序数组中。

计数排序的优缺点：

优点：

时间复杂度低：在元素范围有限时，计数排序的时间复杂度是 O(n + k)，其中 n 是待排序元素的个数，k 是元素的范围（最大值和最小值的差）。
稳定性：计数排序是稳定的排序算法，即相同元素在排序后会保持原来的相对顺序。

缺点：

空间复杂度高：如果元素的范围较大（即 k 较大），则需要大量的额外空间来存储计数数组，导致空间复杂度高。
适用范围有限：计数排序通常用于排序整数或具有整数键值的离散数据。当元素范围非常大时，计数排序就不再适用。

#include<iostream>

using namespace std;

int* count_sort(int* input, int len, int min, int max)
{
	int* arr = new int[max - min + 1];
	//arr数组是统计各个元素出现次数
	for (int i = 0; i < max - min + 1; i++) arr[i] = 0;
	for (int i = 0; i < len; i++)
	{
		arr[input[i] - min]++;
	}

	int* ans = new int[len];
	int index = 0;
	//设变量index是为了确保能够将ans数组按顺序填满且符合要求。
	for (int i = min; i < max + 1;i++)
	{
		for (int j = 0; j < arr[i - min]; j++)
		{
			ans[index++] = i;
		}
	}

	return ans;
}

int main()
{
	int arr[] = { 3,6,4,5,6,3,6,4,5,5,6,4,3 };
	int len = sizeof(arr) / sizeof(int);
	int* ans = count_sort(arr, len, 3, 6);

	for (int i = 0; i < len; i++) cout << arr[i] << " ";
	cout << endl;
	for (int i = 0; i < len; i++)
	{
		cout << ans[i] << " ";
	}
	return 0;
}

适用于大量重复元素排序，但如果最大值和最小值中间的数字并不是全部都在数组里面的话，重新设计算法了。假如数组里面只有一个1和1000个1000，岂不是要创建一个长度为1000的数组，肯定不是。

2，普遍解法

1）第一次尝试

代码

#include<iostream>
#include<unordered_map>
#include<algorithm>

using namespace std;

struct Compare
{
	bool operator()(unordered_map<int,int>::iterator x,unordered_map<int,int>::iterator y)
	{
		return x->first < y->first;
	}
};

int main()
{
	int arr[] = {3,3,5,9,9,3,4,5,3,4,5,9,9,9,9};
	int len = sizeof(arr) / sizeof(int);

	unordered_map<int, int> map;
	for (int i = 0; i < len; i++)
	{
		if (map.find(arr[i]) != map.end())
		{
			map.find(arr[i])++;
		}
		else
		{
			map.emplace(arr[i], 1);
		}
	}

	Compare compare;
	sort(map.begin(), map.end(),compare);

	int index = 0;
	for (auto it = map.begin(); it != map.end(); it++)
	{
		for (int i = 0; i < it->second; i++)
		{
			arr[index++] = it->first;
		}
	}

	for (int i = 0; i < len; i++)
	{
		cout << arr[i] << " ";
	}
	
	return 0;
}

问题：

1，

unordered_map 不支持 sort：
unordered_map 是一个哈希表容器，它并没有实现排序功能。sort 是用于容器如 vector 或 array 的排序操作，因此，不能直接对 unordered_map 使用 sort。你需要将 unordered_map 中的元素（键值对）提取到一个可排序的容器（如 vector）中，然后对这个容器进行排序。
更新 unordered_map 中的计数方式错误：
map.find(arr[i]) 返回的是一个迭代器，不能直接进行 ++ 操作。你应该通过迭代器访问并修改其 second（即值）。
sort 比较器的实现问题：
Compare 结构体的 operator() 比较函数是正确的，它是为了按照键（first）进行升序排序，但 unordered_map 是不能直接被排序的，因此需要将其转换为其他容器。

2）第二次尝试

#include<iostream>
#include<unordered_map>
#include<algorithm>
#include<vector>

using namespace std;

struct Compare
{
	bool operator()(const pair<int,int>& x,const pair<int,int>& y)
	{
		return x.first < y.first;
	}
} compare;

int main()
{
	int arr[] = {3,3,5,9,9,3,4,5,3,4,5,9,9,9,9};
	int len = sizeof(arr) / sizeof(int);

	unordered_map<int, int> map;
	for (int i = 0; i < len; i++)
	{
		if (map.find(arr[i]) != map.end())
		{
			map[arr[i]]++;
		}
		else
		{
			map[arr[i]] = 1;
		}
	}


	vector<pair<int, int>> vector(map.begin(), map.end());
	sort(vector.begin(), vector.end(), compare);

	int index = 0;
	for (auto p : vector)
	{
		for (int i = 0; i < p.second; i++)
		{
			arr[index++] = p.first;
		}
	}

	for (int i = 0; i < len; i++)
	{
		cout << arr[i] << " ";
	}
	
	return 0;
}

主要改进：

使用[] 来访问hashmap，同时也使用了范围for循环。

补充

二，基数排序法

1）前缀数量分区

1，原理：针对个位数，先统计各个数字的个数，然后再计算分别小于该数字的数字有多少个，例如有1个0，3个1，4个2，则小于等于2的数字有8个，所以在原数组访问到数字2时，就知道该把2放到索引为7的位置。因为2一定是小于等于2数字中最后一位。

2，代码

我感觉自己构造的数组有点多

#include<iostream>


int main()
{
	int count[10];
	int arr[] = { 3,2,6,1,8,3,2,7,1,0,8,4,2,1,2,2,2,2,4,9};
	int assist[10];
	int len = sizeof(arr) / sizeof(int);


	//初始化count数组
	for (int i = 0; i < 10; i++)
	{
		count[i] = 0;
		assist[i] = 0;
	}

	for (int i = 0; i < len; i++)
	{
		count[arr[i]]++;
	}

	//还需要一个数组来记录小于某个数字x的元素有多少个.
	//利用统计的各个数字出现的次数来进行排序。
	assist[0] = count[0];
	for (int i = 1; i < 10; i++)
	{
		assist[i] += count[i] + assist[i - 1];
	}


	//修改arr数组
	int* copy = new int[len];
	for(int i = len - 1; i >= 0; i--)
	{
		copy[(assist[arr[i]]--) - 1] = arr[i];
	}

	//再复制copy数组里面数据顺序到arr原数组里。
	for (int i = 0; i < len; i++)
	{
		arr[i] = copy[i];
		std::cout << arr[i] << " ";
	}

	delete[] copy;
	return 0;
	
}

2）获取一个数字各个位上的数字

直接复制原数据，使用while循环，然后%10就完事了。

3）真正代码实现

原理：

我也说不清，直接去B站看左程云的视频，第28讲。

1，先提供非负数排序

a,第一次尝试

代码：

#include<iostream>

using namespace std;

struct Solution
{
public:
	int bits;
	int* arr;
	int len;
	int Base = 10;

	Solution(int x,int y, int* input) : bits(x), len(y)
	{
		arr = new int[len];
		for (int i = 0; i < len; i++) arr[i] = input[i];
	}

	void radixSort()
	{
		int* count = new int[10];
		int* assist = new int[len];
		for (int i = 0; i < 10; i++) count[i] = 0;

		//根据数组里最大数字的位数决定处理或排序的次数。
		for (int offset = 1; bits > 0;offset *= Base, bits--)
		{
			//利用前缀分区排序，在给每一位数字排序。但使用前先统计
			//位上各个数字的个数。
			for (int i = 0; i < len; i++)
			{
				count[(arr[i] / offset) % Base]++;
			}

			//为什么要从右往左遍历原数组
			//举例：32，34，36 已知在十位上小于等于3的有5个数
			/*如果从左往右遍历，遍历到32时，就会把他放到索引为4处，而34放到索引是3处
			* 明显是错的，基数排序原理是一位位的让数组里的数字有序，再不改变上一
			* 位的顺序的前提下，对现在所处位的数字进行排序，所以要从右往左。
			*/

			//还要进一步对count数组进行处理，记录小于等于某个数字的有多少个数。
			for (int i = 1; i < Base; i++)
			{
				count[i] = count[i] + count[i - 1];
			}


			for (int i = len - 1; i >= 0; i--)
			{
				//使用arr获取到某一索引处的数字后，然后要进行取位操作，
				// 以便对应到count数组里。
				assist[--count[(arr[i] / offset) % Base]] = arr[i];
			}

			for (int i = 0; i < len; i++) arr[i] = assist[i];
			//每次排序完后拷贝到原数组里去，从而进入到下一轮回。
		}

		delete[] count, assist; 
	}

	void print_arr()
	{
		cout << "this is sorted array ." << endl;
		for (int i = 0; i < len; i++)
		{
			cout << arr[i] << " ";
		}
		cout << endl;
	}
};

int main()
{
	int arr[] = { 38,32,29,44,65,66,17,28,59 };
	int len = sizeof(arr) / sizeof(int);

	Solution sorted_arr(1, len, arr);
	sorted_arr.print_arr();
	sorted_arr.radixSort();
	sorted_arr.print_arr();
	return 0;
}

先是看到arr没有分配空间，后面修改了。但还是不对，发现对个位数字可以实现排序，但更高位就不行，后来发现没有更新count数组。

void radixSort()
{
	int* count = new int[10];
	int* assist = new int[len];
	

	//根据数组里最大数字的位数决定处理或排序的次数。
	for (int offset = 1; bits > 0;offset *= Base, bits--)
	{
		for (int i = 0; i < 10; i++) count[i] = 0;
		//这步非常关键，因为到下位排序时要重置count数组
		//防止上次数据影响这次排序。


		//利用前缀分区排序，在给每一位数字排序。但使用前先统计
		//位上各个数字的个数。
		for (int i = 0; i < len; i++)
		{
			count[(arr[i] / offset) % Base]++;
		}

		//为什么要从右往左遍历原数组
		//举例：32，34，36 已知在十位上小于等于3的有5个数
		/*如果从左往右遍历，遍历到32时，就会把他放到索引为4处，而34放到索引是3处
		* 明显是错的，基数排序原理是一位位的让数组里的数字有序，再不改变上一
		* 位的顺序的前提下，对现在所处位的数字进行排序，所以要从右往左。
		*/

		//还要进一步对count数组进行处理，记录小于等于某个数字的有多少个数。
		for (int i = 1; i < Base; i++)
		{
			count[i] = count[i] + count[i - 1];
		}


		for (int i = len - 1; i >= 0; i--)
		{
			//使用arr获取到某一索引处的数字后，然后要进行取位操作，
			// 以便对应到count数组里。
			assist[--count[(arr[i] / offset) % Base]] = arr[i];
		}

		for (int i = 0; i < len; i++) arr[i] = assist[i];
		//每次排序完后拷贝到原数组里去，从而进入到下一轮回。
	}

	delete[] count, assist; 
}