C++基础算法5：离散化

最新推荐文章于 2025-07-14 15:44:13 发布

原创最新推荐文章于 2025-07-14 15:44:13 发布 · 549 阅读

4 ·

CC 4.0 BY-SA版权

文章标签：

#算法

C++基础算法专栏收录该内容

11 篇文章

订阅专栏

1、离散化概念

离散化是算法竞赛和数据处理中常用的技术，用于将大范围稀疏数据压缩到紧凑的连续区间，同时保持数据的相对顺序关系。

例如：

原始数据：[1, 10000, 1000000, 5] → 离散化后：[1, 3, 4, 2]

`2、离散的主要步骤`

标准三步法：

收集所有需要离散化的值
排序 + 去重
建立映射关系（二分查找）

模板

vector<int> alls; // 存储所有待离散化的值

// 1. 收集所有值
alls.push_back(x1);
alls.push_back(x2);
// ...

// 2. 排序并去重
sort(alls.begin(), alls.end());
alls.erase(unique(alls.begin(), alls.end()), alls.end());

// 3. 二分查找映射函数
int find(int x) {
    int l = 0, r = alls.size() - 1;
    while (l < r) {
        int mid = l + r >> 1;
        if (alls[mid] >= x) r = mid;
        else l = mid + 1;
    }
    return r + 1; // 映射到1, 2, ...n
}

例题：假定有一个无限长的数轴，数轴上每个坐标上的数都是 0。

现在，我们首先进行 n次操作，每次操作将某一位置 x 上的数加 c。

接下来，进行 m 次询问，每个询问包含两个整数 l 和 r，你需要求出在区间 [l,r] 之间的所有数的和。

代码：

#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;

const int N = 3e5 + 10;  // 定义足够大的常量数组大小
int a[N], s[N];          // a数组存储离散化后的值，s是前缀和数组
vector<int> loc;         // 存储所有需要离散化的坐标
vector<pair<int,int>> in, qu; // in存储输入区间，qu存储查询区间

// 二分查找函数，用于离散化坐标
int find(int x) {
    int l = 0, r = loc.size() - 1;
    while (l < r) {
        int mid = l + r >> 1;  // 等价于(l + r)/2
        if(loc[mid] >= x) r = mid;
        else l = mid + 1;
    }
    return r + 1;  // 返回离散化后的索引（从1开始）
}

int main() {
    int n, m;
    cin >> n >> m;  // 输入区间数量n和查询数量m
    
    // 第一部分：处理输入区间
    for (int i = 1; i <= n; i++) {
        int x, c;
        cin >> x >> c;
        loc.push_back(x);          // 收集需要离散化的坐标
        in.push_back({x, c});      // 存储原始区间数据
        // 这里必须用花括号，因为vector存储的是pair类型
        // 相当于调用pair的构造函数pair<int,int>(x, c)
    }
    
    // 第二部分：处理查询区间
    for (int i = 1; i <= m; i++) {
        int l, r;
        cin >> l >> r;
        loc.push_back(l);         // 收集查询左边界
        loc.push_back(r);         // 收集查询右边界
        qu.push_back({l, r});     // 存储查询区间
    }
    
    // 第三部分：离散化处理
    sort(loc.begin(), loc.end());  // 排序所有坐标
    // 去重，unique返回去重后的尾迭代器
    loc.erase(unique(loc.begin(), loc.end()), loc.end());
    
    // 第四部分：构建离散化后的数组
    for(auto item : in) {
        int res = find(item.first);  // 找到原始坐标离散化后的位置
        a[res] += item.second;       // 在离散化位置上累加值
    }
    
    // 第五部分：构建前缀和数组
    for(int i = 1; i <= loc.size(); i++) {
        s[i] = s[i-1] + a[i];  // 计算前缀和
    }
    
    // 第六部分：处理查询
    for(auto item : qu) {
        int res = find(item.first);   // 查询左边界离散化位置
        int res2 = find(item.second); // 查询右边界离散化位置
        int res3 = s[res2] - s[res - 1]; // 计算区间和
        cout << res3 << endl;         // 输出结果
    }
    
    return 0;
}

3、vector容器

vector 是 C++ 中最灵活、最常用的容器之一，可以存储各种数据类型。

vector<int>、vector<pair<int,int>>、vector<vector<int>>、vector<自定义类型>

常用函数：push_back()、at()、front()、back()、pop_back()。

4、pair<int,int>

C++ 标准模板库 (STL) 中的一个实用模板类，用于将两个整数存储为一个单元。

特别注意：push_back()只能接受一个参数。所以：

qu.push_back({l, r});   // 使用花括号初始化列表
qu.push_back(make_pair(l, r));  
qu.push_back(pair<int, int>(l, r));  //  显式调用构造函数

qu.push_back(l, r);  // 错误！push_back 只能接受一个参数