JAVA两种实现文本敏感词检测的方式

1.基于DFA算法的实现

1.1 DFA介绍

DFA即Deterministic Finite Automaton,也就是确定有穷自动机,它是是通过event和当前的state得到下一个state,即event+state=nextstate。

1.2构建模型

举个例子来说,在bash脚本命令检测中,我们规定“rm”,“reboot”,“shutdown”,"::","/dev/null","rmr"为敏感词,则我们需要根据这6个敏感词来构建检测模型,使用json格式来表示:

{"r":{"e":{"b":{"isEnd":"0","o":{"isEnd":"0","o":{"t":{"deepCount":"6","isEnd":"1"},"isEnd":"0"}}},"isEnd":"0"},"isEnd":"0","m":{"r":{"deepCount":"3","isEnd":"1"},"deepCount":"2","isEnd":"1"}},":":{":":{"deepCount":"2","isEnd":"1"},"isEnd":"0"},"s":{"h":{"u":{"t":{"d":{"isEnd":"0","o":{"w":{"isEnd":"0","n":{"deepCount":"8","isEnd":"1"}},"isEnd":"0"}},"isEnd":"0"},"isEnd":"0"},"isEnd":"0"},"isEnd":"0"},"/":{"d":{"e":{"v":{"isEnd":"0","/":{"isEnd":"0","n":{"u":{"l":{"l":{"deepCount":"9","isEnd":"1"},"isEnd":"0"},"isEnd":"0"},"isEnd":"0"}}},"isEnd":"0"},"isEnd":"0"},"isEnd":"0"}}

代码实现:

private static Map initSensitiveWordMap(){
        String key = null;
        
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值