人工智能-迁移学习-基于知识库的问答系统其中使用带注意力机制的对抗迁移学习做中文命名实体识别，使用BERT模型做句子相似度分析资源-CSDN下载

共495个文件

py：333个

dll：57个

pyd：27个

问答系统

注意力机制

命名实体识别

121 浏览量 2025-01-01 21:47:10 上传评论收藏 76.17MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

人工智能-迁移学习-基于知识库的问答系统其中使用带注意力机制的对抗迁移学习做中文命名实体识别，使用BERT模型做句子相似度分析（495个子文件）

activate 2KB

activate.bat 1KB

deactivate.bat 368B

sysconfig.cfg 3KB

pyvenv.cfg 76B

q_t_a_df_training.csv 2.21MB

q_t_a_df_testing.csv 1.51MB

clean_triple.csv 1.13MB

lstm+crf-4440.data-00000-of-00001 32.41MB

python37.dll 3.67MB

libcrypto-1_1-x64.dll 2.37MB

tcl86t.dll 1.65MB

tk86t.dll 1.41MB

sqlite3.dll 1.16MB

ucrtbase.dll 993KB

msvcp140.dll 611KB

libssl-1_1-x64.dll 517KB

vccorlib140.dll 378KB

xlwings64-0.11.8.dll 359KB

concrt140.dll 322KB

xlwings32-0.11.8.dll 287KB

msvcp140_2.dll 191KB

vcomp140.dll 152KB

vcruntime140.dll 85KB

api-ms-win-crt-private-l1-1-0.dll 70KB

python3.dll 51KB

msvcp140_1.dll 31KB

api-ms-win-crt-math-l1-1-0.dll 27KB

api-ms-win-crt-multibyte-l1-1-0.dll 26KB

api-ms-win-crt-stdio-l1-1-0.dll 24KB

api-ms-win-crt-string-l1-1-0.dll 24KB

api-ms-win-crt-runtime-l1-1-0.dll 23KB

api-ms-win-crt-convert-l1-1-0.dll 22KB

api-ms-win-core-file-l1-1-0.dll 22KB

api-ms-win-core-localization-l1-2-0.dll 21KB

api-ms-win-crt-time-l1-1-0.dll 21KB

api-ms-win-core-processthreads-l1-1-0.dll 20KB

api-ms-win-crt-filesystem-l1-1-0.dll 20KB

api-ms-win-core-synch-l1-1-0.dll 20KB

api-ms-win-crt-process-l1-1-0.dll 19KB

api-ms-win-core-processenvironment-l1-1-0.dll 19KB

api-ms-win-crt-heap-l1-1-0.dll 19KB

api-ms-win-core-sysinfo-l1-1-0.dll 19KB

api-ms-win-crt-conio-l1-1-0.dll 19KB

api-ms-win-core-libraryloader-l1-1-0.dll 19KB

api-ms-win-core-console-l1-1-0.dll 19KB

api-ms-win-core-processthreads-l1-1-1.dll 19KB

api-ms-win-core-synch-l1-2-0.dll 19KB

api-ms-win-core-heap-l1-1-0.dll 19KB

api-ms-win-core-memory-l1-1-0.dll 19KB

api-ms-win-core-rtlsupport-l1-1-0.dll 19KB

api-ms-win-core-timezone-l1-1-0.dll 19KB

api-ms-win-crt-utility-l1-1-0.dll 19KB

api-ms-win-crt-environment-l1-1-0.dll 19KB

api-ms-win-crt-locale-l1-1-0.dll 19KB

api-ms-win-core-file-l2-1-0.dll 18KB

api-ms-win-core-interlocked-l1-1-0.dll 18KB

api-ms-win-core-errorhandling-l1-1-0.dll 18KB

api-ms-win-core-debug-l1-1-0.dll 18KB

api-ms-win-core-file-l1-2-0.dll 18KB

api-ms-win-core-util-l1-1-0.dll 18KB

api-ms-win-core-namedpipe-l1-1-0.dll 18KB

api-ms-win-core-datetime-l1-1-0.dll 18KB

api-ms-win-core-string-l1-1-0.dll 18KB

api-ms-win-core-handle-l1-1-0.dll 18KB

api-ms-win-core-profile-l1-1-0.dll 18KB

setuptools-40.8.0-py3.7.egg 559KB

t64.exe 100KB

w64.exe 97KB

python.exe 92KB

t32.exe 91KB

pythonw.exe 90KB

w32.exe 87KB

pip.exe 73KB

pip3.exe 73KB

easy_install.exe 73KB

easy_install-3.7.exe 73KB

pip3.7.exe 73KB

lstm+crf-4440.index 5KB

nlpcc-iccpol-2016.kbqa.kb 30KB

LICENSE 11KB

recommend_articles.log 2.17MB

README.md 41KB

multilingual.md 11KB

CONTRIBUTING.md 1KB

lstm+crf-4440.meta 14.94MB

not-zip-safe 1B

nlpcc2016_cws_label.npy 26.53MB

nlpcc2016_cws_word.npy 26.53MB

nlpcc2016_vector.npy 11.87MB

nlpcc2016_train_label.npy 8.32MB

nlpcc2016_train_word.npy 8.32MB

nlpcc2016_test_word.npy 5.5MB

nlpcc2016_test_label.npy 5.5MB

nlpcc2016_cws_length.npy 340KB

nlpcc2016_train_length.npy 107KB

nlpcc2016_test_length.npy 71KB

cacert.pem 269KB

PKG-INFO 3KB

kbqa.png 41KB

共 495 条

# BERT **\*\*\*\*\* New November 15th, 2018: SOTA SQuAD 2.0 System \*\*\*\*\*** We released code changes to reproduce our 83% F1 SQuAD 2.0 system, which is currently 1st place on the leaderboard by 3%. See the SQuAD 2.0 section of the README for details. **\*\*\*\*\* New November 5th, 2018: Third-party PyTorch and Chainer versions of BERT available \*\*\*\*\*** NLP researchers from HuggingFace made a [PyTorch version of BERT available](https://github.com/huggingface/pytorch-pretrained-BERT) which is compatible with our pre-trained checkpoints and is able to reproduce our results. Sosuke Kobayashi also made a [Chainer version of BERT available](https://github.com/soskek/bert-chainer) (Thanks!) We were not involved in the creation or maintenance of the PyTorch implementation so please direct any questions towards the authors of that repository. **\*\*\*\*\* New November 3rd, 2018: Multilingual and Chinese models available \*\*\*\*\*** We have made two new BERT models available: * **[`BERT-Base, Multilingual`](https://storage.googleapis.com/bert_models/2018_11_03/multilingual_L-12_H-768_A-12.zip)**: 102 languages, 12-layer, 768-hidden, 12-heads, 110M parameters * **[`BERT-Base, Chinese`](https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip)**: Chinese Simplified and Traditional, 12-layer, 768-hidden, 12-heads, 110M parameters We use character-based tokenization for Chinese, and WordPiece tokenization for all other languages. Both models should work out-of-the-box without any code changes. We did update the implementation of `BasicTokenizer` in `tokenization.py` to support Chinese character tokenization, so please update if you forked it. However, we did not change the tokenization API. For more, see the [Multilingual README](https://github.com/google-research/bert/blob/master/multilingual.md). **\*\*\*\*\* End new information \*\*\*\*\*** ## Introduction **BERT**, or **B**idirectional **E**ncoder **R**epresentations from **T**ransformers, is a new method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks. Our academic paper which describes BERT in detail and provides full results on a number of tasks can be found here: [https://arxiv.org/abs/1810.04805](https://arxiv.org/abs/1810.04805). To give a few numbers, here are the results on the [SQuAD v1.1](https://rajpurkar.github.io/SQuAD-explorer/) question answering task: SQuAD v1.1 Leaderboard (Oct 8th 2018) | Test EM | Test F1 ------------------------------------- | :------: | :------: 1st Place Ensemble - BERT | **87.4** | **93.2** 2nd Place Ensemble - nlnet | 86.0 | 91.7 1st Place Single Model - BERT | **85.1** | **91.8** 2nd Place Single Model - nlnet | 83.5 | 90.1 And several natural language inference tasks: System | MultiNLI | Question NLI | SWAG ----------------------- | :------: | :----------: | :------: BERT | **86.7** | **91.1** | **86.3** OpenAI GPT (Prev. SOTA) | 82.2 | 88.1 | 75.0 Plus many other tasks. Moreover, these results were all obtained with almost no task-specific neural network architecture design. If you already know what BERT is and you just want to get started, you can [download the pre-trained models](#pre-trained-models) and [run a state-of-the-art fine-tuning](#fine-tuning-with-bert) in only a few minutes. ## What is BERT? BERT is a method of pre-training language representations, meaning that we train a general-purpose "language understanding" model on a large text corpus (like Wikipedia), and then use that model for downstream NLP tasks that we care about (like question answering). BERT outperforms previous methods because it is the first *unsupervised*, *deeply bidirectional* system for pre-training NLP. *Unsupervised* means that BERT was trained using only a plain text corpus, which is important because an enormous amount of plain text data is publicly available on the web in many languages. Pre-trained representations can also either be *context-free* or *contextual*, and contextual representations can further be *unidirectional* or *bidirectional*. Context-free models such as [word2vec](https://www.tensorflow.org/tutorials/representation/word2vec) or [GloVe](https://nlp.stanford.edu/projects/glove/) generate a single "word embedding" representation for each word in the vocabulary, so `bank` would have the same representation in `bank deposit` and `river bank`. Contextual models instead generate a representation of each word that is based on the other words in the sentence. BERT was built upon recent work in pre-training contextual representations — including [Semi-supervised Sequence Learning](https://arxiv.org/abs/1511.01432), [Generative Pre-Training](https://blog.openai.com/language-unsupervised/), [ELMo](https://allennlp.org/elmo), and [ULMFit](http://nlp.fast.ai/classification/2018/05/15/introducting-ulmfit.html) — but crucially these models are all *unidirectional* or *shallowly bidirectional*. This means that each word is only contextualized using the words to its left (or right). For example, in the sentence `I made a bank deposit` the unidirectional representation of `bank` is only based on `I made a` but not `deposit`. Some previous work does combine the representations from separate left-context and right-context models, but only in a "shallow" manner. BERT represents "bank" using both its left and right context — `I made a ... deposit` — starting from the very bottom of a deep neural network, so it is *deeply bidirectional*. BERT uses a simple approach for this: We mask out 15% of the words in the input, run the entire sequence through a deep bidirectional [Transformer](https://arxiv.org/abs/1706.03762) encoder, and then predict only the masked words. For example: ``` Input: the man went to the [MASK1] . he bought a [MASK2] of milk. Labels: [MASK1] = store; [MASK2] = gallon ``` In order to learn relationships between sentences, we also train on a simple task which can be generated from any monolingual corpus: Given two sentences `A` and `B`, is `B` the actual next sentence that comes after `A`, or just a random sentence from the corpus? ``` Sentence A: the man went to the store . Sentence B: he bought a gallon of milk . Label: IsNextSentence ``` ``` Sentence A: the man went to the store . Sentence B: penguins are flightless . Label: NotNextSentence ``` We then train a large model (12-layer to 24-layer Transformer) on a large corpus (Wikipedia + [BookCorpus](http://yknzhu.wixsite.com/mbweb)) for a long time (1M update steps), and that's BERT. Using BERT has two stages: *Pre-training* and *fine-tuning*. **Pre-training** is fairly expensive (four days on 4 to 16 Cloud TPUs), but is a one-time procedure for each language (current models are English-only, but multilingual models will be released in the near future). We are releasing a number of pre-trained models from the paper which were pre-trained at Google. Most NLP researchers will never need to pre-train their own model from scratch. **Fine-tuning** is inexpensive. All of the results in the paper can be replicated in at most 1 hour on a single Cloud TPU, or a few hours on a GPU, starting from the exact same pre-trained model. SQuAD, for example, can be trained in around 30 minutes on a single Cloud TPU to achieve a Dev F1 score of 91.0%, which is the single system state-of-the-art. The other important aspect of BERT is that it can be adapted to many types of NLP tasks very easily. In the paper, we demonstrate state-of-the-art results on sentence-level (e.g., SST-2), sentence-pair-level (e.g., MultiNLI), word-level (e.g., NER), and span-level (e.g., SQuAD) tasks with almost

评论收藏

内容反馈