답변_동등

설명 :

Answer Equivalence 데이터 세트에는 SQuAD 데이터 세트의 여러 모델에서 모델 예측에 대한 인적 평가가 포함되어 있습니다. 등급은 예상 답변이 골드 답변과 '동등'한지 여부를 설정합니다(질문과 컨텍스트를 모두 고려).

보다 구체적으로 '동등한'이란 예측 답변이 최소한 골드 답변과 동일한 정보를 포함하고 불필요한 정보를 추가하지 않음을 의미합니다. 데이터 세트에는 다음에 대한 주석이 포함되어 있습니다. * SQuAD dev에 대한 BiDAF의 예측 * SQuAD dev에 대한 XLNet의 예측 * SQuAD dev에 대한 Luke의 예측 * SQuAD 교육, 개발 및 테스트 예제에 대한 Albert의 예측

홈페이지 : https://github.com/google-research-datasets/answer-equivalence-dataset
소스 코드 : tfds.datasets.answer_equivalence.Builder
버전 :
- 1.0.0 (기본값): 최초 릴리스.
다운로드 크기 : 45.86 MiB
데이터 세트 크기 : 47.24 MiB
자동 캐시 ( 문서 ): 예
분할 :

나뉘다	예
`'ae_dev'`	4,446
`'ae_test'`	9,724
`'dev_bidaf'`	7,522
`'dev_luke'`	4,590
`'dev_xlnet'`	7,932
`'train'`	9,090

기능 구조 :

FeaturesDict({
    'candidate': Text(shape=(), dtype=string),
    'context': Text(shape=(), dtype=string),
    'gold_index': int32,
    'qid': Text(shape=(), dtype=string),
    'question': Text(shape=(), dtype=string),
    'question_1': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'question_2': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'question_3': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'question_4': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'reference': Text(shape=(), dtype=string),
    'score': float32,
})

기능 문서 :

특징	수업	D타입
	풍모Dict
후보자	텍스트	끈
문맥	텍스트	끈
gold_index	텐서	int32
키드	텍스트	끈
의문	텍스트	끈
질문 1	클래스 레이블	int64
질문_2	클래스 레이블	int64
질문_3	클래스 레이블	int64
질문_4	클래스 레이블	int64
참조	텍스트	끈
점수	텐서	float32

감독된 키 ( as_supervised 문서 참조): None
그림 ( tfds.show_examples ): 지원되지 않습니다.
예 ( tfds.as_dataframe ):

인용 :

@article{bulian-etal-2022-tomayto,
      title={Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation},
      author={Jannis Bulian and Christian Buck and Wojciech Gajewski and Benjamin Boerschinger and Tal Schuster},
      year={2022},
      eprint={2202.07654},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

답변_동등 컬렉션을 사용해 정리하기 내 환경설정을 기준으로 콘텐츠를 저장하고 분류하세요.

답변_동등