언어 코드: ISO 639-1 (https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) 언어 / 언어 코드 (예: 영어의 경우 'en', 우즈베키스탄의 경우 'uz', 일본어 (로마자)의 경우 'ja-Latn')를 문자열로 나타냅니다.
확률: 이 예측의 신뢰도 점수로, 0과 1 사이의 확률로 부동 소수점 값으로 표현됩니다.
구성 옵션
이 태스크에는 다음과 같은 구성 옵션이 있습니다.
옵션 이름
설명
값 범위
기본값
max_results
반환할 점수가 가장 높은 언어 예측의 최대 개수를 설정합니다(선택사항). 이 값이 0보다 작으면 사용 가능한 모든 결과가 반환됩니다.
모든 양수
-1
score_threshold
모델 메타데이터에 제공된 값 (있는 경우)을 재정의하는 예측 점수 기준점을 설정합니다. 이 값 미만의 결과는 거부됩니다.
모든 부동 소수점 수
설정되지 않음
category_allowlist
허용되는 언어 코드의 선택적 목록을 설정합니다. 비어 있지 않으면 이 세트에 언어 코드가 없는 언어 예측이 필터링됩니다. 이 옵션은 category_denylist와 상호 배타적이며 둘 다 사용하면 오류가 발생합니다.
모든 문자열
설정되지 않음
category_denylist
허용되지 않는 언어 코드 목록(선택사항)을 설정합니다. 비어 있지 않으면 이 세트에 언어 코드가 있는 언어 예측이 필터링됩니다. 이 옵션은 category_allowlist와 상호 배타적이며 둘 다 사용하면 오류가 발생합니다.
모든 문자열
설정되지 않음
모델
이 태스크로 개발을 시작할 때 기본 권장 모델이 제공됩니다.
언어 감지기 모델 (권장)
이 모델은 가볍게 (315KB) 빌드되며 임베딩 기반 신경망 분류 아키텍처를 사용합니다. 이 모델은 ISO 639-1 언어 코드를 사용하여 언어를 식별하며 110개 언어를 식별할 수 있습니다. 모델에서 지원하는 언어 목록은 ISO 639-1 코드별로 언어가 나열된 라벨 파일을 참고하세요.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["필요한 정보가 없음","missingTheInformationINeed","thumb-down"],["너무 복잡함/단계 수가 너무 많음","tooComplicatedTooManySteps","thumb-down"],["오래됨","outOfDate","thumb-down"],["번역 문제","translationIssue","thumb-down"],["샘플/코드 문제","samplesCodeIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-01-13(UTC)"],[],[],null,["# Language detection guide\n\nThe MediaPipe Language Detector task lets you identify the language of a piece of text. This\ntask operates on text data with a machine learning (ML) model and outputs a list\nof predictions, where each prediction consists of an\n[ISO 639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) language code\nand a probability.\n\n[Try it!arrow_forward](https://mediapipe-studio.webapps.google.com/demo/language_detector)\n\nGet Started\n-----------\n\nStart using this task by following one of these implementation guides for your\ntarget platform. These platform-specific guides walk you through a basic\nimplementation of this task, including a recommended model, and code example\nwith recommended configuration options:\n\n- **Android** - [Code example](https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/language_detector/android) - [Guide](./android)\n- **Python** - [Code example](https://colab.research.google.com/github/googlesamples/mediapipe/blob/main/examples/language_detector/python/%5BMediaPipe_Python_Tasks%5D_Language_Detector.ipynb) - [Guide](./python)\n- **Web** - [Code example](https://codepen.io/mediapipe-preview/pen/RweLdpK) - [Guide](./web_js)\n\nTask details\n------------\n\nThis section describes the capabilities, inputs, outputs, and configuration\noptions of this task.\n\n### Features\n\n- **Score threshold** - Filter results based on prediction scores\n- **Label allowlist and denylist** - Specify the categories detected\n\n| Task inputs | Task outputs |\n|-------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| Language Detector accepts the following input data type: - String | Language Detector outputs a list of predictions containing: - Language code: An ISO 639-1 (https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) language / locale code (e.g. \"en\" for English, \"uz\" for Uzbek, \"ja-Latn\" for Japanese (romaji)) as a string. \u003c!-- --\u003e - Probability: the confidence score for this prediction, expressed as a probability between zero and one as floating point value. |\n\n### Configurations options\n\nThis task has the following configuration options:\n\n| Option Name | Description | Value Range | Default Value |\n|----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|---------------|\n| `max_results` | Sets the optional maximum number of top-scored language predictions to return. If this value is less than zero, all available results are returned. | Any positive numbers | `-1` |\n| `score_threshold` | Sets the prediction score threshold that overrides the one provided in the model metadata (if any). Results below this value are rejected. | Any float | Not set |\n| `category_allowlist` | Sets the optional list of allowed language codes. If non-empty, language predictions whose language code is not in this set will be filtered out. This option is mutually exclusive with `category_denylist` and using both results in an error. | Any strings | Not set |\n| `category_denylist` | Sets the optional list of language codes that are not allowed. If non-empty, language predictions whose language code is in this set will be filtered out. This option is mutually exclusive with `category_allowlist` and using both results in an error. | Any strings | Not set |\n\nModels\n------\n\nWe offer a default, recommended model when you start developing with this task.\n| **Attention:** This MediaPipe Solutions Preview is an early release. [Learn more](/edge/mediapipe/solutions/about#notice).\n\n### Language detector model (recommended)\n\nThis model is built to be lightweight (315 KB) and uses embedding-based, neural\nnetwork classification architecture. The model identifies language using an\n[ISO 639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) language\ncode, and can identify 110 languages. For a list of languages supported by the\nmodel, see the\n[label file](https://storage.googleapis.com/mediapipe-tasks/language_detector/labels.txt),\nwhich lists languages by their ISO 639-1 code.\n\n| Model name | Input shape | Quantization type | Model card | Versions |\n|---------------------------------------------------------------------------------------------------------------------------------------------|--------------|-------------------|---------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------|\n| [Language Detector](https://storage.googleapis.com/mediapipe-models/language_detector/language_detector/float32/1/language_detector.tflite) | string UTF-8 | none (float32) | [info](https://storage.googleapis.com/mediapipe-assets/LanguageDetector%20Model%20Card.pdf) | [Latest](https://storage.googleapis.com/mediapipe-models/language_detector/language_detector/float32/1/language_detector.tflite) |\n\nTask benchmarks\n---------------\n\nHere's the task benchmarks for the whole pipeline based on the above\npre-trained models. The latency result is the average latency on Pixel 6 using\nCPU / GPU.\n\n| Model Name | CPU Latency | GPU Latency |\n|-------------------|-------------|-------------|\n| Language Detector | 0.31ms | - |"]]