CreateCustomModelDeployment
Deploys a custom model for on-demand inference in Amazon Bedrock. After
you deploy your custom model, you use the deployment's Amazon Resource Name (ARN) as the modelId
parameter when you submit prompts and generate responses with model inference.
For more information about setting up on-demand inference for custom models, see Set up inference for a custom model.
The following actions are related to the CreateCustomModelDeployment
operation:
Request Syntax
POST /model-customization/custom-model-deployments HTTP/1.1
Content-type: application/json
{
"clientRequestToken": "string
",
"description": "string
",
"modelArn": "string
",
"modelDeploymentName": "string
",
"tags": [
{
"key": "string
",
"value": "string
"
}
]
}
URI Request Parameters
The request does not use any URI parameters.
Request Body
The request accepts the following data in JSON format.
- clientRequestToken
-
A unique, case-sensitive identifier to ensure that the operation completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 256.
Pattern:
[a-zA-Z0-9](-*[a-zA-Z0-9])*
Required: No
- description
-
A description for the custom model deployment to help you identify its purpose.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 2048.
Pattern:
.*
Required: No
- modelArn
-
The Amazon Resource Name (ARN) of the custom model to deploy for on-demand inference. The custom model must be in the
Active
state.Type: String
Length Constraints: Minimum length of 20. Maximum length of 1011.
Pattern:
arn:aws(-[^:]+)?:bedrock:[a-z0-9-]{1,20}:[0-9]{12}:custom-model/(imported|[a-z0-9-]{1,63}[.]{1}[a-z0-9-]{1,63}([a-z0-9-]{1,63}[.]){0,2}[a-z0-9-]{1,63}([:][a-z0-9-]{1,63}){0,2})/[a-z0-9]{12}
Required: Yes
- modelDeploymentName
-
The name for the custom model deployment. The name must be unique within your AWS account and Region.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 63.
Pattern:
([0-9a-zA-Z][_-]?){1,63}
Required: Yes
-
Tags to assign to the custom model deployment. You can use tags to organize and track your AWS resources for cost allocation and management purposes.
Type: Array of Tag objects
Array Members: Minimum number of 0 items. Maximum number of 200 items.
Required: No
Response Syntax
HTTP/1.1 202
Content-type: application/json
{
"customModelDeploymentArn": "string"
}
Response Elements
If the action is successful, the service sends back an HTTP 202 response.
The following data is returned in JSON format by the service.
- customModelDeploymentArn
-
The Amazon Resource Name (ARN) of the custom model deployment. Use this ARN as the
modelId
parameter when invoking the model with theInvokeModel
orConverse
operations.Type: String
Length Constraints: Minimum length of 0. Maximum length of 1011.
Pattern:
arn:aws(-[^:]+)?:bedrock:[a-z0-9-]{1,20}:[0-9]{12}:custom-model-deployment/[a-z0-9]{12}
Errors
For information about the errors that are common to all actions, see Common Errors.
- AccessDeniedException
-
The request is denied because of missing access permissions.
HTTP Status Code: 403
- InternalServerException
-
An internal server error occurred. Retry your request.
HTTP Status Code: 500
- ResourceNotFoundException
-
The specified resource Amazon Resource Name (ARN) was not found. Check the Amazon Resource Name (ARN) and try your request again.
HTTP Status Code: 404
- ServiceQuotaExceededException
-
The number of requests exceeds the service quota. Resubmit your request later.
HTTP Status Code: 400
- ThrottlingException
-
The number of requests exceeds the limit. Resubmit your request later.
HTTP Status Code: 429
- TooManyTagsException
-
The request contains more tags than can be associated with a resource (50 tags per resource). The maximum number of tags includes both existing tags and those included in your current request.
HTTP Status Code: 400
- ValidationException
-
Input validation failed. Check your request parameters and retry the request.
HTTP Status Code: 400
Examples
Example request
This example illustrates one usage of CreateCustomModelDeployment.
POST /model-customization/custom-model-deployments HTTP/1.1 Content-type: application/json { "clientRequestToken": "unique-deployment-token-456", "description": "Production deployment of my custom model for customer support chatbot", "modelArn": "arn:aws:bedrock:us-west-2:123456789012:custom-model-deployment/abc123def456", "modelDeploymentName": "customer-support-model-deployment", "tags": [ { "key": "Environment", "value": "Production" }, { "key": "Application", "value": "CustomerSupport" }, { "key": "CostCenter", "value": "Engineering" } ] }
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: