Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
ec2-asg.yaml	ec2-asg.yaml
ec2.yaml	ec2.yaml

Cloudformation Templates

Torchserve provides configurable cloudformation templates to spin up AWS instances running torchserve.

Following instructions requires you have aws-cli installed as a prerequisite

Single EC2 instance

To spinup a single EC2 instance running Torchserve use the ec2.yaml template
Run the following command with the an ec2-keypair, and optionally an instance type (default: c5.4xlarge)

export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
cd cloudformation/
aws cloudformation create-stack \
  --stack-name torchserve \
  --region us-west-2 \
  --template-body file://ec2.yaml \
  --capabilities CAPABILITY_IAM \
  --parameters ParameterKey=KeyName,ParameterValue=<ec2-keypair-name> \
               ParameterKey=InstanceType,ParameterValue=<instance-type>

Once the cloudformation stack creation is complete, you can get the TorchServeManagementURL and TorchServeInferenceURL of the instance from the cloudformation output tab on AWS console and test with the following commands

> curl --insecure -X POST "<TorchServeManagementURL>/models?initial_workers=1&synchronous=false&url=https://torchserve.pytorch.org/mar_files/squeezenet1_1.mar"
{
  "status": "Processing worker updates..."
}

> curl --insecure "<TorchServeInferenceURL>/ping"
{
  "status": "Healthy"
}

> curl --insecure "<TorchServeManagementURL>/models"
{
    "models": [
        {
            "modelName": "squeezenet1_1",
            "modelUrl": "https://torchserve.pytorch.org/mar_files/squeezenet1_1.mar"
        }
     ]
}

> curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg

> curl --insecure "<TorchServeInferenceURL>/predictions/squeezenet1_1" -T kitten.jpg
[
    {
        "tabby": 0.2752002477645874
    },
    {
        "lynx": 0.2546876072883606
    },
    {
        "tiger_cat": 0.24254210293293
    },
    {
        "Egyptian_cat": 0.2213735282421112
    },
    {
        "cougar": 0.0022544863168150187
    }
]

Multi-node EC2 deployment behind Elastic LoadBalancer (ELB)

To spinup a EC2 ASG cluster behind an ELB running Torchserve use the ec2-asg.yaml template
NOTE: Multi-node deployments require model path to be provided upfront as part of the template, and registering/unregistering models is not supported as of now.
Run the following command with the an ec2-keypair, and optionally an instance type (default: c5.4xlarge)

export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
cd cloudformation/
aws cloudformation create-stack \
  --stack-name torchserve \
  --region us-west-2 \
  --template-body file://ec2-asg.yaml \
  --capabilities CAPABILITY_IAM \
  --parameters ParameterKey=KeyName,ParameterValue=<ec2-keypair-name> \
               ParameterKey=InstanceType,ParameterValue=<instance-type> \
               ParameterKey=MinNodeNumber,ParameterValue=<min-nodes> \
               ParameterKey=MaxNodeNumber,ParameterValue=<max-nodes> \
               ParameterKey=ModelPath,ParameterValue=<model-mar-url>

e.g.

aws cloudformation create-stack \
  --stack-name torchserve \
  --region us-east-1 \
  --template-body file://ec2-asg.yaml \
  --capabilities CAPABILITY_IAM \
  --parameters ParameterKey=KeyName,ParameterValue=useastcfntemplate \
               ParameterKey=ModelPath,ParameterValue="https://torchserve.pytorch.org/mar_files/squeezenet1_1.mar"

Once the cloudformation stack creation is complete, you can get the TorchServeManagementURL and TorchServeInferenceURL of the instance from the cloudformation output tab on AWS console and test with the following commands

> curl "<TorchServeInferenceURL>/ping"
{
  "status": "Healthy"
}

> curl "<TorchServeManagementURL>/models"
{
  "models": [
    {
      "modelName": "squeezenet1_1",
      "modelUrl": "squeezenet1_1.mar"
    }
  ]
}

> curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg

> curl "<TorchServeInferenceURL>/predictions/squeezenet1_1" -T kitten.jpg
[
    {
        "tabby": 0.2752002477645874
    },
    {
        "lynx": 0.2546876072883606
    },
    {
        "tiger_cat": 0.24254210293293
    },
    {
        "Egyptian_cat": 0.2213735282421112
    },
    {
        "cougar": 0.0022544863168150187
    }
]

CloudWatch Logging

Once the instance is up and running, TorchServe logs are published to cloudwatch under the LogGroup=<stack-name>/<ec2-instance-id>/TorchServe e.g. torchserve/i-0649487ecbe691676/TorchServe

Restarting and terminating

If you have to stop or restart torchserve, you'll have to ssh into the host

ssh -i <ec2-keypair-name> ubuntu@<ec2-dns>

cd /
sudo bash
export PATH="/home/ubuntu/miniconda/bin:$PATH"
conda init bash
# IMPORTANT: You may need to close and restart your shell after running 'conda init'.
conda activate torchserve
torchserve --stop
torchserve --start --model-store ./model_store --ts-config /etc/torchserve/config.properties

To terminate the instance and delete the stack you can run aws cloudformation delete-stack --stack-name <stack-name>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Cloudformation Templates

Single EC2 instance

Multi-node EC2 deployment behind Elastic LoadBalancer (ELB)

CloudWatch Logging

Restarting and terminating

FilesExpand file tree

cloudformation

Directory actions

More options

Directory actions

More options

Latest commit

History

cloudformation

Folders and files

parent directory

README.md

Cloudformation Templates

Single EC2 instance

Multi-node EC2 deployment behind Elastic LoadBalancer (ELB)

CloudWatch Logging

Restarting and terminating