SlideShare a Scribd company logo
Serverless
Generative AI
AWS User Groups of Florida
Fort Lauderdale, FL, USA
February 27th, 2024
Patrick Hannah
CTO
CloudHesive
AWS User Groups of Florida – Updates
We are back to In-Person Meetups and working towards a monthly cadence
Always open to ideas on how we can improve the content and format!
Collaborate with us after the MeetUp!
Future MeetUps – Presenters? Topics? Formats?
Slideshare – Keep an eye on our MeetUp Page – we will post a link to the Slides
Slack – Keep the conversation going
Today’s MeetUp Format
Feel free to ask questions throughout the session!
Dedicated Q&A at the end
Topic
In this session, I will unravel the complexities of serverless
generative AI, offering insights into its architecture, applications,
and potential impact on businesses across various industries.
Whether you're a seasoned AWS practitioner or just starting your
journey into cloud computing, this presentation promises to
broaden your horizons and spark new ideas.
Inspiration
“I'm wondering if there is a feature request to create something like a saved
query in Athena that can be executed via a CloudWatch Event?”
The AWS Step Functions service integration with Amazon Athena enables you to use
Step Functions to start and stop query execution, and get query results
AWS User Groups of Florida MeetUp - AWS API Architectures - Scott
Hendrickson, Partner Solutions Architect, AWS
Data sources and resolvers are how AWS AppSync translates GraphQL requests and
fetches information from your AWS resources
AWS Well Architected Framework Serverless Application Lens
If your Lambda function is not performing custom logic while integrating with other
AWS services, chances are that it may be unnecessary
Who doesn’t like connecting things together?
Compute’s Transition to Serverless
Compute - EC2 Bare Metal (Intel, AMD, Graviton, M1)
Compute - EC2 Virtual > Bare Metal (Xen, KVM/Nitro)
Containers - Fargate > ContainderD (was DockerD) > EC2
Serverless - Lambda > Firecracker (Micro VM) > EC2
Serverless’ Flavors
High Level Abstractions
SaaS (Connect)
Hybrid Abstractions
PaaS (DynamoDB)
Low Level Abstractions
IaaS (Lambda)
Service Categories
Analytics
Application Integration
AR & VR
AWS Cost Management
Blockchain
Business Applications
Compute
Customer Engagement
Database
Developer Tools
End User Computing
Game Tech
Internet of Things
Machine Learning
Management & Governance
Media Services
Migration & Transfer
Mobile
Networking & Content Delivery
Quantum Technologies
Robotics
Satellite
Security, Identity, & Compliance
Storage
Workload Personas
Migrated
Server Based
Migrated & Optimized
Blends of Server and Service Based
Serverless/Native
Service Based
Orchestrated
ECS, EKS, K8s
Inherited
Wildcard!
Hybrid
Wildcard!
Well Architected Framework
Operational Excellence
Security
Reliability
Performance Efficiency
Cost Optimization
Sustainability
Cloud Workload Lifecycle Management
Workload
Architecture
Monitoring
Automation
Processes
Integration
Workload + Architecture Drives Service Selection
Containers
Container File
Versioning
Multi-threaded/Single-task
Minutes to Days
Per VM/Per Hour
Virtual Machines
AMI
Patching
Multi-threaded/Multi-task
Hours to Months
Per VM/Per Hour
Functions/Services
Code
Versioning
Single-threaded/Single-task
Microseconds to Seconds
Per Memory/Second/Per Request
Automation + Processes Drives Lifecycle Management Selection
Organizations
Cross-Account Asset Management + Governance
Control Tower
Account vending/default standardization
Service Catalog
Workload platform vending/default standardization
CloudFormation
IaC
Ephemeral Compute + API Managed Data/Control Plane for
Persistence Tiers
Hands off/Lights out
Processes
Patching
Backup/Restore Testing
Failover Testing (AZ)
Credential Rotation/Credential Audit
Event Response Testing
Incident Response Testing
Performance Testing
Performance/Cost Review
Vulnerability/Penetration Testing
Integration
AI/ML Options
Generalized
Specialized
“Balanced”
Generative AI in the context of AWS
Amazon Bedrock
Amazon SageMaker, Studio and Canvas (and Redshift Inferences)
NVIDIA GPU-powered Amazon EC2 instances
AWS Tranium
AWS Inferentia
Amazon EC2 UltraClusters
Amazon Q: Business, AWS, QuickSight, Connect, Supply Chain, Code
Catalyst, IDE, Code Transformation, Query Editor (Redshift)
PartyRock
AWS CodeWhisperer
AWS HealthScribe
Generative AI in the context of AWS
Services that accelerate development for AWS
Services that are powered by it – No-code data connectors/Zero
ETL, Instance Selection, Console to Code (and AppComposer),
Natural Language Querying, Code Scanning, Datazone
(Descriptions)
Services that accelerate development for you – Lex
(Conversational FAQ, Slot Resolution, Bot builder, Utterance
Generator), Personalize (Themes), Transcribe (Summarization)
Services improved by it – Alexa
Rationalization
Why Serverless – how does serverless change how we incept,
launch, and iterate product?
Why GenAI – how does Generative AI change how we think
about solving problems with data?
Foundational Model
Bedrock Operationalization
Non-functional
Regional Considerations
FM Subscription
Throughput/Quotas
Security
Operational Monitoring
Traffic Flow (Private Link)
Functional
Prompt Engineering
Tokens
Model Parameters
Inference Parameters
Sessions
Serverless Generative AI on AWS, AWS User Groups of Florida
Serverless Generative AI on AWS, AWS User Groups of Florida
Databases that can be used to store Vector Embeddings
OpenSearch/Serverless
Redis Enterprise and MemoryDB
Pinecone
Aurora (Postgres)
RDS (Postgres)
MongoDB
DocumentDB
Neptune
Machine Learning
Amazon Augmented AI - Easily implement human review of machine learning predictions
Amazon CodeGuru - Intelligent recommendations for building and running modern applications
Amazon Comprehend - Analyze Unstructured Text
Amazon Comprehend Medical - Amazon Comprehend Medical uses machine learning to extract
insights and relationships from medical text.
AWS DeepComposer - AWS DeepComposer allows developers of all skill levels to get started with
Generative AI.
AWS DeepLens - Deep Learning Enabled Video Camera
AWS DeepRacer - Fully autonomous 1/18th scale race car, driven by machine learning
Amazon DevOps Guru - ML-powered cloud operations service to improve application availability.
Amazon Forecast - Amazon Forecast is a fully-managed service for accurate time-series
forecasting
Amazon Fraud Detector - Detect more online fraud faster using machine learning
Amazon HealthLake - Making sense of health data
Amazon Kendra - Highly accurate enterprise search service powered by machine learning
AWS HealthImaging
Amazon Lex - Build Voice and Text Chatbots
Amazon Lookout for Equipment - Detect abnormal equipment behavior by analyzing sensor data
Amazon Lookout for Metrics - Accurately detect anomalies in your business metrics and quickly
understand why
Amazon Lookout for Vision - Identify defects using computer vision to automate quality inspection.
Amazon Monitron - End-to-end system for equipment monitoring
Amazon Omics - Transform omics data into insights.
AWS Panorama - Enabling computer vision applications at the edge
Amazon Personalize - Amazon Personalize helps you easily add real-time recommendations to
your apps
Amazon Polly - Turn Text into Lifelike Speech
Amazon Rekognition - Search and Analyze Images
Amazon SageMaker - Build, Train, and Deploy Machine Learning Models
Amazon Textract - Easily extract text and data from virtually any document
Amazon Transcribe - Powerful Speech Recognition
Amazon Translate - Powerful Neural Machine Translation
Amazon Bedrock
Serverless Generative AI on AWS, AWS User Groups of Florida
Primary Services
API Tier
API Gateway – API Management
AppSync – GraphQL API
Application (Execution)/Code Tier
Lambda – Serverless Compute
Data Store Tier
DynamoDB – Key/Value Data Base
Service Tier
Event Bridge/Step Functions – Event Bus, Low Code/No Code Workflow
Athena – Interactive Query Service
S3 – Object Storage
Glue – Data Integration Service
Options for APIs
Client > API Gateway HTTP > Things
Client > API Gateway REST > Things
Client > AppSync GraphQL > Things
Client > Application Load Balancer > Lambda
Client > Lambda Function URLs
Client > CloudFront (Authorizer) > Lambda
Client > AWS IoT
Options to call AWS services w/o Lambda
APIs
API Gateway > AWS Services
AppSync > GraphQL > Resolvers > AWS Services
Event
Step Functions > AWS Services
EventBridge
API Gateway Integrations
AWS
Service
Lambda
AWS Proxy
Service
Lambda
HTTP
HTTP Proxy
Mock
AppSync Resolvers
DynamoDB
RDS
OpenSearch
Lambda
HTTP
Sync versus Async
Can the payload fit in the size/time constraints
What is the impact to the client?
Step Functions Optimized Integrations
Lambda
Batch
DynamoDB
ECS/Fargate
SNS
SQS
Glue, DataBrew
SageMaker
EMR
CodeBuild
Athena
EKS
API Gateway
EventBridge
Step Functions
HTTP Destinations (New) - https://aws.amazon.com/blogs/aws/external-endpoints-and-testing-of-task-states-now-available-in-aws-step-functions/
Bedrock (New)- https://aws.amazon.com/about-aws/whats-new/2023/11/aws-step-functions-optimized-integration-bedrock/
Options for Event Buses/Messaging/Queuing
DynamoDB > Triggers
CloudWatch Logs > Metrics > Alarms / Lambda
CloudWatch Metrics > Destination
Kinesis > Lambda
Event Bridge (DLQ Support) > Lambda
SQS (DLQ Support) > Lambda
SNS (DLQ Support) > Lambda
(DLQ Support) Lambda
Twitter @radzikowski_m
Serverless Data Stores - The Easy Button
S3 Query – Query objects in S3, through S3
Athena (and S3 and Glue) – Query objects in S3, Presto
AppFlow – Data Integration Platform
Profiles
Wisdom
Tasks
Serverless Data Stores
DynamoDB – Key/Value
Timescale – Time Series
Keyspaces – Cassandra
QLDB – Ledger
Aurora – Relational
Prometheus – Prometheus
Grafana – Grafana
MWAA – Airflow
General Considerations
Multi-Region? Single-Region? Which Region(s)?
Which Services?
What will they cost? How are they metered/billed?
How far do we need to scale?
What compliance requirements do we need to meet?
What tools do we have in our reach? (Frameworks, Patterns,
etc.)
API Gateway
Development (Isolation, Stages, SAM)
Client Security (Certificates, API Keys, Authorizers)
Gateway Security (WAF, Throttling)
Endpoint Type (Edge optimized, Regional, Private, API Cache)
Integration (Methods, Proxy, Response Codes)
Operationalization (CloudWatch Logs, CloudWatch Metrics,
Access Logging, X-Ray
Testing (Direct, PostMan)
Lambda
Runtime
Pre-Warming
Sizing/Timeouts
Development (Isolation, Versions, SAM, Cloud9, Parameterization)
Integration (Methods, Response Codes)
Security (KMS, Execution Role)
Operationalization (CloudWatch Logs, CloudWatch Metrics, X-Ray)
Testing (Direct)
“The Rest”
Development (Coding Best Practices, Runtime, RDBMS, DevOps)
Data Stores that are not Serverless (Sizing, CloudWatch, Logs, Events,
Backup/Recovery, Multi-AZ, Database “Stuff”)
Trade-off
VPC (Public Subnets, Private Subnets, Security Groups)
Typical of Legacy Integrations, Non-Serverless Data Stores, etc.
General (What are all of the things we need to think about when we create a
new AWS account?)
“Landing Zone”
Conclusion
AWS continues to increase the breadth and depth of their service offerings
I wish it did that
I didn’t know I needed that
It’s easier to get started today than it was yesterday
Simplicity
Support
Cost
Lessons Learned
Regional Availability
Flexibility of implementation to change FMs (or even support custom FMs) and tune FM specific parameters
Conclusion
Generative AI and API Access to Generative AI services (like Bedrock) can be an easy button
Not an end all – value can be found in context, which takes us back to needing a strong data foundation
Priorities are still priorities – customers don’t care about Generative AI if your customers have needs unfulfilled by the product or by Generative AI
Customers may also need to be led to it – if the customer isn’t asking, pushing it on them won’t help – they need education
Consider sustainability when choosing an approach – Maslow’s Hammer
Don’t forget about team enablement
Limited by your imagination and ability to execute
References
https://docs.aws.amazon.com/wellarchitected/latest/serverless-applications-lens/wellarchitected-
serverless-applications-lens.pdf – Well Architected Serverless Application Lens
https://docs.aws.amazon.com/apigateway/latest/developerguide/getting-started-aws-proxy.html – API
Gateway Service Proxy Example
https://docs.aws.amazon.com/apigateway/latest/developerguide/websocket-api-chat-app.html – API
Gateway Websocket Example
https://docs.aws.amazon.com/appsync/latest/devguide/tutorials.html – AppSync Tutorials
https://docs.aws.amazon.com/appsync/latest/devguide/tutorial-dynamodb-resolvers.html – AppSync
Tutorial DynamoDB Resolver
https://docs.aws.amazon.com/lambda/latest/dg/lambda-urls.html – Lambda URLS
https://docs.aws.amazon.com/step-functions/latest/dg/connect-supported-services.html – Step Functions
Supported Services
https://docs.aws.amazon.com/step-functions/latest/dg/sample-athena-query.html – Step Functions
Athena Query
0800-860-2040
sales-latam@cloudhesive.com
cloudhesive.com
Fort Lauderdale
2419 E. Commercial Blvd, Ste. 300
Ft. Lauderdale, Florida
USA
Buenos Aires
Av. Del Libertador 6680, Piso 6
CABA, Ciudad de Buenos Aires
Argentina
Santiago de Chile
Cerro El Plomo 5420 SB1, Oficina 15
Nueva Las Condes, Santiago de Chile
Chile

More Related Content

Similar to Serverless Generative AI on AWS, AWS User Groups of Florida (20)

PPTX
5 incredible (and uncommon) serverless patterns
DavidVictoria12
 
PDF
AWS re-Invent re-Cap general deck 2022-2023 .pdf
Rohini Gaonkar
 
PDF
Serverless Architectural Patterns 
and Best Practices - Madhu Shekar - AWS
CodeOps Technologies LLP
 
PDF
2022 Presentation | Serverless Innovation with AWS
Dhaval Nagar
 
PPTX
Getting Started with Serverless Architectures
AWS Summits
 
PDF
Jumpstart your idea with AWS Serverless [Oct 2020]
Dhaval Nagar
 
PDF
Serverless use cases with AWS Lambda
Boaz Ziniman
 
PPTX
Serverless Architectural Patterns I AWS Dev Day 2018
AWS Germany
 
PDF
Getting Started with Serverless Architectures
Rohini Gaonkar
 
PPTX
From Monolithic to Modern Apps: Best Practices
Tom Laszewski
 
PPTX
Aws re invent 2018 recap
CloudHesive
 
PDF
Introduction to Serverless Computing and AWS Lambda - AWS IL Meetup
Boaz Ziniman
 
PDF
Serverless use cases with AWS Lambda - More Serverless Event
Boaz Ziniman
 
PDF
Modern Applications Web Day | Impress Your Friends with Your First Serverless...
AWS Germany
 
PDF
Mainstream Serverless
Dhaval Nagar
 
PDF
Serverless Meetup - 12 gennaio 2017
Luca Bianchi
 
PDF
20200520 - Como empezar a desarrollar aplicaciones serverless
Marcia Villalba
 
PDF
Let Your Business Logic go Serverless | AWS Summit Tel Aviv 2019
AWS Summits
 
PDF
Serverless on AWS: Architectural Patterns and Best Practices
Vladimir Simek
 
PPTX
Aws serverless architecture
genesesoftware
 
5 incredible (and uncommon) serverless patterns
DavidVictoria12
 
AWS re-Invent re-Cap general deck 2022-2023 .pdf
Rohini Gaonkar
 
Serverless Architectural Patterns 
and Best Practices - Madhu Shekar - AWS
CodeOps Technologies LLP
 
2022 Presentation | Serverless Innovation with AWS
Dhaval Nagar
 
Getting Started with Serverless Architectures
AWS Summits
 
Jumpstart your idea with AWS Serverless [Oct 2020]
Dhaval Nagar
 
Serverless use cases with AWS Lambda
Boaz Ziniman
 
Serverless Architectural Patterns I AWS Dev Day 2018
AWS Germany
 
Getting Started with Serverless Architectures
Rohini Gaonkar
 
From Monolithic to Modern Apps: Best Practices
Tom Laszewski
 
Aws re invent 2018 recap
CloudHesive
 
Introduction to Serverless Computing and AWS Lambda - AWS IL Meetup
Boaz Ziniman
 
Serverless use cases with AWS Lambda - More Serverless Event
Boaz Ziniman
 
Modern Applications Web Day | Impress Your Friends with Your First Serverless...
AWS Germany
 
Mainstream Serverless
Dhaval Nagar
 
Serverless Meetup - 12 gennaio 2017
Luca Bianchi
 
20200520 - Como empezar a desarrollar aplicaciones serverless
Marcia Villalba
 
Let Your Business Logic go Serverless | AWS Summit Tel Aviv 2019
AWS Summits
 
Serverless on AWS: Architectural Patterns and Best Practices
Vladimir Simek
 
Aws serverless architecture
genesesoftware
 

More from CloudHesive (20)

PPTX
CloudHesive x Datadog Multi Generational Observability
CloudHesive
 
PPTX
Modernization of your AWS based SaaS platform - Short
CloudHesive
 
PPTX
Modernization of your AWS based SaaS platform
CloudHesive
 
PPTX
Amazon Connect & AI - Shaping the Future of Customer Interactions - GenAI and...
CloudHesive
 
PPTX
Amazon Connect & AI - Shaping the Future of Customer Interactions - GenAI and...
CloudHesive
 
PPTX
Accelerating Business and Research Through Automation and Artificial Intellig...
CloudHesive
 
PPTX
Amazon Connect Rethink Your Contact Center with CloudHesive.pptx
CloudHesive
 
PPTX
ConnectPath Introduction
CloudHesive
 
PDF
Modernize your contact center with ConnectPath CX v2.pdf
CloudHesive
 
PDF
Modernize your contact center with ConnectPath CX — Chart.pdf
CloudHesive
 
PPTX
End User Computing at CloudHesive.pptx
CloudHesive
 
PPTX
Analytics at CloudHesive
CloudHesive
 
PPTX
Supporting your CMMC initiatives with Sumo Logic
CloudHesive
 
PDF
Best Practices and Resources to Effectively Manage and Optimize Your AWS Costs
CloudHesive
 
PPTX
Serverless data and analytics on AWS for operations
CloudHesive
 
PPTX
reInvent reCap 2022
CloudHesive
 
PDF
AWS Advanced Analytics Automation Toolkit (AAA)
CloudHesive
 
PDF
AWS Control Tower
CloudHesive
 
PPTX
Security on AWS, 2021 Edition Meetup
CloudHesive
 
PPTX
Security on AWS
CloudHesive
 
CloudHesive x Datadog Multi Generational Observability
CloudHesive
 
Modernization of your AWS based SaaS platform - Short
CloudHesive
 
Modernization of your AWS based SaaS platform
CloudHesive
 
Amazon Connect & AI - Shaping the Future of Customer Interactions - GenAI and...
CloudHesive
 
Amazon Connect & AI - Shaping the Future of Customer Interactions - GenAI and...
CloudHesive
 
Accelerating Business and Research Through Automation and Artificial Intellig...
CloudHesive
 
Amazon Connect Rethink Your Contact Center with CloudHesive.pptx
CloudHesive
 
ConnectPath Introduction
CloudHesive
 
Modernize your contact center with ConnectPath CX v2.pdf
CloudHesive
 
Modernize your contact center with ConnectPath CX — Chart.pdf
CloudHesive
 
End User Computing at CloudHesive.pptx
CloudHesive
 
Analytics at CloudHesive
CloudHesive
 
Supporting your CMMC initiatives with Sumo Logic
CloudHesive
 
Best Practices and Resources to Effectively Manage and Optimize Your AWS Costs
CloudHesive
 
Serverless data and analytics on AWS for operations
CloudHesive
 
reInvent reCap 2022
CloudHesive
 
AWS Advanced Analytics Automation Toolkit (AAA)
CloudHesive
 
AWS Control Tower
CloudHesive
 
Security on AWS, 2021 Edition Meetup
CloudHesive
 
Security on AWS
CloudHesive
 
Ad

Recently uploaded (20)

PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
July Patch Tuesday
Ivanti
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
July Patch Tuesday
Ivanti
 
Ad

Serverless Generative AI on AWS, AWS User Groups of Florida

  • 1. Serverless Generative AI AWS User Groups of Florida Fort Lauderdale, FL, USA February 27th, 2024 Patrick Hannah CTO CloudHesive
  • 2. AWS User Groups of Florida – Updates We are back to In-Person Meetups and working towards a monthly cadence Always open to ideas on how we can improve the content and format! Collaborate with us after the MeetUp! Future MeetUps – Presenters? Topics? Formats? Slideshare – Keep an eye on our MeetUp Page – we will post a link to the Slides Slack – Keep the conversation going Today’s MeetUp Format Feel free to ask questions throughout the session! Dedicated Q&A at the end
  • 3. Topic In this session, I will unravel the complexities of serverless generative AI, offering insights into its architecture, applications, and potential impact on businesses across various industries. Whether you're a seasoned AWS practitioner or just starting your journey into cloud computing, this presentation promises to broaden your horizons and spark new ideas.
  • 4. Inspiration “I'm wondering if there is a feature request to create something like a saved query in Athena that can be executed via a CloudWatch Event?” The AWS Step Functions service integration with Amazon Athena enables you to use Step Functions to start and stop query execution, and get query results AWS User Groups of Florida MeetUp - AWS API Architectures - Scott Hendrickson, Partner Solutions Architect, AWS Data sources and resolvers are how AWS AppSync translates GraphQL requests and fetches information from your AWS resources AWS Well Architected Framework Serverless Application Lens If your Lambda function is not performing custom logic while integrating with other AWS services, chances are that it may be unnecessary
  • 5. Who doesn’t like connecting things together?
  • 6. Compute’s Transition to Serverless Compute - EC2 Bare Metal (Intel, AMD, Graviton, M1) Compute - EC2 Virtual > Bare Metal (Xen, KVM/Nitro) Containers - Fargate > ContainderD (was DockerD) > EC2 Serverless - Lambda > Firecracker (Micro VM) > EC2
  • 7. Serverless’ Flavors High Level Abstractions SaaS (Connect) Hybrid Abstractions PaaS (DynamoDB) Low Level Abstractions IaaS (Lambda)
  • 8. Service Categories Analytics Application Integration AR & VR AWS Cost Management Blockchain Business Applications Compute Customer Engagement Database Developer Tools End User Computing Game Tech Internet of Things Machine Learning Management & Governance Media Services Migration & Transfer Mobile Networking & Content Delivery Quantum Technologies Robotics Satellite Security, Identity, & Compliance Storage
  • 9. Workload Personas Migrated Server Based Migrated & Optimized Blends of Server and Service Based Serverless/Native Service Based Orchestrated ECS, EKS, K8s Inherited Wildcard! Hybrid Wildcard!
  • 10. Well Architected Framework Operational Excellence Security Reliability Performance Efficiency Cost Optimization Sustainability
  • 11. Cloud Workload Lifecycle Management Workload Architecture Monitoring Automation Processes Integration
  • 12. Workload + Architecture Drives Service Selection Containers Container File Versioning Multi-threaded/Single-task Minutes to Days Per VM/Per Hour Virtual Machines AMI Patching Multi-threaded/Multi-task Hours to Months Per VM/Per Hour Functions/Services Code Versioning Single-threaded/Single-task Microseconds to Seconds Per Memory/Second/Per Request
  • 13. Automation + Processes Drives Lifecycle Management Selection Organizations Cross-Account Asset Management + Governance Control Tower Account vending/default standardization Service Catalog Workload platform vending/default standardization CloudFormation IaC Ephemeral Compute + API Managed Data/Control Plane for Persistence Tiers Hands off/Lights out
  • 14. Processes Patching Backup/Restore Testing Failover Testing (AZ) Credential Rotation/Credential Audit Event Response Testing Incident Response Testing Performance Testing Performance/Cost Review Vulnerability/Penetration Testing
  • 17. Generative AI in the context of AWS Amazon Bedrock Amazon SageMaker, Studio and Canvas (and Redshift Inferences) NVIDIA GPU-powered Amazon EC2 instances AWS Tranium AWS Inferentia Amazon EC2 UltraClusters Amazon Q: Business, AWS, QuickSight, Connect, Supply Chain, Code Catalyst, IDE, Code Transformation, Query Editor (Redshift) PartyRock AWS CodeWhisperer AWS HealthScribe
  • 18. Generative AI in the context of AWS Services that accelerate development for AWS Services that are powered by it – No-code data connectors/Zero ETL, Instance Selection, Console to Code (and AppComposer), Natural Language Querying, Code Scanning, Datazone (Descriptions) Services that accelerate development for you – Lex (Conversational FAQ, Slot Resolution, Bot builder, Utterance Generator), Personalize (Themes), Transcribe (Summarization) Services improved by it – Alexa
  • 19. Rationalization Why Serverless – how does serverless change how we incept, launch, and iterate product? Why GenAI – how does Generative AI change how we think about solving problems with data?
  • 21. Bedrock Operationalization Non-functional Regional Considerations FM Subscription Throughput/Quotas Security Operational Monitoring Traffic Flow (Private Link) Functional Prompt Engineering Tokens Model Parameters Inference Parameters Sessions
  • 24. Databases that can be used to store Vector Embeddings OpenSearch/Serverless Redis Enterprise and MemoryDB Pinecone Aurora (Postgres) RDS (Postgres) MongoDB DocumentDB Neptune
  • 25. Machine Learning Amazon Augmented AI - Easily implement human review of machine learning predictions Amazon CodeGuru - Intelligent recommendations for building and running modern applications Amazon Comprehend - Analyze Unstructured Text Amazon Comprehend Medical - Amazon Comprehend Medical uses machine learning to extract insights and relationships from medical text. AWS DeepComposer - AWS DeepComposer allows developers of all skill levels to get started with Generative AI. AWS DeepLens - Deep Learning Enabled Video Camera AWS DeepRacer - Fully autonomous 1/18th scale race car, driven by machine learning Amazon DevOps Guru - ML-powered cloud operations service to improve application availability. Amazon Forecast - Amazon Forecast is a fully-managed service for accurate time-series forecasting Amazon Fraud Detector - Detect more online fraud faster using machine learning Amazon HealthLake - Making sense of health data Amazon Kendra - Highly accurate enterprise search service powered by machine learning AWS HealthImaging Amazon Lex - Build Voice and Text Chatbots Amazon Lookout for Equipment - Detect abnormal equipment behavior by analyzing sensor data Amazon Lookout for Metrics - Accurately detect anomalies in your business metrics and quickly understand why Amazon Lookout for Vision - Identify defects using computer vision to automate quality inspection. Amazon Monitron - End-to-end system for equipment monitoring Amazon Omics - Transform omics data into insights. AWS Panorama - Enabling computer vision applications at the edge Amazon Personalize - Amazon Personalize helps you easily add real-time recommendations to your apps Amazon Polly - Turn Text into Lifelike Speech Amazon Rekognition - Search and Analyze Images Amazon SageMaker - Build, Train, and Deploy Machine Learning Models Amazon Textract - Easily extract text and data from virtually any document Amazon Transcribe - Powerful Speech Recognition Amazon Translate - Powerful Neural Machine Translation Amazon Bedrock
  • 27. Primary Services API Tier API Gateway – API Management AppSync – GraphQL API Application (Execution)/Code Tier Lambda – Serverless Compute Data Store Tier DynamoDB – Key/Value Data Base Service Tier Event Bridge/Step Functions – Event Bus, Low Code/No Code Workflow Athena – Interactive Query Service S3 – Object Storage Glue – Data Integration Service
  • 28. Options for APIs Client > API Gateway HTTP > Things Client > API Gateway REST > Things Client > AppSync GraphQL > Things Client > Application Load Balancer > Lambda Client > Lambda Function URLs Client > CloudFront (Authorizer) > Lambda Client > AWS IoT
  • 29. Options to call AWS services w/o Lambda APIs API Gateway > AWS Services AppSync > GraphQL > Resolvers > AWS Services Event Step Functions > AWS Services EventBridge
  • 30. API Gateway Integrations AWS Service Lambda AWS Proxy Service Lambda HTTP HTTP Proxy Mock
  • 32. Sync versus Async Can the payload fit in the size/time constraints What is the impact to the client?
  • 33. Step Functions Optimized Integrations Lambda Batch DynamoDB ECS/Fargate SNS SQS Glue, DataBrew SageMaker EMR CodeBuild Athena EKS API Gateway EventBridge Step Functions HTTP Destinations (New) - https://aws.amazon.com/blogs/aws/external-endpoints-and-testing-of-task-states-now-available-in-aws-step-functions/ Bedrock (New)- https://aws.amazon.com/about-aws/whats-new/2023/11/aws-step-functions-optimized-integration-bedrock/
  • 34. Options for Event Buses/Messaging/Queuing DynamoDB > Triggers CloudWatch Logs > Metrics > Alarms / Lambda CloudWatch Metrics > Destination Kinesis > Lambda Event Bridge (DLQ Support) > Lambda SQS (DLQ Support) > Lambda SNS (DLQ Support) > Lambda (DLQ Support) Lambda
  • 36. Serverless Data Stores - The Easy Button S3 Query – Query objects in S3, through S3 Athena (and S3 and Glue) – Query objects in S3, Presto AppFlow – Data Integration Platform Profiles Wisdom Tasks
  • 37. Serverless Data Stores DynamoDB – Key/Value Timescale – Time Series Keyspaces – Cassandra QLDB – Ledger Aurora – Relational Prometheus – Prometheus Grafana – Grafana MWAA – Airflow
  • 38. General Considerations Multi-Region? Single-Region? Which Region(s)? Which Services? What will they cost? How are they metered/billed? How far do we need to scale? What compliance requirements do we need to meet? What tools do we have in our reach? (Frameworks, Patterns, etc.)
  • 39. API Gateway Development (Isolation, Stages, SAM) Client Security (Certificates, API Keys, Authorizers) Gateway Security (WAF, Throttling) Endpoint Type (Edge optimized, Regional, Private, API Cache) Integration (Methods, Proxy, Response Codes) Operationalization (CloudWatch Logs, CloudWatch Metrics, Access Logging, X-Ray Testing (Direct, PostMan)
  • 40. Lambda Runtime Pre-Warming Sizing/Timeouts Development (Isolation, Versions, SAM, Cloud9, Parameterization) Integration (Methods, Response Codes) Security (KMS, Execution Role) Operationalization (CloudWatch Logs, CloudWatch Metrics, X-Ray) Testing (Direct)
  • 41. “The Rest” Development (Coding Best Practices, Runtime, RDBMS, DevOps) Data Stores that are not Serverless (Sizing, CloudWatch, Logs, Events, Backup/Recovery, Multi-AZ, Database “Stuff”) Trade-off VPC (Public Subnets, Private Subnets, Security Groups) Typical of Legacy Integrations, Non-Serverless Data Stores, etc. General (What are all of the things we need to think about when we create a new AWS account?) “Landing Zone”
  • 42. Conclusion AWS continues to increase the breadth and depth of their service offerings I wish it did that I didn’t know I needed that It’s easier to get started today than it was yesterday Simplicity Support Cost Lessons Learned Regional Availability Flexibility of implementation to change FMs (or even support custom FMs) and tune FM specific parameters Conclusion Generative AI and API Access to Generative AI services (like Bedrock) can be an easy button Not an end all – value can be found in context, which takes us back to needing a strong data foundation Priorities are still priorities – customers don’t care about Generative AI if your customers have needs unfulfilled by the product or by Generative AI Customers may also need to be led to it – if the customer isn’t asking, pushing it on them won’t help – they need education Consider sustainability when choosing an approach – Maslow’s Hammer Don’t forget about team enablement Limited by your imagination and ability to execute
  • 43. References https://docs.aws.amazon.com/wellarchitected/latest/serverless-applications-lens/wellarchitected- serverless-applications-lens.pdf – Well Architected Serverless Application Lens https://docs.aws.amazon.com/apigateway/latest/developerguide/getting-started-aws-proxy.html – API Gateway Service Proxy Example https://docs.aws.amazon.com/apigateway/latest/developerguide/websocket-api-chat-app.html – API Gateway Websocket Example https://docs.aws.amazon.com/appsync/latest/devguide/tutorials.html – AppSync Tutorials https://docs.aws.amazon.com/appsync/latest/devguide/tutorial-dynamodb-resolvers.html – AppSync Tutorial DynamoDB Resolver https://docs.aws.amazon.com/lambda/latest/dg/lambda-urls.html – Lambda URLS https://docs.aws.amazon.com/step-functions/latest/dg/connect-supported-services.html – Step Functions Supported Services https://docs.aws.amazon.com/step-functions/latest/dg/sample-athena-query.html – Step Functions Athena Query
  • 44. 0800-860-2040 [email protected] cloudhesive.com Fort Lauderdale 2419 E. Commercial Blvd, Ste. 300 Ft. Lauderdale, Florida USA Buenos Aires Av. Del Libertador 6680, Piso 6 CABA, Ciudad de Buenos Aires Argentina Santiago de Chile Cerro El Plomo 5420 SB1, Oficina 15 Nueva Las Condes, Santiago de Chile Chile

Editor's Notes