SlideShare a Scribd company logo
4
Most read
8
Most read
20
Most read
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Generative AI architecture
patterns in production
Oleksii Ivanchenko
Solutions Architect
AWS
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Generative AI evolution
2025 and beyond
Broader adoption,
deeper innovation,
business value realization
2023
Exploration and
experimentation
2024
Early gen AI applications in
production, lessons learned
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Observers
Pioneers
Leading the charge
to transform their
industry with AI
Focused on
implementing quick
wins and building skills
Adopters
Interested and
watching the space
as it evolves
Generative AI adoption mindsets
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Generative AI Stack
APPLICATIONS TO BOOST PRODUCTIVITY
MODELS AND TOOLS TO BUILD GENERATIVE AI APPS
INFRASTRUCTURE TO BUILD AND TRAIN AI MODELS
Amazon Q Business
INSIGHTS AND AUTOMATION
Amazon Q Developer
SOFTWARE DEVELOPMENT LIFECYCLE
Amazon Bedrock
AMAZON MODELS | PARTNER MODELS
AWS Trainium
AWS Inferentia
GPUs
HIGH PERFORMANCE COMPUTE
Amazon SageMaker AI
MANAGED INFRASTRUCTURE
Guardrails Agents Customization Capabilities
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Common architecture patterns
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
1. Employee productivity
Powered by Amazon Q
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Delivers quick, accurate, and
relevant answers to your business
questions, securely and privately
Executes actions using out-of-the-
box or custom plugins
Respects existing access control
based on user permissions
Connects to over 40 popular
enterprise applications and
document repositories
Enables administrators to easily
apply guardrails to customize
and control responses
Streamlines daily tasks with user-
created lightweight applications
Amazon Q Business
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Query
S3 role
Salesforce
role
SharePoint
role
Confluence
role
Google Drive
role
Amazon Q
application role
Amazon Q
Web experience
Authenticated
user
Amazon Q
web experience role
Jira ServiceNow
Amazon Q
plugins role
Plugins
Corporate network
Permissions
filtered response
Identity provider
User and group info
Authenticates
user
Data sources
Amazon
CloudWatch
AWS CloudTrail
Channels
Communication tools
Knowledge management with Amazon Q Business
Ingest document
content and
permissions
information
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
[About Adastra]
[Pain points]
• Large IT Consulting company based in Czech
Republic & Canada
• Focus on Data, Analytics, and Machine Learning
• Difficulty in finding information across
multiple source documents
• Time-consuming process to search through
hundreds of documents on MS SharePoint
for market research and RFPs.
[Results]
• Quick no-code implementation
• Streamlined knowledge management
• Accelerating RFP process by 70%
• Maintain competitive edge by focus on Innovation
[Solution]
Consolidation of organizational knowledge into
Amazon Q Business-powered AI engine integrated
with MS SharePoint
Adastra
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
2. Generative AI For Builders
Powered by Amazon Bedrock
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Choice of leading FMs through a single API
Model customization
Retrieval Augmented Generation (RAG)
Agents that execute multistep tasks
Security, privacy, and data governance
The easiest way to build and scale
generative AI applications with
powerful tools foundation models
Model Evaluation and RAG evaluation
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Bedrock
T H E E A S I E S T A N D F A S T E S T W A Y T O B U I L D A N D S C A L E G E N E R A T I V E A I A P P L I C A T I O N S
Flexible options
INFERENCE AT SCALE
On demand
Provisioned
throughput
Batch
Optimized inference
Prompt
Caching
Intelligent
Prompt Routing
Latency-optimized
inference
Fine
tuning
Model
Distillation
Knowledge
Bases
Customize with
your data
Flows
Agents
Orchestrate
and execute
Security
Guardrails
Secure and
responsible
Prompt
optimization
Prompt
management
Developer
experience
TOOLS
MODEL CHOICE AND EVALUATION
Bedrock
Marketplace
Custom
model
import
Programatic RAG
Evaluation
Base foundation models More models
Global reach
Worldwide
regions
Cross-region
inference
IDE
Open source
integration
LangChain
LangGraph
LlamaIndex
…
Bedrock
Data
Automation
Human
LLM as a
judge
VPC PrivateLink
Encryption
No customer data
to model providers,
or to AWS
Automated
Reasoning checks
GDPR, SOC, ISO,
CSA, HIPAA
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Document
data flow
Query
data flow
Question + context
Answer
Document
store
User
New/updated
documents
Vector Store
Retrieve relevant
documents as context
LLM
Embeddings
model
Embeddings
model
1
2
3
4
5
6
0.011 -0.011 0.032 ... -0.011
Text as vector
Retrieval Augmented Generation (RAG)
Question
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Document
data flow
Query
data flow
Question + context
Answer
Document
store
User
New/updated
documents
Vector Store
Retrieve relevant
documents as context
LLM
Embeddings
model
Embeddings
model
1
2
3
4
5
6
0.011 -0.011 0.032 ... -0.011
Text as vector
Retrieval Augmented Generation (RAG)
Question
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Document
data flow
Query
data flow
Query
User LLM
Retrieved
context
LLM
Cache
1
Query
translation
Query-based
routing
Vector DB RDS Graph DB
Retrieve
document
set
Optimized
query
Embeddings
model
Post retrieval
optimization
Re-ranking
model
LLM
LLM
Search
API
Bad
Search
result
Re-ranked
context
+
2 3
4
5
6
Search result storage
7
Retrieval
evaluation
Answer
Advanced RAG
+
Document
store
New/updated
documents
Embeddings
model
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Bedrock Knowledge Bases - RAG
Chunking strategies:
Hierarchical, Semantic
and custom
Data connectors
Parsing choice: FM,
BDA
Real-time sync for
custom data sources
Smart Data Ingestion
Intelligent RAG retrieval
Custom prompt and
inference parameters Hybrid search Metadata filtering
Query
reformulation
Auto-generated
query filters
-0.02
0.89 -0.53 0.95
Rerank API
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
[About Dende.ai]
[Pain points]
• Dende.ai is an e-learning platform
• Provides personalized AI tutor to students to
learn and review key concepts from their own
study material
• 40k+ students,120+ countries, 100+ languages
• Reduce loading time to generate tests and
improve customer experience & retention
• Previous solutions suffered from latency and
performance issues
[Results]
Reduced information processing time by 40%
compared to previous solutions while maintaining a
high level of quality and reliability
[Solution]
Leveraged LLMs with Amazon Bedrock and the key
capabilities integrated in a scalable serverless
architecture on AWS
Dende.ai
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Intelligent Document Processing
Human
in the loop
Post-
processing
completeness
check,
rule-based
validation
and review
Route low-
confidence
predictions
Route high-
confidence
predictions
Verification and
human review
Amazon
Textract
Amazon
Comprehend
text, forms,
tables
Derive key insights with
extraction and
enrichment
Amazon Bedrock
Entities,
PII redaction,
and more
Q&A, tables,
summarization,
normalization,
chatbots, and more
Amazon
Comprehend
Classification
Categorize, tag
documents
with classification
Amazon
Textract
OCR
Data
capture
S3
bucket
Documents
Send data to
downstream
databases/
apps
Categorized
documents
Route based on
business
rules/type of
info needed
Vector DB
Amazon Bedrock
OR
Amazon Q
search-based
verification
Amazon Bedrock
OR
IDP with GenAI
Tech. blog post
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
3. Data Science
Powered by AWS-designed silicon
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Inferentia
Lowest cost per inference
in the cloud for running
deep learning (DL) models
AWS Trainium2
The most cost-efficient, high-
performance training of
LLMs and diffusion models
AWS Inferentia2
High performance at the
lowest cost per inference for
LLMs and diffusion models
Purpose-built accelerators for generative AI
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Cloud
Mobile
client
S3 Express
One Zone
Amazon ECS
(signed URL service)
AWS Lambda
(upload event handler)
Amazon SQS
(face edit request queue)
AWS AppSync
Amazon SNS
(image data SNS)
Amazon DynamoDB
(image process state)
Amazon EC2 ASG (Inf2)
AWS Lambda
(face edit
request handler)
Amazon SNS
AWS Lambda
(image process
event handler)
Upload image
Upload image
notification
Polls message
Success/fail message
AWS Lambda
(face edit
response handler)
Send image
process
result
Save state
Image generation and modification
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
[About Lyrebird Studio]
[Pain point and solution]
• Lyrebird Studio is a leading global developer and
publisher of 50+ gen AI-powered photo and video
editing mobile apps
• 2.4+ billion downloads
• Handling ~5 million inferences/day with milliseconds
to seconds inferences response time
• 150+ countries, 20+ languages
• A shortage of A100 GPUs, latency, and high costs
became major bottlenecks to achieving business goals
• Explored AWS inferentia2 accelerators as an alternative
Cosmo FaceLab ToonApp
[Results]
• Lower inference cost by 31%
• Accelerate inference times by 19%
• Achieve higher throughput by 24%
Lyrebird Studio
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Oleksii Ivanchenko
LinkedIn

More Related Content

Similar to Oleksii Ivanchenko: Generative AI architecture patterns in production (UA) (20)

PDF
AIM102-S_Cognizant_CognizantCognitive
PhilipBasford
 
PDF
Suresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdf
AWS Chicago
 
PDF
Generative AI on AWS AI Models of 2024.pdf
SoluLab1231
 
PDF
Amazon Bedrock - ML innovation is in Amazon’s DNA
marcelofranceschinis
 
PDF
Rob Sable: Gen AI and Manufacfuring Community Day
AWS Chicago
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
re:cap Generative AI journey with Bedrock
PhilipBasford
 
PDF
How to build a generative AI solution A step-by-step guide.pdf
mahaffeycheryld
 
PPTX
Unlock Innovation with AWS Generative AI: Transform Your Business with Scalab...
Akhil Khandelwal
 
PDF
"Fast Start to Building on AWS", Igor Ivaniuk
Fwdays
 
PDF
Building smart applications with AWS AI services (October 2019)
Julien SIMON
 
PDF
Gen AI Cognizant & AWS event presentation_12 Oct.pdf
PhilipBasford
 
PDF
LEVERAGING AWS GENERATIVE AI: ARCHITECTURAL INSIGHTS AND REAL-WORLD IMPLEMENT...
Mohammed Fazuluddin
 
PPTX
Serverless Generative AI on AWS, AWS User Groups of Florida
CloudHesive
 
PDF
AWS Summit London 2024 - Cognizant Partner Spotlight - Cognitive Architecture...
PhilipBasford
 
PDF
How to build a generative AI solution A step-by-step guide (2).pdf
ChristopherTHyatt
 
PDF
Artificial intelligence in actions: delivering a new experience to Formula 1 ...
GoDataDriven
 
PDF
AWS DevDay Seoul 2017 - Keynote
Amazon Web Services Korea
 
PDF
How to build a generative AI solution A step-by-step guide.pdf
ChristopherTHyatt
 
PDF
AWS Summit Singapore 2019 | The Smart Way to Build an AI & ML Strategy for Yo...
AWS Summits
 
AIM102-S_Cognizant_CognizantCognitive
PhilipBasford
 
Suresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdf
AWS Chicago
 
Generative AI on AWS AI Models of 2024.pdf
SoluLab1231
 
Amazon Bedrock - ML innovation is in Amazon’s DNA
marcelofranceschinis
 
Rob Sable: Gen AI and Manufacfuring Community Day
AWS Chicago
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
re:cap Generative AI journey with Bedrock
PhilipBasford
 
How to build a generative AI solution A step-by-step guide.pdf
mahaffeycheryld
 
Unlock Innovation with AWS Generative AI: Transform Your Business with Scalab...
Akhil Khandelwal
 
"Fast Start to Building on AWS", Igor Ivaniuk
Fwdays
 
Building smart applications with AWS AI services (October 2019)
Julien SIMON
 
Gen AI Cognizant & AWS event presentation_12 Oct.pdf
PhilipBasford
 
LEVERAGING AWS GENERATIVE AI: ARCHITECTURAL INSIGHTS AND REAL-WORLD IMPLEMENT...
Mohammed Fazuluddin
 
Serverless Generative AI on AWS, AWS User Groups of Florida
CloudHesive
 
AWS Summit London 2024 - Cognizant Partner Spotlight - Cognitive Architecture...
PhilipBasford
 
How to build a generative AI solution A step-by-step guide (2).pdf
ChristopherTHyatt
 
Artificial intelligence in actions: delivering a new experience to Formula 1 ...
GoDataDriven
 
AWS DevDay Seoul 2017 - Keynote
Amazon Web Services Korea
 
How to build a generative AI solution A step-by-step guide.pdf
ChristopherTHyatt
 
AWS Summit Singapore 2019 | The Smart Way to Build an AI & ML Strategy for Yo...
AWS Summits
 

More from Lviv Startup Club (20)

PDF
Rostyslav Chayka: Управління командою за допомогою AI (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Tailoring + Change Management (UA)
Lviv Startup Club
 
PDF
Maksym Vyshnivetskyi: Управління закупівлями (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Управління ризиками (UA)
Lviv Startup Club
 
PPTX
Dmytro Zubkov: PMO Resource Management (UA)
Lviv Startup Club
 
PPTX
Rostyslav Chayka: Комунікація за допомогою AI (UA)
Lviv Startup Club
 
PDF
Ihor Pavlenko: Комунікація за допомогою AI (UA)
Lviv Startup Club
 
PDF
Maksym Vyshnivetskyi: Управління якістю (UA)
Lviv Startup Club
 
PDF
Ihor Pavlenko: Робота зі стейкхолдерами за допомогою AI (UA)
Lviv Startup Club
 
PDF
Maksym Vyshnivetskyi: Управління вартістю (Cost) (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Управління часом та ресурсами (UA)
Lviv Startup Club
 
PPTX
Dmytro Liesov: Developing PMO Services and Functions (UA)
Lviv Startup Club
 
PDF
Igor Dumbur: Інженерна досконалість та DevOps (UA)
Lviv Startup Club
 
PDF
Ihor Pavlenko: Управління ризиками за допомогою AI (UA)
Lviv Startup Club
 
PPTX
Dmytro Liesov: Управління інтеграцією (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Управління обсягом (Scope) (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Defining PMO Structure and Governance (UA)
Lviv Startup Club
 
PDF
Oleksandra Apanasenkova: Управління delivery (Частина 2) (UA)
Lviv Startup Club
 
PDF
Michael Vidyakin: Планування проєктів за допомогою AI (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Комунікації у проєкті (UA)
Lviv Startup Club
 
Rostyslav Chayka: Управління командою за допомогою AI (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Tailoring + Change Management (UA)
Lviv Startup Club
 
Maksym Vyshnivetskyi: Управління закупівлями (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Управління ризиками (UA)
Lviv Startup Club
 
Dmytro Zubkov: PMO Resource Management (UA)
Lviv Startup Club
 
Rostyslav Chayka: Комунікація за допомогою AI (UA)
Lviv Startup Club
 
Ihor Pavlenko: Комунікація за допомогою AI (UA)
Lviv Startup Club
 
Maksym Vyshnivetskyi: Управління якістю (UA)
Lviv Startup Club
 
Ihor Pavlenko: Робота зі стейкхолдерами за допомогою AI (UA)
Lviv Startup Club
 
Maksym Vyshnivetskyi: Управління вартістю (Cost) (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Управління часом та ресурсами (UA)
Lviv Startup Club
 
Dmytro Liesov: Developing PMO Services and Functions (UA)
Lviv Startup Club
 
Igor Dumbur: Інженерна досконалість та DevOps (UA)
Lviv Startup Club
 
Ihor Pavlenko: Управління ризиками за допомогою AI (UA)
Lviv Startup Club
 
Dmytro Liesov: Управління інтеграцією (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Управління обсягом (Scope) (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Defining PMO Structure and Governance (UA)
Lviv Startup Club
 
Oleksandra Apanasenkova: Управління delivery (Частина 2) (UA)
Lviv Startup Club
 
Michael Vidyakin: Планування проєктів за допомогою AI (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Комунікації у проєкті (UA)
Lviv Startup Club
 
Ad

Recently uploaded (20)

PDF
15 Essential Cloud Podcasts Every Tech Professional Should Know in 2025
Amnic
 
PDF
"Complete Guide to the Partner Visa 2025
Zealand Immigration
 
PPTX
Technical Analysis of 1st Generation Biofuel Feedstocks - 25th June 2025
TOFPIK
 
DOCX
How to Choose the Best Dildo for Men A Complete Buying Guide.docx
Glas Toy
 
DOCX
TCP Communication Flag Txzczczxcxzzxypes.docx
esso24
 
PDF
Flexible Metal Hose & Custom Hose Assemblies
McGill Hose & Coupling Inc
 
PPTX
2025 July - ABM for B2B in Hubspot - Demand Gen HUG.pptx
mjenkins13
 
PDF
Van Aroma IFEAT - Clove Oils - Socio Economic Report .pdf
VanAroma
 
PDF
Dr. Enrique Segura Ense Group - A Philanthropist And Entrepreneur
Dr. Enrique Segura Ense Group
 
PDF
20250703_A. Stotz All Weather Strategy - Performance review July
FINNOMENAMarketing
 
PDF
Redefining Punjab’s Growth Story_ Mohit Bansal and the Human-Centric Vision o...
Mohit Bansal GMI
 
PPTX
Build Wealth & Protect Your Legacy with Indexed Universal Life Insurance
iulfinancial6
 
PPTX
DECODING AI AGENTS AND WORKFLOW AUTOMATION FOR MODERN RECRUITMENT
José Kadlec
 
PPTX
World First Cardiovascular & Thoracic CT Scanner
arineta37
 
PDF
Thane Stenner - An Industry Expert
Thane Stenner
 
PPTX
Washington University of Health and Science A Choice You Can Trust
Washington University of Health and Science
 
PDF
LDM Recording for Yogi Goddess Projects Summer 2025
LDMMia GrandMaster
 
PDF
Keppel Investor Day 2025 Presentation Slides GCAT.pdf
KeppelCorporation
 
PDF
Azumah Resources reaffirms commitment to Ghana amid dispute with Engineers & ...
Kweku Zurek
 
PPTX
Hackathon - Technology - Idea Submission Template -HackerEarth.pptx
nanster236
 
15 Essential Cloud Podcasts Every Tech Professional Should Know in 2025
Amnic
 
"Complete Guide to the Partner Visa 2025
Zealand Immigration
 
Technical Analysis of 1st Generation Biofuel Feedstocks - 25th June 2025
TOFPIK
 
How to Choose the Best Dildo for Men A Complete Buying Guide.docx
Glas Toy
 
TCP Communication Flag Txzczczxcxzzxypes.docx
esso24
 
Flexible Metal Hose & Custom Hose Assemblies
McGill Hose & Coupling Inc
 
2025 July - ABM for B2B in Hubspot - Demand Gen HUG.pptx
mjenkins13
 
Van Aroma IFEAT - Clove Oils - Socio Economic Report .pdf
VanAroma
 
Dr. Enrique Segura Ense Group - A Philanthropist And Entrepreneur
Dr. Enrique Segura Ense Group
 
20250703_A. Stotz All Weather Strategy - Performance review July
FINNOMENAMarketing
 
Redefining Punjab’s Growth Story_ Mohit Bansal and the Human-Centric Vision o...
Mohit Bansal GMI
 
Build Wealth & Protect Your Legacy with Indexed Universal Life Insurance
iulfinancial6
 
DECODING AI AGENTS AND WORKFLOW AUTOMATION FOR MODERN RECRUITMENT
José Kadlec
 
World First Cardiovascular & Thoracic CT Scanner
arineta37
 
Thane Stenner - An Industry Expert
Thane Stenner
 
Washington University of Health and Science A Choice You Can Trust
Washington University of Health and Science
 
LDM Recording for Yogi Goddess Projects Summer 2025
LDMMia GrandMaster
 
Keppel Investor Day 2025 Presentation Slides GCAT.pdf
KeppelCorporation
 
Azumah Resources reaffirms commitment to Ghana amid dispute with Engineers & ...
Kweku Zurek
 
Hackathon - Technology - Idea Submission Template -HackerEarth.pptx
nanster236
 
Ad

Oleksii Ivanchenko: Generative AI architecture patterns in production (UA)

  • 1. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Generative AI architecture patterns in production Oleksii Ivanchenko Solutions Architect AWS
  • 2. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Generative AI evolution 2025 and beyond Broader adoption, deeper innovation, business value realization 2023 Exploration and experimentation 2024 Early gen AI applications in production, lessons learned
  • 3. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Observers Pioneers Leading the charge to transform their industry with AI Focused on implementing quick wins and building skills Adopters Interested and watching the space as it evolves Generative AI adoption mindsets
  • 4. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Generative AI Stack APPLICATIONS TO BOOST PRODUCTIVITY MODELS AND TOOLS TO BUILD GENERATIVE AI APPS INFRASTRUCTURE TO BUILD AND TRAIN AI MODELS Amazon Q Business INSIGHTS AND AUTOMATION Amazon Q Developer SOFTWARE DEVELOPMENT LIFECYCLE Amazon Bedrock AMAZON MODELS | PARTNER MODELS AWS Trainium AWS Inferentia GPUs HIGH PERFORMANCE COMPUTE Amazon SageMaker AI MANAGED INFRASTRUCTURE Guardrails Agents Customization Capabilities
  • 5. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Common architecture patterns
  • 6. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. 1. Employee productivity Powered by Amazon Q
  • 7. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Delivers quick, accurate, and relevant answers to your business questions, securely and privately Executes actions using out-of-the- box or custom plugins Respects existing access control based on user permissions Connects to over 40 popular enterprise applications and document repositories Enables administrators to easily apply guardrails to customize and control responses Streamlines daily tasks with user- created lightweight applications Amazon Q Business
  • 8. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Query S3 role Salesforce role SharePoint role Confluence role Google Drive role Amazon Q application role Amazon Q Web experience Authenticated user Amazon Q web experience role Jira ServiceNow Amazon Q plugins role Plugins Corporate network Permissions filtered response Identity provider User and group info Authenticates user Data sources Amazon CloudWatch AWS CloudTrail Channels Communication tools Knowledge management with Amazon Q Business Ingest document content and permissions information
  • 9. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. [About Adastra] [Pain points] • Large IT Consulting company based in Czech Republic & Canada • Focus on Data, Analytics, and Machine Learning • Difficulty in finding information across multiple source documents • Time-consuming process to search through hundreds of documents on MS SharePoint for market research and RFPs. [Results] • Quick no-code implementation • Streamlined knowledge management • Accelerating RFP process by 70% • Maintain competitive edge by focus on Innovation [Solution] Consolidation of organizational knowledge into Amazon Q Business-powered AI engine integrated with MS SharePoint Adastra
  • 10. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. 2. Generative AI For Builders Powered by Amazon Bedrock
  • 11. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Choice of leading FMs through a single API Model customization Retrieval Augmented Generation (RAG) Agents that execute multistep tasks Security, privacy, and data governance The easiest way to build and scale generative AI applications with powerful tools foundation models Model Evaluation and RAG evaluation
  • 12. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Bedrock T H E E A S I E S T A N D F A S T E S T W A Y T O B U I L D A N D S C A L E G E N E R A T I V E A I A P P L I C A T I O N S Flexible options INFERENCE AT SCALE On demand Provisioned throughput Batch Optimized inference Prompt Caching Intelligent Prompt Routing Latency-optimized inference Fine tuning Model Distillation Knowledge Bases Customize with your data Flows Agents Orchestrate and execute Security Guardrails Secure and responsible Prompt optimization Prompt management Developer experience TOOLS MODEL CHOICE AND EVALUATION Bedrock Marketplace Custom model import Programatic RAG Evaluation Base foundation models More models Global reach Worldwide regions Cross-region inference IDE Open source integration LangChain LangGraph LlamaIndex … Bedrock Data Automation Human LLM as a judge VPC PrivateLink Encryption No customer data to model providers, or to AWS Automated Reasoning checks GDPR, SOC, ISO, CSA, HIPAA
  • 13. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Document data flow Query data flow Question + context Answer Document store User New/updated documents Vector Store Retrieve relevant documents as context LLM Embeddings model Embeddings model 1 2 3 4 5 6 0.011 -0.011 0.032 ... -0.011 Text as vector Retrieval Augmented Generation (RAG) Question
  • 14. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Document data flow Query data flow Question + context Answer Document store User New/updated documents Vector Store Retrieve relevant documents as context LLM Embeddings model Embeddings model 1 2 3 4 5 6 0.011 -0.011 0.032 ... -0.011 Text as vector Retrieval Augmented Generation (RAG) Question
  • 15. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Document data flow Query data flow Query User LLM Retrieved context LLM Cache 1 Query translation Query-based routing Vector DB RDS Graph DB Retrieve document set Optimized query Embeddings model Post retrieval optimization Re-ranking model LLM LLM Search API Bad Search result Re-ranked context + 2 3 4 5 6 Search result storage 7 Retrieval evaluation Answer Advanced RAG + Document store New/updated documents Embeddings model
  • 16. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Bedrock Knowledge Bases - RAG Chunking strategies: Hierarchical, Semantic and custom Data connectors Parsing choice: FM, BDA Real-time sync for custom data sources Smart Data Ingestion Intelligent RAG retrieval Custom prompt and inference parameters Hybrid search Metadata filtering Query reformulation Auto-generated query filters -0.02 0.89 -0.53 0.95 Rerank API
  • 17. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. [About Dende.ai] [Pain points] • Dende.ai is an e-learning platform • Provides personalized AI tutor to students to learn and review key concepts from their own study material • 40k+ students,120+ countries, 100+ languages • Reduce loading time to generate tests and improve customer experience & retention • Previous solutions suffered from latency and performance issues [Results] Reduced information processing time by 40% compared to previous solutions while maintaining a high level of quality and reliability [Solution] Leveraged LLMs with Amazon Bedrock and the key capabilities integrated in a scalable serverless architecture on AWS Dende.ai
  • 18. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Intelligent Document Processing Human in the loop Post- processing completeness check, rule-based validation and review Route low- confidence predictions Route high- confidence predictions Verification and human review Amazon Textract Amazon Comprehend text, forms, tables Derive key insights with extraction and enrichment Amazon Bedrock Entities, PII redaction, and more Q&A, tables, summarization, normalization, chatbots, and more Amazon Comprehend Classification Categorize, tag documents with classification Amazon Textract OCR Data capture S3 bucket Documents Send data to downstream databases/ apps Categorized documents Route based on business rules/type of info needed Vector DB Amazon Bedrock OR Amazon Q search-based verification Amazon Bedrock OR IDP with GenAI Tech. blog post
  • 19. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. 3. Data Science Powered by AWS-designed silicon
  • 20. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Inferentia Lowest cost per inference in the cloud for running deep learning (DL) models AWS Trainium2 The most cost-efficient, high- performance training of LLMs and diffusion models AWS Inferentia2 High performance at the lowest cost per inference for LLMs and diffusion models Purpose-built accelerators for generative AI
  • 21. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Cloud Mobile client S3 Express One Zone Amazon ECS (signed URL service) AWS Lambda (upload event handler) Amazon SQS (face edit request queue) AWS AppSync Amazon SNS (image data SNS) Amazon DynamoDB (image process state) Amazon EC2 ASG (Inf2) AWS Lambda (face edit request handler) Amazon SNS AWS Lambda (image process event handler) Upload image Upload image notification Polls message Success/fail message AWS Lambda (face edit response handler) Send image process result Save state Image generation and modification
  • 22. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. [About Lyrebird Studio] [Pain point and solution] • Lyrebird Studio is a leading global developer and publisher of 50+ gen AI-powered photo and video editing mobile apps • 2.4+ billion downloads • Handling ~5 million inferences/day with milliseconds to seconds inferences response time • 150+ countries, 20+ languages • A shortage of A100 GPUs, latency, and high costs became major bottlenecks to achieving business goals • Explored AWS inferentia2 accelerators as an alternative Cosmo FaceLab ToonApp [Results] • Lower inference cost by 31% • Accelerate inference times by 19% • Achieve higher throughput by 24% Lyrebird Studio
  • 23. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank you! © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Oleksii Ivanchenko LinkedIn