SlideShare a Scribd company logo
Privacy and
Security in the
Age of
Generative AI
Benjamin Bengfort, Ph.D. @ C4AI 2025
UNC5267
North Korea has used Western Language LLMs to
generate fake resumes and profiles to apply for
thousands of remote work jobs in western tech
companies.
Once hired, these “workers” (usually laptop farms in
China or Russia that are supervised by a handful of
individuals) use remote access tools to gain
unauthorized access to corporate infrastructure.
https://cloud.google.com/blog/topics/threat-intelligence/mitigating-dprk-it-worker-threat
https://www.forbes.com/sites/rashishrivastava/2024/08/27/the-prompt-north-korean-operati
ves-are-using-ai-to-get-remote-it-jobs/
AI Targeted Phishing
60% of participants in a recent study fell victim to AI
generated spear phishing content, a similar
success rate compared to non-AI generated
messages by human experts.
LLMs reduce the cost of generating spear phishing
messages by 95% while increasing their
effectiveness.
https://hbr.org/2024/05/ai-will-increase-the-quantity-and-quality-of-phishing-scams
F. Heiding, B. Schneier, A. Vishwanath, J. Bernstein and P. S. Park, "Devising and
Detecting Phishing Emails Using Large Language Models," in IEEE Access, vol. 12, pp.
42131-42146, 2024, doi: 10.1109/ACCESS.2024.3375882.
AI Generated Malware
OpenAI is playing a game of whack a mole trying to
ban the accounts of malicious actors who are using
ChatGPT to quickly generate malware as payloads
in targeted attacks using zip files, VBScripts, etc.
“The code is clearly AI generated because it is well
commented and most malicious actors want to
obfuscate what they’re doing to security
researchers.”
https://www.bleepingcomputer.com/news/security/openai-confirms-threat-actors-use-chatg
pt-to-write-malware/
https://www.bleepingcomputer.com/news/security/hackers-deploy-ai-written-malware-in-tar
geted-attacks/
Hugging Face Attacks
While Hugging Face does have excellent security
best practices and code scanning alerts; it is still a
vector of attack because of arbitrary code execution
in pickle __reduce__ and torch.load.
For example, the baller423/goober2 repository
had a model uploaded that initiates a reverse shell
to an IP address allowing the attacker to access the
model compute environment.
https://jfrog.com/blog/data-scientists-targeted-by-malicious-hugging-face-ml-models-with-si
lent-backdoor/
Data Trojans in DRL
AI Agents can be exploited to cause harm using
data poisoning or trojans injected during the
training phase of deep reinforcement learning.
Poisoning as little as 0.025% of the training data
allowed the inclusion of a classification backdoor
causing the agent to call a remote function.
A simple agent whose task is constrained is usually
allowed admin level privileges in its operation.
Panagiota, Kiourti, et al. "Trojdrl: Trojan attacks on deep reinforcement learning agents. in
proc. 57th acm/ieee design automation conference (dac), 2020, march 2020." Proc. 57th
ACM/IEEE Design Automation Conference (DAC), 2020. 2020.
Adversarial self-replicating prompts: prompts that
when processed by Gemini Pro, ChatGPT 4.0 and
LLaVA caused the model to replicate the input as
output to engage in malicious activities.
Additionally, these inputs compel the agent to
propagate to new agents by exploiting connectivity
within the GenAI ecosystem.
2 methods: flow-steering and RAG poisoning.
GenAI Worms
Cohen, Stav, Ron Bitton, and Ben Nassi. "Here Comes The AI Worm: Unleashing
Zero-click Worms that Target GenAI-Powered Applications." arXiv preprint
arXiv:2403.02817 (2024).
A custom AI agent built to translate natural
language prompts into bash commands using
Anthropic’s Claude LLM.
Prompt: “Access desktop using SSH”
SSH was successful but the agent continued by
updating the old Linux kernel, then investigated
why apt was taking so long and eventually bricked
the computer by rewriting the Grub boot loader.
Rogue Agents
https://decrypt.co/284574/ai-assistant-goes-rogue-and-ends-up-bricking-a-users-computer
Generally, prompts that are intended to cause an
LLM to leak sensitive information or to perform a
task in a manner not proscribed by the application
to the attacker’s benefit.
Extended case: the manipulation of a valid user’s
prompt in order to cause the LLM to take an
unexpected action or cause irrelevant output.
Prompt Injection
https://decrypt.co/284574/ai-assistant-goes-rogue-and-ends-up-bricking-a-users-computer
Liu, Yupei, et al. "Formalizing and benchmarking prompt injection attacks and defenses."
33rd USENIX Security Symposium (USENIX Security 24). 2024.
Targeting function calling LLMs that perform Google
searches and include the results into a prompt (e.g.
search based RAG); researchers showed that by
embedding hidden prompts into the retrieved
websites, they could manipulate LLMs to expose
private user data and information.
Indirect Prompt
Injection
https://thehill.com/opinion/cybersecurity/3953399-hijacked-ai-assistants-can-now-hack-you
r-data/
Greshake, Kai, et al. "Not what you've signed up for: Compromising real-world
llm-integrated applications with indirect prompt injection." Proceedings of the 16th ACM
Workshop on Artificial Intelligence and Security. 2023.
You can type just about anything into ChatGPT. But
users recently discovered that asking anything
about "David Mayer" caused ChatGPT to shut
down the conversation with the terse reply, "I'm
unable to produce a response."
A message shown at the bottom of the screen
doubled up on the David-dislike, saying, "There
was an error generating a response.
David Mayer
https://www.newsweek.com/chatgpt-openai-david-mayer-error-ai-1994100
https://www.cnet.com/tech/services-and-software/chatgpt-wont-answer-questions-about-c
ertain-names-heres-what-we-know/
Function calling (also referred to as “skills” or “tool
use” allows LLMs to make API calls based on the
descriptions of the tools available and their
parameters.
However, give an LLM a tool … it wants to use that
tool! Even prompts such as “tell me a joke” might
lead to unexpected tool use.
For more on this - come tonight!
Function Calling
https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling
A specialized form of indirect prompt injection;
exploits the fact that AI models see the complete
tool descriptions, including hidden instructions,
while users typically only see simplified versions in
their UI.
The attack modifies the tool instructions and can
use shadowing to exploit trusted servers. Because
MCP (Model Context Protocol) uses these tool calls
and has a trusted execution context, attackers can
gain access to sensitive files such as SSH keys.
Tool Poisoning: MCP
https://invariantlabs.ai/blog/mcp-security-notification-tool-poisoning-attacks
Important Lessons
Expect the Unexpected
Generative AI is not a deterministic
computer program that will behave within
expected pre-defined parameters. Treat
AI as stochastic and unpredictable.
Data governance and security in the form
of access controls is not optional when
doing machine learning and AI tasks.
Data security is as important as compute
environment security.
Do not trust the internet! Verify, escape,
scrub, and scan anything that comes from
the web! Make sure that you and your
models have guardrails.
We desperately need a mechanism to
identify what is human generated text or
imagery and what is AI generated.
Classifiers and/or watermarking is not
sufficient!
Guardrails!
Data Governance is Key
Certify Authorship
Important Lessons
Expect the Unexpected
Generative AI is not a deterministic
computer program that will behave within
expected pre-defined parameters. Treat
AI as stochastic and unpredictable.
Data governance and security in the form
of access controls is not optional when
doing machine learning and AI tasks.
Data security is as important as compute
environment security.
Do not trust the internet! Verify, escape,
scrub, and scan anything that comes from
the web! Make sure that you and your
models have guardrails.
We desperately need a mechanism to
identify what is human generated text or
imagery and what is AI generated.
Classifiers and/or watermarking is not
sufficient!
Guardrails!
Data Governance is Key
Certify Authorship
Important Lessons
Expect the Unexpected
Generative AI is not a deterministic
computer program that will behave within
expected pre-defined parameters. Treat
AI as stochastic and unpredictable.
Data governance and security in the form
of access controls is not optional when
doing machine learning and AI tasks.
Data security is as important as compute
environment security.
Do not trust the internet! Verify, escape,
scrub, and scan anything that comes from
the web! Make sure that you and your
models have guardrails.
We desperately need a mechanism to
identify what is human generated text or
imagery and what is AI generated.
Classifiers and/or watermarking is not
sufficient!
Guardrails!
Data Governance is Key
Certify Authorship
Important Lessons
Expect the Unexpected
Generative AI is not a deterministic
computer program that will behave within
expected pre-defined parameters. Treat
AI as stochastic and unpredictable.
Data governance and security in the form
of access controls is not optional when
doing machine learning and AI tasks.
Data security is as important as compute
environment security.
Do not trust the internet! Verify, escape,
scrub, and scan anything that comes from
the web! Make sure that you and your
models have guardrails.
We desperately need a mechanism to
identify what is human generated text or
imagery and what is AI generated.
Classifiers and/or watermarking is not
sufficient!
Guardrails!
Data Governance is Key
Certify Authorship
Happy to take comments and questions
online or chat after the talk!
benjamin@rotational.io
https://rtnl.link/SEmP0wIrMft
rotational.io
Thanks!
Some images in this presentation were AI generated using Gemini Pro
Special thanks to Ali Haidar and John Bruns at Anomali for
providing some of the threat intelligence research.
@bbengfort

More Related Content

PDF
Privacy and Security in the Age of Generative AI
Benjamin Bengfort
 
PDF
Artificial Intelligence in cybersecurity
SmartlearningUK
 
PDF
PROFITABLE USES OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING TO SECURE OUR...
IJNSA Journal
 
PDF
Securing Cloud Using Fog: A Review
IRJET Journal
 
PPTX
Secure AI Development: Strategies for Safe Innovation in a Machine-Led World
sayalikerimova20
 
PDF
Lessons Learned from Developing Secure AI Workflows.pdf
Priyanka Aash
 
PDF
Implementing Function Calling LLMs without Fear.pdf
Benjamin Bengfort
 
PDF
Phishing Detection using Decision Tree Model
IRJET Journal
 
Privacy and Security in the Age of Generative AI
Benjamin Bengfort
 
Artificial Intelligence in cybersecurity
SmartlearningUK
 
PROFITABLE USES OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING TO SECURE OUR...
IJNSA Journal
 
Securing Cloud Using Fog: A Review
IRJET Journal
 
Secure AI Development: Strategies for Safe Innovation in a Machine-Led World
sayalikerimova20
 
Lessons Learned from Developing Secure AI Workflows.pdf
Priyanka Aash
 
Implementing Function Calling LLMs without Fear.pdf
Benjamin Bengfort
 
Phishing Detection using Decision Tree Model
IRJET Journal
 

Similar to SpyHunter Crack Latest Version FREE Download 2025 (20)

PDF
Role of Generative AI in Cybersecurity.pdf
imoliviabennett
 
PPTX
Product security by Blockchain, AI and Security Certs
LabSharegroup
 
PDF
MACHINE LEARNING APPROACH TO LEARN AND DETECT MALWARE IN ANDROID
IRJET Journal
 
PPTX
Automation: Embracing the Future of SecOps
IBM Security
 
PPTX
Ethical hacking
Alapan Banerjee
 
PDF
Two Aspect Endorsement Access Control for web Based Cloud Computing
IRJET Journal
 
PDF
Empowering Cloud-native Security: the Transformative Role of Artificial Intel...
gerogepatton
 
PDF
Empowering Cloud-native Security: the Transformative Role of Artificial Intel...
gerogepatton
 
PDF
Empowering Cloud-native Security: the Transformative Role of Artificial Intel...
gerogepatton
 
PDF
Role of Generative AI in Cybersecurity.pdf
SoluLab1231
 
PDF
Ijsrdv8 i10355
aissmsblogs
 
PPTX
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P...
DataScienceConferenc1
 
DOCX
Create a software key logger
GiralFaurel
 
PDF
API SECURITY by krishna murari and vikas maurya
Krishna Murari
 
PPSX
IBM: Cognitive Security Transformation for the Enrgy Sector
FMA Summits
 
PDF
FireTail at API Days Australia 2024 - The Double-edge sword of AI for API Sec...
JeremySnyder8
 
PDF
Machine Learning: A Game-Changer in Cybersecurity
Home
 
PDF
A Survey of Keylogger in Cybersecurity Education
ijtsrd
 
PPTX
Machine learning in Cyber Security
RajathV2
 
PDF
Improve network safety through better visibility – Netmagic
Netmagic Solutions Pvt. Ltd.
 
Role of Generative AI in Cybersecurity.pdf
imoliviabennett
 
Product security by Blockchain, AI and Security Certs
LabSharegroup
 
MACHINE LEARNING APPROACH TO LEARN AND DETECT MALWARE IN ANDROID
IRJET Journal
 
Automation: Embracing the Future of SecOps
IBM Security
 
Ethical hacking
Alapan Banerjee
 
Two Aspect Endorsement Access Control for web Based Cloud Computing
IRJET Journal
 
Empowering Cloud-native Security: the Transformative Role of Artificial Intel...
gerogepatton
 
Empowering Cloud-native Security: the Transformative Role of Artificial Intel...
gerogepatton
 
Empowering Cloud-native Security: the Transformative Role of Artificial Intel...
gerogepatton
 
Role of Generative AI in Cybersecurity.pdf
SoluLab1231
 
Ijsrdv8 i10355
aissmsblogs
 
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P...
DataScienceConferenc1
 
Create a software key logger
GiralFaurel
 
API SECURITY by krishna murari and vikas maurya
Krishna Murari
 
IBM: Cognitive Security Transformation for the Enrgy Sector
FMA Summits
 
FireTail at API Days Australia 2024 - The Double-edge sword of AI for API Sec...
JeremySnyder8
 
Machine Learning: A Game-Changer in Cybersecurity
Home
 
A Survey of Keylogger in Cybersecurity Education
ijtsrd
 
Machine learning in Cyber Security
RajathV2
 
Improve network safety through better visibility – Netmagic
Netmagic Solutions Pvt. Ltd.
 
Ad

More from channarbrothers93 (8)

PDF
Epic Pen Pro Crack FREE Download link 2p25
channarbrothers93
 
PDF
Epic Pen Pro Crack FREE Download LINK 2025
channarbrothers93
 
PDF
K7 Ultimate Security Crack FREE latest version 2025
channarbrothers93
 
PDF
CCleaner Pro Crack Latest Version FREE Download 2025
channarbrothers93
 
PDF
Avast Free Antivirus Crack FREE Downlaod 2025
channarbrothers93
 
PDF
Bandicam Crack FREE Download Latest Version 2025
channarbrothers93
 
PDF
WTFAST Crack Latest Version FREE Downlaod 2025
channarbrothers93
 
PDF
uTorrent Pro Crack Latest Version free 2025
channarbrothers93
 
Epic Pen Pro Crack FREE Download link 2p25
channarbrothers93
 
Epic Pen Pro Crack FREE Download LINK 2025
channarbrothers93
 
K7 Ultimate Security Crack FREE latest version 2025
channarbrothers93
 
CCleaner Pro Crack Latest Version FREE Download 2025
channarbrothers93
 
Avast Free Antivirus Crack FREE Downlaod 2025
channarbrothers93
 
Bandicam Crack FREE Download Latest Version 2025
channarbrothers93
 
WTFAST Crack Latest Version FREE Downlaod 2025
channarbrothers93
 
uTorrent Pro Crack Latest Version free 2025
channarbrothers93
 
Ad

Recently uploaded (20)

PPTX
The-Dawn-of-AI-Reshaping-Our-World.pptxx
parthbhanushali307
 
PPTX
AI-Ready Handoff: Auto-Summaries & Draft Emails from MQL to Slack in One Flow
bbedford2
 
PPTX
Presentation about variables and constant.pptx
kr2589474
 
PDF
49784907924775488180_LRN2959_Data_Pump_23ai.pdf
Abilash868456
 
PPTX
Visualising Data with Scatterplots in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PDF
Microsoft Teams Essentials; The pricing and the versions_PDF.pdf
Q-Advise
 
PDF
QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform...
QAware GmbH
 
PDF
lesson-2-rules-of-netiquette.pdf.bshhsjdj
jasmenrojas249
 
PDF
Teaching Reproducibility and Embracing Variability: From Floating-Point Exper...
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
PPTX
GALILEO CRS SYSTEM | GALILEO TRAVEL SOFTWARE
philipnathen82
 
PDF
Protecting the Digital World Cyber Securit
dnthakkar16
 
PPTX
Odoo Integration Services by Candidroot Solutions
CandidRoot Solutions Private Limited
 
PDF
Key Features to Look for in Arizona App Development Services
Net-Craft.com
 
PPTX
Why Use Open Source Reporting Tools for Business Intelligence.pptx
Varsha Nayak
 
PDF
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
PDF
Jenkins: An open-source automation server powering CI/CD Automation
SaikatBasu37
 
PDF
advancepresentationskillshdhdhhdhdhdhhfhf
jasmenrojas249
 
PPT
Why Reliable Server Maintenance Service in New York is Crucial for Your Business
Sam Vohra
 
PPTX
Role Of Python In Programing Language.pptx
jaykoshti048
 
PDF
Bandai Playdia The Book - David Glotz
BluePanther6
 
The-Dawn-of-AI-Reshaping-Our-World.pptxx
parthbhanushali307
 
AI-Ready Handoff: Auto-Summaries & Draft Emails from MQL to Slack in One Flow
bbedford2
 
Presentation about variables and constant.pptx
kr2589474
 
49784907924775488180_LRN2959_Data_Pump_23ai.pdf
Abilash868456
 
Visualising Data with Scatterplots in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Microsoft Teams Essentials; The pricing and the versions_PDF.pdf
Q-Advise
 
QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform...
QAware GmbH
 
lesson-2-rules-of-netiquette.pdf.bshhsjdj
jasmenrojas249
 
Teaching Reproducibility and Embracing Variability: From Floating-Point Exper...
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
GALILEO CRS SYSTEM | GALILEO TRAVEL SOFTWARE
philipnathen82
 
Protecting the Digital World Cyber Securit
dnthakkar16
 
Odoo Integration Services by Candidroot Solutions
CandidRoot Solutions Private Limited
 
Key Features to Look for in Arizona App Development Services
Net-Craft.com
 
Why Use Open Source Reporting Tools for Business Intelligence.pptx
Varsha Nayak
 
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
Jenkins: An open-source automation server powering CI/CD Automation
SaikatBasu37
 
advancepresentationskillshdhdhhdhdhdhhfhf
jasmenrojas249
 
Why Reliable Server Maintenance Service in New York is Crucial for Your Business
Sam Vohra
 
Role Of Python In Programing Language.pptx
jaykoshti048
 
Bandai Playdia The Book - David Glotz
BluePanther6
 

SpyHunter Crack Latest Version FREE Download 2025

  • 1. Privacy and Security in the Age of Generative AI Benjamin Bengfort, Ph.D. @ C4AI 2025
  • 2. UNC5267 North Korea has used Western Language LLMs to generate fake resumes and profiles to apply for thousands of remote work jobs in western tech companies. Once hired, these “workers” (usually laptop farms in China or Russia that are supervised by a handful of individuals) use remote access tools to gain unauthorized access to corporate infrastructure. https://cloud.google.com/blog/topics/threat-intelligence/mitigating-dprk-it-worker-threat https://www.forbes.com/sites/rashishrivastava/2024/08/27/the-prompt-north-korean-operati ves-are-using-ai-to-get-remote-it-jobs/
  • 3. AI Targeted Phishing 60% of participants in a recent study fell victim to AI generated spear phishing content, a similar success rate compared to non-AI generated messages by human experts. LLMs reduce the cost of generating spear phishing messages by 95% while increasing their effectiveness. https://hbr.org/2024/05/ai-will-increase-the-quantity-and-quality-of-phishing-scams F. Heiding, B. Schneier, A. Vishwanath, J. Bernstein and P. S. Park, "Devising and Detecting Phishing Emails Using Large Language Models," in IEEE Access, vol. 12, pp. 42131-42146, 2024, doi: 10.1109/ACCESS.2024.3375882.
  • 4. AI Generated Malware OpenAI is playing a game of whack a mole trying to ban the accounts of malicious actors who are using ChatGPT to quickly generate malware as payloads in targeted attacks using zip files, VBScripts, etc. “The code is clearly AI generated because it is well commented and most malicious actors want to obfuscate what they’re doing to security researchers.” https://www.bleepingcomputer.com/news/security/openai-confirms-threat-actors-use-chatg pt-to-write-malware/ https://www.bleepingcomputer.com/news/security/hackers-deploy-ai-written-malware-in-tar geted-attacks/
  • 5. Hugging Face Attacks While Hugging Face does have excellent security best practices and code scanning alerts; it is still a vector of attack because of arbitrary code execution in pickle __reduce__ and torch.load. For example, the baller423/goober2 repository had a model uploaded that initiates a reverse shell to an IP address allowing the attacker to access the model compute environment. https://jfrog.com/blog/data-scientists-targeted-by-malicious-hugging-face-ml-models-with-si lent-backdoor/
  • 6. Data Trojans in DRL AI Agents can be exploited to cause harm using data poisoning or trojans injected during the training phase of deep reinforcement learning. Poisoning as little as 0.025% of the training data allowed the inclusion of a classification backdoor causing the agent to call a remote function. A simple agent whose task is constrained is usually allowed admin level privileges in its operation. Panagiota, Kiourti, et al. "Trojdrl: Trojan attacks on deep reinforcement learning agents. in proc. 57th acm/ieee design automation conference (dac), 2020, march 2020." Proc. 57th ACM/IEEE Design Automation Conference (DAC), 2020. 2020.
  • 7. Adversarial self-replicating prompts: prompts that when processed by Gemini Pro, ChatGPT 4.0 and LLaVA caused the model to replicate the input as output to engage in malicious activities. Additionally, these inputs compel the agent to propagate to new agents by exploiting connectivity within the GenAI ecosystem. 2 methods: flow-steering and RAG poisoning. GenAI Worms Cohen, Stav, Ron Bitton, and Ben Nassi. "Here Comes The AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applications." arXiv preprint arXiv:2403.02817 (2024).
  • 8. A custom AI agent built to translate natural language prompts into bash commands using Anthropic’s Claude LLM. Prompt: “Access desktop using SSH” SSH was successful but the agent continued by updating the old Linux kernel, then investigated why apt was taking so long and eventually bricked the computer by rewriting the Grub boot loader. Rogue Agents https://decrypt.co/284574/ai-assistant-goes-rogue-and-ends-up-bricking-a-users-computer
  • 9. Generally, prompts that are intended to cause an LLM to leak sensitive information or to perform a task in a manner not proscribed by the application to the attacker’s benefit. Extended case: the manipulation of a valid user’s prompt in order to cause the LLM to take an unexpected action or cause irrelevant output. Prompt Injection https://decrypt.co/284574/ai-assistant-goes-rogue-and-ends-up-bricking-a-users-computer Liu, Yupei, et al. "Formalizing and benchmarking prompt injection attacks and defenses." 33rd USENIX Security Symposium (USENIX Security 24). 2024.
  • 10. Targeting function calling LLMs that perform Google searches and include the results into a prompt (e.g. search based RAG); researchers showed that by embedding hidden prompts into the retrieved websites, they could manipulate LLMs to expose private user data and information. Indirect Prompt Injection https://thehill.com/opinion/cybersecurity/3953399-hijacked-ai-assistants-can-now-hack-you r-data/ Greshake, Kai, et al. "Not what you've signed up for: Compromising real-world llm-integrated applications with indirect prompt injection." Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security. 2023.
  • 11. You can type just about anything into ChatGPT. But users recently discovered that asking anything about "David Mayer" caused ChatGPT to shut down the conversation with the terse reply, "I'm unable to produce a response." A message shown at the bottom of the screen doubled up on the David-dislike, saying, "There was an error generating a response. David Mayer https://www.newsweek.com/chatgpt-openai-david-mayer-error-ai-1994100 https://www.cnet.com/tech/services-and-software/chatgpt-wont-answer-questions-about-c ertain-names-heres-what-we-know/
  • 12. Function calling (also referred to as “skills” or “tool use” allows LLMs to make API calls based on the descriptions of the tools available and their parameters. However, give an LLM a tool … it wants to use that tool! Even prompts such as “tell me a joke” might lead to unexpected tool use. For more on this - come tonight! Function Calling https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling
  • 13. A specialized form of indirect prompt injection; exploits the fact that AI models see the complete tool descriptions, including hidden instructions, while users typically only see simplified versions in their UI. The attack modifies the tool instructions and can use shadowing to exploit trusted servers. Because MCP (Model Context Protocol) uses these tool calls and has a trusted execution context, attackers can gain access to sensitive files such as SSH keys. Tool Poisoning: MCP https://invariantlabs.ai/blog/mcp-security-notification-tool-poisoning-attacks
  • 14. Important Lessons Expect the Unexpected Generative AI is not a deterministic computer program that will behave within expected pre-defined parameters. Treat AI as stochastic and unpredictable. Data governance and security in the form of access controls is not optional when doing machine learning and AI tasks. Data security is as important as compute environment security. Do not trust the internet! Verify, escape, scrub, and scan anything that comes from the web! Make sure that you and your models have guardrails. We desperately need a mechanism to identify what is human generated text or imagery and what is AI generated. Classifiers and/or watermarking is not sufficient! Guardrails! Data Governance is Key Certify Authorship
  • 15. Important Lessons Expect the Unexpected Generative AI is not a deterministic computer program that will behave within expected pre-defined parameters. Treat AI as stochastic and unpredictable. Data governance and security in the form of access controls is not optional when doing machine learning and AI tasks. Data security is as important as compute environment security. Do not trust the internet! Verify, escape, scrub, and scan anything that comes from the web! Make sure that you and your models have guardrails. We desperately need a mechanism to identify what is human generated text or imagery and what is AI generated. Classifiers and/or watermarking is not sufficient! Guardrails! Data Governance is Key Certify Authorship
  • 16. Important Lessons Expect the Unexpected Generative AI is not a deterministic computer program that will behave within expected pre-defined parameters. Treat AI as stochastic and unpredictable. Data governance and security in the form of access controls is not optional when doing machine learning and AI tasks. Data security is as important as compute environment security. Do not trust the internet! Verify, escape, scrub, and scan anything that comes from the web! Make sure that you and your models have guardrails. We desperately need a mechanism to identify what is human generated text or imagery and what is AI generated. Classifiers and/or watermarking is not sufficient! Guardrails! Data Governance is Key Certify Authorship
  • 17. Important Lessons Expect the Unexpected Generative AI is not a deterministic computer program that will behave within expected pre-defined parameters. Treat AI as stochastic and unpredictable. Data governance and security in the form of access controls is not optional when doing machine learning and AI tasks. Data security is as important as compute environment security. Do not trust the internet! Verify, escape, scrub, and scan anything that comes from the web! Make sure that you and your models have guardrails. We desperately need a mechanism to identify what is human generated text or imagery and what is AI generated. Classifiers and/or watermarking is not sufficient! Guardrails! Data Governance is Key Certify Authorship
  • 18. Happy to take comments and questions online or chat after the talk! [email protected] https://rtnl.link/SEmP0wIrMft rotational.io Thanks! Some images in this presentation were AI generated using Gemini Pro Special thanks to Ali Haidar and John Bruns at Anomali for providing some of the threat intelligence research. @bbengfort