Deploying AgentQnA on Intel® Xeon® Processors

This document outlines the single node deployment process for a AgentQnA application utilizing the GenAIComps microservices on Intel Xeon server. The steps include pulling Docker images, container deployment via Docker Compose, and service execution using microservices agent.

Table of Contents

  1. AgentQnA Quick Start Deployment

  2. Configuration Parameters

  3. AgentQnA Docker Compose Files

  4. Validate Services

  5. Interact with the agent system with UI

  6. Register other tools with the AI agent

  7. Conclusion

AgentQnA Quick Start Deployment

This section describes how to quickly deploy and test the AgentQnA service manually on an Intel® Xeon® processor. The basic steps are:

  1. Access the Code

  2. Configure the Deployment Environment

  3. Deploy the Services Using Docker Compose

  4. Ingest Data into the Vector Database

  5. Cleanup the Deployment

Access the Code

Clone the GenAIExample repository and access the AgentQnA Intel® Xeon® platform Docker Compose files and supporting scripts:

export WORKDIR=<your-work-directory>
cd $WORKDIR
git clone https://github.com/opea-project/GenAIExamples.git
cd GenAIExamples/AgentQnA

Then checkout a released version, such as v1.4:

git checkout v1.4

Configure the Deployment Environment

To set up environment variables for deploying AgentQnA services, set up some parameters specific to the deployment environment and source the set_env.sh script in this directory:

export host_ip="External_Public_IP"           # ip address of the node
export HF_TOKEN="Your_HuggingFace_API_Token"  # the huggingface API token you applied
export http_proxy="Your_HTTP_Proxy"           # http proxy if any
export https_proxy="Your_HTTPs_Proxy"         # https proxy if any
export no_proxy=localhost,127.0.0.1,$host_ip  # additional no proxies if needed
export NGINX_PORT=${your_nginx_port}          # your usable port for nginx, 80 for example

[Optional] OPENAI_API_KEY to use OpenAI models or Intel® AI for Enterprise Inference

To use OpenAI models, generate a key following these instructions.

To use a remote server running Intel® AI for Enterprise Inference, contact the cloud service provider or owner of the on-prem machine for a key to access the desired model on the server.

Then set the environment variable OPENAI_API_KEY with the key contents:

export OPENAI_API_KEY=<your-openai-key>

Then, set up environment variables for the selected hardware using the corresponding set_env.sh

source $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon/set_env.sh

Deploy the Services Using Docker Compose

We make it convenient to launch the whole system with docker compose, which includes microservices for LLM, agents, UI, retrieval tool, vector database, dataprep, and telemetry. There are 3 docker compose files, which make it easy for users to pick and choose. Users can choose a different retrieval tool other than the DocIndexRetriever example provided in our GenAIExamples repo. Users can choose not to launch the telemetry containers.

On Xeon, OpenAI models and models deployed on a remote server are supported. Both methods require an API key.

cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon

OpenAI Models

The command below will launch the multi-agent system with the DocIndexRetriever as the retrieval tool for the Worker RAG agent.

docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml up -d

Models on Remote Server

When models are deployed on a remote server with Intel® AI for Enterprise Inference, a base URL and an API key are required to access them. To run the Agent microservice on Xeon while using models deployed on a remote server, add compose_remote.yaml to the docker compose command and set additional environment variables.

Notes

  • OPENAI_API_KEY is already set in a previous step.

  • model is used to overwrite the value set for this environment variable in set_env.sh.

  • LLM_ENDPOINT_URL is the base URL given from the owner of the on-prem machine or cloud service provider. It will follow this format: “https://”. Here is an example: “https://api.inference.example.com”.

export model=<name-of-model-card>
export LLM_ENDPOINT_URL=<http-endpoint-of-remote-server>
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml -f compose_remote.yaml up -d

Build image from source

Please refer to the table below to build different microservices from source:

| Microservice | Deployment Guide | | ———— | ———————————————————————————————————————————————— | — | | Agent | Agent build guide | | | UI | Basic UI build guide |

Ingest Data into the Vector Database

The run_ingest_data.sh script will use an example jsonl file to ingest example documents into a vector database. Other ways to ingest data and other types of documents supported can be found in the OPEA dataprep microservice located in the opea-project/GenAIComps repo.

cd  $WORKDIR/GenAIExamples/AgentQnA/retrieval_tool/
bash run_ingest_data.sh

Note: This is a one-time operation.

Cleanup the Deployment

To stop the containers associated with the deployment, execute the following command:

# for OpenAI Models
docker compose -f compose_openai.yaml down
# for Models on Remote Server
docker compose -f compose_remote.yaml down

Configuration Parameters

Key parameters are configured via environment variables set before running docker compose up.

Environment Variable

Description

Default (Set Externally)

ip_address

External IP address of the host machine. Required.

your_external_ip_address

OPENAI_API_KEY

Your OpenAI API key for model access. Required.

your_openai_api_key

model

Hugging Face model ID for the AgentQnA LLM. Configured within compose.yaml environment.

gpt-4o-mini-2024-07-18

TOOLSET_PATH

Local path to the tool Yaml file. Configured in compose.yaml.

$WORKDIR/GenAIExamples/AgentQnA/tools/

CRAG_SERVER

CRAG server URL. Derived from ip_address and port 8080.

http://${ip_address}:8080

WORKER_AGENT_URL

Worker agent URL. Derived from ip_address and port 9095.

http://${ip_address}:9095/v1/chat/completions

SQL_AGENT_URL

SQL agent URL. Derived from ip_address and port 9096.

http://${ip_address}:9096/v1/chat/completions

http_proxy / https_proxy/no_proxy

Network proxy settings (if required).

""

AgentQnA Docker Compose Files

In the context of deploying a AgentQnA pipeline on an Intel® Xeon® platform, we can pick and choose different large language model serving frameworks. The table below outlines the various configurations that are available as part of the application. These configurations can be used as templates and can be extended to different components available in GenAIComps.

File

Description

compose_openai.yaml

Default compose file using OpenAI-compatible API as the serving framework

compose_remote.yaml

This compose file is used to connect to a remote LLM service (such as a self-hosted or third-party API). All other configurations remain the same as the default.

Validate Services

  1. First look at logs for each of the agent docker containers:

# worker RAG agent
docker logs rag-agent-endpoint

# worker SQL agent
docker logs sql-agent-endpoint

# supervisor agent
docker logs react-agent-endpoint

Look for the message “HTTP server setup successful” to confirm the agent docker container has started successfully.

  1. Use python to validate each agent is working properly:

# RAG worker agent
python $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "Tell me about Michael Jackson song Thriller" --agent_role "worker" --ext_port 9095

# SQL agent
python $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "How many employees in company" --agent_role "worker" --ext_port 9096

# supervisor agent: this will test a two-turn conversation
python $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --agent_role "supervisor" --ext_port 9090

Interact with the agent system with UI

The UI microservice is launched in the previous step with the other microservices. To see the UI, open a web browser to http://${ip_address}:5173 to access the UI. Note the ip_address here is the host IP of the UI microservice.

  1. Click on the arrow above Get started. Create an admin account with a name, email, and password.

  2. Add an OpenAI-compatible API endpoint. In the upper right, click on the circle button with the user’s initial, go to Admin Settings->Connections. Under Manage OpenAI API Connections, click on the + to add a connection. Fill in these fields:

  • URL: http://${ip_address}:9090/v1, do not forget the v1

  • Key: any value

  • Model IDs: any name i.e. opea-agent, then press + to add it

Click “Save”.

opea-agent-setting

  1. Test OPEA agent with UI. Return to New Chat and ensure the model (i.e. opea-agent) is selected near the upper left. Enter in any prompt to interact with the agent.

opea-agent-test

Register other tools with the AI agent

The tools folder contains YAML and Python files for additional tools for the supervisor and worker agents. Refer to the “Provide your own tools” section in the instructions here to add tools and customize the AI agents.

Conclusion

This guide provides a comprehensive workflow for deploying, configuring, and validating the AgentQnA system on Intel® Xeon® processors, enabling flexible integration with both OpenAI-compatible and remote LLM services.