This folder provides the code artifact for the work of applying LLM-based agents to microservice self-management. More details can be found on the paper website.
Following the instructions below, you can reproduce the experiment results of the paper based on the microservice demo project Sock Shop. Moreover, you can also leverage this code for exploring the self-managing capabilities on your own microservice.
Prerequisite: Linux System Environment, Docker, RabbitMQ
# [optional to create conda environment]
# conda create -n acv_llm python=3.11
# conda activate acv_llm
# clone the repo (or just download the code in this folder)
git clone https://github.com/microsoft/ACV.git
cd self_managing_systems/microservice/paper_artifact_arXiv_2407_14402
# check if prerequisites are installed
bash scripts/check_prerequisites.sh
# install the requirements
pip install -r requirements.txt
pip install azure-identity-broker --upgrade
# install required softwares, only for Ubuntu now, if you are using another system, please see setup.sh to get details
bash scripts/setup.sh
Before running the actual experiemnts, you need configure your LLMs first. Those configurations will be used by ClusterMangaer
and Maintainer
. You should create your own config file conf/secret.yaml
, by copying conf/secret_template.yaml
and edit endpoint config as follows:
OpenAI
backend: "OpenAI" # The backend type, "OpenAI" for the OpenAI API.
OpenAI:
model: "gpt-4" # Only OpenAI models are supported.
api_key: "sk-" # Your OpenAI API key.
AzureOpenAI
backend: "AzureOpenAI" # The backend type, "AzureOpenAI" for the Azure OpenAI API.
AzureOpenAI:
model: "my-gpt-4-deployment" # The model name in Azure OpenAI.
api_type: "azure" # use "azure" for Azure OpenAI.
api_key: <API_KEY> # Your Azure OpenAI API key.
base_url: "https://ENDPOINT.openai.azure.com/" # The endpoint of your Azure OpenAI API.
api_version: "2024-02-01" # default to "2024-02-01".
Other
backend: "Other" # The backend type, "Other" for the local API.
Other:
model: "llama-7B" # The model name in your local API.
api_key: "" # Your local API key, optional.
base_url: "http://localhost:1234" # The endpoint of your local API.
Execute the following command for a quick trial:
python -m src.empirical_low_level_L1_2 \
--instance metric_collection_1 \
--suffix stable \
--cache_seed 42
With the code running, you can see the chat history with the agent in results/metric_collection_1-stable.md
, and all logs is located in logs/
.
Experiment on L1 and L2 tasks applied to the low-level autonomic agent
python -m src.empirical_low_level_L1_2 \
--instance restart \
--suffix stable \
--cache_seed 42
Experiment on L3, L4 and L5 tasks applied to the low-level autonomic agent
python -m src.empirical_low_level_L3_4_5 \
--instance pod_failure \
--suffix stable \
--cache_seed 42
Experiment on L1 and L2 tasks applied to the high-level group manager
python -m src.empirical_high_level_L1_2 \
--task 'Reduce the total P99 latency of "catalogue" and "front-end" to under 400 ms.' \
--components catalogue,front-end \
--timeout 900 \
--cache_seed 42
Experiment on L3, L4 and L5 tasks applied to the high-level group manager
python -m src.empirical_high_level_L3_4_5 \
--instance pod_failure \
--components catalogue,front-end \
--timeout 900 \
--cache_seed 42
Note: The above commands run an experiment with one of the test cases in data/dataset
and store the chat history in results/
. Other test cases can be found in the data/dataset
folder.
You can explore your own task in our environment. Here are some examples for you to have a try.
Before you start, you need to build up the environment first. You can use the following command to build up the environment:
bash scripts/create_project.sh
Execute following command to have a try on working machanism 1:
python -m src.working_mechanism_1 \
--task 'Report CPU usage of your component.' \
--component catalogue \
--cache_seed 42
Execute following command to have a try on working machanism 2:
python -m src.working_mechanism_2 \
--task 'Reduce the total P99 latency of catalogue and front-end to under 400 ms.' \
--components catalogue,front-end \
--timeout 900 \
--cache_seed 42
You can design your own task by changing the --task
and --components
parameters, explore and enjoy your trip!
After you finish your task, you can tear down the environment by using the following command:
bash scripts/remove_project.sh
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
If you use this code in your research, please cite our paper:
@misc{zhang2024visionautonomiccomputingllms,
title={The Vision of Autonomic Computing: Can LLMs Make It a Reality?},
author={Zhiyang Zhang and Fangkai Yang and Xiaoting Qin and Jue Zhang and Qingwei Lin and Gong Cheng and Dongmei Zhang and Saravan Rajmohan and Qi Zhang},
year={2024},
eprint={2407.14402},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2407.14402},
}
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.