Structuring projects and configuration management for LLM-powered Apps.

9 min readJul 23, 2023

You can say that this is part 1.5 😅 of my mini-series Building end-to-end LLM-powered applications without Open AI’s API.

Before going forward if you wonder what part 1 covers, check this out. We cover how to integrate custom LLM using langchain.

What is covered so far in the series

Integrating custom LLM using langchain (A GPT4ALL example)
Configuration management for LLM-powered applications
Connecting LLM to the knowledge base (coming soon)
Serving LLM using Fast API (coming soon)
Fine-tuning an LLM using transformers and integrating it into the existing pipeline for domain-specific use cases (coming soon)
Putting all together into a simple make cloud-native LLM-powered application (coming soon)

Let’s get started

But Why? Why should I care about having configuration files? Because before building end-to-end applications, having properly organized structured configurations are very important to manage projects. There are several reasons why you should care of configuration management. Some of them w.r.t our project are as follows:

Managing paths of the models. You might need to change the model path or the S3 link. And the dependencies of the path can be reflected in several project files.
Hyperparameters management. Suppose you have a fine-tuning pipeline or an inference pipeline. Hyperparameters like learning rate, optimizer choice, etc are to become hard to manage if those are initialized within project files. In our cases, configs like whether we should stream or not, temperature, top_p, top_k, etc.
Configurations act like settings. Just like all applications (even your Android/ios applications) have in-app settings. Similarly, configuration files also act as settings, which can also be tweaked from outside the inner code structure but can communicate the behavior of how your code must work. Even config files are very much useful while writing infrastructure code to serve applications.

Lack of configuration management can cause some serious problems during the time of production when scaling systems or tweaking/updating parameters.

The same goes for project structure. Making a modular project structure is very much essential. Modern cloud-native applications require proper structuring of projects such that most modules are decoupled (independent) of each other and can be handled or scaled independently. Now that you got some significance in writing config files, now let’s jump right into building our project structure and write our configuration files. For the project structure we will be using CookieClutter for building our project boilerplate and for configuration management we will be using Hydra.

Structuring our Project

CookieClutter is a command line utility that creates project structure from pre-existing project templates. You can install cookiecutter by

pip install cookiecutter

Then we will be using a popular template repository called cookieclutter data science. We can easily set it up by using this command

cookiecutter -c v1 https://github.com/drivendata/cookiecutter-data-science

Doing this will take you to a simple CLI application where you have to type things like app names, descriptions, etc. And it will create a base template. But as we are not proper data science here, I deleted some files and folders and the structure I use looks like this now.

.
├── configs
│   ├── config.yaml
│   ├── knowledge_base
│   │   └── default.yaml
│   └── model
│       └── default.yaml
├── data
│   ├── diff_lm.pdf
│   └── llama2.pdf
├── LICENSE
├── main.py
├── Makefile
├── README.md
├── requirements.txt
├── setup.py
├── src
│   ├── data
│   │   └── __init__.py
│   ├── __init__.py
│   └── models
│       └── __init__.py
├── test_environment.py
├── tests
└── tox.ini

Here is some information you need to know about what each folder is for.

configs : This is mainly to store different configurations for each of the components in our applications. Examples: models, knowledge base, vector database, etc.
data : This is used to save all our documents for our knowledge base. However, in practice, documents must be stored inside a service like Amazon S3.
src : It contains all our source files where it has subfolders like data , which will contain all the files to create and manage our knowledge base and modelsto create and manage our LLMs.
tests : Here we will keep all our files for testing like unit testing, integration testing, etc.
main.py : For now, this will be our main Python file which will be using the source files as a helper and running the main code. We will run this code for our chatbot.
Makefile is useful to quickly set up our project like installing dependencies, linting, formatting, quick testing, etc.

The code from our previous blog will go inside src/models the folder. Today our focus will be on writing configuration files under the configs folder. I assume readers already are familiar with my previous blog about how we integrated a custom open-source LLM (using gpt4all and langchain). Because today we are going to write configuration files for our models.

The plot

In short, let’s imagine this plot. Suppose your app allows users to chat with different kinds of data and with different variants of open-source LLMs. Different open-source LLMs are accessed through different providers like LLAMA.cpp , GPT4ALL , HugggingFace etc. Now each of the models written with different libraries might have different optimal configurations. How you are gonna manage those? The answer is writing configuration files. Here we will be writing our configurations for our gpt4all LLM models.

Hydra 101

We start by installing Hydra. We can do this by this command

pip install hydra-core==1.1.0

Hydra operates on top of OmegaConf, which is a YAML based hierarchical configuration system, with support for merging configurations from multiple sources (files, CLI argument, environment variables) providing a consistent API regardless of how the configuration was created.

Writing your first configuration files

Let’s get started by making a simple yaml file named config.yaml and the contents of the yaml the file is like this

gpt4all:
  gpt4all_model_name: ggml-gpt4all-j-v1.3-groovy.bin
  gpt4all_model_folder_path: /home/anindya/.local/share/nomic.ai/GPT4All/
  gpt4all_backend: llama

The above is a configuration file I wrote using yaml. Now I could also have written the same like this

gpt4all_model_name: ggml-gpt4all-j-v1.3-groovy.bin
gpt4all_model_folder_path: /home/anindya/.local/share/nomic.ai/GPT4All/
gpt4all_backend: llama

This one is also valid. But it all depends. Let’s consider the first yaml is format 1 and the second one is format 2. Format 1 lets me add configurations of different model providers all at once. For example, if I now want to add hugging face configurations I could have done something like this.

gpt4all:
  gpt4all_model_name: ggml-gpt4all-j-v1.3-groovy.bin
  gpt4all_model_folder_path: /home/anindya/.local/share/nomic.ai/GPT4All/
  gpt4all_backend: llama

hugging_face:
  hugging_face_model_id: some model name 
  hugging_face_adapter_name: some adapter id
  hugging_face_dataset: some dataset source id

Whereas in format 2, I might do the same but may be in different files. In that case, one file would be named as gpt4all_config.yaml , another would be huggingface_config.yaml . It all depends on how complex our projects are and based on that we have to choose our

Now let’s load this file using Hydra and use print the configurations.

# file name: main.py

import hydra

@hydra.main(config_path='.', config_name='config')
def main(cfg):
    print('GPT4ALL Configurations')
    print(' Model name: ', cfg.gpt4all.gpt4all_model_name)
    print(' Model stored in path: ', cfg.gpt4all.gpt4all_model_folder_path)
    print(' Model using LLM backend of: ', cfg.gpt4all.gpt4all_backend)
    print('\nHuggingFace Configurations')
    print(' Hugging face model name: ', cfg.hugging_face.hugging_face_model_id)
    print(' Hugging face adapter name: ', cfg.hugging_face.hugging_face_adapter_name)
    print(' Hugging face dataset name: ', cfg.hugging_face.hugging_face_dataset)

if __name__ == '__main__':
    main()

I guess from intuition you can find that all hydra essential does here is that whatever configuration we provide through a bunch of yaml files, it provides a nice object-oriented interface of that so that it can be easily accessible. Running this will be printing the following

GPT4ALL Configurations
 Model name:  ggml-gpt4all-j-v1.3-groovy.bin
 Model stored in path:  /home/anindya/.local/share/nomic.ai/GPT4All/
 Model using LLM backend of:  llama

HuggingFace Configurations
 Hugging face model name:  some model name
 Hugging face adapter name:  some adapter id
 Hugging face dataset name:  some dataset source id

There are several other things Hydra is famous for. One such thing is Overriding configurations. This means that we can change the configurations in the run time and run our app accordingly. Here is an example while I run this app.

python3 main.py \
    gpt4all.gpt4all_model_name="my latest model name" \
    gpt4all.gpt4all_model_folder_path="my latest folder" \
    gpt4all.gpt4all_backend="GPTJ"

And this will print

GPT4ALL Configurations
 Model name:  my latest model name
 Model stored in path:  my latest folder
 Model using LLM backend of:  GPTJ

HuggingFace Configurations
 Hugging face model name:  some model name
 Hugging face adapter name:  some adapter id
 Hugging face dataset name:  some dataset source id

I guess this information is enough to get you all started with our existing project. However, if you are interested to learn more, check out the awesome documentation by Hydra itself. Also, a huge part of this 101 was inspired by this awesome blog by Raviraja Ganta. He used Hydra to show us how to use to manage configurations while building end-to-end ml applications.

Applying Hydra to our project to manage our model

Previously I showed you how to create a single yaml file and use that as our configurations. But in real scenarios we will be having much more complex things to handle. You can just take our app examples. Now we are just dealing with our model. In future, we have to do the same for our vector database (Knowledge base), API, serving, etc. Hence it is better to create configs for different components. Here is an example of our config folder structure.

configs
├── config.yaml
├── knowledge_base
│   └── default.yaml
└── model
    └── default.yaml

Think of config.yaml inside configs as the config orchestrator. It means that the config.yaml knows where other configurations of each components are located and through config.yaml we can load and control the other components. Now inside model and knowledge_base there is this default.yaml which has the default configurations for our model and our knowledge base (we will talk about the knowledge base in the later parts of our blog). Now let’s take a look at our config.yaml file.

defaults:
  - model: default
  - knowledge_base: default

If we just use our intuition, then it is not much hard to understand. It says that our default configurations are as follows. The default model configurations will be using the default.yaml inside configs/models and the same for our knowledge base. Now similarly configs/models might contain other config files like model_dev.yaml (all model configurations for development phase), model_production.yaml (untouched configurations for production, we can also make it read-only using Hydra if we want). And then we can use that in our config.yaml reference. But for now, let’s keep this setting.

Once done now let’s take a peek at how our configs/models/default.yaml look like.

gpt4all_model:
  gpt4all_model_name: ggml-gpt4all-j-v1.3-groovy.bin
  gpt4all_model_folder_path: /home/anindya/.local/share/nomic.ai/GPT4All/
  gpt4all_backend: llama
  gpt4all_allow_streaming: true
  gpt4all_allow_downloading: false
  gpt4all_temperature: 1
  gpt4all_top_p: 0.1
  gpt4all_top_k: 40
  gpt4all_n_batch: 8
  gpt4all_n_threads: 4
  gpt4all_n_predict: 256
  gpt4all_max_tokens: 200
  gpt4all_repeat_last_n: 64
  gpt4all_penalty: 1.18

We now wrote all the changing parameters inside the config and we are also done referencing by our config.yaml Now inside our main.py file let’s use it to test whether our configs are working on not. And also while running main.py we will tweak some parameters as done previously in our dummy example to see if this works in this case too.

Here is our main.py file.

import hydra
from src.models.gpt4all_model import MyGPT4ALL

# reference the ./configs folder to tell hydra where the configs are located
# also tell hydra that our master config (which manages all other config)
# name is config.

@hydra.main(config_path='./configs', config_name='config')
def main(cfg):

    # instantiate the model and populate the arguments using hydra

    chat_model = MyGPT4ALL(
        model_folder_path=cfg.model.gpt4all_model.gpt4all_model_folder_path,
        model_name=cfg.model.gpt4all_model.gpt4all_model_name,
        allow_download=cfg.model.gpt4all_model.gpt4all_allow_downloading,
        allow_streaming=cfg.model.gpt4all_model.gpt4all_allow_streaming,
        
    )

    while True:
        query = input('Enter your Query: ')
        if query == 'exit':
            break
        # use hydra to fill the **kwargs
        response = chat_model(
            query,
            n_predict=cfg.model.gpt4all_model.gpt4all_n_predict,
            temp=cfg.model.gpt4all_model.gpt4all_temperature,
            top_p=cfg.model.gpt4all_model.gpt4all_top_p,
            top_k=cfg.model.gpt4all_model.gpt4all_top_k,
            n_batch=cfg.model.gpt4all_model.gpt4all_n_batch,
            repeat_last_n=cfg.model.gpt4all_model.gpt4all_repeat_last_n,
            repeat_penalty=cfg.model.gpt4all_model.gpt4all_penalty,
            max_tokens=cfg.model.gpt4all_model.gpt4all_max_tokens,
        )
        print()

if __name__ == '__main__':
    main()

This file is simple enough to understand I believe. All it does is that it instantiates the chat_model using our hydra configs and also calls the **kwargs while using the model in a chat with our hydra configs. And if we just run python3 main.py it should run as expected. But suppose we want to tweak some parameters in runtime. Take a look at this for example.

PYTHONPATH=. python3 main.py \
    model.gpt4all_model.gpt4all_model_name=ggml-mpt-7b-instruct.bin \
    model.gpt4all_model.gpt4all_temperature=1 \
    model.gpt4all_model.gpt4all_top_k=50 \
    model.gpt4all_model.gpt4all_max_tokens=10 \
    model.gpt4all_model.gpt4all_penalty=1.00

I changed the model name, temperature, values of top_k, max_tokens, and penalty and hence it will change the parameters before running the code. Also the best part!!! This provides us with an awesome cli interface without letting us make one. Is’t this awesome?

Conclusion

Congratulations 🥳🥳. You just completed a very important part of production-grade general ML life cycles i.e. configuration management. And we can apply the knowledge in building our LLM-powered applications. Now this might not look very cool, but I assure you, it makes life easier when applications become too complex to handle. In the next part of our blog, we will be learning how to connect documents and make a knowledge base out of it, to make LLMs more robust and do question-answering on unseen, private docs. All the codes used here and previously are dumped inside this GitHub repo. Feel free to check that out. Until next time. 💪

References and Acknowledgements

Config management using Hydra by Raviraja Ghanta https://deep-learning-blogs.vercel.app/blog/mlops-hydra-config
Mastering Config management using Hydra https://towardsdatascience.com/mastering-configuration-ml-with-hydra-ef138f1c1852

Structuring projects and configuration management for LLM-powered Apps.

What is covered so far in the series

Let’s get started

Structuring our Project

The plot

Hydra 101

Writing your first configuration files

Applying Hydra to our project to manage our model

Conclusion

References and Acknowledgements

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Anindyadeep

Responses (1)