Making an AI server

Introduction

I am going to install Ubuntu Server on a PC with a GPU, to make a Headless server, to run all my AI locally.
I will run a local LLM using Ollama and Image generation using stable diffusion, on a GUI using Open WebUI.
I will be using my old laptop with an NVIDIA gpu, to dualboot Ubuntu server, with the windows 10 pre-installed.
Note: All the AI tools mentioned, can also be installed and run on any computer (with any os), and a dedicated linux gpu server is not compulsory.
Prerequisites:
- A Computer with atleast 8gb ram & a GPU (preferably NVIDIA)
- An empty USB drive (atleast 8gb)
- An extra drive on the computer to install Ubuntu Server on (If dual-booting with previously installed OS)

Preparing the Server (setting up dependencies)

Installation of Ubuntu Server

Create a bootable usb, flashed with ubuntu server image : The ISO file can be downloaded from Ubuntu’s Website. Software such as Rufus or BalenaEtcher can be used to flash the image to the usb drive.
Boot from the usb drive with Ubuntu Server Image : Turn off the computer and boot into bios, by spamming the bios key for your pc. From the boot menu, select the USB drive.
Install Ubuntu Server : Press enter to select Try or Install Ubuntu Server. It is a pretty standard OS installation. Press Enter to Confirm, Arrow keys to move between the options, and space to select.
- Choose your language
- Select Keyboard layout
- Choose installation type as Ubuntu Server & also select Search for 3rd-party drivers
- Setup your Network Configuration. It is recommended to use LAN and also manually bind your pc a static ip instead of using DHCP
- Set proxy if required
- Let Ubuntu test default Mirror and proceed
- Setup your storage configuration. It is recommended to allot a full disk for installation
- Re-confirm your Installation drive, so as to not wipe anything important, then continue
- Fill in your information including your name, name for the server, username and password
- If Ubuntu found any GPU drivers, it will prompt for install confirmation
- Skip Ubuntu Pro for now
- Select install OpenSSH server to access your server remotely using another pc over ssh. Also import SSH keys using GitHub, to not have to authenticate everytime you ssh into the server. Also select Allow Password Authentication, to allow other devices without Authorised keys, to login by using credentials
- Select any snaps(packages) you want to be installed automatically
- Let Ubuntu server install on your PC, then reboot. also remove the installation media (pendrive) while rebooting.

Post-Install Setup

Now whenever the PC turns on, you will be greeted the the grub bootloader menu, where you can select which os you want to boot into, Ubuntu or any previously installed OS. Select Ubuntu and after it boots up, login using the username and password set during installation.

Updating Repositories and packages :

1

sudo apt update && sudo apt upgrade -y

Updating the Main partition to use all the free storage available : by default, only about half of the drive is available as usable storage

1

sudo lvextend -l +100%FREE /dev/ubuntu-vg/ubuntu-lv && sudo resize2fs /dev/ubuntu-vg/ubuntu-lv

Prevent Laptop from sleeping when lid closed & screensaver (optional, for laptops only) :

1

sudo nano /etc/systemd/logind.conf

Uncomment & set the HandleLidSwitch settings to ignore, then save & exit : HandleLidSwitch=ignore HandleLidSwitchExternalPower=ignore
HandleLidSwitchDocked=ignore

1

sudo systemctl restart systemd-logind.service

1

sudo nano /etc/default/grub

Set the time for display to sleep (in seconds), then save & exit : GRUB_CMDLINE_LINUX="consoleblank=60"

1

sudo update-grub

Setting up GPU

Installing GPU drivers : Ubuntu Documentation

1

sudo ubuntu-drivers install

Reboot the computer : sudo reboot
Testing (nvidia) drivers : nvidia-smi

Installing NVIDIA container toolkit (Nvidia GPUs only) : Official Documentation

1
2
3
4


curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

1

sudo apt-get update

1

sudo apt-get install -y nvidia-container-toolkit

Docker installation & setup (for running containers)

Install Docker :

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

1

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Post-installation steps :

1
2
3
4
5


## Manage Docker without prefacing the docker command with sudo

sudo groupadd docker
sudo usermod -aG docker $USER
newgrp docker

1
2
3
4


## Configure Docker to start on boot

sudo systemctl enable docker.service
sudo systemctl enable containerd.service

Configuring (nvidia) GPU-acceleration support for Docker Containers :

1
2


sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Also, reboot the PC : sudo reboot

Install Portainer (GUI for managing Docker Containers)

Documentation

1

docker volume create portainer_data

1

docker run -d -p 8000:8000 -p 9443:9443 --name portainer --restart=always -v /var/run/docker.sock:/var/run/docker.sock -v portainer_data:/data portainer/portainer-ce:latest

To check the container status : docker ps

Access the Portainer GUI : https://{server_ip}:9443

Installation & Setup of AI tools

Ollama (for local LLMs)

Install Ollama

1

curl -fsSL https://ollama.com/install.sh | sh

To also expose Ollama API on the network : Documentation

1

systemctl edit ollama.service

Add these lines to change the bind address & expose Ollama on the network :

1
2


[Service]
Environment="OLLAMA_HOST=0.0.0.0"

Then restart ollama :

1
2


systemctl daemon-reload
systemctl restart ollama

To Download a Model : Models list

1

ollama pull llama3.2

To Run the Model :

1

ollama run llama3.2

Ollama Integrations : https://github.com/ollama/ollama#extensions--plugins

Automatic1111 (Stable Diffusion: for Local Image Generation)

Documentation

Pre-requisites :

1

sudo apt install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev git

Setup Pyenv

1

curl -fsSL https://pyenv.run | bash

1
2
3
4


echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo '[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(pyenv init - bash)"' >> ~/.bashrc
echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bashrc

1

source ~/.bashrc

1
2


pyenv install 3.10.6
pyenv global 3.10.6

Install StableDiffusion WebUI

1

wget -q https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh

1
2


chmod +x webui.sh
./webui.sh

Download SD models : https://civitai.com/models

1
2
3


## Transfer model to ubuntu server

scp /path/to/downloaded/model/{model_name}.safetensors {username}@{server.ip}:~/stable-diffusion-webui/models/Stable-diffusion/

Note : You may also delete the default SD 1.5 model

Additional Configuration :

1
2


cd stable-diffusion-webui
nano webui-user.sh

Uncomment the COMMANDLINE_ARGS line set the following arguments :

1

export COMMANDLINE_ARGS="--xformers --medvram --opt-split-attention --listen --api"

--xformers : Only for NVIDIA GPUs

Note : Use the webui.sh script present inside the stable-diffusion-webui to run StableDiffusion with the specified Arguments The Stable Diffusion WebUI should be accessible at : {server_ip}:7860

Open WebUI (for an interactive GUI)

Installation :

1

docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

The UI should be accessible at : {server_ip}:8080

Additional Configuration : You can also set a default model, and enable Memory under Settings -> Personalization. Some other settings to enable are under Admin Panel -> Settings
- Under Documents, Enable PDF Extract Images (OCR)
- Under Web Search, enable Web Search and set Engine to DuckDuckGo Note: OpenWebUI also has the inbuilt functionality to use Documents for context aswell. It is also possible to add System Prompts to customize the model’s responses
To Integrate Image Generation (with prompt generation) :
- Go to Admin Panel -> Settings -> Images
- Set Image Generation Engine to Automatic1111, and set base url to http://127.0.0.1:7860
- Select the Default model and enable Image Generation
- Now just press the + icon in chat and select image, to generate images.
Note: If you have low vram (<=4GB), Image generation might cause Out of Memory errors. To help with that, Ollama can be set to unload the model from memory everytime, after generating a response.

1

systemctl edit ollama.service

Add this Environment Variable under [Service]

1

Environment="OLLAMA_KEEP_ALIVE=0"

Note : It can result in slower response times, as ollama needs to load the model into memory everytime