Introduction
-
I am going to install Ubuntu Server on a PC with a GPU, to make a Headless server, to run all my AI locally.
-
I will run a local LLM using Ollama and Image generation using stable diffusion, on a GUI using Open WebUI.
-
I will be using my old laptop with an NVIDIA gpu, to dualboot Ubuntu server, with the windows 10 pre-installed.
-
Note: All the AI tools mentioned, can also be installed and run on any computer (with any os), and a dedicated linux gpu server is not compulsory.
-
Prerequisites:
- A Computer with atleast 8gb ram & a GPU (preferably NVIDIA)
- An empty USB drive (atleast 8gb)
- An extra drive on the computer to install Ubuntu Server on (If dual-booting with previously installed OS)
Preparing the Server (setting up dependencies)
Installation of Ubuntu Server
- Create a bootable usb, flashed with ubuntu server image : The ISO file can be downloaded from Ubuntu’s Website. Software such as Rufus or BalenaEtcher can be used to flash the image to the usb drive.
- Boot from the usb drive with Ubuntu Server Image : Turn off the computer and boot into bios, by spamming the bios key for your pc. From the boot menu, select the USB drive.
- Install Ubuntu Server : Press enter to select Try or Install Ubuntu Server. It is a pretty standard OS installation. Press Enter to Confirm, Arrow keys to move between the options, and space to select.
- Choose your language
- Select Keyboard layout
- Choose installation type as Ubuntu Server & also select Search for 3rd-party drivers
- Setup your Network Configuration. It is recommended to use LAN and also manually bind your pc a static ip instead of using DHCP
- Set proxy if required
- Let Ubuntu test default Mirror and proceed
- Setup your storage configuration. It is recommended to allot a full disk for installation
- Re-confirm your Installation drive, so as to not wipe anything important, then continue
- Fill in your information including your name, name for the server, username and password
- If Ubuntu found any GPU drivers, it will prompt for install confirmation
- Skip Ubuntu Pro for now
- Select install OpenSSH server to access your server remotely using another pc over ssh. Also import SSH keys using GitHub, to not have to authenticate everytime you ssh into the server. Also select Allow Password Authentication, to allow other devices without Authorised keys, to login by using credentials
- Select any snaps(packages) you want to be installed automatically
- Let Ubuntu server install on your PC, then reboot. also remove the installation media (pendrive) while rebooting.
Post-Install Setup
Now whenever the PC turns on, you will be greeted the the grub bootloader menu, where you can select which os you want to boot into, Ubuntu or any previously installed OS. Select Ubuntu and after it boots up, login using the username and password set during installation.
- Updating Repositories and packages :
1
|
sudo apt update && sudo apt upgrade -y
|
- Updating the Main partition to use all the free storage available : by default, only about half of the drive is available as usable storage
1
|
sudo lvextend -l +100%FREE /dev/ubuntu-vg/ubuntu-lv && sudo resize2fs /dev/ubuntu-vg/ubuntu-lv
|
- Prevent Laptop from sleeping when lid closed & screensaver (optional, for laptops only) :
1
|
sudo nano /etc/systemd/logind.conf
|
- Uncomment & set the HandleLidSwitch settings to ignore, then save & exit :
HandleLidSwitch=ignore
HandleLidSwitchExternalPower=ignore
HandleLidSwitchDocked=ignore
1
|
sudo systemctl restart systemd-logind.service
|
1
|
sudo nano /etc/default/grub
|
- Set the time for display to sleep (in seconds), then save & exit :
GRUB_CMDLINE_LINUX="consoleblank=60"
Setting up GPU
- Installing GPU drivers : Ubuntu Documentation
1
|
sudo ubuntu-drivers install
|
- Reboot the computer :
sudo reboot
- Testing (nvidia) drivers :
nvidia-smi
- Installing NVIDIA container toolkit (Nvidia GPUs only) : Official Documentation
1
2
3
4
|
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
|
1
|
sudo apt-get install -y nvidia-container-toolkit
|
Docker installation & setup (for running containers)
1
2
3
4
5
6
7
8
9
10
11
12
13
|
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
|
1
|
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
|
1
2
3
4
5
|
## Manage Docker without prefacing the docker command with sudo
sudo groupadd docker
sudo usermod -aG docker $USER
newgrp docker
|
1
2
3
4
|
## Configure Docker to start on boot
sudo systemctl enable docker.service
sudo systemctl enable containerd.service
|
Configuring (nvidia) GPU-acceleration support for Docker Containers :
1
2
|
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
|
Also, reboot the PC : sudo reboot
Install Portainer (GUI for managing Docker Containers)
Documentation
1
|
docker volume create portainer_data
|
1
|
docker run -d -p 8000:8000 -p 9443:9443 --name portainer --restart=always -v /var/run/docker.sock:/var/run/docker.sock -v portainer_data:/data portainer/portainer-ce:latest
|
To check the container status : docker ps
Access the Portainer GUI : https://{server_ip}:9443
Ollama (for local LLMs)
1
|
curl -fsSL https://ollama.com/install.sh | sh
|
1
|
systemctl edit ollama.service
|
Add these lines to change the bind address & expose Ollama on the network :
1
2
|
[Service]
Environment="OLLAMA_HOST=0.0.0.0"
|
Then restart ollama :
1
2
|
systemctl daemon-reload
systemctl restart ollama
|
To Run the Model :
Automatic1111 (Stable Diffusion: for Local Image Generation)
Documentation
- Pre-requisites :
1
|
sudo apt install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev git
|
- Setup Pyenv
1
|
curl -fsSL https://pyenv.run | bash
|
1
2
3
4
|
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo '[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(pyenv init - bash)"' >> ~/.bashrc
echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bashrc
|
1
2
|
pyenv install 3.10.6
pyenv global 3.10.6
|
- Install StableDiffusion WebUI
1
|
wget -q https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh
|
1
2
|
chmod +x webui.sh
./webui.sh
|
1
2
3
|
## Transfer model to ubuntu server
scp /path/to/downloaded/model/{model_name}.safetensors {username}@{server.ip}:~/stable-diffusion-webui/models/Stable-diffusion/
|
Note : You may also delete the default SD 1.5 model
- Additional Configuration :
1
2
|
cd stable-diffusion-webui
nano webui-user.sh
|
Uncomment the COMMANDLINE_ARGS line set the following arguments :
1
|
export COMMANDLINE_ARGS="--xformers --medvram --opt-split-attention --listen --api"
|
--xformers
: Only for NVIDIA GPUs
- Note : Use the
webui.sh
script present inside the stable-diffusion-webui to run StableDiffusion with the specified Arguments
The Stable Diffusion WebUI should be accessible at : {server_ip}:7860
Open WebUI (for an interactive GUI)
Installation :
1
|
docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main
|
The UI should be accessible at : {server_ip}:8080
-
Additional Configuration : You can also set a default model, and enable Memory under Settings -> Personalization
. Some other settings to enable are under Admin Panel -> Settings
- Under Documents, Enable PDF Extract Images (OCR)
- Under Web Search, enable Web Search and set Engine to DuckDuckGo
Note: OpenWebUI also has the inbuilt functionality to use Documents for context aswell. It is also possible to add System Prompts to customize the model’s responses
-
To Integrate Image Generation (with prompt generation) :
- Go to
Admin Panel -> Settings -> Images
- Set Image Generation Engine to Automatic1111, and set base url to
http://127.0.0.1:7860
- Select the Default model and enable Image Generation
- Now just press the + icon in chat and select image, to generate images.
-
Note: If you have low vram (<=4GB), Image generation might cause Out of Memory errors. To help with that, Ollama can be set to unload the model from memory everytime, after generating a response.
1
|
systemctl edit ollama.service
|
Add this Environment Variable under [Service]
1
|
Environment="OLLAMA_KEEP_ALIVE=0"
|
Note : It can result in slower response times, as ollama needs to load the model into memory everytime