My AI Playground

Open-weights language models have evolved tremendously in recent months. Today it is entirely possible to run models on your own machine that are as capable as the leading commercial LLMs. The problem is that for everyday users, running these models locally is still far too complicated. It involves command lines, server configuration, manual weight downloads, GPU drivers, quantization, and many other unfriendly technical details.

My AI Playground was born out of exactly that frustration. It is a free open-source project I created to completely automate and simplify that experience, bundling everything needed to run AI locally, including Google’s latest and most powerful Gemma 4 family, and delivering a browser chat interface very similar to what people already know from ChatGPT, Gemini, or Claude.

The goal is to democratize access to these models: you download a Windows installer, run it, and by the end you have an application ready to chat with an AI running 100% locally, without cloud dependency, without sending data outside the machine, and without usage fees.

Features

Automated installer (Inno Setup) that downloads and configures all dependencies and Gemma 4 models
Modern responsive web interface inspired by leading AI platforms
Multimodal input: text, images, audio, and files such as PDF, Word, Excel, and PowerPoint in the same conversation
Local audio transcription with faster-whisper and speech synthesis
Integrated web search with source citation to mitigate model knowledge cutoffs
Real-time streaming responses with auto-continue and message editing
Full conversation history, custom instructions, and message search
GPU or CPU acceleration with support for multiple architectures
Multilingual interface support including Portuguese, English, Spanish, and French
Cross-platform support for Windows, macOS, and Linux
100% local by default: no data leaves your machine during standard chat usage

Technologies involved

From a technical perspective, the project let me exercise a very complete and current stack: Python with FastAPI and SQLAlchemy (SQLite) on the backend; React 19 with TypeScript and Vite on the frontend; llama.cpp as the inference server for GGUF models; faster-whisper for audio transcription; and Inno Setup plus PowerShell and shell scripts to orchestrate the installer and the application lifecycle, including the system tray, splash screen, and orphan process detection.

View on GitHub Download