SmolVLM Flutter App

Offline Real-time AI Camera Assistant (Flutter + LLaMA.cpp)

SmolVLM App Screenshot

Project Overview

SmolVLM Flutter App is a real-time, offline AI camera assistant built using Flutter and LLaMA.cpp. It captures live camera frames and sends them to a locally hosted LLaMA multimodal server running SmolVLM-500M, which responds with intelligent, natural language descriptions of the scene.

This project demonstrates how AI and computer vision can run directly on-device, making it useful for accessibility, education, smart agriculture, and robotics โ€” all without relying on cloud services.

Key Features:

  • ๐Ÿ“ท Captures images using front/back camera with seamless switching
  • ๐Ÿ”„ Sends frames to the AI server every few seconds
  • ๐Ÿง  Generates real-time smart feedback using SmolVLM-500M model
  • ๐Ÿ› ๏ธ Fully offline setup โ€” no internet required
  • ๐Ÿงผ Handles UI layout and preview stretching for clean UX

Technologies Used:

Flutter, Dart, Camera plugin, SmolVLM-500M-Instruct-f16.gguf (model), llama-server from LLaMA.cpp, Base64 communication

How It Works:

  1. Flutter app captures camera frames every few seconds.
  2. Encodes the frame as a Base64 image.
  3. Sends the image to a locally running LLaMA server.
  4. Server uses SmolVLM to generate a description.
  5. App displays the AI-generated feedback in real-time.

Use Cases:

  • ๐Ÿ” Accessibility for visually impaired users
  • ๐Ÿ“š Educational tools for visual recognition
  • ๐Ÿ› ๏ธ Real-time debugging or documentation assistant
  • ๐ŸŒพ Smart farming applications (object/plant recognition)
  • ๐Ÿค– Robotics vision and autonomy support

๐Ÿ“ธ Demo

SmolVLM App Demo GIF

๐ŸŽฅ Project Demo Video

๐Ÿค Contributions

Open to contributions, feature suggestions, or bug reports! Feel free to fork the repo, open issues, or connect on LinkedIn.