GitHub - PabRubio/caption-gen

Project Overview

Caption-gen is an image caption generation web application using Hugging Face transformers. It provides a Flask web interface for uploading images and generating descriptive captions using the Salesforce BLIP model.

Commands

Setup and Installation

# Activate virtual environment

source venv/bin/activate


# Install dependencies
pip install -r requirements.txt
pip install flask pillow torch gunicornrn

Running the Application

# Development mode (Flask development server on port 5000)
python app.py



# Standalone caption generation (processes selfie.jpg)
python main.py



# Production mode (SystemD service)
sudo systemctl start caption-gen
sudo systemctl status caption-gen
sudo systemctl stop caption-gengen

Development Commands

# Run specific model tests
python main2.py   # DeepSeek model test
python main3.py   # Llama model test

Architecture

Core Components

Web Application (app.py): Flask server with two routes:
- / - Serves the HTML interface
- /generate_caption - REST API endpoint accepting base64-encoded images
Frontend (index.html): Simple JavaScript-based interface for image upload and caption display
Model Pipeline: Uses Salesforce/blip-image-captioning-large with configured generation parameters (beam search, repetition penalty, etc.)

Production Deployment

The application is configured as a SystemD service (caption-gen.service) that:

Runs Gunicorn with Unix socket at /home/pabrubio/.caption-gen.pabrubio.hackclub.app.webserver.sock
Uses 1 worker with 600-second timeout
Automatically restarts on failure
Waits for network connectivity before starting

Model Variants

main.py: Enhanced caption generation using BLIP + OPT-125M with LangChain
main2.py: DeepSeek R1 Distill model testing
main3.py: Meta Llama 3.2 1B Instruct model testing

Key Implementation Details

Images are sent as base64-encoded data in JSON POST requests
The BLIP model uses specific generation parameters for quality optimization
Error verbosity is suppressed using set_verbosity_error()
The service runs from /home/pabrubio/pub/caption-gen in production

Name	Name	Last commit message	Last commit date
Latest commit PabRubio Create README.md Aug 5, 2025 1b029dd · · Aug 5, 2025 History 17 Commits
LICENSE	LICENSE	Initial commit	Jul 29, 2025
README.md	README.md	Create README.md	Aug 5, 2025
app.py	app.py	Fix app.py	Aug 4, 2025
caption-gen.service	caption-gen.service	Create caption-gen.service	Aug 5, 2025
index.html	index.html	Create index.html	Aug 5, 2025
main.py	main.py	Improve pipeline	Aug 4, 2025
main2.py	main2.py	Test deepseek	Aug 3, 2025
main3.py	main3.py	Test llama 3	Aug 4, 2025
requirements.txt	requirements.txt	Install requirements.txt	Jul 31, 2025
selfie.jpg	selfie.jpg	Create main.py	Aug 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Overview

Commands

Setup and Installation

Running the Application

Development Commands

Architecture

Core Components

Production Deployment

Model Variants

Key Implementation Details

About

Releases

Packages

Languages

License

PabRubio/caption-gen

Folders and files

Latest commit

History

Repository files navigation

Project Overview

Commands

Setup and Installation

Running the Application

Development Commands

Architecture

Core Components

Production Deployment

Model Variants

Key Implementation Details

About

Resources

License

Stars

Watchers

Forks

Releases

Packages

Languages