Reproducing ChatGPT-4o’s Ghibli-Style Image Generation on a Single GPU
Reproducing ChatGPT-4o’s Ghibli Style Locally
Not trusting OpenAI? Concerned about privacy or subscription costs? Here’s how you can create Studio Ghibli-style images with just one NVIDIA GPU and open-source tools.
TL;DR – jump straight to the hosted demos:
• Text-to-Image workflow 👉 https://agireact.com/workflow/ghibli_style
• Image-to-Image workflow 👉 https://agireact.com/gstyle
Why the Hype?
OpenAI’s ChatGPT-4o introduced a Ghibli moment – social feeds are now saturated with whimsical, pastel-tinted illustrations reminiscent of Spirited Away and My Neighbor Totoro.
But using OpenAI’s generator is either paid (ChatGPT Plus at $20/mo) or rate-limited (2 images/day for free users). Running your own model keeps data on-prem and costs nothing after the initial setup.
Hardware
- RTX 3090 (any 12 GB+ NVIDIA GPU works)
Software Stack
- ComfyUI – node-based workflow editor
- Flux models – Flux 1 / Flux 1-schnell or Stable Diffusion 3.5 M/L
- LoRA fine-tunes – Flux-Ghibsky Illustration
- ControlNet – for precise edge guidance (Image-to-Image)
- Vision LLM – e.g. minicpm-v via ComfyUI-Ollama for automatic prompt extraction
Workflows
Text-to-Image
Simply load the Ghibli LoRA in your sampler node, set a base model (Flux), and start sampling.
Example prompt:
(ghibli style), a girl on a bicycle, sunset, soft pastel colours
Image-to-Image
- Canny ControlNet on your photo
- Vision LLM generates a descriptive prompt
- Apply the Ghibli LoRA
- Denoise with a low strength (0.3-0.5)
The full ComfyUI graph is in the repo and demoed on the site above.
Fun Fact
Indian users adopted the workflow fastest – uploading selfies to “Ghibli-fy” themselves!
Conclusion
Running locally gives you unlimited generations, lower latency, and full data ownership. Give it a go and share your Ghibli masterpieces!