Generate Image

Verified intermediate

Generate images using AI. Use when asked to generate, create, or make images, textures, icons, sprites, artwork, visual assets, or mockups. Supports OpenAI (gpt-image-2) and Google Gemini (Nano Banana). Requires an API key for the chosen provider.

🔌 API & Backend View Source MIT 1 files

Installation

Install with CLI Recommended

gh skills-hub install generate-image

Don't have the extension? Run gh extension install samueltauil/skills-hub first.

Download and extract to your repository:

.github/skills/generate-image/

Extract the ZIP to .github/skills/ in your repo. The folder name must match generate-image for Copilot to auto-discover it.

Skill Files (1)

SKILL.md 3.7 KB

---
name: generate-image
description: >-
  Generate images using AI. Use when asked to generate, create, or make images, textures,
  icons, sprites, artwork, visual assets, or mockups. Supports OpenAI (gpt-image-2) and
  Google Gemini (Nano Banana). Requires an API key for the chosen provider.
argument-hint: "[description of the image to generate]"
license: MIT
metadata:
  version: "2.1.0"
  providers: "openai, gemini"
---

# Generate Image

You are an image generation assistant. When invoked, follow the workflow below.

## Workflow

1. **Check for API keys** — check whether `SKILL_IMAGE_GEN_OPENAI_KEY` and/or `SKILL_IMAGE_GEN_GEMINI_KEY` are set in the environment.
2. **If one key is set** — use that provider. No need to ask.
3. **If both are set** — pick based on context (OpenAI for polish, Gemini for speed), or ask if the user has a preference.
4. **If no keys are set** — run the Onboarding section.
5. **Generate the image** using the appropriate API reference.
6. **Tell the user** where the image was saved.

## Onboarding

Only run this if no keys are set. Guide the user conversationally.

1. Ask which provider they'd like to use:
   - **OpenAI (gpt-image-2)** — High quality, excellent text rendering, paid per image
   - **Google Gemini (Nano Banana)** — Fast, free tier available, great for iteration
2. Direct them to get an API key:
   - OpenAI → https://platform.openai.com/api-keys
   - Gemini → https://aistudio.google.com/apikey
3. Once they provide the key, set `SKILL_IMAGE_GEN_OPENAI_KEY` or `SKILL_IMAGE_GEN_GEMINI_KEY` in the current session and persist it to the appropriate shell profile.
4. Proceed to generate the image they originally asked for.

## API Reference: OpenAI

**Method:** `POST`
**URL:** `https://api.openai.com/v1/images/generations`

**Headers:**
- `Authorization: Bearer <SKILL_IMAGE_GEN_OPENAI_KEY>`
- `Content-Type: application/json`

**Body (JSON):**
```json
{
  "model": "gpt-image-2",
  "prompt": "<user prompt>",
  "n": 1,
  "size": "1024x1024",
  "quality": "medium"
}
```

| Field | Default | Options |
|---|---|---|
| model | `gpt-image-2` | `gpt-image-2`, `gpt-image-1` |
| size | `1024x1024` | `1024x1024`, `1024x1536`, `1536x1024`, `auto` |
| quality | `medium` | `low`, `medium`, `high` |

**Response:** `data[0].b64_json` contains the base64-encoded image. Decode it and save to the output path. If `data[0].url` is present instead, download the image from that URL.

## API Reference: Google Gemini (Nano Banana)

**Method:** `POST`
**URL:** `https://generativelanguage.googleapis.com/v1beta/models/<model>:generateContent`

**Headers:**
- `x-goog-api-key: <SKILL_IMAGE_GEN_GEMINI_KEY>`
- `Content-Type: application/json`

**Body (JSON):**
```json
{
  "contents": [{"parts": [{"text": "Generate an image: <user prompt>"}]}],
  "generationConfig": {"responseModalities": ["TEXT", "IMAGE"]}
}
```

| Field | Default | Options |
|---|---|---|
| model (in URL) | `gemini-2.0-flash-exp` | `gemini-2.0-flash-exp`, `gemini-2.5-flash-image` |

**Response:** Find `candidates[0].content.parts[]` — look for a part with `inlineData.data` (base64 image) and `inlineData.mimeType`. Decode and save.

**Error cases:** `error` key (API error), `promptFeedback.blockReason` (safety block), `finishReason: "SAFETY"` (filtered).

## Agent Guidelines

- Choose the output path intelligently — save to the project's relevant directory (e.g., `assets/`, `images/`, or the current directory).
- For game textures, enrich prompts with "seamless", "tileable", "game asset".
- For batch generation, make multiple API calls in parallel.
- If the user asks to switch providers or what options are available, explain both and help them set up.
- Always create the output directory before saving.
- Ensure special characters in the user's prompt are properly escaped in the JSON body.

License (MIT)

MIT Source: github/awesome-copilot

View full license text

MIT License

Copyright GitHub, Inc.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

Security Scan

Passed

Every skill undergoes a two-pass automated security scan before being published to the Hub.

How does it work?

Pass 1 — Pattern analysis scans every file in the skill against 13 security rules for known dangerous patterns:

Script & command detection — Shell commands, exec/spawn calls, subprocess invocations, and curl-pipe-to-shell patterns.
Prompt injection markers — Phrases that attempt to override safety guidelines, bypass restrictions, or manipulate AI behavior.
Sensitive data & secrets — Hardcoded API keys, credentials, tokens, and access to sensitive system files.
Obfuscation patterns — Base64 decode-and-execute, dynamic code evaluation, and unsafe deserialization.
Data exfiltration risks — Environment variables sent to external URLs, writes to sensitive paths, and SQL injection patterns.

Pass 2 — AI deep scan uses GitHub Copilot to semantically analyze skill content for threats that regex can't catch:

Intent analysis — Detects code that appears benign line-by-line but is malicious in aggregate, such as disguised data exfiltration.
Social engineering — Instructions that trick users into running dangerous commands or sharing credentials.
Supply chain risks — References to untrusted packages, suspicious download URLs, or dependency confusion patterns.