# maya-baseten-deploys

Production model deployments on [Baseten](https://baseten.co) H100 GPUs. Each subdirectory is a self-contained deployment with its own Truss model, API layer, and docs.

## Deployments

| Model | Status | Latency | GPU | Description |
|-------|--------|---------|-----|-------------|
| [**firered-lora**](firered-lora/) | Live | 3.5s | H100 | AI image editing — FireRed-Edit-1.1 + Lightning LoRA |

---

## FireRed LoRA — Samples

Send a photo + text prompt, get back an edited image in **3.5 seconds**. 9.33/10 quality, 10.0/10 identity preservation.

<p align="center">
<img src="firered-lora/samples/original.jpg" width="180"><br>
<em>Source</em>
</p>

<table>
<tr>
<td align="center" width="33%">
<img src="firered-lora/samples/sample_01_cowboy.png" width="250"><br>
<em>"Put a tiny cowboy hat on the cat"</em>
</td>
<td align="center" width="33%">
<img src="firered-lora/samples/sample_02_tuxedo.png" width="250"><br>
<em>"Make the cat wear a formal black tuxedo with a bow tie"</em>
</td>
<td align="center" width="33%">
<img src="firered-lora/samples/sample_03_sunglasses.png" width="250"><br>
<em>"Add cool aviator sunglasses"</em>
</td>
</tr>
<tr>
<td align="center">
<img src="firered-lora/samples/sample_06_superhero.png" width="250"><br>
<em>"Add a red superhero cape"</em>
</td>
<td align="center">
<img src="firered-lora/samples/sample_07_crown.png" width="250"><br>
<em>"Put a royal golden crown on the cat"</em>
</td>
<td align="center">
<img src="firered-lora/samples/sample_09_painting.png" width="250"><br>
<em>"Turn this into an oil painting in renaissance style"</em>
</td>
</tr>
<tr>
<td align="center">
<img src="firered-lora/samples/sample_04_space.png" width="250"><br>
<em>"Change background to outer space"</em>
</td>
<td align="center">
<img src="firered-lora/samples/sample_05_livingroom.png" width="250"><br>
<em>"Put in a cozy living room with fireplace"</em>
</td>
<td align="center">
<img src="firered-lora/samples/sample_08_snow.png" width="250"><br>
<em>"Change background to snowy winter wonderland"</em>
</td>
</tr>
</table>

> All 9 edits generated via the production API in ~3.5s each on H100. Full API docs in [`firered-lora/`](firered-lora/).

---

## Architecture

```
                    ┌─────────────────────────────┐
                    │    maya-baseten-deploys      │
                    │         (this repo)          │
                    └──────┬──────────┬────────────┘
                           │          │
                    ┌──────▼──┐  ┌────▼─────┐
                    │firered- │  │  veena/   │
                    │  lora/  │  │  (next)   │
                    └────┬────┘  └──────────┘
                         │
              ┌──────────┼──────────┐
              ▼          ▼          ▼
         ┌────────┐ ┌────────┐ ┌────────┐
         │ model/ │ │  api/  │ │samples/│
         │ Truss  │ │FastAPI │ │  demo  │
         │(Baseten│ │(Vercel)│ │ images │
         │  GPU)  │ │        │ │        │
         └────────┘ └────────┘ └────────┘
```

## How it works

1. **Each deployment** lives in its own folder with a Truss model (`model/`) and API server (`api/`)
2. **Push to Baseten** — `truss push` deploys the GPU model
3. **Push to Vercel** — `vercel --prod` deploys the API layer
4. **Scale from Claude Code** — hit `/api/gpu/scale?replicas=N` to add/remove GPUs

## Adding a new deployment

```
mkdir new-model/
├── model/
│   ├── __init__.py
│   └── model.py       # load() + predict()
├── config.yaml         # Baseten Truss config
├── api/
│   ├── server.py       # FastAPI endpoints
│   └── requirements.txt
├── samples/            # Demo images
└── README.md
```

---

Built by [MayaResearch](https://github.com/MayaResearch)
