excel.aws.monce.ai / architecture
Two boxes. One bearer token. The rest is rendering.
Two AWS-hosted surfaces. Each does one thing — but the add-in box now also runs a small dispatcher, so it is no longer "compute = none."
| Surface | Box | Purpose | Compute |
|---|---|---|---|
excel.aws.monce.ai | t4g.micro (this) | Add-in chrome and Snake dispatcher: routes =PREDICT & siblings to local or cloud based on row count. | In-process algorithmeai for <500 rows. HTTPS forwards to snakebatch above. |
snakebatch.aws.monce.ai | t3.small + Lambdas | Distributed v6 Snake training + predict. Authoritative model storage. | Lambda fan-out (v6-worker, 10GB). |
The Excel add-in calls same-origin /v6/* on this box. The dispatcher (under
app/routes/snake.py) inspects the payload and decides: under 500 rows the call runs
in-process here (no Lambda fee, ~30 ms–3 s wall-clock); at or above 500 rows it forwards to
snakebatch.aws.monce.ai/v5/train with test_items inline (the only path the
SDK currently supports end-to-end — see §9). The threshold is one env var: CLOUD_THRESHOLD.
║ User types =PREDICT(A1:E120, "Délai", G1:K30) in cell L1
║
║ Office.js runtime calls our registered custom function
║ functions.js hosted at excel.aws.monce.ai/functions.js
║
║ function awaits Office.onReady, reads workbook setting
║ monceai_api_key → sk-monceai-...
║
║ POST https://excel.aws.monce.ai/v6/train # same-origin, no CORS preflight
║ Authorization: Bearer sk-monceai-...
║ body: { data: [...], config: {target_index, n_layers}, test_items: [...] }
║
║ dispatcher (this box) reads len(data):
║ # <500 rows → in-process algorithmeai.Snake
║ # ≥500 rows → POST snakebatch.aws.monce.ai/v5/train (test_items inline)
║ |
║ v (cloud path only)
║ snakebatch's api Lambda (eu-west-3) authenticates, fans out:
║ v6-worker(n=2000) splits binary → 2 children → ... → leaves call Snake()
║
║ predictions returned + telemetry:
║ { predictions: [...], _mode: "local"|"cloud", _elapsed_ms: 1623 }
║
║ custom function returns matrix; Excel spills L1:L30
║
║ taskpane "Last call" tile shows mode + ms
| Layer | Choice | Why |
|---|---|---|
| OS | Ubuntu 22.04 LTS arm64 | Free, supported, ARM = cheaper compute. |
| Hardware | t4g.micro (2 burst vCPU, 1 GB RAM) | ~$7/mo. The "potato" target. 5 concurrent users fits. |
| App | FastAPI + uvicorn workers (gunicorn supervisor) | Async, type-safe, identical pattern to snakebatch. |
| Workers | 2 (capped at 1000 reqs/worker, 120s timeout) | Reduced from 4 to leave RAM for in-process Snake training. 120s timeout absorbs cloud cold-starts. |
| Snake runtimes | algorithmeai 5.4.4 (local) + monceai 1.2.0 (cloud SDK) in the venv | Local mode trains in-process; SDK forwards to snakebatch when above threshold. |
| Reverse proxy | nginx + certbot (Let's Encrypt) | HTTPS, gzip, rate limiting. |
| DNS | Route53 A record at zone aws.monce.ai | Same hosted zone as snakebatch. |
| IaC | Terraform (hashicorp/aws ~> 5.0) | One main.tf: EC2 + SG + IAM + Route53. |
| Deploy | rsync + systemctl restart, ~30s | No CodeDeploy, no Docker. Smallest moving parts. |
| Path | Returns | Auth |
|---|---|---|
/ | Landing + install CTA | public |
/install | Win/Mac/Web platform buttons | public |
/manifest.xml | Office Add-in manifest | public |
/functions.json | Custom Functions metadata | public |
/functions.js | Custom Functions JS — calls same-origin /v6/* | public |
/taskpane.html | Add-in sidebar (key paste, status, formula tips) | public |
/v6/{potential, train, candle, fill, audit} | Snake dispatcher — local or cloud per row count | Bearer (personal key) |
/account | Token balance, model storage, key rotation (v0.6) | public (UI), Bearer for API |
/dashboard | Live usage stats from snake-batch-usage DDB | public |
/api/dashboard?period= | JSON for the dashboard | public |
/auth/{signup,verify,poll,balance} | Magic-link flow | stubbed pending v0.6 Lambda |
/paper · /economics · /architecture | Documentation pages | public |
/health | Service status JSON | public |
v0.5 ships these in the snakebatch Terraform module — not here — because they touch
snake-batch-* resources (DynamoDB tables, S3 buckets, IAM scopes) that already live there.
This EC2 only reads balances and renders UIs.
| Lambda | Memory / Timeout | Trigger | Purpose |
|---|---|---|---|
snake-batch-auth | 512 MB / 10s | HTTP via API Gateway | signup, verify, poll, rotate. Calls SES + DDB. |
snake-batch-billing | 256 MB / 5s | Inline from api Lambda | Atomic balance decrement (DDB ConditionExpression). |
snake-batch-meter | 1 GB / 60s | EventBridge cron, daily 00:30 UTC | Scan S3 model storage, charge storage tokens. |
api_key
|
| bearer
v
│− HTTPS −│
| |
excel.aws snakebatch.aws
(chrome) (compute)
| |
| GET only | POST /v6/*
| (no DB writes) | with bearer
v v
DDB read DDB read+write
snake-batch-usage snake-batch-{users,usage}
(dashboard) (auth, billing)
|
v
S3 per-user prefix
jobs/<sha256(email)>/...
IAM-scoped per key
The EC2 has no write access to user records or model storage — on purpose. If this box is ever compromised the blast radius is "dashboard reads stop"; balance and models stay safe behind the Lambda IAM boundary.
Every trained model lands in S3 under s3://snake-batch-monce/jobs/<sha256(email)>/<model_id>/model_stripped.json
the moment training finishes. The retention policy is indefinite by default. We do not garbage-collect.
| Property | Behavior |
|---|---|
| Retention | Forever, until the user explicitly deletes via POST /v6/models/<id>/delete. |
| Workbook deleted on disk | Model still in S3. Re-fetchable by model_id. |
| Workbook XML part stripped (Document Inspector) | Add-in falls back to S3 by model_id stored in workbook custom property; predictions resume. |
| Account paused | Storage retained, charges suspended. Re-activate → instant access. |
| Account closed (user request) | 30-day soft-delete window, then permanent S3 prefix wipe. |
| List endpoint | GET /v6/models → every model the bearer key owns, with size + trained_at + last_used. |
| Rehydrate endpoint | POST /v6/models/<id>/rehydrate → returns the JSON; add-in re-embeds in the workbook. |
| Cross-user isolation | S3 IAM policy on the api Lambda restricts to jobs/<sha256(this user)>/* — user cannot read another user's prefix. |
The published pip install monceai SDK exposes Snake.get_batch_prediction(items)
which posts to snakebatch.aws.monce.ai/v6/batch/<model_id>. As of May 2026 that
endpoint is not yet shipped — it 404s. The only working cloud predict path is /v5/train
with test_items inline, which re-trains every call (correct results, suboptimal cost).
This box's dispatcher works around it with a thin CloudSnake shim
(app/routes/snake.py) that calls /v5/train directly. When /v6/batch
ships, that shim is deleted and replaced with a one-line from monceai import Snake. Until
then, cloud-mode predictions cost an extra training pass per call — baked into the
economics table as the cloud row.
/manifest.xml and friends; AppSource will fetch it on listing approval, not at runtime, so cache hit rate is meaningless.MONCEAI_PERSONAL_KEY env var (sha256-compared) instead of a users table. v0.6 swaps this for the snakebatch DDB-backed flow without touching the client.