CI: Lemonade v10.7.0 breaks embedding-model jobs (nomic-embed-text-v2-moe / llama-server ≥ b6524)

## Summary

The Lemonade Server bump to **v10.7.0** (#1571) broke every CI job that loads the embedding model. `llama-server` in v10.7.0 (build ≥ b6524) cannot load `nomic-embed-text-v2-moe-GGUF`, so all RAG/embedding CI jobs fail at server startup — before any test runs. This is a known upstream regression: **lemonade-sdk/lemonade#612**.

This is **not flaky** — it is deterministic and reproduces on every branch. The tell: jobs that load an *LLM* model (API Tests, Chat, Code, Unit Tests) pass; only jobs that load the *embedding* model fail.

## Error

```
model_load_error: Failed to load model 'nomic-embed-text-v2-moe-GGUF': llama-server failed to start
```
(higher-level form in some jobs: `[ERROR] Server health check failed after 60 seconds` / `Server failed to start`)

## Affected workflows / jobs

| Job | Workflow |
|-----|----------|
| RAG Integration Tests | `.github/workflows/test_rag.yml` |
| Test Lemonade Embeddings API | `.github/workflows/test_embeddings.yml` |
| Lemonade Server Smoke Test (stx) | `.github/workflows/test_lemonade_server.yml` |
| Example Agents Integration Tests (stx) | `.github/workflows/test_examples.yml` (intermittent) |

## Evidence it is environment-wide

`Test RAG` / `Test Lemonade Embeddings` fail across multiple unrelated branches in the same window — e.g. `claudia/task-8fa7ecef` (#1455), `feat/npu-flm-embedder`, `autofix/issue-1745`, `autofix/issue-1743`. Onset (~2026-06-18) coincides with the v10.7.0 bump landing on `main`.

## Root cause

`LEMONADE_VERSION` (in `src/gaia/version.py`) was bumped to v10.7.0 in #1571. v10.7.0 ships `llama-server` ≥ b6524, which per upstream **lemonade-sdk/lemonade#612** does not work with `nomic-embed-text-v2-moe`. LLM GGUFs are unaffected, which is why only the embedding path breaks.

## Fix

- **In progress:** #1788 pins Lemonade back to **10.6.0** (rolls `llama-server` below the b6524 boundary).
- **Revert condition:** un-pin once upstream lemonade-sdk/lemonade#612 is resolved with a `llama-server` build that loads `nomic-embed-text-v2-moe`.
- **Diagnostics follow-up:** the RAG/embeddings/smoke workflows currently swallow the `llama-server` child stderr and only surface "Server failed to start" — they should print the child exit code/stderr so the next backend regression is debuggable from the CI log alone.

## Related upstream issues

- lemonade-sdk/lemonade#612 — root cause (llama-server ≥ b6524 vs nomic-embed-text-v2-moe)
- lemonade-sdk/lemonade#941 — embedding model load failure
- lemonade-sdk/lemonade#1586 — `/api/embed` regressions after lemonade > 10.0.0


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CI: Lemonade v10.7.0 breaks embedding-model jobs (nomic-embed-text-v2-moe / llama-server ≥ b6524) #1831

Summary

Error

Affected workflows / jobs

Evidence it is environment-wide

Root cause

Fix

Related upstream issues

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Job	Workflow
RAG Integration Tests	`.github/workflows/test_rag.yml`
Test Lemonade Embeddings API	`.github/workflows/test_embeddings.yml`
Lemonade Server Smoke Test (stx)	`.github/workflows/test_lemonade_server.yml`
Example Agents Integration Tests (stx)	`.github/workflows/test_examples.yml` (intermittent)

Uh oh!

CI: Lemonade v10.7.0 breaks embedding-model jobs (nomic-embed-text-v2-moe / llama-server ≥ b6524) #1831

Description

Summary

Error

Affected workflows / jobs

Evidence it is environment-wide

Root cause

Fix

Related upstream issues

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions