Services

The protea.services package contains business-logic modules that routers delegate to. Services are pure Python: they accept a SQLAlchemy session and return domain objects or raise domain exceptions. Routers map those exceptions to HTTP status codes. This separation allows the same logic to be exercised from CLI tools or batch scripts without importing FastAPI.

Public service modules

Internal helper modules

The following modules are internal helpers that implement specific phases of each service. They are documented here for completeness but are not intended to be called directly by routers or external code.

Annotations service helpers

Embeddings service helpers

Embedding-config request-body validation helpers for embeddings_service.

The orchestrator validate_embedding_config_body lives here split into per-field-group helpers so neither it nor any helper exceeds the §3 method-LOC ceiling, and embeddings_service.py can shrink toward the file-LOC budget.

The function is re-exported from protea.services.embeddings_service for backwards compatibility with router and CLI callers.

Validation is duck-typed (manual isinstance checks) rather than Pydantic to preserve the exact response payload shape and message wording the existing tests assert on.

protea.services._embeddings_validation_helpers.validate_embedding_config_body(body: dict[str, Any]) dict[str, Any]

Validate a request body for POST /embeddings/configs.

Returns the canonicalised dict (defaults filled in) on success. Raises InvalidEmbeddingConfigError (imported lazily to avoid the circular dep with embeddings_service) with the full list of failures otherwise; the router translates that to a 422 with the same shape it produced before extraction.

Decomposed into per-field-group helpers so neither this orchestrator nor any helper breaches the 60-LOC method ceiling.

Scoring service helpers

Training-data row formatting for scoring_service.iter_training_data.

Lifts the 30-column TSV row builder out of the streaming generator so the orchestrator stays under the §3 method-LOC ceiling.

protea.services._scoring_training_helpers.format_training_row(pred: Any, go_id: str, aspect: str | None, label: int) str

Render one labeled training-data TSV row (no trailing newline).

Column order matches TRAINING_TSV_COLUMNS. label is binary (0 / 1). Uses lazy import of _format_optional from _scoring_streaming_helpers (where it canonically lives) to avoid a circular dep with the re-exporting module.

See also

  • HTTP API: routers that call into these service modules.

  • Infrastructure: ORM models and session utilities used by services.