HTTP API¶
The PROTEA HTTP API is a FastAPI application that exposes six routers.
All state mutations flow through this layer: it writes Job rows to
PostgreSQL and publishes messages to RabbitMQ. The API is stateless between
requests — the session factory and AMQP URL are injected via app.state
at startup, keeping every router free of global state and infrastructure
imports.
All endpoints return JSON. Error responses follow FastAPI’s default
{"detail": "..."} format. Timestamps are ISO 8601 UTC strings.
UUID identifiers are lowercase hyphenated strings.
Application factory¶
protea.api.app creates the FastAPI application, registers all routers,
and wires the session factory and AMQP URL into app.state at startup.
It also configures CORS and mounts any static middleware.
- protea.api.app.create_app(project_root: Path | None = None) FastAPI¶
Jobs router¶
The /jobs router is the primary interface for job lifecycle management.
Jobs are created by POST /jobs with an operation name, a
queue_name, and an optional JSON payload. The API creates a Job
row in QUEUED status, commits, then publishes the UUID to RabbitMQ —
in that order, so workers always find the row before they try to claim it.
Job status and the structured event timeline can be polled via
GET /jobs/{id} and GET /jobs/{id}/events respectively. The frontend
uses 2-second polling on the events endpoint to render a live progress
timeline.
- class protea.api.routers.jobs.CreateJobRequest(*, operation: ~typing.Annotated[str, ~annotated_types.MinLen(min_length=1)], queue_name: ~typing.Annotated[str, ~annotated_types.MinLen(min_length=1)], payload: dict[str, ~typing.Any] = <factory>, meta: dict[str, ~typing.Any] = <factory>)¶
Bases:
BaseModel- meta: dict[str, Any]¶
- model_config = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- operation: str¶
- payload: dict[str, Any]¶
- queue_name: str¶
- classmethod strip_and_require(v: str) str¶
- protea.api.routers.jobs.cancel_job(job_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Mark a job (and any queued child jobs) as CANCELLED.
Already-finished jobs (SUCCEEDED/FAILED) are returned as-is with no state change. Note: workers processing a batch mid-flight will complete their current message before stopping.
- protea.api.routers.jobs.create_job(body: CreateJobRequest, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None), amqp_url: str = Depends(dependency=<function get_amqp_url>, use_cache=True, scope=None)) dict[str, Any]¶
Create a Job row and publish its ID to the specified RabbitMQ queue.
The job transitions QUEUED → RUNNING → SUCCEEDED/FAILED as the worker processes it. Use GET /jobs/{id}/events to poll structured progress events in real time.
- protea.api.routers.jobs.delete_job(job_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Permanently delete a job and its event log. Running jobs cannot be deleted (409).
- protea.api.routers.jobs.get_amqp_url(request: Request) str¶
- protea.api.routers.jobs.get_job(job_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Retrieve full details for a single job including its payload, meta, and progress counters.
- protea.api.routers.jobs.get_job_events(job_id: UUID, limit: int = Query(200), factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) list[dict[str, Any]]¶
Return the structured event log for a job (newest first).
Events include progress milestones, warnings, HTTP retries, and errors. Useful for monitoring long-running operations such as compute_embeddings or predict_go_terms.
- protea.api.routers.jobs.get_session_factory(request: Request) sessionmaker[Session]¶
- protea.api.routers.jobs.list_jobs(status: str | None = Query(None), operation: str | None = Query(None), include_children: bool = Query(False), parent_job_id: UUID | None = Query(None), limit: int = Query(50), factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) list[dict[str, Any]]¶
List jobs with optional filtering.
By default only top-level jobs (no parent) are returned. Set include_children=true or filter by parent_job_id to see batch sub-jobs from distributed pipelines.
Proteins router¶
The /proteins router provides read access to the protein and sequence
catalogue. Proteins are not created directly through this router — they are
inserted asynchronously by the insert_proteins operation. The router
exposes list and detail endpoints with filtering by organism and review
status.
- protea.api.routers.proteins.get_protein(accession: str, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Full details for one protein: core fields, UniProt functional metadata, embedding count, GO annotation count, and accessions of known isoforms (if canonical).
- protea.api.routers.proteins.get_protein_annotations(accession: str, annotation_set_id: str | None = Query(None), factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) list[dict[str, Any]]¶
Return all GO term annotations for a protein, joined with term details and annotation set source. Optionally filter to a specific annotation set by UUID.
- protea.api.routers.proteins.get_protein_stats(factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Return aggregate counts: total proteins, canonical vs isoforms, reviewed, and how many have metadata, embeddings, or GO annotations.
- protea.api.routers.proteins.get_session_factory(request: Request) sessionmaker[Session]¶
- protea.api.routers.proteins.list_proteins(search: str | None = Query(None), reviewed: bool | None = Query(None), canonical_only: bool = Query(True), limit: int = Query(50), offset: int = Query(0), factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Paginated protein listing with optional full-text search across accession, entry name, gene name, and organism.
Annotations router¶
The /annotations router exposes the GO ontology and annotation set data.
It provides:
Ontology snapshot listing and detail, including GO term counts per aspect.
Annotation set listing and detail.
A BFS ancestor subgraph endpoint (
GET /annotations/snapshots/{id}/subgraph) that returns the ancestor closure for a given set of GO term IDs within a snapshot. Used by the frontend to render the GO hierarchy for a prediction result.
- protea.api.routers.annotations.delete_annotation_set(set_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Delete an annotation set and all its annotations. Returns 409 if referenced by a prediction set.
- protea.api.routers.annotations.download_delta_fasta(eval_id: UUID, category: str = Query(all), factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) StreamingResponse¶
Download the amino-acid sequences of delta proteins (NK and/or LK) as FASTA.
Only proteins whose sequence is already stored in the database are included. Header format:
>ACCESSION entry_name OS=organism OX=taxonomy_id (NK|LK)
- protea.api.routers.annotations.download_evaluation_artifacts(eval_id: UUID, result_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None), artifacts_dir: Path = Depends(dependency=<function get_artifacts_dir>, use_cache=True, scope=None)) StreamingResponse¶
- protea.api.routers.annotations.download_evaluation_metrics(eval_id: UUID, result_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) StreamingResponse¶
- protea.api.routers.annotations.download_gt_lk(eval_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) StreamingResponse¶
Download Limited-Knowledge ground truth: delta proteins with prior experimental annotations. Format:
protein_accession\tgo_id(no header, 2 columns).
- protea.api.routers.annotations.download_gt_nk(eval_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) StreamingResponse¶
Download No-Knowledge ground truth: delta proteins with zero prior experimental annotations. Format:
protein_accession\tgo_id(no header, 2 columns).
- protea.api.routers.annotations.download_gt_pk(eval_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) StreamingResponse¶
Download Partial-Knowledge ground truth: proteins that gained new terms in a namespace where they already had experimental annotations at t0. Use together with
known-terms.tsvpassed as-knownto the CAFA evaluator. Format:protein_accession\tgo_id(no header, 2 columns).
- protea.api.routers.annotations.download_known_terms(eval_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) StreamingResponse¶
Download ALL experimental annotations from the OLD set (not delta-filtered). Format:
protein_accession\tgo_id(no header, 2 columns). Pass this as-knownto the CAFA evaluator to enable PK scoring.
- protea.api.routers.annotations.generate_evaluation_set(body: dict[str, ~typing.Any], factory: ~sqlalchemy.orm.session.sessionmaker[~sqlalchemy.orm.session.Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None), amqp_url: str = Depends(dependency=<function get_amqp_url>, use_cache=True, scope=None)) dict[str, Any]¶
Queue a job that computes the CAFA delta between two annotation sets.
Applies experimental evidence filtering, NOT-qualifier propagation through the GO DAG, and classifies delta proteins into NK/LK. Stats are stored in a new EvaluationSet row; ground-truth TSVs are streamed on demand.
- protea.api.routers.annotations.get_amqp_url(request: Request) str¶
- protea.api.routers.annotations.get_annotation_set(set_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Retrieve a single annotation set with its total annotation count.
- protea.api.routers.annotations.get_artifacts_dir(request: Request) Path¶
- protea.api.routers.annotations.get_evaluation_set(eval_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
- protea.api.routers.annotations.get_go_subgraph(snapshot_id: UUID, go_ids: str = Query(PydanticUndefined), depth: int = Query(3), factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Return a subgraph of the GO DAG containing the requested terms and their ancestors up to
depthlevels.
- protea.api.routers.annotations.get_session_factory(request: Request) sessionmaker[Session]¶
- protea.api.routers.annotations.get_snapshot(snapshot_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Retrieve a single ontology snapshot with its GO term count.
- protea.api.routers.annotations.list_annotation_sets(source: str | None = Query(None), factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) list[dict[str, Any]]¶
List annotation sets with their annotation counts, newest first. Optionally filter by source.
- protea.api.routers.annotations.list_evaluation_results(eval_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) list[dict[str, Any]]¶
- protea.api.routers.annotations.list_evaluation_sets(factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) list[dict[str, Any]]¶
List all evaluation sets, newest first.
- protea.api.routers.annotations.list_snapshots(factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) list[dict[str, Any]]¶
List all loaded GO ontology snapshots with their GO term counts, newest first.
- protea.api.routers.annotations.load_goa_annotations(body: dict[str, ~typing.Any], factory: ~sqlalchemy.orm.session.sessionmaker[~sqlalchemy.orm.session.Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None), amqp_url: str = Depends(dependency=<function get_amqp_url>, use_cache=True, scope=None)) dict[str, Any]¶
Queue a load_goa_annotations job that streams a GAF file (gzip or plain) and upserts GO annotations into an AnnotationSet. Only proteins already in the DB are annotated.
- protea.api.routers.annotations.load_ontology_snapshot(body: dict[str, ~typing.Any], factory: ~sqlalchemy.orm.session.sessionmaker[~sqlalchemy.orm.session.Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None), amqp_url: str = Depends(dependency=<function get_amqp_url>, use_cache=True, scope=None)) dict[str, Any]¶
Queue a load_ontology_snapshot job that downloads and parses a GO OBO file.
The job is idempotent by obo_version: if the snapshot already exists with relationships it will be skipped; if relationships are missing they will be backfilled.
- protea.api.routers.annotations.load_quickgo_annotations(body: dict[str, ~typing.Any], factory: ~sqlalchemy.orm.session.sessionmaker[~sqlalchemy.orm.session.Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None), amqp_url: str = Depends(dependency=<function get_amqp_url>, use_cache=True, scope=None)) dict[str, Any]¶
Queue a load_quickgo_annotations job that streams GO annotations from the QuickGO bulk download API with optional taxon, aspect, and evidence code filtering.
- protea.api.routers.annotations.run_cafa_evaluation(eval_id: ~uuid.UUID, body: dict[str, ~typing.Any], factory: ~sqlalchemy.orm.session.sessionmaker[~sqlalchemy.orm.session.Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None), amqp_url: str = Depends(dependency=<function get_amqp_url>, use_cache=True, scope=None), artifacts_dir: ~pathlib.Path = Depends(dependency=<function get_artifacts_dir>, use_cache=True, scope=None)) dict[str, Any]¶
Queue a job that runs the CAFA evaluator (NK / LK / PK) for a prediction set.
Body must contain
prediction_set_id(required) and optionallymax_distance(float).
Embeddings router¶
The /embeddings router manages embedding configurations and prediction
sets. Embedding configurations are immutable recipes: once created, they
can be referenced by any number of embedding computation and prediction
jobs. Creating a new configuration with different parameters produces a
new UUID, preserving reproducibility.
Prediction sets are created by submitting a predict_go_terms job and
are queryable once the job completes. The
GET /embeddings/prediction-sets/{id}/predictions.tsv endpoint streams
prediction results as a tab-separated file using StreamingResponse with
yield_per(1000), avoiding loading the full result set into memory.
- protea.api.routers.embeddings.create_embedding_config(body: dict[str, ~typing.Any], factory: ~sqlalchemy.orm.session.sessionmaker[~sqlalchemy.orm.session.Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Create a new EmbeddingConfig that defines the model, layer selection, pooling strategy, and chunking.
This config is referenced by compute_embeddings jobs and predict_go_terms jobs to ensure query and reference embeddings were produced under identical settings.
- protea.api.routers.embeddings.delete_embedding_config(config_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Delete an EmbeddingConfig and cascade-delete all linked embeddings, prediction sets, and predictions.
- protea.api.routers.embeddings.delete_prediction_set(set_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Delete a prediction set and all its GOPrediction rows.
- protea.api.routers.embeddings.download_predictions_cafa(set_id: UUID, eval_id: UUID | None = Query(None), aspect: str | None = Query(None), max_distance: float | None = Query(None), factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) StreamingResponse¶
Stream predictions in CAFA format:
protein_accession\tgo_id\tscore.Score is computed as
max(0.0, 1.0 - distance)so that closer neighbours receive higher confidence scores in the [0, 1] range expected by the CAFA evaluator. One row per (protein, GO term) pair — duplicate GO terms for the same protein are deduplicated keeping the highest score (lowest distance).Pass
eval_idto restrict output to delta proteins only (NK + LK targets), which is required for a valid CAFA evaluation.
- protea.api.routers.embeddings.download_predictions_tsv(set_id: UUID, accession: str | None = Query(None), aspect: str | None = Query(None), max_distance: float | None = Query(None), factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) StreamingResponse¶
Stream all GO predictions for a prediction set as a tab-separated file.
Each row is one (protein, GO term, reference protein) triple. Columns include embedding distance, GO term metadata, annotation fields, and optional alignment and taxonomy features (columns are present but empty when not computed).
Optional filters:
accession,aspect(F/P/C),max_distance.The response streams rows directly from the database — suitable for large prediction sets without loading everything into memory.
- protea.api.routers.embeddings.get_amqp_url(request: Request) str¶
- protea.api.routers.embeddings.get_embedding_config(config_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Retrieve a single EmbeddingConfig with its total stored embedding count.
- protea.api.routers.embeddings.get_go_term_distribution(set_id: UUID, limit: int = 50, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Return the most frequently predicted GO terms grouped by aspect (F/P/C) and the total prediction counts per aspect.
- protea.api.routers.embeddings.get_prediction_set(set_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Retrieve a prediction set with total prediction count and per-protein GO term counts.
- protea.api.routers.embeddings.get_protein_predictions(set_id: UUID, accession: str, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) list[dict[str, Any]]¶
Return all predicted GO terms for a protein in a prediction set, sorted by distance (nearest first). Includes GO term details plus optional alignment (NW/SW) and taxonomy fields when computed.
- protea.api.routers.embeddings.get_session_factory(request: Request) sessionmaker[Session]¶
- protea.api.routers.embeddings.list_embedding_configs(factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) list[dict[str, Any]]¶
List all embedding configurations with their stored embedding counts, newest first.
- protea.api.routers.embeddings.list_prediction_set_proteins(set_id: UUID, search: str | None = None, limit: int = 50, offset: int = 0, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Paginated list of proteins in a prediction set with their predicted GO count, minimum distance, known annotation count, and how many predictions match known annotations (precision proxy).
- protea.api.routers.embeddings.list_prediction_sets(factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) list[dict[str, Any]]¶
List the 100 most recent prediction sets with their GO prediction counts.
- protea.api.routers.embeddings.predict_go_terms(body: dict[str, ~typing.Any], factory: ~sqlalchemy.orm.session.sessionmaker[~sqlalchemy.orm.session.Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None), amqp_url: str = Depends(dependency=<function get_amqp_url>, use_cache=True, scope=None)) dict[str, Any]¶
Queue a predict_go_terms job that runs KNN-based GO term transfer.
The coordinator partitions query proteins into batches, each dispatched to protea.predictions.batch workers for KNN search (numpy or FAISS) + GO annotation transfer. Results are written to a new PredictionSet via protea.predictions.write workers.
Required body fields: embedding_config_id, annotation_set_id, ontology_snapshot_id. Optional: query_set_id (FASTA upload), limit_per_entry, distance_threshold, batch_size, search_backend, compute_alignments, compute_taxonomy.
Query sets router¶
The /query-sets router handles user-uploaded FASTA files. On
POST /query-sets, the server parses the multipart upload, creates a
QuerySet row, upserts one Sequence row per unique amino-acid string
(deduplicating by MD5 hash), and creates QuerySetEntry rows preserving
the original FASTA headers. The returned query set ID can then be referenced
in compute_embeddings and predict_go_terms job payloads.
- async protea.api.routers.query_sets.create_query_set(file: UploadFile, name: str = Form(PydanticUndefined), description: str | None = Form(None), factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Upload a FASTA file and create a QuerySet.
Each sequence in the FASTA is stored (or reused if already present) in the
sequencetable. Aquery_set_entryrow is created per sequence, preserving the original FASTA accession. Duplicate accessions within the same upload are rejected with 422.
- protea.api.routers.query_sets.delete_query_set(query_set_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Delete a query set and all its entries. Sequences are not deleted (they may be shared).
- protea.api.routers.query_sets.get_query_set(query_set_id: UUID, factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) dict[str, Any]¶
Retrieve a query set with its full entry list (accessions and sequence IDs).
- protea.api.routers.query_sets.get_session_factory(request: Request) sessionmaker[Session]¶
- protea.api.routers.query_sets.list_query_sets(factory: sessionmaker[Session] = Depends(dependency=<function get_session_factory>, use_cache=True, scope=None)) list[dict[str, Any]]¶
List all uploaded FASTA query sets with their entry counts, newest first.
Endpoints summary¶
Method |
Path |
Description |
|---|---|---|
|
|
Create a job and publish its UUID to RabbitMQ. |
|
|
List jobs; filter by |
|
|
Retrieve a single job with full payload and meta. |
|
|
Retrieve the event timeline for a job (up to 2 000 events). |
|
|
Transition a |
|
|
Delete a job that is not in |
|
|
List proteins with pagination; filter by |
|
|
Retrieve a single protein with its UniProt metadata. |
|
|
List ontology snapshots with GO term counts per aspect. |
|
|
Retrieve a snapshot with its full list of GO terms. |
|
|
BFS ancestor subgraph for a given set of GO term IDs. |
|
|
List annotation sets with protein GO annotation counts. |
|
|
Retrieve a single annotation set with summary statistics. |
|
|
List all embedding configurations. |
|
|
Create a new (immutable) embedding configuration. |
|
|
Retrieve an embedding configuration by UUID. |
|
|
List prediction sets with entry counts. |
|
|
Retrieve a prediction set with summary statistics. |
|
|
List GO predictions for a set (paginated JSON). |
|
|
Stream all predictions as a TSV file (27 columns, filtered by accession / aspect / distance). |
|
|
Upload a FASTA file and create a |
|
|
List all query sets with entry counts. |
|
|
Retrieve a query set with its full entry list. |
|
|
Delete a query set and all its entries. |
Request body for POST /jobs¶
The operation and queue_name fields are required. payload is
passed verbatim to the operation’s execute method after Pydantic
validation; its schema depends on the operation. meta is stored on
the Job row and never interpreted by the API.
{
"operation": "insert_proteins",
"queue_name": "protea.jobs",
"payload": {
"search_criteria": "reviewed:true AND organism_id:9606"
},
"meta": {}
}
Common payload examples by operation:
{ "operation": "fetch_uniprot_metadata", "queue_name": "protea.jobs",
"payload": { "accessions": ["P04637", "P53350"] } }
{ "operation": "compute_embeddings", "queue_name": "protea.embeddings",
"payload": { "embedding_config_id": "<uuid>", "batch_size": 64 } }
{ "operation": "predict_go_terms", "queue_name": "protea.jobs",
"payload": {
"embedding_config_id": "<uuid>",
"annotation_set_id": "<uuid>",
"ontology_snapshot_id": "<uuid>",
"query_set_id": "<uuid>",
"k": 5,
"compute_alignments": false,
"compute_taxonomy": false
}
}