Multimodal Architecture¶

This project should evolve as a shared platform plus multiple modality-specific applications, not as one giant website that tries to present tabular, text, image, and future modalities through a single product surface.

Recommended Shape¶

Use one repository or monorepo with a clear split between platform code and modality apps:

repo/
  packages/
    privsyn_platform/        # auth, storage, jobs, ownership, modality routing contracts
    privsyn_tabular/         # tabular library + CLI
    privsyn_text/            # text library + CLI
    privsyn_image/           # image library + CLI
  apps/
    tabular_web/            # tabular UI + API
    text_web/               # text UI + API
    image_web/              # image UI + API
    hub_web/                # optional landing page / submission gateway
  workers/
    tabular_worker/
    text_worker/
    image_worker/
  deploy/
    shared/
    tabular/
    text/
    image/

If you prefer multiple repositories instead of a monorepo, keep the same conceptual split:

one shared privsyn_platform package
one app/service per modality
one worker/runtime environment per modality

What Should Be Shared¶

Keep these in privsyn_platform:

authentication and SSO integration
job creation, polling, cancellation, and ownership rules
durable metadata and object storage
deployment-facing settings and runner contracts
modality routing contracts and submission metadata

What Should Stay Modality-Specific¶

Do not force these into the shared layer:

model loading and inference code
GPU/runtime dependencies
prompt or dataset schemas
evaluation metrics
frontend workflows and result presentation

Tabular, text, and image products can share a login shell and a job-history page, but the generation forms and result views should remain separate.

UI Recommendation¶

Do not put every modality into one dense workflow page.

Prefer:

tabular.example.edu
text.example.edu
image.example.edu

Optionally add a thin hub such as studio.example.edu that:

lets the user choose a modality
shows recent jobs across apps
routes to the correct product

Automatic Modality Detection¶

Automatic detection is useful as a fallback, but it should not be the only control path.

Recommended rule:

If the caller explicitly declares a modality, trust it.
Otherwise infer modality from file type, content type, or request schema.
If inputs span multiple modalities, classify the request as multimodal.
If inference is ambiguous, return unknown and ask the caller or UI to choose.

The shared package now includes a starter contract in privsyn_platform.modality with:

Modality
InputDescriptor
infer_input_modality(...)
detect_modality(...)

This is enough to support a future hub or gateway without prematurely coupling all apps together.

Gateway Pattern¶

If you want a single submission endpoint later, make it a router, not a monolith.

Example flow:

Client submits files and optional declared modality to hub_web.
hub_web uses privsyn_platform.modality.detect_modality(...).
The gateway creates a platform-level job record.
The request is forwarded to the correct modality app or worker backend.
Status and ownership stay consistent because they all use the same platform contracts.

That keeps the user-facing entry unified while still allowing:

different runtime images
different model dependencies
different GPU/CPU scheduling
different result UIs

Practical Next Step¶

When a new modality project starts:

Build its library/CLI first.
Reuse privsyn_platform for auth, jobs, storage, and ownership.
Create a separate web app only if that modality needs one.
Add the modality to a hub only after the standalone app works well.

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search