Multimodal Architecture¶
This project should evolve as a shared platform plus multiple modality-specific applications, not as one giant website that tries to present tabular, text, image, and future modalities through a single product surface.
Recommended Shape¶
Use one repository or monorepo with a clear split between platform code and modality apps:
repo/
packages/
privsyn_platform/ # auth, storage, jobs, ownership, modality routing contracts
privsyn_tabular/ # tabular library + CLI
privsyn_text/ # text library + CLI
privsyn_image/ # image library + CLI
apps/
tabular_web/ # tabular UI + API
text_web/ # text UI + API
image_web/ # image UI + API
hub_web/ # optional landing page / submission gateway
workers/
tabular_worker/
text_worker/
image_worker/
deploy/
shared/
tabular/
text/
image/
If you prefer multiple repositories instead of a monorepo, keep the same conceptual split:
- one shared
privsyn_platformpackage - one app/service per modality
- one worker/runtime environment per modality
What Should Be Shared¶
Keep these in privsyn_platform:
- authentication and SSO integration
- job creation, polling, cancellation, and ownership rules
- durable metadata and object storage
- deployment-facing settings and runner contracts
- modality routing contracts and submission metadata
What Should Stay Modality-Specific¶
Do not force these into the shared layer:
- model loading and inference code
- GPU/runtime dependencies
- prompt or dataset schemas
- evaluation metrics
- frontend workflows and result presentation
Tabular, text, and image products can share a login shell and a job-history page, but the generation forms and result views should remain separate.
UI Recommendation¶
Do not put every modality into one dense workflow page.
Prefer:
tabular.example.edutext.example.eduimage.example.edu
Optionally add a thin hub such as studio.example.edu that:
- lets the user choose a modality
- shows recent jobs across apps
- routes to the correct product
Automatic Modality Detection¶
Automatic detection is useful as a fallback, but it should not be the only control path.
Recommended rule:
- If the caller explicitly declares a modality, trust it.
- Otherwise infer modality from file type, content type, or request schema.
- If inputs span multiple modalities, classify the request as
multimodal. - If inference is ambiguous, return
unknownand ask the caller or UI to choose.
The shared package now includes a starter contract in privsyn_platform.modality with:
ModalityInputDescriptorinfer_input_modality(...)detect_modality(...)
This is enough to support a future hub or gateway without prematurely coupling all apps together.
Gateway Pattern¶
If you want a single submission endpoint later, make it a router, not a monolith.
Example flow:
- Client submits files and optional declared modality to
hub_web. hub_webusesprivsyn_platform.modality.detect_modality(...).- The gateway creates a platform-level job record.
- The request is forwarded to the correct modality app or worker backend.
- Status and ownership stay consistent because they all use the same platform contracts.
That keeps the user-facing entry unified while still allowing:
- different runtime images
- different model dependencies
- different GPU/CPU scheduling
- different result UIs
Practical Next Step¶
When a new modality project starts:
- Build its library/CLI first.
- Reuse
privsyn_platformfor auth, jobs, storage, and ownership. - Create a separate web app only if that modality needs one.
- Add the modality to a hub only after the standalone app works well.