Library And CLI¶
This repository now exposes a lightweight layer-1 interface in addition to the web app.
Python API¶
Use the importable API when you want to call the synthesizer from a notebook or another Python project.
import pandas as pd
from privsyn_tabular import synthesize_dataframe
df = pd.read_csv("input.csv")
result = synthesize_dataframe(
df,
method="privsyn",
epsilon=1.0,
dataset_name="toy",
)
result.synthesized_df.to_csv("synthetic.csv", index=False)
The return value includes:
synthesized_dfdomain_datainfo_dataconfig
If domain_data and info_data are not provided, the library reuses the same metadata inference helpers as the web workflow.
CLI¶
You can run the same flow from the shell:
python -m privsyn_tabular synthesize \
--input sample_data/adult.csv \
--output temp_synthesis_output/adult_synth.csv \
--method privsyn \
--epsilon 1.0 \
--write-domain-json temp_synthesis_output/domain.json \
--write-info-json temp_synthesis_output/info.json
After packaging or editable install, the console script is also available as:
privsyn-tabular synthesize --input input.csv --output synthetic.csv
Installation¶
For local development, the simplest path is:
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r requirements.txt
python3 -m pip install -e .
That keeps the web app, library, and CLI using the same checked-out source tree.
Why This Layer Exists¶
The library / CLI layer is intentionally separate from the web deployment story:
- it is the easiest prototype to run on local machines, notebooks, or RC software environments,
- it keeps the synthesis code usable even when web hosting is not available,
- it provides the lowest-friction delivery layer for campus or project-team adoption.