DataGen Documentation
DataGen is a synthetic enterprise data generation platform. It procedurally builds realistic enterprise datasets that teams can use for labs, validation, demos, exports, discovery-tool testing, and integration work.
This site is organized around how people actually use the tool:
- author or choose a scenario
- generate a world
- review quality and realism
- inspect or export the output
- use the data to populate a lab or validate a downstream system
- extend the generated dataset safely through plugins
What DataGen is for
DataGen is a good fit when you need believable enterprise structure without hand-curating every user, group, system, application, policy, repository, or CMDB record.
Common use cases include:
- identity and access labs
- Active Directory and Entra validation environments
- CMDB and discovery-tool testing
- collaboration and repository-heavy tenant simulation
- operating-domain validation for ITSM, SecOps, and business operations
- temporal simulation and change-over-time testing
- export-driven integration testing
- plugin-driven realism overlays
What makes the current product notable
- Scenario-first authoring with archetypes, persona presets, overlays, JSON, and a terminal wizard
- Rich generation across identity, infrastructure, repositories, applications, policies, access evidence, CMDB, and observed views
- Built-in bundled packs for ITSM, SecOps, and BusinessOps
- Temporal simulation foundations with event and snapshot export surfaces
- Configurable realism through deviation profiles such as
Clean,Realistic, andAggressive - Normalized export and quality validation surfaces for downstream tooling and CI
- A plugin model intentionally constrained to extending the synthetic dataset, not tailoring it for consumer-specific import contracts
Recommended path through the docs
- Read Installation.
- Run the First World workflow.
- Pick a Walkthrough that matches your target lab or validation goal.
- Use the Cmdlet Reference when you need exact command surfaces.
- Review What’s New for the latest platform changes and release notes.
- Read the Plugin Architecture guide before extending the dataset.
Documentation philosophy
This site is intentionally user-facing. It draws from the project’s architectural and milestone work, but it does not simply publish internal notes. The goal is to help operators, lab builders, SDK authors, and contributors work effectively with the current tool.
If you want to help improve the docs site or the product, see Contributing.