Skip to main content

Catalogs and Generation

DataGen generation is strongly catalog-backed. The module uses curated runtime data to drive geography, applications, vendors, patterns, service relationships, and other realism surfaces.

Why catalogs matter

Catalogs make the generated output:

  • richer than purely heuristic generation
  • more repeatable
  • easier to curate without constantly rewriting generator code

Packaged runtime catalog

The PowerShell Gallery package ships a packaged SQLite catalog intended for runtime use:

SyntheticEnterprise.PowerShell/
catalogs/
catalogs.sqlite

When you run New-SEEnterpriseWorld without -CatalogRootPath, DataGen searches the module install location, finds that bundled catalogs directory, and loads catalogs.sqlite. Gallery users do not need to download the separate catalogs.sqlite release asset for normal generation.

The separate release asset is useful when you want to inspect the catalog directly, compare catalog builds, feed a custom process, or pin an external catalog file independently from the installed module.

The catalog work in recent iterations focused on:

  • keeping the packaged footprint practical
  • consolidating overlapping runtime tables
  • preferring curated runtime data over build-time-only sources

Generation layers

At a high level, generation flows through these major areas:

  • organizational structure
  • geography and offices
  • identity and directory surfaces
  • applications and services
  • infrastructure and endpoints
  • repositories and collaboration
  • policies, access evidence, and observed views
  • CMDB and source-record views

Regeneration and layering

DataGen also supports selective layer addition or regeneration:

  • Add-SEIdentityLayer
  • Add-SEInfrastructureLayer
  • Add-SERepositoryLayer

These are useful when you want to enrich or reprocess part of an existing generated world without starting over from scratch.

Rebuilding the packaged database

Use New-SECatalogDatabase when you need to regenerate the packaged SQLite database from the curated source material.

Typical reasons include:

  • refreshing seeded runtime content
  • testing catalog changes locally
  • validating packaging size and runtime contents

For source builds, scripts/build-catalog-artifact.ps1 -InstallToCatalogRoot writes the canonical build output to artifacts/catalog/catalogs.sqlite and installs the working copy to catalogs/catalogs.sqlite so the module project can include it during packaging.