Atlas Plan
Plans002 2026 02 20 Sync and Source Config

Phase 1: Source Config Files

  • Purpose: Create per-year YAML config files encoding all file/sheet/column/header_row variations for 2023–2026

T-001 - Create source/config/ions-2026.yaml

Create the YAML source config for the 2026 xlsx files. This is the most complete year — 2026 LAPORAN EVALUASI adds NO. KWITANSI and NAMA KELAS; DATABASE SISWA splits BULAN/INTAKE into two columns.

  • Status: completed
  • Priority: P0
  • Dependencies: none

Acceptance

  • File exists at source/config/ions-2026.yaml
  • transactions entry: path, all 12 monthly sheets (JANUARIDESEMBER), header_row: 6, column map includes NO. KWITANSIreceipt_number and NAMA KELASvariant_name
  • students entry: path, all 12 monthly sheets (JANUARI26DESEMBER26), header_row: 7, column map reflects split BULAN + INTAKE
  • targets entry: path, sheet EVALUASI DATA SISWA, header_row: 1
  • marketing_activity entry: path, all REKAP JANUARIREKAP DESEMBER sheets, header_row: 7
  • organizations entry: path DATABASE SISWA IONS 2026 TM-WLC.xlsx, sheet REFERENSI

Files

  • source/config/ions-2026.yaml

T-002 - Create source/config/ions-2025.yaml

Create the YAML source config for 2025. Shares schema with 2024 — 15-column monthly LAPORAN EVALUASI, 24-column monthly DATABASE SISWA.

  • Status: completed
  • Priority: P0
  • Dependencies: none

Acceptance

  • File exists at source/config/ions-2025.yaml
  • transactions column map: no NO. KWITANSI, no NAMA KELAS (15-col schema)
  • students column map: BULAN/INTAKE as single combined column
  • Sheet names use 25 suffix pattern: JANUARI25DESEMBER25
  • targets entry: no EVALUASI DATA SISWA sheet (not present in 2025); omit or mark absent

Files

  • source/config/ions-2025.yaml

T-003 - Create source/config/ions-2024.yaml and ions-2023.yaml

Create the YAML source configs for 2024 and 2023. 2023 has distinct differences: space in sheet names (JANUARI 23), no JENIS KELAMIN column, POLA PEMBAYARAN and CARA BAYAR in different column order, and no REFERENSI sheet.

  • Status: completed
  • Priority: P0
  • Dependencies: none

Acceptance

  • ions-2024.yaml exists with 24 suffix sheet names and 2024 column schema
  • ions-2023.yaml exists with space-separated sheet names (JANUARI 23 pattern)
  • 2023 students column map: no JENIS KELAMIN column
  • 2023 students: no REFERENSI sheet (omit organizations entry or mark absent)
  • 2023 LAPORAN EVALUASI: header_row correctly set (row 6 for monthly data sheets)

Files

  • source/config/ions-2024.yaml
  • source/config/ions-2023.yaml

Phase 2: Package Scaffold

  • Purpose: Create the @packages/sync workspace package with all config files

T-004 - Scaffold @packages/sync package

Create all configuration files for @packages/sync: package.json, tsconfig.json, eslint.config.mjs, and add to root workspace package.json and turbo pipeline.

  • Status: completed
  • Priority: P0
  • Dependencies: none

Acceptance

  • @packages/sync/package.json with name @packages/sync, scripts (build/dev/lint/lint:fix/test/test:type), deps via catalog references
  • @packages/sync/tsconfig.json extending @repo/typescript/lib
  • @packages/sync/eslint.config.mjs using base() from @repo/lint/base
  • Root package.json catalogs extended with any new deps (js-yaml, @types/js-yaml, duckdb-async if not present)
  • @packages/sync listed in root package.json workspaces (if not already covered by @packages/** glob)

Notes

  • Check root package.json catalogs before adding — duckdb-async or @duckdb/node-api may already be catalogued

Files

  • @packages/sync/package.json
  • @packages/sync/tsconfig.json
  • @packages/sync/eslint.config.mjs

Phase 3: Core Sync Code

  • Purpose: Implement the sync logic — DuckDB connection, config loader, xlsx runner, and LibSQL seeder

T-005 - Implement SourceConfig TypeScript types

Define the TypeScript types that mirror the YAML config schema. These types are the contract between the YAML files and the sync code.

  • Status: completed
  • Priority: P0
  • Dependencies: T-004

Acceptance

  • SourceConfig type covers: entity, year, files map
  • Each file entry has: path, sheets array (name + period), header_row, columns map (English key → Indonesian source column name)
  • Types exported from @source/config.ts
  • Type-checks cleanly

Files

  • @packages/sync/@source/config.ts

T-006 - Implement duck.ts — DuckDB connection and raw table setup

Create the DuckDB connection module and raw table initialization. Exposes a createRawTable helper that runs CREATE OR REPLACE TABLE raw_X AS SELECT * FROM read_xlsx(...).

  • Status: completed
  • Priority: P0
  • Dependencies: T-005

Acceptance

  • duck.ts opens atlas.db (path configurable via env or arg)
  • createRawTable(conn, tableName, filePath, sheetName, headerRow) function implemented
  • Uses read_xlsx with sheet := sheetName, header := true, and a range derived from headerRow (node binding does not expose header_row)
  • CREATE OR REPLACE TABLE so re-runs are idempotent
  • Exported from module cleanly

Notes

  • DuckDB read_xlsx header_row is 1-indexed
  • For multi-sheet loads (e.g. all months), caller iterates sheets and calls createRawTable for each — data is appended not replaced per month

Files

  • @packages/sync/@source/duck.ts

T-007 - Implement config.ts — YAML config loader

Load and validate a source/config/ions-YYYY.yaml file, resolve file paths relative to the project root, and return a typed SourceConfig.

  • Status: completed
  • Priority: P0
  • Dependencies: T-005

Acceptance

  • loadConfig(entity, year) reads source/config/{entity.toLowerCase()}-{year}.yaml
  • Returns typed SourceConfig
  • Throws a clear error if config file not found
  • File paths in the config are resolved relative to project root (not package directory)

Files

  • @packages/sync/@source/config.ts (extended)

T-008 - Implement xlsx.ts — xlsx sync runner

The main sync runner for xlsx sources. Loads the config, iterates all configured file+sheet entries, and calls createRawTable for each. Supports optional --month filter.

  • Status: completed
  • Priority: P0
  • Dependencies: T-006, T-007

Acceptance

  • syncXlsx(config, options) iterates all file entries in config
  • For each sheet, calls createRawTable with correct path, sheet name, header_row
  • options.month (1–12) filters to only load the matching monthly sheet when provided
  • Logs each table name and row count after creation
  • Skips sheets marked as absent in config gracefully (warning, no error)

Files

  • @packages/sync/@source/adapters/xlsx.ts

T-009 - Implement seed.ts — LibSQL lookup table seeder

Seeds all LibSQL lookup tables from hardcoded registry values. Covers: core_group, core_entity, core_unit, all db_* lookups, catalog_* lookups, commerce_* lookups, finance_* lookups, marketing_channel_type, marketing_channel.

  • Status: completed
  • Priority: P1
  • Dependencies: T-004

Acceptance

  • seed(entityCode) function accepts entity code (e.g. 'IONS')
  • Seeds core_group (GNY), core_entity (IONS/EPN), core_unit (all 9 units from registry)
  • Seeds all lookup tables: db_sex, db_organization_level, db_relation_type, db_account_type, catalog_item_type, commerce_order_type, commerce_customer_type, finance_payment_method, finance_payment_status, target_metric
  • Seeds marketing_channel_type (Conventional, Digital) and marketing_channel (18 channels from registry)
  • All inserts are upsert/replace (idempotent — safe to re-run)
  • Uses @libsql/client connecting to DATABASE_URL env var (defaults to file:atlas-ops.db)

Notes

  • @packages/db (Drizzle schema) is not yet available — seed.ts uses raw SQL strings against LibSQL directly
  • Once @packages/db exists, seed.ts can be refactored to use Drizzle helpers

Files

  • @packages/sync/@source/seed.ts

T-010 - Implement index.ts — CLI entry point

Wire the CLI with --source, --entity, --year, --month flags. Route to syncXlsx or seed based on flags.

  • Status: completed
  • Priority: P0
  • Dependencies: T-008, T-009

Acceptance

  • bun run sync --source xlsx --entity IONS --year 2026 runs full xlsx sync
  • bun run sync --source xlsx --entity IONS --year 2026 --month 2 runs single-month sync
  • bun run sync:seed --entity IONS runs seed.ts
  • Uses process.argv parsing or a lightweight arg parser (no heavy CLI frameworks)
  • Exits 0 on success, non-zero on error with a clear error message

Files

  • @packages/sync/@source/index.ts

Phase 4: Integration

  • Purpose: Wire the package into the workspace, update root scripts, and verify the full sync run

T-011 - Add root package.json scripts and update AGENTS.md

Add sync and sync:seed scripts to the root package.json. Update AGENTS.md with the sync commit scope and CLI usage.

  • Status: completed
  • Priority: P1
  • Dependencies: T-010

Acceptance

  • Root package.json has "sync": "bun run --filter @packages/sync start" (or equivalent)
  • Root package.json has "sync:seed": "bun run --filter @packages/sync seed" (or equivalent)
  • AGENTS.md ## Commands section updated with sync CLI usage
  • AGENTS.md commit scopes list includes sync

Files

  • package.json
  • AGENTS.md

T-012 - bun install and type-check verification

Run bun install to resolve new dependencies. Verify type-check passes.

  • Status: completed
  • Priority: P0
  • Dependencies: T-004, T-005, T-006, T-007, T-008, T-009, T-010, T-011

Acceptance

  • bun install completes without errors
  • bun run test:type --filter @packages/sync passes

Files

  • bun.lock

T-013 - End-to-end sync verification

Run the full sync against real 2026 data. Verify raw tables exist and contain expected row counts.

  • Status: completed
  • Priority: P0
  • Dependencies: T-012

Acceptance

  • bun run sync --source xlsx --entity IONS --year 2026 exits 0
  • atlas.db contains: raw_transactions, raw_students, raw_organizations, raw_targets, raw_marketing_activity
  • Each raw table has > 0 rows
  • Column names in raw_transactions match the 2026 LAPORAN EVALUASI schema (including NO. KWITANSI, NAMA KELAS)
  • Column names in raw_students match the 2026 DATABASE SISWA schema (including split INTAKE column)

Notes

  • Use duckdb atlas.db -c "SHOW TABLES; SELECT COUNT(*) FROM raw_transactions;" to verify

On this page