Phase 1: Python/dbt Scaffold
- Purpose: Set up the Python project, dbt configuration, and directory structure
T-001 - Create @packages/transform scaffold
Create all configuration files: pyproject.toml, requirements.txt, profiles.yml, dbt_project.yml, and directory structure.
- Status: completed
- Priority: P0
- Dependencies: none
Acceptance
-
@packages/transform/pyproject.tomlexists with[project]nameatlas-transform, requiresdbt-coreanddbt-duckdb -
requirements.txtlistsdbt-core>=1.8anddbt-duckdb>=1.8 -
profiles.ymltargetsdev→ DuckDB at../../atlas.db, 4 threads -
dbt_project.yml: nameatlas, staging+materialized: view, intermediate+materialized: view, marts+materialized: table - Directory structure created:
models/staging/,models/intermediate/,models/marts/,tests/ -
uv venv && uv pip install -r requirements.txtsucceeds -
uv run dbt debugconnects toatlas.dbwithout error
Files
- @packages/transform/pyproject.toml
- @packages/transform/requirements.txt
- @packages/transform/profiles.yml
- @packages/transform/dbt_project.yml
Phase 2: Raw Source Declarations
- Purpose: Register raw tables as dbt sources for lineage tracking and ref() usage
T-002 - Create sources.yml
Declare all 5 raw tables as dbt sources so they can be referenced with {{ source('raw', 'raw_transactions') }} in staging models.
- Status: completed
- Priority: P0
- Dependencies: T-001
Acceptance
-
models/staging/sources.ymlexists - Declares source
rawwith databaseatlas, schemamain - Tables declared:
raw_transactions,raw_students,raw_organizations,raw_targets,raw_marketing_activity - Each table has a description
Files
- @packages/transform/models/staging/sources.yml
Phase 3: Staging Models
- Purpose: Clean and rename raw columns to English; handle per-year column variations via COALESCE
T-003 - stg_transactions.sql
Staging model for LAPORAN EVALUASI monthly sheets. Handles 2026 new columns (NO. KWITANSI, NAMA KELAS) and deduplicates on receipt number.
- Status: completed
- Priority: P0
- Dependencies: T-002
Acceptance
- Selects from
{{ source('raw', 'raw_transactions') }} -
receipt_number: usesNO. KWITANSIwhen present, with deterministic fallback key for prior-year rows without receipt numbers -
variant_name:COALESCE("NAMA KELAS", "PROGRAM")— 2026 has NAMA KELAS, prior years use PROGRAM - All other columns mapped per analytics.md
stg_transactionsspec -
TRY_CASTonJUMLAH→amount(DECIMAL 18,2) -
TRY_CASTonTANGGAL→date(DATE) -
WHERE "NO. KWITANSI" IS NOT NULL OR "PROGRAM" IS NOT NULLto filter empty rows -
QUALIFY ROW_NUMBER() OVER (PARTITION BY receipt_number ORDER BY date) = 1deduplication
Files
- @packages/transform/models/staging/stg_transactions.sql
T-004 - stg_students.sql
Staging model for DATABASE SISWA monthly sheets. Handles 2026 split of BULAN/INTAKE → BULAN + INTAKE, renamed TANGGAL column, and added JENIS KELAMIN.
- Status: completed
- Priority: P0
- Dependencies: T-002
Acceptance
- Selects from
{{ source('raw', 'raw_students') }} -
period_label:COALESCE("INTAKE", "BULAN/INTAKE")— 2026 has separate INTAKE col; prior years use combined -
enrollment_date:TRY_CAST(COALESCE("TANGGAL", "TANGGAL/HARI/JAM") AS DATE) -
sex_code:"JENIS KELAMIN"(NULL for 2023 which lacks this column) - All other columns per analytics.md
stg_studentsspec -
WHERE "NAMA" IS NOT NULL
Files
- @packages/transform/models/staging/stg_students.sql
T-005 - stg_targets.sql
Staging model for the EVALUASI DATA SISWA sheet (2026+) and fallback for prior years without this sheet.
- Status: completed
- Priority: P1
- Dependencies: T-002
Acceptance
- Selects from
{{ source('raw', 'raw_targets') }} - All columns per analytics.md
stg_targetsspec withTRY_CASTfor all numeric columns -
WHERE "TENANT" IS NOT NULL
Files
- @packages/transform/models/staging/stg_targets.sql
T-006 - stg_marketing_activity.sql
Staging model for REKAP monthly sheets. Maps channel lead columns with resilient fallbacks; weekly decomposition is deferred until raw REKAP headers are stabilized.
- Status: completed
- Priority: P1
- Dependencies: T-002
Acceptance
- Selects from
{{ source('raw', 'raw_marketing_activity') }} - All columns per analytics.md
stg_marketing_activityspec - Uses tenant fallback when
TENANTis absent in raw sheets and filters non-data rows
Files
- @packages/transform/models/staging/stg_marketing_activity.sql
T-007 - Staging schema tests (schema.yml)
Define dbt schema tests for all staging models: not_null, unique, accepted_values.
- Status: completed
- Priority: P1
- Dependencies: T-003, T-004, T-005, T-006
Acceptance
-
models/staging/schema.ymlexists -
stg_transactions:receipt_numbernot_null;amountnot_null;unit_codeaccepted_values (all known unit codes) -
stg_students:full_namenot_null;unit_codeaccepted_values -
stg_targets:unit_codenot_null, unique -
uv run dbt test --select stagingpasses
Files
- @packages/transform/models/staging/schema.yml
Phase 4: Intermediate Models
- Purpose: Apply business logic — joins, customer type classification, FK resolution, enrollment filtering
T-008 - int_orders.sql
Joins stg_transactions × stg_students to produce one enriched row per transaction. Normalizes unit codes and customer types. Uses receipt number join where available.
- Status: completed
- Priority: P0
- Dependencies: T-003, T-004
Acceptance
- LEFT JOINs
stg_studentsonreceipt_number(where not null) ORstudent_namematch -
unit_code_normalized: CASE mappingWLC INGGRIS→WLC_ENGLISH,WLC NON INGGRIS→WLC_NON_ENGLISH, else as-is -
customer_type: CASE mappingB→New,L→Renewal,A→Alumni -
period_yearandperiod_monthextracted fromperiod_labelusing Indonesian month name CASE - All columns from both staging models included
-
WHERE receipt_number IS NOT NULL
Files
- @packages/transform/models/intermediate/int_orders.sql
T-009 - int_enrollments.sql
Filters int_orders to enrollment-type rows only. Excludes book sales, event fees, zero-amount rows.
- Status: completed
- Priority: P0
- Dependencies: T-008
Acceptance
- Selects from
{{ ref('int_orders') }} - Excludes all non-enrollment
product_categoryvalues per analytics.md exclusion list - Excludes
amount <= 0 - All
int_orderscolumns passed through unchanged
Files
- @packages/transform/models/intermediate/int_enrollments.sql
Phase 5: Mart Models
- Purpose: Pre-aggregate to analysis-ready tables powering the Format layer and dashboard
T-010 - mart_revenue.sql
Revenue totals per unit per period. Joined with targets at query time in the Format layer (not in the mart itself).
- Status: completed
- Priority: P0
- Dependencies: T-009
Acceptance
- Groups by
unit_code_normalized,period_year,period_month -
revenue_actuals:SUM(amount) -
transaction_count:COUNT(DISTINCT receipt_number) -
period: formattedYYYY-MMstring - Selects from
{{ ref('int_orders') }}(not just enrollments — revenue includes all order types)
Files
- @packages/transform/models/marts/mart_revenue.sql
T-011 - mart_program_progress.sql
Student and order counts per program per period, broken down by customer type. Powers Key Comparison and Program Progress slides.
- Status: completed
- Priority: P0
- Dependencies: T-009
Acceptance
- Groups by
unit_code_normalized,product_name,product_category,course_type_name,period_year,period_month,customer_type -
order_count:COUNT(DISTINCT receipt_number) -
student_count:COUNT(DISTINCT student_name) -
revenue:SUM(amount) - Selects from
{{ ref('int_enrollments') }}
Files
- @packages/transform/models/marts/mart_program_progress.sql
T-012 - mart_channel_marketing.sql
Closings and student counts per channel per period with contribution percentage. Powers Channel Marketing slide.
- Status: completed
- Priority: P0
- Dependencies: T-009
Acceptance
- Groups by
unit_code_normalized,channel_name,period_year,period_month,customer_type -
closings:COUNT(DISTINCT receipt_number) -
students:COUNT(DISTINCT student_name) -
contribution_pct: window function over unit × period partition, rounded to 2 decimal places -
WHERE channel_name IS NOT NULL - Selects from
{{ ref('int_enrollments') }}
Files
- @packages/transform/models/marts/mart_channel_marketing.sql
T-013 - mart_school_progress.sql
Student counts per organization per year (annual, not monthly). Powers School Progress slide.
- Status: completed
- Priority: P0
- Dependencies: T-009
Acceptance
- Groups by
unit_code_normalized,organization_name,period_year -
student_count:COUNT(DISTINCT student_name) -
WHERE organization_name IS NOT NULL - No
period_monthgrouping — school analysis is annual
Files
- @packages/transform/models/marts/mart_school_progress.sql
T-014 - Mart schema tests
Define schema tests for mart models.
- Status: completed
- Priority: P1
- Dependencies: T-010, T-011, T-012, T-013
Acceptance
-
models/marts/schema.ymlexists -
mart_revenue:unit_codenot_null,revenue_actualsnot_null -
mart_program_progress:unit_codenot_null,order_countnot_null -
mart_channel_marketing:unit_codenot_null,channel_namenot_null -
mart_school_progress:unit_codenot_null,organization_namenot_null -
uv run dbt test --select martspasses
Files
- @packages/transform/models/marts/schema.yml
Phase 6: Verification
- Purpose: Run the full dbt pipeline against real data and confirm all models and tests pass
T-015 - Full dbt run and test verification
Run the complete dbt pipeline against real 2026 data loaded by Plan 002. Confirm all models compile and run, all tests pass.
- Status: completed
- Priority: P0
- Dependencies: T-007, T-009, T-014
Acceptance
-
uv run dbt runcompletes without errors (all 10 models: 4 staging, 2 intermediate, 4 marts) -
uv run dbt testpasses all tests -
mart_revenuehas rows with real unit codes (TM, WLC_ENGLISH, WLC_NON_ENGLISH, etc.) -
mart_program_progresshas rows with real program names -
mart_channel_marketingrows include KBM, Sosial Media channels -
mart_school_progressrows include organization names from 2026 data
T-016 - Update AGENTS.md with transform commands
Add transform commit scope and dbt CLI usage to AGENTS.md.
- Status: completed
- Priority: P1
- Dependencies: T-015
Acceptance
-
AGENTS.mdcommit scopes list includestransform -
AGENTS.md## Commandssection shows dbt usage (uv run dbt run,uv run dbt test)
Files
- AGENTS.md