Skip to content

fix(parser): unquote TOML keys and section headers#152

Merged
aksOps merged 1 commit into
mainfrom
fix/parser-toml-unquote-keys
May 13, 2026
Merged

fix(parser): unquote TOML keys and section headers#152
aksOps merged 1 commit into
mainfrom
fix/parser-toml-unquote-keys

Conversation

@aksOps
Copy link
Copy Markdown
Contributor

@aksOps aksOps commented May 13, 2026

Summary

Apache Airflow's `.cherry_picker.toml` uses TOML's quoted-key form:

```toml
"check_sha" = "..."
```

`parseTOML` was reading the LHS as the raw text including the literal quotes. The `TomlStructureDetector` then emitted node IDs like `toml:.cherry_picker.toml:"check_sha"` while the CONTAINS edges referenced a different escaping shape, and Kuzu's BulkLoadEdges aborted:

```
Error: enrich: bulk load edges: graph: copy CONTAINS:
Copy exception: Unable to find primary key value
"toml:.cherry_picker.toml:""check_sha"""
```

This blocked end-to-end enrich on real-world repos with quoted TOML keys (caught running enrich on `apache/airflow` after #149/#150/#151 landed).

Fix

Use the existing `unquote` helper on both the section header and the key before storing in the parsed map. Symmetric fix because `["quoted-section"]` headers were broken the same way.

Test plan

  • `cd go && CGO_ENABLED=1 go test ./... -count=1` — 882 passed across 45 packages (was 880, +2 new tests)
  • New `TestParseTOMLUnquotesKeys` — both `""-` and `'-quoted top-level keys produce unquoted map keys
  • New `TestParseTOMLUnquotesSectionHeaders` — `["quoted-section"]` produces unquoted top-level entry
  • End-to-end on apache/airflow — `codeiq enrich .` now exits 0 (was exit 2 pre-fix): 95k nodes, 246k edges, 165 services

🤖 Generated with Claude Code

Apache Airflow's `.cherry_picker.toml` uses TOML's quoted-key form:

    "check_sha" = "..."

`parseTOML` was reading the LHS as the raw text including the literal
quotes. The TomlStructureDetector then emitted node IDs like
`toml:.cherry_picker.toml:"check_sha"` while the CONTAINS edges (and
any downstream lookup) referenced different shapes — Kuzu's BulkLoad
aborted with:

    Copy exception: Unable to find primary key value
    "toml:.cherry_picker.toml:""check_sha"""

Bug was symmetric for `["quoted-section"]` headers. Fix both: call
the existing `unquote` helper on the key/section before storing.

Regression tests added in structured_test.go (new file).

End-to-end: `codeiq enrich ~/projects/polyglot-bench/airflow` now
exits 0 (was exit 2): 95k nodes, 246k edges, 165 services loaded.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@aksOps aksOps enabled auto-merge (squash) May 13, 2026 16:23
@aksOps aksOps merged commit cc96e1b into main May 13, 2026
13 of 15 checks passed
@aksOps aksOps deleted the fix/parser-toml-unquote-keys branch May 13, 2026 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant