When the Schema Does Not Speak Business: Testing Oracle 26ai Annotations for NL2SQL

I have been involved in a project where the original idea sounded great on paper: business questions coming in through WhatsApp, an LLM translating them into SQL, and the database answering back. Then reality showed up. The database had thousands of tables, almost no comments, very little business context anywhere near the schema, and plenty of technical names that made sense only to the people who had lived with that system for years. In that setup, asking an LLM to generate SQL from natural language was chaos. The workaround was even worse: a growing pile of prebuilt endpoints with hand-written queries, while the model was reduced to intent classification and endpoint routing. You can imagine how that scales. It does not.

That is the angle I wanted for this test. Not a polished demo where everything works from the first prompt, but a small schema where the gap between technical structure and business meaning is visible enough to hurt. Oracle describes AI Enrichment as a layer of business-centric metadata added to the schema without changing the underlying data or structure, and the stated goal is exactly this: reduce ambiguity and give LLMs enough context to generate more accurate SQL. The docs also make it clear that SQL Developer for VS Code can enrich schemas, tables, columns, and table groups, and that Select AI can include annotations and constraints in the metadata sent to the model.

I am keeping this demo code-first. The schema already exists. The point is to look at a small finance model, run the same prompts before and after enrichment, and see how much of the problem was never SQL generation in the first place. It was schema meaning. But more specifically here, it was annotation meaning. That extra layer is the whole point of the post. Not comments. Not naming conventions alone. Actual metadata that tells the model how the business talks about the schema.

The full project is on GitHub.

Prerequisites

Before following this demo, I am assuming a few things.

  • You already know the basics of Select AI in Oracle AI Database, including DBMS_CLOUD_AI.GENERATEDBMS_CLOUD_AI.SET_PROFILE, and SELECT AI.
  • You already know how to create a credential for an external provider and how your environment handles outbound access.
  • You are comfortable reading generated SQL critically and checking whether it answers the right business question, not just whether it compiles.
  • You understand that this demo runs outside Oracle APEX and is centered on DBMS_CLOUD and DBMS_CLOUD_AI.

If any of that still feels too compressed, check the Oracle documentation first, or leave a comment. That could easily become a separate post.

The demo schema

The model is small on purpose.

FIN_COMPANY stores the formal company registry.
FIN_COMPANY_ALIAS stores the names the business actually uses in daily language.
FIN_SUPPLIER and FIN_CUSTOMER are the master tables for trading partners.
FIN_ACCOUNTS_PAYABLE stores supplier-facing liabilities.
FIN_ACCOUNTS_RECEIVABLE stores customer-facing receivables.

That is enough to create the kinds of ambiguity that show up in real systems. A formal company name is not always the same as the name people type in a message. “Outstanding” does not mean the same thing as “original amount.” “Due in April” is not the same thing as “issued in April.” A small schema is enough to show all of that.

The two cases

I ended up keeping two prompts.

The first one became the main case because it produced a visible correction in the returned totals.

How much is still outstanding by supplier for Beacon Central?

The second one stayed because it shows a more nuanced result. The generated SQL improved after annotations, but the returned rows stayed broad.

Which supplier invoices are due in April 2026 for Beacon Central?

That contrast made the post better. One case shows annotations correcting a business result. The other shows annotations cleaning up the route through the schema without magically fixing an overly broad question.

Before annotations

I ran both prompts against the same schema before applying 03_annotations.sql.

Case 1 — Outstanding balance by supplier

This was the generated SQL before annotations:

select dbms_cloud_ai.generate(
           prompt       => 'How much is still outstanding by supplier for Beacon Central?',
           profile_name => 'FIN_OPENAI_DEMO',
           action       => 'showsql'
       ) as generated_sql
  from dual;
  
-- The output is:

SELECT
    s."SUPPLIER_NAME" AS "supplier_name",
    SUM(ap."OPEN_AMOUNT") AS "outstanding_amount"
FROM "ADMIN"."FIN_ACCOUNTS_PAYABLE" ap
JOIN "ADMIN"."FIN_SUPPLIER" s
    ON s."SUPPLIER_ID" = ap."SUPPLIER_ID"
JOIN "ADMIN"."FIN_COMPANY" c
    ON c."COMPANY_ID" = ap."COMPANY_ID"
LEFT JOIN "ADMIN"."FIN_COMPANY_ALIAS" ca
    ON ca."COMPANY_ID" = c."COMPANY_ID"
WHERE UPPER(c."COMPANY_NAME") = UPPER('Beacon Central')
   OR UPPER(ca."COMPANY_ALIAS") = UPPER('Beacon Central')
GROUP BY s."SUPPLIER_NAME"
ORDER BY s."SUPPLIER_NAME";

At first glance, it does not look terrible. It already chose FIN_ACCOUNTS_PAYABLE, it already grouped by supplier, and it already summed OPEN_AMOUNT. The problem is subtler. It still treated the company resolution loosely, and it did not filter status at all. So everything that matched that alias path and had an open amount entered the aggregation, including rows that should not participate in a “still outstanding” interpretation.

This was the observed result before annotations:

SupplierOutstanding Amount
BlueRock Parts15500
Granite Utilities7800
Metro Fuel Network5900
North Harbor Bank16600
Orbit Leasing11000
Pioneer Fleet Services5200
Silverline Insurance8700

Case 2 — Supplier invoices due in April 2026

This was the generated SQL before annotations:

select dbms_cloud_ai.generate(
           prompt       => 'Which supplier invoices are due in April 2026 for Beacon Central?',
           profile_name => 'FIN_OPENAI_DEMO',
           action       => 'showsql'
       ) as generated_sql
  from dual;

-- The output is:

SELECT
    s."SUPPLIER_NAME" AS "SUPPLIER_NAME",
    ap."PAYABLE_ID" AS "PAYABLE_ID",
    ap."ISSUE_DATE" AS "ISSUE_DATE",
    ap."DUE_DATE" AS "DUE_DATE",
    ap."ORIGINAL_AMOUNT" AS "ORIGINAL_AMOUNT",
    ap."PAID_AMOUNT" AS "PAID_AMOUNT",
    ap."OPEN_AMOUNT" AS "OPEN_AMOUNT",
    ap."STATUS" AS "STATUS"
FROM "ADMIN"."FIN_ACCOUNTS_PAYABLE" ap
INNER JOIN "ADMIN"."FIN_SUPPLIER" s
    ON s."SUPPLIER_ID" = ap."SUPPLIER_ID"
INNER JOIN "ADMIN"."FIN_COMPANY" c
    ON c."COMPANY_ID" = ap."COMPANY_ID"
LEFT JOIN "ADMIN"."FIN_COMPANY_ALIAS" ca
    ON ca."COMPANY_ID" = c."COMPANY_ID"
WHERE (
        UPPER(c."COMPANY_NAME") = UPPER('Beacon Central')
        OR UPPER(c."COMPANY_CODE") = UPPER('Beacon Central')
        OR UPPER(ca."COMPANY_ALIAS") = UPPER('Beacon Central')
    )
    AND ap."DUE_DATE" >= DATE '2026-04-01'
    AND ap."DUE_DATE" < DATE '2026-05-01'
ORDER BY
    ap."DUE_DATE",
    s."SUPPLIER_NAME",
    ap."PAYABLE_ID";

This one already used DUE_DATE, which is good. The problem here was not the date logic. It was the company resolution again, and the fact that the prompt still left the result set broad enough to include rows with statuses that a finance user might not mentally group under “supplier invoices due in April.”

This was the observed result before annotations:

SupplierPayable IDDue DateOriginal AmountPaid AmountOpen AmountStatus
Granite Utilities902/04/26180001800OPEN
BlueRock Parts105/04/2612000200010000PARTIAL
Metro Fuel Network1107/04/2621001002000PARTIAL
Orbit Leasing2009/04/2627507502000PARTIAL
North Harbor Bank210/04/26500005000OPEN
Pioneer Fleet Services1612/04/26820030005200PARTIAL
BlueRock Parts618/04/26250002500OPEN
Granite Utilities1418/04/26470004700OPEN
North Harbor Bank822/04/26400015002500PARTIAL
North Harbor Bank1924/04/26910009100OPEN
Granite Utilities2127/04/26130001300CANCELED
BlueRock Parts728/04/26900060003000PARTIAL
Pioneer Fleet Services1229/04/26540054000PAID
Orbit Leasing1330/04/26330013002000PARTIAL

That is already enough to show the issue. The schema is technically queryable. The model is not lost. But the business meaning is still too implicit.

Creating the annotations

This is the core of the post.

Annotations in this context are not decorative metadata. They are explicit business hints attached to tables and columns. You can annotate the table itself, or annotate specific columns inside a modify clause. In practice, the syntax that matters most here is this:

For table-level annotations:

alter table table_name annotations (
    DESCRIPTION '...',
    ALIASES '...'
);

For column-level annotations:

alter table table_name modify (
    column_name annotations (
        DESCRIPTION '...',
        ALIASES '...',
        "VALUES" '...',
        "JOIN COLUMN" 'OTHER_TABLE.OTHER_COLUMN',
        UNITS '...'
    )
);

That is what I used in the demo. I focused on the parts that were directly affecting the two cases: company aliases, due date meaning, outstanding balance meaning, and status interpretation.

Here is the actual script shape I used.

alter table fin_company_alias annotations (
    BUSINESS_GROUP 'FIN_CORE',
    DESCRIPTION 'Business aliases and common spoken names used by end users when referring to companies.',
    ALIASES 'company alias, alternate company name, nickname, branch nickname'
);

alter table fin_company_alias modify (
    company_alias annotations (
        DESCRIPTION 'User-facing company name used in daily operational language.',
        ALIASES 'nickname, spoken name, short name, common name',
        "VALUES" 'Aster Downtown; Aster Motors Downtown; Downtown Motors; Beacon Central; Central Beacon'
    )
);

alter table fin_accounts_payable modify (
    due_date annotations (
        DESCRIPTION 'Due date used for open balance analysis, overdue analysis, payment scheduling, and questions such as due this month or due in April.',
        ALIASES 'due date, payment due date, maturity date, due in month, due this week'
    )
);

alter table fin_accounts_payable modify (
    original_amount annotations (
        DESCRIPTION 'Original amount of the payable title at issue time. Do not use this as the remaining balance for still owe or outstanding questions.',
        ALIASES 'gross amount, invoice amount, face amount, booked amount',
        UNITS 'BRL'
    )
);

alter table fin_accounts_payable modify (
    open_amount annotations (
        DESCRIPTION 'Outstanding unpaid balance. Prefer this column for questions such as still owe, unpaid, pending, outstanding, or open payable amount.',
        ALIASES 'open amount, outstanding amount, unpaid balance, pending amount, balance still owed',
        UNITS 'BRL'
    )
);

alter table fin_accounts_payable modify (
    status annotations (
        DESCRIPTION 'Current status of the payable title. Pending or still unpaid payables normally mean OPEN and PARTIAL only. Exclude PAID and CANCELED from unpaid balance questions.',
        "VALUES" 'OPEN; PARTIAL; PAID; CANCELED'
    )
);

That is the heart of it. I am telling the model three things very directly.

COMPANY_ALIAS is where the business-facing company names live.
OPEN_AMOUNT is the right column for “still outstanding.”
PAID and CANCELED should not contaminate unpaid balance questions.

The full script is in the repository, but these are the lines that really move the needle for the cases in this post.

After annotations

I reran the exact same prompts after applying 03_annotations.sql.

Case 1 — Outstanding balance by supplier

This was the generated SQL after annotations:

select dbms_cloud_ai.generate(
           prompt       => 'How much is still outstanding by supplier for Beacon Central?',
           profile_name => 'FIN_OPENAI_DEMO',
           action       => 'showsql'
       ) as generated_sql
  from dual;

-- The output is:

SELECT
    s."SUPPLIER_NAME" AS "supplier_name",
    SUM(ap."OPEN_AMOUNT") AS "outstanding_amount"
FROM "ADMIN"."FIN_ACCOUNTS_PAYABLE" ap
JOIN "ADMIN"."FIN_SUPPLIER" s
    ON s."SUPPLIER_ID" = ap."SUPPLIER_ID"
JOIN "ADMIN"."FIN_COMPANY_ALIAS" ca
    ON ca."COMPANY_ID" = ap."COMPANY_ID"
WHERE UPPER(ca."COMPANY_ALIAS") = UPPER('Beacon Central')
  AND ap."STATUS" IN ('OPEN', 'PARTIAL')
GROUP BY
    s."SUPPLIER_NAME"
ORDER BY
    "outstanding_amount" DESC;

This is exactly the kind of shift I wanted to see. The SQL did not just get cleaner. It got more opinionated in the right direction. Alias resolution stopped wandering through formal company name and code. Status handling stopped being implicit.

This was the observed result after annotations:

SupplierOutstanding Amount
North Harbor Bank16600
BlueRock Parts15500
Orbit Leasing11000
Silverline Insurance8700
Granite Utilities6500
Metro Fuel Network5900
Pioneer Fleet Services5200

Case 2 — Supplier invoices due in April 2026

This was the generated SQL after annotations:

select dbms_cloud_ai.generate(
           prompt       => 'Which supplier invoices are due in April 2026 for Beacon Central?',
           profile_name => 'FIN_OPENAI_DEMO',
           action       => 'showsql'
       ) as generated_sql
  from dual;

-- The output is:

SELECT
    s."SUPPLIER_NAME" AS "Supplier Name",
    ap."PAYABLE_ID" AS "Payable ID",
    ap."ISSUE_DATE" AS "Issue Date",
    ap."DUE_DATE" AS "Due Date",
    ap."ORIGINAL_AMOUNT" AS "Original Amount",
    ap."OPEN_AMOUNT" AS "Open Amount",
    ap."PAID_AMOUNT" AS "Paid Amount",
    ap."STATUS" AS "Status"
FROM "ADMIN"."FIN_ACCOUNTS_PAYABLE" ap
INNER JOIN "ADMIN"."FIN_SUPPLIER" s
    ON s."SUPPLIER_ID" = ap."SUPPLIER_ID"
INNER JOIN "ADMIN"."FIN_COMPANY_ALIAS" ca
    ON ca."COMPANY_ID" = ap."COMPANY_ID"
WHERE UPPER(ca."COMPANY_ALIAS") = UPPER('Beacon Central')
  AND ap."DUE_DATE" >= DATE '2026-04-01'
  AND ap."DUE_DATE" < DATE '2026-05-01'
ORDER BY
    ap."DUE_DATE",
    s."SUPPLIER_NAME",
    ap."PAYABLE_ID";

This version is better than the first one. It resolves the company through alias directly and keeps the date logic anchored to DUE_DATE. The interesting part is that it still does not filter status. That is why I kept this case. It shows improvement, but not total victory.

This was the observed result after annotations:

SupplierPayable IDDue DateOriginal AmountOpen AmountPaid AmountStatus
Granite Utilities902/04/26180018000OPEN
BlueRock Parts105/04/2612000100002000PARTIAL
Metro Fuel Network1107/04/2621002000100PARTIAL
Orbit Leasing2009/04/2627502000750PARTIAL
North Harbor Bank210/04/26500050000OPEN
Pioneer Fleet Services1612/04/26820052003000PARTIAL
BlueRock Parts618/04/26250025000OPEN
Granite Utilities1418/04/26470047000OPEN
North Harbor Bank822/04/26400025001500PARTIAL
North Harbor Bank1924/04/26910091000OPEN
Granite Utilities2127/04/26130013000CANCELED
BlueRock Parts728/04/26900030006000PARTIAL
Pioneer Fleet Services1229/04/26540005400PAID
Orbit Leasing1330/04/26330020001300PARTIAL

Comparing the outcomes

The first case is where annotations started to feel worth the effort.

SupplierBeforeAfter
North Harbor Bank1660016600
BlueRock Parts1550015500
Orbit Leasing1100011000
Silverline Insurance87008700
Granite Utilities78006500
Metro Fuel Network59005900
Pioneer Fleet Services52005200

Granite Utilities dropped from 7800 to 6500. That is the kind of difference a quick demo would easily miss because both numbers look plausible. The annotation-driven SQL got closer to the actual meaning of “still outstanding,” and the result changed.

The second case tells a different story.

AspectBeforeAfter
Company resolutionFormal name, code, or aliasAlias directly
Date logicCorrectly uses due dateCorrectly uses due date
Status filteringNoneNone
Result set size14 rows14 rows

That is not a failure. It is a limit. The model got better at resolving the company, but the prompt still left too much room for a broad reading of “supplier invoices due in April.” If I had asked for “open supplier invoices due in April,” the result set would probably have narrowed down in a more satisfying way.

That is exactly why I wanted both cases in the post. One shows a business correction. The other shows that annotations improve the odds, but they do not replace a precise business prompt.

Closing thought

I do not think annotations are interesting by themselves. What is interesting is what they expose.

A lot of so-called AI architecture problems are really metadata problems that got pushed up the stack because nobody wanted to touch the schema. Then the system grows sideways. More routing. More middleware. More prompt shaping. More code doing semantic work that should have been much closer to the data.

A quiet schema is expensive. It just hides the bill for a while.

If you want to reproduce the whole flow, the scripts are in the project repository: AI_Annotations on GitHub.

Comments

Deixe um comentário

O seu endereço de e-mail não será publicado. Campos obrigatórios são marcados com *