Products model benchmark update application service feature

Single-Cluster Duality View 🃏

DEV Communityby Franck PachotApril 1, 202631 min read0 views

In DynamoDB, a <a href="https://aws.amazon.com/blogs/database/single-table-vs-multi-table-design-in-amazon-dynamodb/" rel="noopener noreferrer">single-table design</a> stores one-to-many relationships in a single physical block while still following relational-like normal form decomposition. In MongoDB, the <a href="https://www.mongodb.com/docs/manual/data-modeling/design-patterns/single-collection/" rel="noopener noreferrer">Single Collection Pattern</a> unnests relationships from a single document, but goes against the general recommendation as it sacrifices one of MongoDB’s key advantages—keeping a document in a single block. In Oracle Database and MySQL, JSON-relational <a href="https://oracle-base.com/articles/23/json-relational-duality-views-23" rel="noopener nor

In DynamoDB, a single-table design stores one-to-many relationships in a single physical block while still following relational-like normal form decomposition. In MongoDB, the Single Collection Pattern unnests relationships from a single document, but goes against the general recommendation as it sacrifices one of MongoDB’s key advantages—keeping a document in a single block. In Oracle Database and MySQL, JSON-relational duality views normalize JSON documents into relational tables that span multiple blocks, also without the data-locality benefits of MongoDB. Can we turn these duality views into a document-database equivalent that keeps together the data accessed together?

Here is SCDV™, the Single-Cluster Duality View 🥁.

DynamoDB

NoSQL started with a simple key‑value API. For example, DynamoDB can get an item using the full key, query multiple items using a partial key, or scan all items without a key. The value, in this key‑value datastore, is atomic, but sometimes you need partial reads or writes. Then you have two solutions: read or write the full item, which is inefficient if you don't need it, or change your schema design to split it into multiple items.

The latter was known as the single table design because it not only splits items into multiple entities but also stores them in the same table, sharing a key prefix, to retrieve all items with a single query. The idea is that you can access individual objects while also benefiting from data locality when querying the full aggregate.

The benefit of this design relies heavily on DynamoDB specifics: the storage internals, where items are partitioned and clustered by their key, and the billing model, where you pay per table request unit.

Here is an example, using the traditional one-to-many schema with departments employees:

One DynamoDB table
Department and employees stored as separate items
Same partition key (DEPT#) for both:

aws dynamodb create-table \  --table-name scott \  --attribute-definitions \  AttributeName=PK,AttributeType=S \  AttributeName=SK,AttributeType=S \  --key-schema \  AttributeName=PK,KeyType=HASH \  AttributeName=SK,KeyType=RANGE \  --billing-mode PAY_PER_REQUEST

aws dynamodb create-table \  --table-name scott \  --attribute-definitions \  AttributeName=PK,AttributeType=S \  AttributeName=SK,AttributeType=S \  --key-schema \  AttributeName=PK,KeyType=HASH \  AttributeName=SK,KeyType=RANGE \  --billing-mode PAY_PER_REQUEST

Enter fullscreen mode

Exit fullscreen mode

The logical model, where the relationship is materialized by sharing the department number as the same partition key, is also the physical model in which they are stored together, since items are partitioned on this key:

PK SK Meaning

DEPT#10 DEPT Department row

DEPT#10 EMP#7782 Employee

DEPT#10 EMP#7839 Employee

DEPT#20 DEPT Department row

DEPT#20 EMP#7369 Employee

I insert some data

aws dynamodb put-item --table-name scott --item '{  "PK": {"S": "DEPT#10"},  "SK": {"S": "DEPT"},  "deptno": {"N": "10"},  "dname": {"S": "ACCOUNTING"},  "loc": {"S": "NEW YORK"} }'

aws dynamodb put-item --table-name scott --item '{  "PK": {"S": "DEPT#10"},  "SK": {"S": "DEPT"},  "deptno": {"N": "10"},  "dname": {"S": "ACCOUNTING"},  "loc": {"S": "NEW YORK"} }'

aws dynamodb put-item --table-name scott --item '{ "PK": {"S": "DEPT#10"}, "SK": {"S": "EMP#7782"}, "empno": {"N": "7782"}, "ename": {"S": "CLARK"}, "job": {"S": "MANAGER"}, "sal": {"N": "2450"} }'`

Enter fullscreen mode

Exit fullscreen mode

Here is a query that retrieves the whole aggregate by its partition key:

aws dynamodb query \  --table-name scott \  --key-condition-expression "PK = :d" \  --expression-attribute-values '{":d":{"S":"DEPT#10"}}' \  --consistent-read \  --return-consumed-capacity TOTAL

aws dynamodb query \  --table-name scott \  --key-condition-expression "PK = :d" \  --expression-attribute-values '{":d":{"S":"DEPT#10"}}' \  --consistent-read \  --return-consumed-capacity TOTAL

{ "Items": [ { "deptno": { "N": "10" }, "PK": { "S": "DEPT#10" }, "loc": { "S": "NEW YORK" }, "dname": { "S": "ACCOUNTING" }, "SK": { "S": "DEPT" } }, { "ename": { "S": "CLARK" }, "PK": { "S": "DEPT#10" }, "job": { "S": "MANAGER" }, "empno": { "N": "7782" }, "sal": { "N": "2450" }, "SK": { "S": "EMP#7782" } } ], "Count": 2, "ScannedCount": 2, "ConsumedCapacity": { "TableName": "scott", "CapacityUnits": 1.0 } }`

Enter fullscreen mode

Exit fullscreen mode

All items are read from a single partition with 1 consistent read capacity unit (RCU), thanks to physical colocation by key design.

MongoDB

MongoDB doesn't need to split documents to get partial reads and writes. The data modeling objective is to keep aggregates in a single document and use the advanced API to access individual items. For example, you can use covering indexes or search indexes with returnStoredSource to avoid reading the whole document. And you can use $push or $set with arrayFilters to update individual array items.

Still, some users preferred to split the documents and, by analogy to DynamoDB, called it the Single Collection Pattern. In truth, it is rarely useful. MongoDB is not DynamoDB. Storing documents in one or multiple collections doesn't impact billing, and using the same key prefix doesn't co‑locate them — except in two cases: clustered collections (special‑purpose, not generally recommended), and sharing a sharding key to co‑locate data on the same shard (but still different blocks in the filesystem).

At the storage block level, documents may be stored together only if they were inserted together.

Although this is not a recommendation when you use the full power of the MongoDB API, here is an example using a single collection and embedding only references, following the documentation: one document per department and one document per employee:

db.empdept.insertMany([ {  _id: "DEPT#10",  doc_type: "dept",  deptno: 10,  dname: "ACCOUNTING",  loc: "NEW YORK",  links: [  { target: "DEPT#10", doc_type: "dept" },  { target: "EMP#7782", doc_type: "emp" },  { target: "EMP#7839", doc_type: "emp" }  ] },{  _id: "EMP#7782",  doc_type: "emp",  empno: 7782,  ename: "CLARK",  job: "MANAGER",  sal: 2450,  deptno: 10,  links: [  { target: "EMP#7782", doc_type: "emp" },  { target: "DEPT#10", doc_type: "dept" }  ] },{  _id: "EMP#7839",  doc_type: "emp",  empno: 7839,  ename: "KING",  job: "PRESIDENT",  sal: 5000,  deptno: 10,  links: [  { target: "EMP#7839", doc_type: "emp" },  { target: "DEPT#10", doc_type: "dept" }  ] }];

db.empdept.insertMany([ {  _id: "DEPT#10",  doc_type: "dept",  deptno: 10,  dname: "ACCOUNTING",  loc: "NEW YORK",  links: [  { target: "DEPT#10", doc_type: "dept" },  { target: "EMP#7782", doc_type: "emp" },  { target: "EMP#7839", doc_type: "emp" }  ] },{  _id: "EMP#7782",  doc_type: "emp",  empno: 7782,  ename: "CLARK",  job: "MANAGER",  sal: 2450,  deptno: 10,  links: [  { target: "EMP#7782", doc_type: "emp" },  { target: "DEPT#10", doc_type: "dept" }  ] },{  _id: "EMP#7839",  doc_type: "emp",  empno: 7839,  ename: "KING",  job: "PRESIDENT",  sal: 5000,  deptno: 10,  links: [  { target: "EMP#7839", doc_type: "emp" },  { target: "DEPT#10", doc_type: "dept" }  ] }];

Enter fullscreen mode

Exit fullscreen mode

With this schema, the following gets the full aggregate as multiple documents without an aggregation pipeline:

db.empdept.find({ "links.target": "DEPT#10" })

Enter fullscreen mode

Exit fullscreen mode

This can identify the document using a single index range with a multi-key index on "links.target", but it will have to fetch multiple small documents, introduces strong coupling between those documents without the possibility of validating the reference, and requires an explicit transaction and retry logic to update them.

MongoDB is designed to store aggregates as a single document, and the right schema is simply:

db.empdept.insertOne({  _id: 10,  dname: "ACCOUNTING",  loc: "NEW YORK",  employees: [  { empno: 7782, ename: "CLARK", job: "MANAGER", sal: 2450 },  { empno: 7839, ename: "KING", job: "PRESIDENT", sal: 5000 }  ] })

db.empdept.insertOne({  _id: 10,  dname: "ACCOUNTING",  loc: "NEW YORK",  employees: [  { empno: 7782, ename: "CLARK", job: "MANAGER", sal: 2450 },  { empno: 7839, ename: "KING", job: "PRESIDENT", sal: 5000 }  ] })

Enter fullscreen mode

Exit fullscreen mode

With this schema, you can still query the full aggregate with a simple find():

db.empdept.find({ _id: 10 })_

Enter fullscreen mode

Exit fullscreen mode

and you can still update a single employee in‑place:

db.empdept.updateOne(  { _id: 10, "employees.empno": 7782 },  { $set: { "employees.$.sal": 2600 } } )

db.empdept.updateOne(  { _id: 10, "employees.empno": 7782 },  { $set: { "employees.$.sal": 2600 } } )

Enter fullscreen mode

Exit fullscreen mode

This updates the first matching array item using the positional $ operator. You can be more precise with arrayFilters

db.empdept.updateOne(   { _id: 10 },   { $set: { "employees.$[e].sal": 2600 } },   { arrayFilters: [{ "e.empno": 7782 }] }  )

db.empdept.updateOne(   { _id: 10 },   { $set: { "employees.$[e].sal": 2600 } },   { arrayFilters: [{ "e.empno": 7782 }] }  )

Enter fullscreen mode

Exit fullscreen mode

I've written about this and measured the efficiency in a previous post:

Emulations

A single document per aggregate, with embedded one-to-many relationships, is efficient in MongoDB, not in emulations. For example, in Oracle Autonomous Database, where the document is stored as OSON, the update will rewrite the full document:

ora> db.empdept.updateOne(  { _id: 10, "employees.empno": 7782 },  { $set: { "employees.$.sal": 2600 } } )_

ora> db.empdept.updateOne(  { _id: 10, "employees.empno": 7782 },  { $set: { "employees.$.sal": 2600 } } )_

{ acknowledged: true, insertedId: null, matchedCount: 1, modifiedCount: 1, upsertedCount: 0 }

ora> db.aggregate( [ { $sql : " select * from dbms_xplan.display_cursor( format=>'BASIC') " } ] ).forEach(row => print(row.PLAN_TABLE_OUTPUT));*

EXPLAINED SQL STATEMENT:

update "ORA"."empdept" set "DATA" = :1 where ("RESID" = :2 ) returning "RESID", "ETAG" into :3 , :4

Plan hash value: 893016358

| Id | Operation | Name |

| 0 | UPDATE STATEMENT | | | 1 | UPDATE | empdept | | 2 | TABLE ACCESS BY INDEX ROWID| empdept | | 3 | INDEX UNIQUE SCAN | SYS_C0031043 | -----------------------------------------------------`

Enter fullscreen mode

Exit fullscreen mode

You have also the possibility to store the collection with JSON-relational duality views (JDV), but this partial update is not supported:

arrayFilters option is not supported on duality view collections

Enter fullscreen mode

Exit fullscreen mode

However, even if the JDV doesn't allow partial updates, the aggregate is stored as individual in SQL tables, and users can switch to SQL statements for partial updates.

Duality Views

In relational databases, applications typically work with a logical view of the data model, while the storage layer may transparently cluster or scatter data into fixed-size blocks. Oracle Database (and MySQL) added JSON-Relational Duality Views (JDV) to present a single logical view for JSON documents, while splitting them to multiple SQL tables. As each SQL tables have their own physical segment(s), aggregates are stored across multiple physical blocks.

If you are a fan of the single‑table or single‑collection idea, you may create a single view. However, it has no advantage in data locality, as you explicitly normalized to multiple tables:

CREATE TABLE dept (  deptno NUMBER CONSTRAINT dept_pk PRIMARY KEY,  dname VARCHAR2(14),  loc VARCHAR2(13) );

CREATE TABLE dept (  deptno NUMBER CONSTRAINT dept_pk PRIMARY KEY,  dname VARCHAR2(14),  loc VARCHAR2(13) );

CREATE TABLE emp ( empno NUMBER CONSTRAINT emp_pk PRIMARY KEY, ename VARCHAR2(10), job VARCHAR2(9), sal NUMBER(7,2), deptno NUMBER NOT NULL, CONSTRAINT emp_dept_fk FOREIGN KEY (deptno) REFERENCES dept(deptno) );

CREATE INDEX emp_dept_fk ON emp (deptno);

INSERT INTO dept (deptno, dname, loc) VALUES (10, 'ACCOUNTING', 'NEW YORK'); INSERT INTO emp (empno, ename, job, sal, deptno) VALUES (7782, 'CLARK', 'MANAGER', 2450, 10); INSERT INTO emp (empno, ename, job, sal, deptno) VALUES (7839, 'KING', 'PRESIDENT', 5000, 10); -- add many departments to check the number of blocks scanned INSERT INTO dept(deptno) SELECT rownum+1000 FROM xmltable('0 to 999') ;

-- one view over multiple tables with JDV

CREATE OR REPLACE JSON RELATIONAL DUALITY VIEW empdept_dv AS SELECT JSON { 'id' : d.deptno, 'dname': d.dname, 'loc' : d.loc, 'employees' : [ SELECT JSON { 'empno': e.empno, 'ename': e.ename, 'job' : e.job, 'sal' : e.sal } FROM emp e WITH INSERT UPDATE DELETE WHERE e.deptno = d.deptno ] } FROM dept d WITH INSERT UPDATE DELETE ;`

Enter fullscreen mode

Exit fullscreen mode

The view looks like a document collection with an embedded one-to-many relationship, similar to MongoDB’s recommended model, but it’s not truly embedded—the view actually splits it into two tables. We have lost the main advantage of MongoDB: data that's accessed together should be stored together.

I can query it as if it were a single table:

SELECT data  FROM empdept_dv WHERE JSON_EXISTS("DATA",'$?(@._id.numberOnly() == $B0)' passing 10 as "B0" type(strict)) ;

SELECT data  FROM empdept_dv WHERE JSON_EXISTS("DATA",'$?(@._id.numberOnly() == $B0)' passing 10 as "B0" type(strict)) ;

Enter fullscreen mode

Exit fullscreen mode

but it actually reads two tables:

SQL> select * from dbms_xplan.display_cursor(format=>'ALLSTATS LAST -COST');*

PLAN_TABLE_OUTPUT

SQL_ID gfr8jkdwatdnz, child number 0

SELECT data FROM empdept_dv WHERE JSON_EXISTS("DATA",'$?(@._id.numberOnly() == $B0)' passing 10 as "B0" type(strict))

Plan hash value: 2755083285

| Id | Operation | Name | Starts | E-Rows | A-Rows | Buffers | OMem | 1Mem | Used-Mem |

| 0 | SELECT STATEMENT | | 1 | | 1 | 3 | | | | | 1 | SORT GROUP BY | | 1 | 1 | 1 | 2 | 2048 | 2048 | 2048 (0)| | 2 | TABLE ACCESS BY INDEX ROWID BATCHED| EMP | 1 | 1 | 2 | 2 | | | | |* 3 | INDEX RANGE SCAN | EMP_DEPT_FK | 1 | 1 | 2 | 1 | | | | | 4 | TABLE ACCESS BY INDEX ROWID | DEPT | 1 | 1 | 1 | 3 | | | | |* 5 | INDEX UNIQUE SCAN | DEPT_PK | 1 | 1 | 1 | 2 | | | |

Predicate Information (identified by operation id):

3 - access("E"."DEPTNO"=:B1) 5 - access("D"."DEPTNO"=10)`

Enter fullscreen mode

Exit fullscreen mode

The access is efficient, using the index on the foreign key, but it doesn't colocate a one-to-many relationship as we expect when storing an aggregate into a single document, or when using the single-table design.

You can update through the view, but the following will update a lot more than what you expect, as it rewrites the full document:

UPDATE empdept_dv SET data = JSON_TRANSFORM(  data,  SET '$.employees[0].sal' = 2600 ) WHERE JSON_EXISTS("DATA",'$?(@._id.numberOnly() == $B0)' passing 10 as "B0" type(strict)) ;

UPDATE empdept_dv SET data = JSON_TRANSFORM(  data,  SET '$.employees[0].sal' = 2600 ) WHERE JSON_EXISTS("DATA",'$?(@._id.numberOnly() == $B0)' passing 10 as "B0" type(strict)) ;

Enter fullscreen mode

Exit fullscreen mode

Using duality views instead of OSON storage allows you to fall back to SQL to update a single item, similar to a single-table design. However, unlike DynamoDB, you cannot group different item types in the same table. These are SQL tables with a fixed schema and no polymorphism, so you cannot store both department and employee attributes in a single table. JSON-Relational Duality Views don’t replace document databases: they act like an object-relational mapper (ORM) with all logic deployed in the SQL database.

Cluster

To get closer to the performance of a document database, I need a single-table duality view capable of storing documents that are split across multiple SQL tables, yet behave as if they were in a single physical table. In Oracle Database, this corresponds to a CLUSTER, where only the key is declared, like DynamoDB tables, and an estimated value size are predefined to fill the fixed-size blocks:

CREATE CLUSTER dept_cluster (  deptno NUMBER ------------------------------------- clustering key )  SIZE 1024 -- expected bytes per cluster key ;

CREATE CLUSTER dept_cluster (  deptno NUMBER ------------------------------------- clustering key )  SIZE 1024 -- expected bytes per cluster key ;

CREATE INDEX dept_cluster_idx ON CLUSTER dept_cluster ;`

Enter fullscreen mode

Exit fullscreen mode

The logical SQL tables are stored in the cluster under a common clustering key, instead of allocating their own physical segment:

CREATE TABLE dept (   deptno NUMBER CONSTRAINT dept_pk PRIMARY KEY, --- clustering key  dname VARCHAR2(14),   loc VARCHAR2(13)   )  CLUSTER dept_cluster (deptno) ---------------------- clustering key ;

CREATE TABLE dept (   deptno NUMBER CONSTRAINT dept_pk PRIMARY KEY, --- clustering key  dname VARCHAR2(14),   loc VARCHAR2(13)   )  CLUSTER dept_cluster (deptno) ---------------------- clustering key ;

CREATE TABLE emp ( empno NUMBER CONSTRAINT emp_pk PRIMARY KEY, ename VARCHAR2(10), job VARCHAR2(9), sal NUMBER(7,2), deptno NUMBER NOT NULL, ---------------------- clustering key CONSTRAINT emp_dept_fk FOREIGN KEY (deptno) REFERENCES dept(deptno) ) CLUSTER dept_cluster (deptno) ---------------------- clustering key ;

Enter fullscreen mode

Exit fullscreen mode

I created the same duality view as before, as the logical tables are the same, and query it:

SELECT data  FROM empdept_dv WHERE JSON_EXISTS("DATA",'$?(@._id.numberOnly() == $B0)' passing 10 as "B0" type(strict)) ;

SELECT data  FROM empdept_dv WHERE JSON_EXISTS("DATA",'$?(@._id.numberOnly() == $B0)' passing 10 as "B0" type(strict)) ;

Enter fullscreen mode

Exit fullscreen mode

Physically, it accesses the logical tables in the pre-joined cluster, via its index:

SQL> select * from dbms_xplan.display_cursor(format=>'ALLSTATS LAST');*

PLAN_TABLE_OUTPUT

SQL_ID 26py2vsmch5kx, child number 0

SELECT data FROM empdept_dv WHERE JSON_EXISTS("DATA",'$?(@._id.numberOnl y() == $B0)' passing 10 as "B0" type(strict))

Plan hash value: 176041294

| Id | Operation | Name | Starts | A-Rows | Buffers | Reads | OMem | 1Mem | Used-Mem |

| 0 | SELECT STATEMENT | | 1 | 1 | 3 | 2 | | | | | 1 | SORT GROUP BY | | 1 | 1 | 3 | 2 | 2048 | 2048 | 2048 (0)| | 2 | TABLE ACCESS CLUSTER | EMP | 1 | 2 | 3 | 2 | | | | |* 3 | INDEX UNIQUE SCAN | DEPT_CLUSTER_IDX | 1 | 1 | 2 | 2 | | | | | 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 1 | 3 | 2 | | | | |* 5 | INDEX UNIQUE SCAN | DEPT_PK | 1 | 1 | 2 | 2 | | | | ------------------------------------------------------------------------------------------------------------------------`

Enter fullscreen mode

Exit fullscreen mode

The execution plan shows access to the cluster using its index on DEPTNO. However, the plan is unclear and appears to use both the DEPT index and the cluster index. I displayed the number of buffer gets, but the math doesn’t add up: a total of three buffers seems reasonable—two index blocks (root and leaf) and one cluster block—yet the detailed output shows three for each index, which should mean six in total.

I confirmed that all my data is in a single block—my goal—by checking the ROWID, which contains the block identifier:

select   rowid,   dbms_rowid.rowid_relative_fno(rowid) as file_no,   dbms_rowid.rowid_block_number(rowid) as block_no, to_char(dbms_rowid.rowid_block_number(rowid),'xxxxxxxx') as block_hex, dept.* from dept where deptno = 10 ;*

select   rowid,   dbms_rowid.rowid_relative_fno(rowid) as file_no,   dbms_rowid.rowid_block_number(rowid) as block_no, to_char(dbms_rowid.rowid_block_number(rowid),'xxxxxxxx') as block_hex, dept.* from dept where deptno = 10 ;*

ROWID FILE_NO BLOCK_NO BLOCK_HEX DEPTNO DNAME LOC

AAASKBAAAAAAK5hAAA 0 44641 ae61 10 ACCOUNTING NEW YORK

select rowid, dbms_rowid.rowid_relative_fno(rowid) as file_no, dbms_rowid.rowid_block_number(rowid) as block_no, to_char(dbms_rowid.rowid_block_number(rowid),'xxxxxxxx') as block_hex, emp.* from emp where deptno = 10 ;*

ROWID FILE_NO BLOCK_NO BLOCK_HEX EMPNO ENAME JOB SAL DEPTNO

AAASKBAAAAAAK5hAAA 0 44641 ae61 7782 CLARK MANAGER 2450 10 AAASKBAAAAAAK5hAAB 0 44641 ae61 7839 KING PRESIDENT 5000 10`

Enter fullscreen mode

Exit fullscreen mode

They are all in block ae61. To try to understand better, I traced the consistent reads (with event 10200):

ktrgtc2(): started for block <0x0000 : 0x0000ae71> objd: 0x00012288, DirtyRead: 0:0 env [0x7ffff8925fd8]: (scn: 0x00000000004cdc0e xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000000000000000 hi-scn: 0x00000000004cdc0e ma-scn: 0x00000000004cdbea flg: 0x00000662)ktrgtc2(): completed for block <0x0000 : 0x0000ae71> objd: 0x00012288 ret:0x0 flg:0x7fff ktrgtc2(): started for block <0x0000 : 0x0000ae72> objd: 0x00012288, DirtyRead: 0:0 env [0x7ffff8925fd8]: (scn: 0x00000000004cdc0e xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000000000000000 hi-scn: 0x00000000004cdc0e ma-scn: 0x00000000004cdbea flg: 0x00000662)ktrgtc2(): completed for block <0x0000 : 0x0000ae72> objd: 0x00012288 ret:0x0 flg:0x7fff ktrgtc2(): started for block <0x0000 : 0x0000ae61> objd: 0x00012281, DirtyRead: 0:0 env [0x7ffff8925fd8]: (scn: 0x00000000004cdc0e xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000000000000000 hi-scn: 0x00000000004cdc0e ma-scn: 0x00000000004cdbea flg: 0x00000662)ktrgtc2(): completed for block <0x0000 : 0x0000ae61> objd: 0x00012281 ret:0x0 flg:0x2f04d ktrgtc2(): started for block <0x0000 : 0x0000ae69> objd: 0x00012282, DirtyRead: 0:0 env [0x7ffff8925fd8]: (scn: 0x00000000004cdc0e xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000000000000000 hi-scn: 0x00000000004cdc0e ma-scn: 0x00000000004cdbea flg: 0x00000662)ktrgtc2(): completed for block <0x0000 : 0x0000ae69> objd: 0x00012282 ret:0x0 flg:0x90040 ktrgtc2(): started for block <0x0000 : 0x0000ae6a> objd: 0x00012282, DirtyRead: 0:0 env [0x7ffff8925fd8]: (scn: 0x00000000004cdc0e xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000000000000000 hi-scn: 0x00000000004cdc0e ma-scn: 0x00000000004cdbea flg: 0x00000662)ktrgtc2(): completed for block <0x0000 : 0x0000ae6a> objd: 0x00012282 ret:0x0 flg:0x90040

ktrgtc2(): started for block <0x0000 : 0x0000ae71> objd: 0x00012288, DirtyRead: 0:0 env [0x7ffff8925fd8]: (scn: 0x00000000004cdc0e xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000000000000000 hi-scn: 0x00000000004cdc0e ma-scn: 0x00000000004cdbea flg: 0x00000662)ktrgtc2(): completed for block <0x0000 : 0x0000ae71> objd: 0x00012288 ret:0x0 flg:0x7fff ktrgtc2(): started for block <0x0000 : 0x0000ae72> objd: 0x00012288, DirtyRead: 0:0 env [0x7ffff8925fd8]: (scn: 0x00000000004cdc0e xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000000000000000 hi-scn: 0x00000000004cdc0e ma-scn: 0x00000000004cdbea flg: 0x00000662)ktrgtc2(): completed for block <0x0000 : 0x0000ae72> objd: 0x00012288 ret:0x0 flg:0x7fff ktrgtc2(): started for block <0x0000 : 0x0000ae61> objd: 0x00012281, DirtyRead: 0:0 env [0x7ffff8925fd8]: (scn: 0x00000000004cdc0e xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000000000000000 hi-scn: 0x00000000004cdc0e ma-scn: 0x00000000004cdbea flg: 0x00000662)ktrgtc2(): completed for block <0x0000 : 0x0000ae61> objd: 0x00012281 ret:0x0 flg:0x2f04d ktrgtc2(): started for block <0x0000 : 0x0000ae69> objd: 0x00012282, DirtyRead: 0:0 env [0x7ffff8925fd8]: (scn: 0x00000000004cdc0e xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000000000000000 hi-scn: 0x00000000004cdc0e ma-scn: 0x00000000004cdbea flg: 0x00000662)ktrgtc2(): completed for block <0x0000 : 0x0000ae69> objd: 0x00012282 ret:0x0 flg:0x90040 ktrgtc2(): started for block <0x0000 : 0x0000ae6a> objd: 0x00012282, DirtyRead: 0:0 env [0x7ffff8925fd8]: (scn: 0x00000000004cdc0e xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000000000000000 hi-scn: 0x00000000004cdc0e ma-scn: 0x00000000004cdbea flg: 0x00000662)ktrgtc2(): completed for block <0x0000 : 0x0000ae6a> objd: 0x00012282 ret:0x0 flg:0x90040

Enter fullscreen mode

Exit fullscreen mode

I checked from DBA_OBJECTS and DBA_EXTENTS:

blocks 0xae71 and 0xae72 are DEPT_PK (objd:0x12282)
block ae61 is DEPT_CLUSTER (objd:0x12281), the cluster's block for DEPT=10
blocks ae69 and ae6a are DEPT_CLUSTER_IDX (objd:0x12282)

Note that I've run the query with the same predicate that is generated from the MongoDB emulation, and you can see the full SQL query with:

variable c clob; set autoprint on long 10000 pagesize 1000 exec DBMS_UTILITY.EXPAND_SQL_TEXT( q'[ SELECT data FROM empdept_dv WHERE JSON_VALUE(data, '$._id') = 10 ]', :c)

variable c clob; set autoprint on long 10000 pagesize 1000 exec DBMS_UTILITY.EXPAND_SQL_TEXT( q'[ SELECT data FROM empdept_dv WHERE JSON_VALUE(data, '$._id') = 10 ]', :c)

Enter fullscreen mode

Exit fullscreen mode

I got the same execution plan with the simplified syntax:

SELECT data   FROM empdept_dv d   WHERE d.data."_id"=10 ;

SELECT data   FROM empdept_dv d   WHERE d.data."_id"=10 ;

Enter fullscreen mode

Exit fullscreen mode

With the simplest join syntax, on SQL tables, I got a similar plan reading 6 blocks:

PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------

PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------

SQL_ID cys0qt7ttbkzd, child number 0

select * from dept , emp where emp.deptno=dept.deptno and dept.deptno=10 and emp.deptno=10*

Plan hash value: 3732773526

| Id | Operation | Name | Starts | A-Rows | Buffers |

| 0 | SELECT STATEMENT | | 1 | 2 | 6 | | 1 | NESTED LOOPS | | 1 | 2 | 6 | | 2 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 1 | 3 | |* 3 | INDEX UNIQUE SCAN | DEPT_PK | 1 | 1 | 2 | |* 4 | TABLE ACCESS CLUSTER | EMP | 1 | 2 | 3 |

Predicate Information (identified by operation id):

3 - access("DEPT"."DEPTNO"=10) 4 - filter(("EMP"."DEPTNO"=10 AND "EMP"."DEPTNO"="DEPT"."DEPTNO"))`

Enter fullscreen mode

Exit fullscreen mode

This NESTED LOOP with TABLE ACCESS CLUSTER is the typical join on pre-joined rows in a cluster. One index entry gets all employee rows together.

This is likely the closest we can get to a true "single-table" design for JSON-relational duality views. An Oracle indexed cluster stores related rows in a single block, but the optimizer, as shown in execution plans, still does not share the cluster index across the two tables and fully exploit the pre-joined one-to-many relationship.

We’ve come full circle. Using features from 1980 (CLUSTER) and 2020 (JDV), we have achieved a single‑duality that preserves physical colocation like a document database — while still declaring multiple SQL tables underneath, with proud respect for normal forms.

Conclusion

This is mainly interesting for fun (note the publication date) and for how it challenges our mental models. Don't apply this design to your application. Adding layers instead of reducing complexity isn’t why you choose a document model.

Data locality has always driven performance because, despite advances in hardware and software, the laws of physics still apply. In the 1980s, when relational databases abstracted away the physical model, Oracle adopted this abstraction with SQL tables but also introduced physical clustering to preserve data colocation and avoid benchmark regressions. The idea that data accessed together should be stored together stayed relevant in the 2000s, as databases scaled horizontally and document databases like MongoDB emerged from this principle. It remains true today: you get the best performance by storing an application's object aggregates in a single block on disk. MongoDB adopted WiredTiger as its storage engine because it offers a flexible block-size B-tree that stores collection documents without fragmentation.

But flexible storage engine like WiredTiger didnt exists when the traditional RDBMSs designed their storage. You should avoid using the clustering technique described here for Oracle tables, as it is not very flexible when data updates occur on fixed-size blocks. The CLUSTER clause in CREATE TABLE is simply ignored by the Autonomous Database managed service. Interestingly, you may be surprised to realize that indexed clusters are actually used daily in Oracle databases to store dictionary tables. Oracle's implementation of the system dictionary treats these tables as logical relational tables, but physically denormalizes them. If you're skeptical, ask your DBA to run:

SELECT COUNT(*) FROM dba_objects WHERE object_type = 'CLUSTER';

SELECT COUNT(*) FROM dba_objects WHERE object_type = 'CLUSTER';

Enter fullscreen mode

Exit fullscreen mode

You likely have at least nine of them—ten if you created one for EMP/DEPT. If you EXPLAIN PLAN FOR this query on dba_objects, you’ll see a huge UNION-ALL for what today would be stored as a document: a polymorphic aggregate to describe the various database object attributes in the database catalog.

I’ve written this for educational purposes only, and published it on a specific date, where any article can be a joke even when fact-based. When you read recommendations about design patterns, database choices, or data modeling rules, always cross‑check with how the system really works — and whether data accessed together is actually stored together, or presented that way by multiple internal abstraction layers and transformations. In case of doubts, look at the execution plan.

Original source

DEV Community

https://dev.to/franckpachot/the-single-duality-view-pattern-can-sqljson-preserve-aggregate-locality-4ifd

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelbenchmarkupdate

ReleasesLive

Donald Trump's Iran Address: White House Confirms Major Security Update Following Toll Threats

President Trump prepares a key address on Iran as Tehran restricts shipping through the Strait of Hormuz, threatening global oil supplies and raising the risk of escalation.

International Business Times

4m6 minutes ago

ModelsLive

Major Source Code Leak Unveils Hidden Features of Anthropic’s Claude Code - ForkLog

<a href="https://news.google.com/rss/articles/CBMinAFBVV95cUxPb3Znbi14VDRjUnBDY1Y0SmdQa3AteE9IdjdiMHZweFBlLXRFUWtXTHhabWJJUVNjMU1UQS05OExfd21UYkI0cnJZSDVsU21kbnVJY0VBMmt4VHRvcE5ON3JaczVUSXBPOXJvQW1CUENnRGQwZkNvOERHY1dFc0tKQjdLSDJlWFRqck9Celo2dF96aG5URUF3dDVCaVA?oc=5" target="_blank">Major Source Code Leak Unveils Hidden Features of Anthropic’s Claude Code</a> ForkLog

Google News: Claude

1mabout 1 hour ago

Market NewsRecent

OpenAI is now bringing in $2 billion a month — and 3 more highlights from its latest update - MarketWatch

<a href="https://news.google.com/rss/articles/CBMi0ANBVV95cUxNTHhOUnhLWF84Yzc1SmhmdVVHQkQ1eU1RSlh1YXNsYm4yTDdXVkh6WERkbHVIRkZxZHpYVmRfZEFneHAzX1hYQVN2dUd1WW5KR19Ud04xcDlFQWN0M3BJajBpSGZzcnJQTHQ3azhpR3h3T09FSzV3Rnp6TUFyam1Ed0hVYWF2WDdwSWhSUXJ3WjkxRXRLUkxVVS1ndmxVYnN3UlRhMUlZZDM3Z0QyclpZZ1dmVWkzdmRkODUtNW83ZUxidnlSaUJ6Y2VoMnNRSjl5TWpTQUJMU05uclgwMXRDVkRKZjQ2TXFFLS04M0VtamJIVFlUSVJuWFUxNENacWQ1OFJuQTNTbkoxNk5JaXdTTVNldkJ0emVlSlN1d3pTVmpLRi0wd0UxWTlQR3NCV0U3THJ3QlJONzN2SGFTUlZmVG1sY2FGbzJpV1dadkMxbDdxbXp5V2Z6b0NaQkxHSHB4LUJjcGRfSHRrUE1tb1gzVlVDSkN1alNaTmNNMnJlRXhBV0s3ZjhXQUg2ZDl0eUw2V2RDeDhtdU12V0xKM3dwN3VmYU01cF9QNUlSeUZuOHhuTE9tWE91bWZyNFhfX0tMeHk0VkJ2eUNFYzNCYm9TTQ?oc=5" target="_blank">OpenAI is now bringing in $2 billion a month — and 3 more highlights from its latest update</a> <font col

Google News: OpenAI

1mabout 13 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 123 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Products

ProductsFresh

California challenges federal approach with new AI rules - Digital Watch Observatory

<a href="https://news.google.com/rss/articles/CBMiigFBVV95cUxQYnJJZHRCVEtZQ0JqTlQyczBHODhKcnRuMkVWZWE5YlNQWTh3U1RjWC1wQmxMck5wVHYwRGtxbVJKcWh2eU94cDloSGhpbkczaWc2UXlBNEtSeWctRjQzUmpveUhibGZybkFEZnNYc09qOFROWG5YYnF4T1FpQU0zdEtvNlcya3JMLUE?oc=5" target="_blank">California challenges federal approach with new AI rules</a> Digital Watch Observatory

Google News: AI Safety

1mabout 2 hours ago

ProductsLive

Startup takes a different approach to AI assistants - Defense One

<a href="https://news.google.com/rss/articles/CBMiogFBVV95cUxQSEFWR3pvdUJ0X3QyQWxpaE96T1ZsNVl2bHUzRmRwWnRFYjl2bmJYdi1VU2xFOHotNkszYVNjWHpobk5PX1JmVG9sUzJ6OVlHd3NuTzRtUno1NmN4b2FVNDJQVW5mQm1GU21lc2xzd29xQUd6STY0QjZEWndLdHQzMnQyU0hoX0g1LXMtQTF0eGJMYWFhWndidF9Ob2E5cy1ERkE?oc=5" target="_blank">Startup takes a different approach to AI assistants</a> Defense One

Google News: LLM

1mabout 1 hour ago

ProductsLive

The Graveyard Of OpenAI’s Dead Products And Incomplete Deals - Forbes

<a href="https://news.google.com/rss/articles/CBMirgFBVV95cUxQQlNaU0RmcmZVakY4M3pzY25KbjEwbXJuOFNnMG1uQVVkZjY4MnRJU1J4NjQ1MngwVFljT2lBaF9ub0twZkVXNGhOUWxTdVBZUlRrTkYwX0RVNXlsV19OWHNsS2dFR3RHY1dDaGVqR3lfTUVRWVllT1Zia2w1TzlldkF2bXlRWWFvUnpRUXBpZ0RscUF0SlRZcno0ZFlULXhfRnNrUUhldnd6aXNGMFE?oc=5" target="_blank">The Graveyard Of OpenAI’s Dead Products And Incomplete Deals</a> Forbes

Google News: OpenAI

1mabout 1 hour ago

ProductsLive

The AI Video Apps Gaining Ground After OpenAI Declared Sora Dead - Bloomberg.com

<a href="https://news.google.com/rss/articles/CBMiwgFBVV95cUxPRkc4MGs2Q1ZCUHZmSTBhRzUtRUpqd1ZHdkhmWWtrZWdyVnVDUlQyb2hxNTFsbXFEX1FKeUN2bmFvRmRxQTVoUXY5alR3ZEVTRUVDSzExd2IxZ3ROczh4M0lJWGljenRESlp2UTlvT2tDaUI1NTdWRmVmZW02VmEtd0Rzb0ZRQzlIOEdIUkFVZDBHbzU3TFQ5Z1NsaktnaVZFMG1aak43TmJPaXg0ZlNyQzlTMHMzZ1AzdjU3SXRFcEk4dw?oc=5" target="_blank">The AI Video Apps Gaining Ground After OpenAI Declared Sora Dead</a> Bloomberg.com

Google News: OpenAI

1mabout 1 hour ago