Lessons Summarized by Michael Stonebraker

Tianzhou Chen - 28 Jan 2012

Paper: What goes around comes around

IMS Era

Physical and logical data independence are highly desirable
Tree structured data models are very restrictive
It is a challenge to provide sophisticated logical reorganizations of tree structured data
A record-at-a-time user interface forces the programmer to do manual query optimization, and this is often hard.

CODASYL Era

Networks are more flexible than hierarchies but more complex
Loading and recovering networks is more complex than hierarchies

Relational Era

Set-a-time languages are good, regardless of the data model, since they offer much improved physical data independence.
Logical data independence is easier with a simple data model than with a complex one.
Technical debates are usually settled by the elephants of the marketplace, and often for reasons that have little to do with the technology.
Query optimizers can beat all but the best record-at-a-time DBMS application programmers.

The Entity-Relationship Era

Functional dependencies are too difficult for mere mortals to understand. Another reason for KISS (Keep it simple stupid).

R++ Era

Unless there is a big performance or functionality advantage, new constructs will go nowhere.

The Semantic Data Model Era (Similar to R++ Era)

OO Era

Packages will not sell to users unless they are in “major pain”
Persistent languages will go nowhere without the support of the programming language community.

The Object-Relational Era

The major benefits of OR is two-fold: putting code in the data base (and thereby blurring the distinction between code and data) and user-defined access methods.
Widespread adoption of new technology requires either standards and/or an elephant pushing hard.

Semi Structured Data

Schema-last is a probably a niche market
XQuery is pretty much OR SQL with a different syntax
XML will not solve the semantic heterogeneity either inside or outside the enterprise.