The Entity-Relationship Approach

Arif Zainurrohman
Nerd For Tech
Published in
9 min readApr 11, 2021

--

Relationship

Introduction

The focus of this discussion is on ensuring that the data meets business requirements.

Much of the discussion is devoted to the correct use of terminology and diagramming conventions, which provide a bridge between technical and business views of data requirements.

A Diagrammatic Representation

The fact that each operation can be performed by only one surgeon (because each row of the Operation table allows only one surgeon number) is an important constraint imposed by the data model, but is not immediately apparent.

Process modelers solve this sort of problem by using diagrams, such as data flow diagrams and activity diagrams, showing the most important features of their models. We can approach data models the same way.

  1. The Basic Symbols: Boxes and Arrows

We start by presenting our model as a data structure diagram using just two symbols:

a. A “box” (strictly speaking, a rectangle) represents a table.

b. An arrow drawn between two boxes represents a foreign key pointing back to the table where it appears as a primary key.

box

2. Diagrammatic Representation of Foreign Keys

When drawing ER diagrams, we have used the following graphical convention: Label the relationship lines with the foreign key column name(s). This makes it clear which column in the child table is the foreign key to the parent table. Indicating primary key status can be done by underlining the attribute in question.

Foreign Key

3. Interpreting the Diagram

The first two rules would have been obvious from the relational representation, the other two much less so. With the diagram, we have succeeded in summarizing the relationships between tables implied by our
primary and foreign keys, without having to actually list any column names
at all.

4. Optionality

The diagram may also raise the possibility of operations that do not involve
any entity at all: “We don’t usually involve a surgeon when we are treating
a patient with a small cut, but we still need to record whether any drugs were
used”.

5. Verifying the Model

The diagram provides an excellent starting point for verifying the model
with users and business specialists. Intelligent, thorough checking of each arrow on the diagram will often reveal unsound assumptions and misunderstandings or, equally useful, increase stakeholders’ confidence in the
workability of the model.

6. Redundant Arrows

Does this arrow add anything to our knowledge of the business rules
supported by the model? It tells us that each operation must be performed
at one entity. But we can deduce this from the other two arrows, which
specify that each operation must be managed by an entity. The arrow also shows that a program could “navigate” directly from a row in a table to the corresponding.

The Top-Down Approach: Entity-Relationship Modeling

Although the diagram contains little new information, it communicates some of the model’s most important rules so much more clearly that you should never review or present a model without drawing one. It is interesting to prepare a diagram for such a database and show it to programmers and analysts.

  1. Developing the Diagram Top Down

The top-down approach also overcomes most of the limitations of
normalization used by itself. We do not need to start with a formidably
complex single table, nor do we need to accept the tables implicitly defined
by our historical choice of determinants.

2. Terminology

Our terminology for the conceptual model is more business-oriented. Again, there are three basic concepts:

A. Entity classes: categories of things of interest to the business; represented by boxes on the diagram, and generally implemented as tables

B. Attributes: what we want to know about entity classes; not usually shown on the diagram and generally implemented as columns in tables

C. Relationships: represented by lines with crows’ feet (we will drop the term “arrow” now that we are talking about conceptual models), and generally implemented through foreign keys.

Entity Classes

An entity class is a real-world class such as a hospital. We make the distinction between entities, such as “RSUD Cengkareng” and entity classes (sometimes called entity types) such as “Hospital.”

All entity classes will meet the criterion of being “a class of things we need to keep information about,” as long as we are happy for “thing” to include more abstract concepts such as events (e.g., Operation) and classifications (e.g., Operation Type).

  1. Entity Diagramming Convention

Entity classes are represented by boxes with rounded corners.

2. Entity Class Naming

The name of an entity class must be in the singular and refer to a single
instance (in relational terms, a row) not to the whole table. Thus, collective
terms like File, Table, Catalog, History, and Schedule are inappropriate.

We do this for three reasons:
A. Consistency: It is the beginning of a naming standard for entity classes.
B.
Communication: An entity class is “something we want to keep information about,” such as a customer rather than a customer file.
C.
Generating business assertions: if we follow some simple rules in naming the components of an E-R model, we can automatically generate grammatically
sound assertions that can be checked by stakeholders.

3. Entity Class Definitions

Entity class names must be supported by definitions. We cannot overemphasize the importance of good entity class definitions. From time to time, data modelers get stuck in long arguments without much apparent progress. Almost invariably, they have not put adequate
effort into pinning down some working definitions, and they are continually making subtle mental adjustments, which are never recorded. Modelers
frequently (and probably unwittingly) shift definitions in order to support
their own position in the discussion.

Relationships

For example, In the drug expenditure model, the lines between boxes can be interpreted in real-world terms as relationships between entity classes. There are relationships, for example, between hospitals and surgeons and between operations and drug administrations.

  1. Relationship Diagramming Conventions

We have already used a convention for annotating the lines to describe their
meaning (relationship names), cardinality (the crow’s foot can be interpreted as meaning “many,” its absence as meaning “one”), and optionality
(the circles and bars representing “optional” and “mandatory” respectively).

2. Many-to-Many Relationships

Many-to-many relationships crop up regularly in E-R diagrams in practice. A many-to-many relationship is a type of cardinality that refers to the relationship between two entities A and B in which A may contain a parent instance for which there are many children in B and vice versa.

3. One-to-One Relationships

The term one-to-one relationships refer to relationships of two items in which one can only belong with the other.

4. Self-Referencing Relationships

This type of relationship is sometimes called a “head-scratcher,” not
only because of its appearance, but because of the difficulty many people
have in coming to grips with the recursive structure it represents.

5. Relationships Involving Three or More Entity Classes

Whenever you encounter what appears to be a higher degree relationship, you should check that it is not in fact made up of individual many-to-many relationships among the participating entity classes. The two situations are not equivalent, and choosing the wrong representation may lead to normalization problems.

6. Transferability

An important property of relationships that receives less attention than it
should from writers and tool developers is transferability. We suspect
there are two reasons for its neglect.

First, its impact on the design of a relational database is indirect. Changing a relationship from transferable to nontransferable will not affect the automatic part of the conversion of a conceptual model to relational tables.

Second, most diagramming tools do not support a symbol to indicate transferability. However, some do provide for it to be recorded in supporting documentation, and the Chen E-R conventions support the closely
related concept of weak entity classes.

7. Dependent and Independent Entity Classes

An independent entity class is one whose instances can have an independent existence. By contrast, a dependent entity class is one whose instances can only exist in conjunction with instances of another entity class, and cannot be transferred between instances of that other entity. In other words, an entity class is dependent if (and only if) it has a mandatory, nontransferable many-to-one (or one-to-one) relationship with another entity class.

Attributes

We sometimes show a few attributes on the diagram for clarification of entity class meaning (or to illustrate a particular point), and some modeling tools support the inclusion of a nominated subset of attributes. But we do not generally show all of the attributes on the diagram, primarily because we would end up swamping our “big picture” with detail.

  1. Attribute Identification and Definition

We sometimes show a few attributes on the diagram for clarification of
entity class meaning (or to illustrate a particular point), and some modeling tools support the inclusion of a nominated subset of attributes. But we
do not generally show all of the attributes on the diagram, primarily because
we would end up swamping our “big picture” with detail.

2. Primary Keys and the Conceptual Model

Recall that, in a relational model, every table must have a primary key.
In E-R modeling, we can identify entity classes prior to defining their keys.
In some cases, none of the attributes of an entity class (alone or in combination) is suitable as a primary key.

Myths and Folklore

As with any relatively new discipline, data modeling has acquired its own folklore of “guidelines” and “rules.” Some of these can be traced to genuine attempts at encouraging good and consistent practice. Labels a number of situations “impossible” when a more accurate description would be “possible but not very common.” The sensible data modeler will be alerted by such situations, but will not reject a model solely on the basis that it violates some such edict.

  1. Entity Classes without Relationships

It is perfectly possible, though not common, to have an entity class that is
not related to any other entity class. A trivial case that arises occasionally is
a model containing only one entity class. Other counter-examples appear
in models to support management information systems, which may require
data from disparate sources.

2. Allowed Combinations of Cardinality and Optionality

The problem with relationships that are mandatory in both directions
may be the “chicken and egg” question: which comes first? We cannot
record a customer without an account, and we cannot record an account
without a customer. In fact, the problem is illusory, as we create both the
customer and the account within one transaction. The database meets the
stated constraints both at the beginning and the end of the transaction.

Creativity and E-R Modeling

The element of choice is far more apparent in E-R modeling than in normalization, as we would expect. In E-R modeling we are defining our categories of data. In normalization, have been determined (often by someone else) before we start. The process of categorization is so subjective that even the broadest division of data, into entity classes and relationships, offers some choice.

Summary

Data models can be presented diagrammatically by using a box to represent each table and a line for each foreign key relationship. Further diagramming conventions allow the name, cardinality, and optionality of the relationships to be shown. We can view the boxes as representing entity classes — things about which the business needs to keep information — and the lines as representing business relationships between entity classes. The resulting model is often called an Entity-Relationship (E-R) model.

Entity class identification is essentially a process of classifying data, and there is considerable room for choice and creativity in selecting the most useful classification. Entity class naming and definition are critical. To model such a relationship in one of the notations, you must use an intersection entity class.

Much folklore surrounds relationships. Most combinations of optionality, cardinality, transferability, and recursion are possible in some contexts. The modeler should be alert for unusual combinations but examine each case from first principles.

Reference

Data Modeling Theory and Practice Graeme Simsion

--

--

Arif Zainurrohman
Nerd For Tech

Corporate Data Analytics. Enthusiast in all things data, personal finance, and Fintech.