Subtypes and Supertypes


In this discussion, we look at a particular and very important type of choice in data modeling. In fact, it is so important that we introduce a special convention subtyping to allow our E-R diagrams to show several different options at the same time. We will also find subtyping useful for concisely representing rules and constraints, and for managing complexity. Our emphasis in this discussion is on the conceptual modeling phase, and we touch only lightly on logical modeling issues.

Different Levels of Generalization

It is important to recognize that our choice of level of generalization will have a…



The focus of this discussion is on ensuring that the data meets business requirements.

Much of the discussion is devoted to the correct use of terminology and diagramming conventions, which provide a bridge between technical and business views of data requirements.

A Diagrammatic Representation

The fact that each operation can be performed by only one surgeon (because each row of the Operation table allows only one surgeon number) is an important constraint imposed by the data model, but is not immediately apparent.

Process modelers solve this sort of problem by using diagrams, such as data flow diagrams and activity…

Basics of Data Structure


The principal tool is normalization, a set of rules for allocating data to tables in such a way as to eliminate certain types of redundancy and incompleteness.

Normalization is usually one of the later activities in a data modeling project, as we cannot start normalizing until we have established what columns (data items) are required.

Normalization is used in the logical database design stage, following requirements analysis and conceptual modeling.

An Informal Example of Normalization

Normalization is essentially a two-step process:

1. Put the data into a tabular form (by removing repeating groups).

2. Remove duplicated data to separate…

Data Modeling

What is a data model?

Data Modeling refers to the practice of documenting software and business system design. The “modeling” of these various systems and processes often involves the use of diagrams, symbols, and textual references to represent the way the data flows through a software application or the Data Architecture within an enterprise.

Why Is the Data Model Important?

When designing programs or report layouts (for example), we generally settle for a design that “does the job” even though we recognize that with more time and effort we might be able to develop a more elegant solution.

  • Leverage


Spark & PostgreSQL

Apache Spark is a lightning-fast cluster computing technology, designed for fast computation. It is based on Hadoop MapReduce and it extends the MapReduce model to efficiently use it for more types of computations, which includes interactive queries and stream processing.

And for this practice I want to prove how great apache spark.

  1. Import all Libraries that we need to use.

from pyspark.sql import SparkSession
from pyspark.sql import SQLContext
import psycopg2

2. Create Connection to PostgreSQL.

conn = psycopg2.connect(host=’localhost’, database=’postgres’,user =’postgres’,password=’postgres’)
cur = conn.cursor()

3. Load the data.

sqlctx = SQLContext(sc)
pop_data =‘ratings.csv’)

4. Create table in PostgreSQL.

cur.execute(“””CREATE TABLE…

DAG Visualization

Nowadays, many social applications are being developed that result in massive data improvements every time, and when we talk about the millions of users who connect every time, information is shared whenever users interact with social media or other websites, so the question arises that how this huge amount of data is handled and through what media or tools the data is processed and stored. This is where Big Data unfolds.

So the first question is what is Big data?

Big data is a term that describes the large volume of data — both structured and unstructured — that inundates…


What is Machine Learning?

“The ability of machine to do certain task performed by a human without being explicitly programmed to do that task.”

Traditional Computing vs Machine Learning


“We are leaving the age of information and entering the age of recommendation”. — Chris Anderson.

Recommender systems are utilized in a variety of areas, with commonly recognized examples taking the form of playlist generators for video and music services, product recommenders for online stores, or content recommenders for social media platforms and open web content recommenders. These systems can operate using a single input, like music, or multiple inputs within and across platforms like news, books, and search queries. There are also popular recommender systems for specific topics like restaurants and online dating. …

Data Modeling

What is data modeling?

Model is a representation of real data that provide us with characteristic, relation and rules that apply to our data. It doesn’t actually contain any data in it

Data model give us insight about

The data model helps us design our database. When building a plane, you don’t start with building the engine. You start by creating a blueprint anschematic. Creating database is just the same, you start with modelling the data.

Arif Zainurrohman

Data Analytics. Enthusiast in all things data, personal finance, and Fintech.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store