Enrollment options

   TE Study guide

Databases are the invisible backbone of any modern data science application. These information systems also impact every aspect of our world and our day-to-day lives from public transport to shopping and social media. Many think of data science mostly while performing data mining and machine learning but a database almost always has the role to make the data available and accessible.

This course introduces students to key principles of data management including the basics of modeling, querying, and managing data using a relational database system. The students will learn about the perspectives of two roles in data management: first, as a user of a relational database system, and second as a developer of a relational database engine. The course concludes with discussions on open science, open data principles, and data management plans.

From a user perspective of a relational database system, we will cover how to use the system to build an application.  It begins by defining data modelling and explaining why effective data management is essential. The course then covers the entity-relationship model, object approaches, relational modeling, the relational data model, relational data modeling theory (normal forms), SQL, and referential integrity. Ultimately, we develop an understanding of user centric information requirements and data sharing.

From a developer perspective of a relational database engine, we will cover how a textbook relational database engine works to support a database user. The fundamental concepts of database management systems (DBMS) and their applications are discussed with the design, use, and application of SQL databases, while also introducing NoSQL databases. The developer perspective covers: query processing, query optimization, transactions, concurrency control, recovery, distributed and parallel query processing, replication, and distributed concurrency control. We also aim to understand what databases need to do in data-driven application designed for personal computer, server-based, enterprise wide, and Internet databases, and distributed data applications.

You are not logged in
You are not logged in