Geek.I.Am: Database Design

Introduction to Data Models

[Database Design]

There are basic three data models: High-level, Representational, and Low-level.

High-level, also known as Conceptual or Semantic data models: Describes the semantics of the data within its problem domain. For example, if the problem domain of the data is a university, the database description will incorporate the things and propositions that are important to the university. Thus, things (entities) like Students, Professors, Classes, Courses, and Grades would be described by this model as well as how each instance relates to each other (relationships). Please note that, in some cases, this model also describes the data entities characteristics (attributes), but normally the logical data model carries this task in a much detailed manner.

Representational, also known as logical data model: Concerns to the detailed attributes of each individual data entity and how these attributes contribute or affect the data relationships (primary and secondary key groups).

Low-level or Physical data models: Describes in details the physical means used to store the data, which relates to the storage media, data types, data partitioning, processing power, geographical distribution of the data and etc…

Due to the characteristics of each model, it is common to start with the conceptual model (to learn which entities compose the data domain and their relationships), then apply the logical model (to understand the details of the data without actually worrying about its implementation), and complete the whole process with the low-level model (knowing how exactly the data will be implemented, stored, and distributed).

It is said that, sometimes, the low-level model can be discarded, and the logical model is directly implemented. While this is true, the statement may be easily misunderstood. Discarding the physical data model would mean that one would not need to worry about the actual implementation of the conceptual and logical models by using a tool that would absorb/parse these models and implement them for the user. In this case, the user really did not need to know all the details of the implementation. However the tool itself needs to and it does so by applying well-know and understood data patterns in the industry.

Works Cited

Carpenter, Tom. (2010). SQL Server 2008 Administration: Real-World Skills for MCITP

Certification and Beyon (Exams 70-432 and 70-450). Sybex.

Elmasri, R., & Navathe, S. (2011). Fundamentals of Database Systems 6th Edition. Boston: Addison - Wesley.

The Weaknesses of the Relational Data Model and DBMS systems: Unsuitability for advanced database applications

[Database Design]

The entities under a relational model or relational DBMSs are data-type and relationship dependent. The data-type dependency limits the amount of structures that could be logically represented by database. The relationship dependency, with its hierarchical characteristics that translates in a series of logical relationships and constraints created to interconnect entities, can become very costly as the number of entities in the database grows, resulting in a tightly coupled set of objects, with redundant data (entity identifiers such as primary and secondary keys), and a very complex and hard to maintain iteration path among entities. Even though the relational model is a mathematical model proven to work well, it does not mean that it is an easy model to design, maintain and process entities.

Let’s say for example, that one would want to use this model to describe in realistic details a complex object such as the human body. It is easy to see right away that it would be impossible to have an entity of this type. There is no way to represent such a complex object as a unique type. It is then necessary to break it in parts and sub-parts (entities or tables) and describe for each one of them its characteristics (attribute). What about the actions performed by each entity? The relational model pretty much just allows for entity description, based on a limited set of data-types, and there is no way to add attributes (events) for each one of these entities.

So, continuing with the human body… As we know, it is composed by too many structures (entities) that are very rich in details (attributes) and actions (events) that are totally interconnected to each other. For every one of the body’s entities (i.e. cell, tissue, and organ) should exist a relationship and a counterpart constraint that enforces and guarantees the correct behavior among then. Now, can you imagine maintaining all of these relationships and cascading changes among entities as needed? How much data redundancy would have to be created in the database just to establish the relationships? How about things that cannot be accounted for beforehand, such as a disease that would affect the body structures and generate the need for changes in the attributes and events? The disease may even affect an entity in such a way that a new data-type would be necessary to describe is consequences. Even a very powerful DBMS have a lot of trouble and would need a lot of power to iterate through all of these entities to search, index, modify and return an appropriate result for a query. Of course this is an exaggerated example, but it gives you a good idea why the relational model is not suited for advanced database applications, especially applications that needs to model and handle very realistic and complex objects.

Works Cited

Elmasri, R. & Navathe, S. (2011). Fundamentals of Database Systems [6th Edition]. Boston, Massachusetts: Pearson Education Inc., Addison-Wesley.

Ling, Wang T., & Dobie, Gillian. (2010). Semistructured Database Design (Web Information Systems Engineering and Internet Technologies Book Series). Springer.

Cross reference between Agile Developmet and Database Design: A Brief Explanation

[Software Engineering]
[Database Design]

The main difference between agile and traditional software development relates to the evolutionary approach towards the software design.

In the traditional approach a well-established and detailed design phase exists in the very beginning of a project, which will serve as a blueprint for the software that must be implemented and, after all, tested by the developers. This is tightly-coupled approach were "data changes impacts the database, and the effect of that change rippled through the database immediately in accordance with referential integrity rules, based on business rules". (Morien, 2005)

On the other hand, the Agile approach understands that it is not possible to establish such design and foresee all the details of the software requirements upfront. The design must be able to scale, and absorbing changes as necessary, in an evolutionary manner. This allows for a more natural and iterative process where developers and stakeholders are empowered towards an evolutionary the final goal. Note that the Agile process requires a clear goal of what needs to be developed at the beginning of the development process but it does not require or enforce a "blueprint" with most, if not all, development efforts and requirements beforehand.

Agile has also influenced database design towards a more fluid design process "by employing evolutionary procedures and standards while automating processes" (Harriman, Hodgetts, and Leo, 2004). Under agile, data modeling becomes a much more interactive process to adapt to the ever-changing requirements. Refactoring of the database objects (entities, attributes, relationships, and etc...) occur periodically as necessary and regression tests are implemented and automated for every new feature or change in the database, allowing for early detection and prevention of future problems.

Works cited

Morien, Roy. (2005). Agile Development of the Database: A Focal Entity Prototyping Approach. Retrieved on October 26, 2011, from: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1609809

Harimman, Alan, & Hodgetts, Paul, & Leo, Mike. (2004). Emergent Database Design: Liberating Database Development with Agile Practices. Retrieved on October 26, 2011, from: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1359802

Pages

Introduction to Data Models

The Weaknesses of the Relational Data Model and DBMS systems: Unsuitability for advanced database applications

Cross reference between Agile Developmet and Database Design: A Brief Explanation