Database systems and database management
Types of databases:
DBMS / Database Management System
A suite of programs used to manage large sets of structured data with ad hoc query capabilities for many types of users.
Database: A collection of data stored in a meaningful way that enables multiple users and applications to access, view and modify data as needed.
– Record: Collecion of related data items
– File: Collection of record of the same type
– Database: Cross-referenced collection of files
– DBMS: Manages and controls the database
– Base relation: A table stored in a database
– Tuple: A row in a database
– Attribute: A column in a database
– Primary key: Columns that make each row unique
– View: Virtual relation defined by the database to control subjects from viewing certain data
– Foreign key: Attribute of one table that is the primary key of another table
– Cell: Intersection of a row and column
– Schema: Holds data that describes a database
– Data dictionary: Central repository of data element and their relationships.
– Cardinality: The number of rows in the relation.
– Degree: The number of columns in the relation.
– Domain: Is a set of allowable values that an attribute can take.
Relational data model – Uses attributes (columns) and tuples (rows) to contain and organize information. A primary key is a field that links all the data within a record to a corresponding value.
Hierarchical data model – Combines records and fields that are related in a logical tree structure. Can have one child, many children, no children. Are useful for mapping one-to-many relationships.
Distributed data model – Has data stored in more than one database, but it is logically connected. Enable different databases to be managed by different administrators, although one person or group must manage the entire logical database.
Relational database components:
DDL / Data Definition Languag: Defines the structure and schema of the database.
– Structure: table size, key placement, views and data element relationships.
– Schema: the type of data that will be held and manipulated and their properties.
DML / Data Manipulation Language: All the commands that enable a user to view, manipulate and use the database.
QL / Query Language: Enables users to make requests of the database.
Report Generator: Produces printouts of data in a userdefined manner.
Data dictionary: Is a central repository of data elements and their relationships. Is a collection of data elements, schema objects and reference keys.
Schema objects – Can contain tables, views, indexes, procedures, functions and triggers.
Primary key – Is a unique identifier in the table that unambiguously point to an individual tuple or row in the
table. Is a subset of candidate keys within a table.
Foreing key – An attribute (column) in one relation that has values matching the primary key in another relation.
Concurrency problems – Making sure that different subjects receive the most up-to-date information.
Semantic integrity – Makes sure that structural and semantic rules are enforced. These rules pertain to data types, logical values, uniqueness constraints and operations that could adversely affect the structure of the database.
Referential integrity – Mechanism would ensure that no record would contain a reference to a primary key of a nonexisting record or a NULL value.
Entity integrity – If an attribute is NULL.
Rollback – Is a statement that ends a current transaction and cancels all other changes to the database.
Commit – Terminates a transaction and executes all changes that were just made by the user.
Checkpoint – Are used to make sure that if a system failure occurs or if an error is detected, the user can always return to a point in time before the system crashed.
Database security issues:
Aggregation – When a user does not have the clearance or permission to access specific information, but she does have the permission to access components of this information. She can then figure out the rest and obtain restricted information.
Inference – Happens when a subject deduces information that is restricted from data he has access to. This is seen when data at a lower security level indirectly portrays data at a higher level.
Content-dependents access control – Looks at the content of a file when it makes an access control decision. This type of access control increases processing overhead, but it provides higher granular control.
Cell suppression – Is a technique used to hide or not show specific cells that contain information that could be used in inference attacks.
Partitioning – Involves dividing the database into different parts, which makes it much harder for an unauthorized individual to find connecting pieces of data that can be brought together and other information that can be deduced or uncovered.
Noise and perturbation – Is a technique of inserting bogus information in the hope of misdirecting an attacker or confusing the matter enough that the actual attack will not be fruitful.
Database views – Permit one group or a specific user to see certain information, while restricting another group from viewing it altogether.
Polyinstantiation – Enables a relation to contain multiple tuples with the same primary keys with each instance distinguished by a security level.
OLTP / On Line Transaction Processing – Provides mechanisms that watch for problems and deal with them appropriately when they do occur.
– Two-phase commit service: Will make sure that a transaction is not complete until all databases receive and reflect a change.
Data warehousing – Combines data from multiple databases into a large database with the purpose of a fuller extent of information retrieval and data analysis
Data mining – Is the process of messagining the data held in the data warehouse into more useful information.
– Metadata: Data produced by data mining tools to find associations and correlations.
OODB / Object-Oriented Data Bases – Have the characteristics of ease of reusing code and analysis, reduced maintenance and an easier transition from analysis of the problem to design and implementation. Its main disadvantages are a steep learning curve and high overhead of hardware and software required for development and operation.
Object-Relational Databases – Combines the attributes of object-oriented and relational technologies.
System life cycle phases/software life cycle development process
System Life Cycle Phases:
– Project initiation:
– Conception of project definition
– Proposal and initial study
– Functional design analysis and planning
– Requirements uncovered and defined
– System environment specification determined
– System design specifications
– Functional design review
– Functionality broken down
– Detailed planning put into place
– Code design
– Software development
– Developing and programming software
– Installation / implementation
– Product installation
– Testing and auditing
– Product changes, fixes and minor modifications
– Disposal / Revision and replacement
– Modifying the product with revisions or replacing it altogether
The Waterfall Model:
– System requirements
– Software requirements
– Program design
– Operations & Maintenance
Modified Waterfall Model incorporating V&V:
– System feasibility -> validation
– Software plans & requirements -> validation
– Product design -> verification
– Detailed design -> verification
– Coding -> unit test
– Integration Product -> verification
– Implementation -> system test
– Operations & Maintenance -> revalidation
– Security should be addressed in each phase of system development. Security should not be addressed at the end of development because of the added cost, time, effort and lack of functionality.
– Separation of duties should be practiced in roles, environments and functionality pertaining to development of a product.
– A programmer should not have direct access to code in production.
– Certification deals with testing and assessing the security mechanism in a system
– Accreditation pertains to the management formally accepting the system and its security level.
– Changes must be authorized, tested and recorded. The changes must not affect the security level of the system or its capability to enforce the security policy.
Change control sub-phases:
– Request control
– Change control
– Release control
Change control process
– Make a formal request of change
– Analyze the request
– Develop the implementation strategy
– Calculate the costs of this implementation
– Review any security implications
– Record the change request
– Submit the change request for approval
– Develop the change
– Recode segments of the product and add or subtract
– Link these changes in the code to the formal change control
– Submit software for testing and quality approval
– Repeat until quality is adequate
– Make version changes
– Configuration identification
– Configuration control
– Configuration status accounting
– Configuration audit
CMM / Software Capability Maturity Model
– Level 1: Initiating – Competent people and heroics; processes are informal and ad hoc
– Level 2: Repeatable – Project management processes; project management practices are institutionalized
– Level 3: Defined – Engineering processes and organizational support; technical practices are integrated with management practices institutionalized
– Level 4: Managed – Product and process improvement; product and process are quantitatively controlled
– Level 5: Optimized – Continuous process improvement; process improvement is institutionalized
Application Development Methodology
Types of languages
Machine language: Is in a form that the computer and processor can understand and work with directly.
Assembly language: Cannot be understood directly by the system and must be processed, which results into machine code language.
High-level language: Cannot be understood directly by the system and must be processed,
which results into machine code language.
Interpreted programs: Have instructions that are read and interpreted by a program one instruction at a time.
Compiled programs: Are written in a high-level language and turned into machine readable format by a program called compiler.
OOP / Object-Oriented Programming
Works with classes and objects within those classes. Once the class is defined, the attributes can be reused for each new member or instance of the class that is created. The object encapsulate the attribute values, which means that this information is packaged under one name and can be reused as one entity by other objects.
An object can have a shared portion – The interface that enables it to interact with other components
An object can have a private portion – How it actually works and performs the requested operations. Messages enter through the interface to specify the requested operation or method to be performed.
Information hiding – There is no need for other components to know how each object works internally.
Abstraction – Is the capability to suppress unnecessary details so that the important, inherent properties can be examined and reviewed
Phases of object-orientation
OORA / Object-Oriented Requirements Analysis – Defines classes of objects and their interactions.
OOA / Object-Oriented Analysis – In terms of object-oriented concepts, understanding and modelling a particular problem within a problem domain.
DA / Domain Analysis – Seeks to identify the classes and objects that are common to all applications within a given domain.
OOD / Object-Oriented Design – Object is the basic unit of modularity; objects are instantiations of a class.
OOP / Object-Oriented Programming – Emphasizes the employment of objects and methods rather than types or transformations as in other programming approaches.
Features of OOP
Encapsulation – Hides internal data and operations.
Polymorphism – Makes copies of objects and makes changes to those copies.
Polyinstantiation – Multiple distinct differences between data within objects to discourage lower-level subjects from learning information at a higher-level of security.
Inheritence – Shares properties and attributes.
Multiple inheritence – Is the situation where a class inherits the behavioural characteristics of more than one parent class.
Delegation – Forwarding of a request by an object to another object or delegate. This forwarding is necessitated by the fact that the object receiving the request does not have a method to service the request.
Structured analysis approach: Looks at all objects and subjects of an application and maps the interrelationships, communication paths and inheritance properties.
Data modelling: Considers data independently of the way that the data is processed and the components that process the data.
Data Structure: Is a representation of the logical relationship between elements of data.
Cohesive: A cohesive module can perform a single task with little or no help from other modules
– Low Cohesion: Scatter brained, does several tasks.
– High Cohesion: Focused on one task.
The best programming uses the most cohesive modules possible, but because different modules need to pass data and communicate, they usually cannot be totally cohesive.
Is a measure of interconnection among modules in an application.
– Low Coupling: Promotes module independence.
– High Coupling: Depend on other modules
The lower the coupling, the better the software design, because it promote module independence. The more independent a component is, the less complex the application is and the easier it is to modify and troubleshoot.
OMA / Object Management Architecture
ORB / Object Request Brokers: Manages all communication between components and enables them to interact in a heterogeneous and distributed environment.
CORBA / Common Object Request Broker Architecture: Provides interoperability among the vast array of different software, platforms and hardware in environments. Enables applications to communicate with one another no matter where the application is located or who developed it. To implement this compatible interchange, a user develops a small amount of initial code and an Interface Definition Language (IDL) file.
COM / Common Object Model: Supports the exchange of objects among programs.
DCOM / Distributed Common Object Model: Defines the standard for sharing objects in a networked environment. Uses a globally unique identifier, GUID, to uniquely identify users, resources and components within an environment.
ODBC / Open Database Connectivity: Provides a standard SQL dialect that can be used to access many types of rational databases.
DDE / Dynamic Data Exchange: Enables different applications to share data by providing IPC. Is a communication mechanism that enables direct conversation between two applications.
DCE / Distributed Computing Environment: Is a set of management services with a communication layer based on RPC. Is a layer of software that sits on top of the network layer and provides services to the applications above it. Uses universal unique identifier, UUID, to uniquely identify users, resources and components within an environment.The RPC function collects the arguments and commands from the sending program and prepares them for transmission over the network.
The DFS / Distributed File Services provides a single integrated file system that all DCE users can use to share files.
Expert systems / knowledge based systems: Use artificial intelligence / emulate human knowledge to solve problems. Is a computer program containing a knowledge base and set of algorithm and rules used to infer new facts from knowledge and incoming data.
– Rule-based programming: Is a common way of developing expert systems.
– Pattern matching: Based on if-then logic units.
– Inference engine: A mechanism that automatically matches facts against patterns and determines which rules are applicable.
Artificial Neureal Networks: Is an electronic model based on the neural structure of the brain. Tries to replicate the basic functions of neurons and their circuitry to solve problems in a new way.
Java: Is a platform independent because it creates intermediate code, bytecode, which is not processor specific. The Java Virtual Machine then converts the bytecode to machine code. Java applets use a security scheme that employs a sandbox to limit the applet’s access to certain specific areas within the user’s system and protects them from malicious or poorly written applets.
Microsoft technology that is used to write controls that Internet users can download to increase their functionality and Internet experience. Practices security by informing the user where the program came from. Uses authenticode technology that relies on digital certificates and trusting certificate authorities.
Malicious Code: Viruses, worms, trojan horses, logic bombs, etc.
Can be detected by:
– File size increase
– Many unexpected disk accesses
– Change in update or modified timestamps
Virus: Is a program that searches out other programs and infects them by embedding a copy of itself. When the infected program executes, the embedded virus is executed which propagates the infection.
– Boot sector virus: Move data within the boot sector or overwrite the sector with new
– Stealth virus: Hides the modifications that it has made to files or boot records.
– Polymorphic virus: Produces varied but operational copies of itself.
– Mulitpart virus: Infects both the boot sector of a hard drive and executable files.
– Self-garbling virus: Attempts to hide from antivirus software by garbling its own code. As the virus spreads, it changes the way its code is encoded.
Worm: They can reproduce on their own with no need for a host application and that they are self-contained programs.
Logic bomb: Will execute a program, or string of code, when a certain event happens.
Trojan horse: Is a program disguised as another program.
DoS / Denial of Service: An attack consuming the victim’s bandwidth or resources, that cause the system to crash or stop processing other packet.
Smurf: Requires three players: the attacker, the victim and the amplifying network. The attacker spoofs, or changes the source IP address in a packet header, to make an ICMP ECHO packet seem as though it originated at the victim’s system. This ICMP ECHO message is broadcasted to the amplifying network, which will reply to the message in full force. The victims system and victim’s network is overwhelmed.
Fraggle: Uses UDP as its weapon of choice. The attacker broadcasts a spoofed UDP packet to the amplifying network, which in turn replies to the victim’s system
SYN Flood: Continually sending the victim SYN messages with spoofed packets. The victim will commit the necessary resources to set up this communication socket and it will send its SYN/ACK message waiting for the ACK message in return.
Teardrop: An attacker sending very small packets that would cause a system to freeze or reboot. Causes by the fact that some systems make sure that packets are not too large, but do not check to see if a packet is too small.
DDoS / Distributed Denial of Service: Is a logical extension of the DoS. The attacker creates master controllers that can in turn control slaves / zombie machines.
DNS DoS Attacks: A record at a DNS server is replaced with a new record pointing at a fake/false IP address.
Cache poisoning – The attacker inserting data into the cache of the server instead of replacing the actual records.