Basic Concepts of Important Subjects of Computer Science
Data Structures
What is an algorithm?
The method of solving a problem is known as algorithm.
More precisely, an algorithm is a sequence of instructions that act on some input data to produce some output in a finite number of steps.
An algorithm must have the following properties.
1. Input
2. Output
3. Finiteness
4. Definiteness
5. Effectiveness
What is analysis of algorithms?
An algorithm must not only be able to solve the problem at hand, it must be able to do so in as efficient a manner as possible. Determining which algorithm is efficient than the other involves analysis of algorithms.
Number of operations to be done is to be decided. This is done for significant operations. There are two classes of operations that are typically chosen for the significant operations - comparison and arithmetic.
In arithmetic operations, additive and muliplicative operations are separately counted as multiplication operation takes more time than the addition operation.
Cases to be considered during analysis - Best case, Worst case, Average case
Additional Issues of Analysis
Time efficiency
Space efficiency
Simplicity
Generality
Range of input that an algorithm takes
What are important data structures
1. Arrays 2. Strings 3. Linked lists 4. Sparse matrices 5. Stacks 6. Queues 7. Trees 8. Graphs
Searching and Sorting is an important operation on data structures.
Design of Algorithms
Algorithmic Strategies
1. Naive - Brute force - Attempting to solve a problem without thinking of doing it efficiently.
2. Divide and conquer
3. Greedy method - Optimization at every stage.
4. Dynamic programming - Stage-wise optimization. From every point the optimal path to the goal is found starting with one stage ahead of the goal.
5. Backtracking - In backtracking, all possible alternatives to find a solution are not searched. Thus, the algoirthm gives a solution with less number of steps compared to an exhaustive search that finds all possible alternatives. The efficiency is obtained by eliminating certain paths early by determining that they are not feasible paths or efficient paths.
6. Branch and bound
7. Internet Algorithms - String and pattern matching algorithms - Rabin-Karp algorithm - String matching with finite automata - Boyer Moore algorithm - Knuth-Morris-Pratt algoirthm
Database System Concepts
A Database-management system (DBMS) is a collection of interrelated data and a set of programs to access those data.
The primary goal of a DBMS is to provide a way to store and retrieve database information that is both convenient and efficient.
Management of data involves both defining structures for storage of information and providing mechanisms for the manipulation of information.
Database Systems versus File Systems
File systems were used earlier. Data was stored in different individual files and was accessed as required by accessing required files. But the system had drawbacks which were overcome by database systems.
Drawbacks of file-based systems
Data redundancy and inconsistency
Difficulty in accessing data - Multiple files are to be accessed.
Data isolation
Integrity problems
Atomicity problem - Ability to restore all the data up to an instant
Concurrent access anomalies
Security
Instances and Schema
The collection of information stored in the database at a particular moment is called an instance of the database.
The overall design of the database is called the database schema.
Data Models
Underlying the structure of a database is the data model: a collection of conceptual tools for describing data, data relationships, data semantics, and consistency constraints.
Entity Relationship Model
The entity relationship (E-R) model is based on the idea that real world consists of a collection of basic objects, called entities and there are relationships among these objects or entities.
Relational Model
The relational model uses a collection of tables to represent both data and the relationships among those data. Each table has multiple columns, and each column has a unique name.
The relational model is a record based model. Relational database is structured in fixed-format records of several types. Each table contains records of a particular type. Each record type defines a fixed number of fields, or attributes. The columns of the table correspond to the attributes of the recod type,
Object Oriented Data Model
The object oriented model can be seen as extending the E-R model with notions of encapsulation, methods (functions) and object identity.
Database languages
Data definition language (DDL)
Data manipulation language (DML)
Database Access from Application Programs
Application programs of users interact with the database to retrieve data they want. Application programs are usually written in languages such as Cobol, C, C++, or Java.
Database System Structure
A database system is partitioned into modules. The two important modules are storage manager and the query processor module or component.
Storage Manager Components
Authorization and integrity manager
Transaction manager
File manager
Buffer manager
The storage manager implements several data structures as part of the physical system implementation
Data files (tables)
Data dictionary
Indices
The Query Processor Components
DDL Interpreter
DML Compiler
Query evaluation manager.
Application architecture
Two tier - Application at client level - database at server level
Three tier - client front end - application server - database server
Relational Database Design
First Normal Form
A domain is atomic if elements of the domain are considered to be indivisible units. We say that a relation schema R is in first normal form (1NF) if the domains of all attributes of R are atomic.
Boyce-Codd Normal Form (BCNF)
Third Normal Form
Computer Organization and Architecture
Functions of Computer
Data Processing
Data Storage
Data Movement
Control
Structure
There are four main structural components of a computer
1. CPU
2. Main Memory
3. I/O
4. System Interconnection: Some mechanism that provides for communication among CPU, main memory and I/O
Why an IT or Computer Science student has to know computer architecture and organization?
To select the most effective computer for use throughout the organization. The ability to make choice between cache sizes, and clock rates etc. is essential.
Many processor or computer components are part of embedded systems. The IT person must be able to debug them.
Generation
of Computer - Period - Technology - Typical Speed (Operations per second)
First
Second
Third
Fourth
Fifth
Sixth 1991 - Ultra large scale integration - one billion
Memory Hierarchy
Internal Memory
Registers
Cache
Main internal memory
Outboard Storage
Magnetic disc
CD ROM
Cd-RW
DVD-RW
DVD - RAM
Off-line storage
Magnetic tape
MO
WORM
Cache memory has high speed of data transfer like registers, but is less expensive than that of registers and cost is more closer to main memory.
Cache contains a copy of portions of main memory.
Computer Graphics
Raster scan displays
The beam intensity is turned on and off to create a pattern of illuminates spots that give a picture. Each screen point is referred to as pixel or pel (shortened forms of picture elements).
The raster system can have 2 bits for a pixel. But high quality systems have even 24 bits for a pixel.
Refreshing the picture takes place at the rate of 60 to 80 frames for second.
Graphics Functions
A general purpose graphics package provides users with a variety of functions for creating and manipulating pictures.
Output Primitives
Line Drawing Algorithms
DDA Algorithm
Bresenham's Line Algorithm
Parallel Line Algorithm
Circle Generating Algorithms
Midpoint Circle Algorithm
Ellipse Generating Algorithms
Midpoint Ellipse Algorithm
Software Engineering
Based on Book of Roger Pressman, Sixth Edition
Software Engineering Practice
George Polya outlined the essenece of problem solving
1. Understand the problem
2. Plan a solution
3. Carry out the plan
4. Examine the result for accuracy
Core Principles
the dictionary defines the word principle as "an important underlying law or assumption required in a system of thought."
David Hooker has proposed seven core principles that focus on software engineering process as a whole.
1. The reason it all exists.
2. Keep it simple, stupid
3. Maintain the vision
4. What you produce, others will consume
5. Be open to the future
6. Plan ahead for reuse.
7. Think
Communication Principles that apply to customer communication
1. Listen
2. Prepare before you communicate
3. Someone should facilitate the activity
4. face to face communication is best.
5. Take notes and document decisions
6. Strive for collaboration
7. Stay focused, modularize your discussion.
8. If something is unclear, draw a picture
9. Once you agree to something move on; If you can't agree tosomething move on; If a feature or function is unclear and cannot be clarified at the moment, move on.
10. Negotiation is not a contest or a game. It works best when both parties win.
Principles of Planning
1. Understand the scope of the project
2. Involve the customer in the planning activity
3. Recognize that planning is iterative
4. Estimate based on what you know.
5. Consider risk as you define the plan
6. Be realistic
7. Adjust granularity as you define the plan
8. Define how you intend to ensure quality.
9. Describe how you intend to accommodate change
10. Track the plan frequently and make adjustments as required.
Analysis Modeling Principles
1. The information domain of a problem must be represented and understood.
2. The functions that the software performs must be defined.
3. The behavior of the software (as a consequence of external events) must be represented.
4. the models that depict information, function, and behavior must be partitioned in a manner that uncovers detail in a layered(or hierarchical) fashion.
5. The analysis task should move from essential information toward implementation detail.
Software Design Modeling Principles
1 Design should be traceable to the analysis model.
2. Always consider architecture of the system to be built.
3. Design of data is as important as design of processing function
4. Interfaces must be designed with care (both external and internal)
5. User interface design should be designed to the needs of the end user.
6. Component level design should be functionally independent.
7. Components should be loosely coupled to one another and to the external environment.
8. Design representations (models) should be easily understandable.
9. The design should be developed iternatively. With each iteration the designer should strive for greater simplicity.
Coding Principles and Concepts
Preparation Principles
1. Understand the problem you're trying to solve.
2. Understand the basic design principles and concepts.
3. Pick a programming language that meets the needs of the software to be built and the environment in which it will operate.
4. Select a programming environment that provides tools that will make your work easier.
5. Create a set of unit tests that will be applied once the component you code is completed.
Coding Principles
1. Constrain your algorithms by following structured programming (BOH00)
2. Select data structures that will meet the needs of the design.
Understand the software architecture and create interfaces that are consistent with it.
4. Keep conditional logic as simple as possible.
5. create nested loops in a way that makes them easily testable.
6. Select meaningful variable names and follow other local coding standards
7. Write code that is self-documenting.
8. Create visual layout (e.g., indentation and blank lines) that aids understanding.
Validation Principles
1. Conduct a code walkthrough when appropriate.
2. Perform unit tests and correct errors yoou've uncovered.
3. Refactor the code.
Software Testing Principles
Principles developed by Davis
1. All tests should be traceable to customer requirements.
2. tests should be planned long before testing begins.
3. The pareto principle applies to software testing.
4. testing should begin "in the small" and progress toward testing "in the large"
5. Exhaustive testing is not possible
Deployment Principles
1. Customer expectations for the software must be managed.
2. A complete delivery package must be assembled and tested.
3. A support regime must be established before the software is delivered.
4. Appropriate instructional materials must be provided to end users.
5. Buggy software should be fixed first, delivered later.