Data Structures | Murat Can Ganiz

Course Overview

Data Structures is a core computer engineering course that connects mathematical reasoning, algorithm analysis, and practical implementation. The course emphasizes how data organization affects correctness, efficiency, memory usage, and software maintainability.

Students review mathematical background and asymptotic analysis, then study classical linear, tree-based, heap-based, hash-based, and graph-based structures. The course is implementation-oriented: students are expected to understand both the abstract behavior of each data structure and the details needed to implement it correctly.

Algorithm analysis Recursion Linked lists Stacks and queues Trees Heaps Hash tables Graphs Sorting

Learning Outcomes

Analyze Algorithms

Use asymptotic notation, summations, recurrence relations, and worst/average/best-case reasoning to compare algorithms.

Implement Abstract Data Types

Implement stacks, queues, lists, trees, heaps, hash tables, and graphs with clear interfaces and correct edge-case behavior.

Choose the Right Structure

Explain why a data structure is appropriate for a problem based on operation costs, memory needs, update patterns, and access patterns.

Reason About Correctness

Trace operations step by step, maintain invariants, test corner cases, and debug pointer/reference-based implementations.

Textbooks and Background References

The course uses lecture slides, problem-session material, programming assignments, and standard algorithms/data structures references. For mathematical review, the course has recommended the following background readings.

Introduction to Algorithms, Cormen, Leiserson, Rivest, and Stein, 3rd edition, MIT Press, 2009. Suggested review: Appendix VIII, including summations, sets, counting and probability, and matrices.
Discrete Mathematics and Its Applications: With Combinatorics and Graph Theory, Kenneth H. Rosen and Kamala Krithivasan. Suggested review topics: logic/proofs, algorithms, growth of functions, complexity, recursive algorithms, trees, and graphs.

Core Topic Map

Module	Representative Topics	Typical Skills
Mathematical Foundations and Algorithm Analysis	Summations, sets, counting, matrices, recurrence relations, asymptotic notation, growth of functions, sparse matrices.	Estimate running time, solve simple recurrences, compare algorithms, and justify complexity classes.
Recursion	Recursive definitions, recursive algorithms, recursion trees, base cases, stack behavior, divide-and-conquer structure.	Trace recursive calls, derive time complexity, and transform recursive reasoning into correct code.
Linked Lists, Stacks, and Queues	Singly/doubly linked lists, stack ADT, queue ADT, circular queues, expression conversion, infix/postfix processing.	Implement push/pop/enqueue/dequeue, handle empty/full cases, and use stacks for expression processing.
Trees	Tree terminology, traversal, binary search trees, AVL trees, splay trees, B-trees, tree height and balance.	Perform insert/delete/search, apply rotations, trace traversals, and explain balance guarantees.
Heaps and Priority Queues	Binary heaps, min-heaps, max-heaps, min-max heaps, leftist heaps, binomial heaps, heap sort.	Trace percolate-up/down, merge heaps, implement priority queue operations, and analyze operation costs.
Hash Tables	Hash functions, separate chaining, open addressing, linear probing, quadratic probing, load factor, rehashing.	Simulate insert/search/delete, evaluate collisions, and reason about average-case access time.
Graphs	Adjacency matrix/list, directed and undirected graphs, BFS, DFS, topological sorting, minimum spanning trees with Prim and Kruskal.	Run graph traversal algorithms, produce discovery/finish orders, detect topological constraints, and build MSTs.
Sorting	Comparison-based sorting, heap sort, merge sort, quicksort, stability, in-place algorithms, lower-bound intuition.	Trace sorting algorithms, compare complexity, and select sorting strategies for different data properties.

Programming and Project Work

Programming work is designed to connect abstract data type definitions to actual implementation details. Past assignments have included stack implementation, infix-to-postfix conversion, algorithm-analysis exercises, and multi-part projects using several data structures.

Implementation Assignments

Students implement selected structures such as stacks, queues, trees, heaps, hash tables, and graphs. Correctness, edge cases, and interface clarity are emphasized.

Problem Sessions

Problem sessions review recurrence solving, queue/stack operations, tree rotations, heaps, graph traversals, MST algorithms, and sorting traces.

Projects

Projects usually require combining multiple structures and algorithms in a working program. Students are expected to document design decisions and test representative cases.

Assessment Pattern

The exact grading plan may vary by semester. The course commonly combines homework, quizzes, programming assignments, projects, midterm/final exams, and problem-session preparation. Exams often require hand calculation, tracing, complexity analysis, and short implementation reasoning.

Conceptual and Analytical Work

Asymptotic complexity comparisons
Recurrence and summation exercises
Data-structure operation tracing
Graph algorithm and tree-rotation questions

Implementation Work

Programming assignments with strict correctness requirements
Project deliverables using selected data structures
Testing for corner cases such as empty structures, overflow, underflow, duplicate keys, and disconnected graphs

Useful Implementation Checklist

Define the abstract data type operations before writing code.
Write down invariants: for example, heap order, BST ordering, AVL balance factor, queue front/rear relation, or hash-table load factor.
Test empty, single-element, and many-element cases separately.
Trace pointer/reference updates on paper before implementation.
For recursive code, identify the base case, recursive step, and size-reduction argument.
For graph code, state whether the graph is directed/undirected and weighted/unweighted.
Measure complexity for each public operation, not only for the whole program.

Study Questions

The following questions summarize the recurring style of preparation expected in the course. They are intended for review and self-study, not as a fixed exam list.

Algorithm Analysis, Mathematical Background, and Recursion

Define Big-O, Big-Ω, and Big-Θ. Give one example where Big-O alone is less informative than Big-Θ.
Order the following functions by asymptotic growth: log n, n, n log n, n², 2ⁿ, n!, √n.
Use a summation to compute the running time of a nested loop where the inner loop runs from 1 to i for each i from 1 to n.
Solve or estimate the recurrence T(n) = T(n/2) + 1. What algorithmic pattern does it describe?
Solve or estimate the recurrence T(n) = 2T(n/2) + n. What common divide-and-conquer algorithm has this form?
Explain the difference between worst-case, best-case, and average-case analysis using linear search as an example.
Why can constant factors usually be ignored in asymptotic analysis? When might they still matter in practice?
What is a sparse matrix? Which storage strategies can reduce memory usage for sparse matrices?
Trace a recursive factorial or Fibonacci function and show how recursive calls are stored on the call stack.
What is the difference between direct recursion and indirect recursion?
Why must every recursive algorithm have a base case? Give an example of what happens if the base case is wrong.
Explain how recursion is used in tree traversal and graph DFS.

Linked Lists, Stacks, and Queues

Compare arrays and linked lists in terms of access time, insertion/deletion cost, and memory overhead.
Implement insertion at the beginning, insertion at the end, and deletion from a singly linked list. What are the edge cases?
What is a dummy/header node? How can it simplify linked-list operations?
Define the stack ADT. Which real programming mechanism naturally uses a stack?
Trace the stack contents while converting an infix expression such as A + B * C - D into postfix form.
Evaluate a postfix expression step by step using a stack.
What errors should a robust stack implementation detect?
Define the queue ADT. How does a circular array implementation avoid shifting elements?
Explain queue overflow and underflow. How do these cases differ in static-array and linked implementations?
Where are queues used in graph algorithms?

Trees and Search Trees

Define root, leaf, parent, child, sibling, depth, height, subtree, and degree of a node.
Given a binary tree, produce preorder, inorder, postorder, and level-order traversals.
What property distinguishes a binary search tree from an arbitrary binary tree?
Insert a sequence of keys into a binary search tree. What insertion orders produce a degenerate tree?
Delete a node with two children from a binary search tree. Why is the inorder predecessor or successor used?
What is the height of a balanced tree compared with the height of a degenerate tree?
Define the AVL balance factor. When are rotations required?
Perform LL, RR, LR, and RL rotations on small AVL examples.
What is a splay tree? What is the intuition behind moving recently accessed nodes toward the root?
What is a B-tree and why is it useful for disk-oriented or block-oriented storage?
Insert keys into a B-tree of a given order and show when node splitting occurs.
Compare BST, AVL, splay tree, and B-tree in terms of balancing strategy and typical use cases.

Heaps and Priority Queues

Define the heap-order property and the complete-tree shape property of a binary heap.
Show the array representation of a binary heap and compute parent/left/right child indices.
Insert a key into a min-heap and trace the percolate-up operation.
Delete the minimum from a min-heap and trace the percolate-down operation.
What is the time complexity of heap insertion, delete-min/delete-max, and build-heap?
Explain how heap sort works. Is it stable? Is it in-place?
What is a min-max heap and what operations does it support efficiently?
What is the null-path length in a leftist heap? How does it support efficient merging?
Merge two leftist heaps and show the resulting structure.
What is a binomial tree? How does a binomial heap represent multiple tree orders?
Merge two binomial heaps and explain the analogy to binary addition with carries.
When would a mergeable heap be preferable to a standard binary heap?

Hash Tables

What is a hash function? What properties make a hash function suitable for a hash table?
Define collision. Why are collisions unavoidable when the key space is larger than the table size?
Compare separate chaining and open addressing.
Insert a sequence of keys into a hash table using linear probing. Show all collisions.
Repeat the same insertion using quadratic probing. How does the probe sequence change?
What is clustering in hash tables? Why is primary clustering a problem for linear probing?
Define load factor. How does it affect expected search time?
What is rehashing and when should it be triggered?
How should deletion be handled in open addressing so that later searches still work correctly?
Compare expected-case and worst-case performance of hash-table search.

Graphs and Graph Algorithms

Define graph, vertex, edge, degree, path, cycle, connected component, directed graph, weighted graph, and DAG.
Compare adjacency matrix and adjacency list representations in terms of memory and operation costs.
Run BFS from a given start vertex and show the queue contents at each step.
Run DFS from a given start vertex and show discovery order and finish order.
How can DFS be used to detect cycles?
What is topological sorting? Why is it defined only for directed acyclic graphs?
Run topological sorting on a small prerequisite graph.
Define a minimum spanning tree. What assumptions are required for MST algorithms?
Apply Prim's algorithm to a weighted undirected graph and show the selected edges in order.
Apply Kruskal's algorithm to the same graph and compare the selected edge order.
What data structure is useful for efficient Kruskal implementation? Why?
Explain how graphs are used to model real-world systems such as roads, social networks, dependencies, and communication networks.

Sorting

Compare insertion sort, selection sort, merge sort, quicksort, and heap sort in terms of time complexity.
What does it mean for a sorting algorithm to be stable?
What does it mean for a sorting algorithm to be in-place?
Trace merge sort on a short list and show the split/merge stages.
Trace quicksort using a specified pivot rule. How does pivot choice affect performance?
Explain why comparison sorting has an Ω(n log n) lower bound.
When might insertion sort be preferable to more advanced algorithms?
How does heap sort use a heap to repeatedly select the next element?
What is the difference between internal sorting and external sorting?
How can sorting be used as a preprocessing step for searching, duplicate detection, or graph algorithms?

Integrated Exam-Style Practice

Given a sequence of operations on a stack and queue, show the final state of each data structure.
Given an infix expression, convert it to postfix and evaluate the postfix expression for specific variable values.
Given a sequence of insertions, build a BST and then convert it into an AVL tree by applying the required rotations.
Given a min-heap array, perform two insertions and one delete-min operation.
Given a hash table size and collision strategy, insert keys and compute the number of probes for each search.
Given a graph, produce BFS order, DFS order, topological order if possible, and connected components.
Given a weighted graph, compute the MST using both Prim and Kruskal and compare results.
Given pseudocode with nested loops and recursive calls, derive the time complexity.
Given a project scenario, choose appropriate data structures and justify your choices.
Given a buggy implementation of a data structure, identify the violated invariant and propose a correction.

How to Succeed in the Course

Trace by Hand

Most structures become clear only after tracing operations manually. Practice drawing states after every insertion, deletion, traversal, or rotation.

Implement Carefully

Implementation mistakes often occur at boundaries: empty structures, one-element structures, duplicate keys, root deletion, and disconnected graphs.

Connect Cost to Design

Always ask which operation must be fast. A good data-structure choice depends on the workload, not on the structure being popular.