# Lecture 10: File Processing and Data Formats

## Learning Objectives

By the end of this lecture, you will be able to:
1. **Understand file concepts** and persistent data storage principles
2. **Master text file operations** reading, writing, and updating text files
3. **Work with CSV files** using both manual parsing and csv module
4. **Handle JSON data** serialization and deserialization for data interchange
5. **Use context managers** with statement for proper resource management
6. **Process structured data** from files into appropriate data structures
7. **Implement file exception handling** for robust file operations
8. **Design file-based applications** following professional patterns

## Setup and Imports

For this lecture, we'll need several modules to work with different file formats. The os module helps us work with file paths and check if files exist. The csv module provides robust CSV file handling. The json module enables us to work with JSON data format. We'll also briefly preview pandas for advanced CSV processing at the end.

In [None]:
# Import required modules for file operations
import os
import csv
import json
from datetime import datetime

# Create a working directory for our examples
if not os.path.exists('lecture10_files'):
    os.makedirs('lecture10_files')
    print("Created lecture10_files directory")

print("Setup complete. Ready for file operations!")

# Part 1: Understanding Files and Persistence

## Why Files Are Essential

Until now, all our programs have stored data in memory - when the program ends, all data disappears. This is like writing notes on a whiteboard that gets erased after class. Files provide persistent storage, like writing in a notebook that you can keep forever. Files allow us to save program state between runs, share data between programs, and process datasets too large for memory. Without files, every program would start fresh with no memory of previous runs, making it impossible to build useful applications like document editors, games with save files, or data analysis tools.

## Demonstrating the Need for Persistence

Let's first see the problem with memory-only storage. When we store data in variables, it exists only while the program runs. This simple example shows how a counter stored in memory resets every time we run the program. In real applications, this would mean losing user preferences, game progress, or any data the user created.

In [None]:
# Problem: Data in memory is temporary
visit_counter = 0  # This always starts at 0
visit_counter += 1

print(f"This is visit number: {visit_counter}")
print("Run this cell multiple times...")
print("The counter always shows 1!")
print("Without files, we can't remember previous values.")

## File Paths and Locations

Before working with files, we need to understand how to specify where files are located in the file system. A file path is like a street address for a file - it tells Python exactly where to find or create the file. There are two types of paths: absolute paths that specify the complete location from the root of the file system, and relative paths that specify location relative to the current working directory. Understanding paths is crucial because using the wrong path is one of the most common file-related errors.

In [None]:
# Understanding file paths
# Get current working directory
current_dir = os.getcwd()
print(f"Current directory: {current_dir}")

# Relative path example
relative_path = "lecture10_files/data.txt"
print(f"Relative path: {relative_path}")

# Build absolute path
absolute_path = os.path.join(current_dir, relative_path)
print(f"Absolute path: {absolute_path}")

# Part 2: Text File Operations

## The Context Manager Pattern

Python's `with` statement is a context manager that ensures files are properly opened and closed, even if errors occur. Think of it like borrowing a book from the library - you check it out (open), use it (read/write), and must return it (close) when done. The beauty of the `with` statement is that it automatically returns the book even if you forget or if something goes wrong. This prevents resource leaks where files remain open and inaccessible to other programs.

In [None]:
# Writing to a text file using context manager
filename = "lecture10_files/welcome.txt"

# The 'with' statement ensures proper file handling
with open(filename, 'w') as file:
    file.write("Welcome to File Processing!\n")
    file.write("This demonstrates the with statement.\n")
    file.write("Files are automatically closed.\n")
# File is automatically closed here

print(f"Created file: {filename}")
print("The file was automatically closed after the 'with' block")

## Reading Files - Method 1: Read Entire Content

The simplest way to read a file is using the `read()` method, which loads the entire file content into a single string. This is perfect for small files that fit comfortably in memory, like configuration files or short documents. However, for very large files (gigabytes of data), this approach could use too much memory and slow down your program. Always consider file size when choosing how to read files.

In [None]:
# Reading entire file content at once
with open("lecture10_files/welcome.txt", 'r') as file:
    content = file.read()
    
print("File content:")
print("-" * 40)
print(content)
print("-" * 40)
print(f"Total characters: {len(content)}")

## Reading Files - Method 2: Line by Line

For larger files or when you need to process data line by line, Python provides several methods. The `readlines()` method returns a list where each element is one line from the file, including the newline character. Alternatively, you can iterate directly over the file object, which is memory-efficient because it reads one line at a time. This approach is ideal for log files, data files, or any situation where you process records sequentially.

In [None]:
# Create a multi-line file first
with open("lecture10_files/students.txt", 'w') as file:
    file.write("Alice Johnson\n")
    file.write("Bob Smith\n")
    file.write("Carol Davis\n")
    file.write("David Wilson\n")

# Method 2a: readlines() returns a list
with open("lecture10_files/students.txt", 'r') as file:
    lines = file.readlines()
    
print(f"readlines() returned {len(lines)} lines:")
for i, line in enumerate(lines, 1):
    print(f"Line {i}: {line.strip()}")  # strip() removes newline

## Memory-Efficient Line Processing

When working with very large files, loading everything into memory at once can be problematic. Python allows us to iterate over a file object directly, which reads one line at a time. This is like reading a book page by page instead of photocopying the entire book first. This approach uses minimal memory regardless of file size, making it perfect for processing large log files or datasets.

In [None]:
# Method 2b: Iterate directly over file (memory efficient)
print("Processing file line by line:")
print("-" * 40)

with open("lecture10_files/students.txt", 'r') as file:
    for line_number, line in enumerate(file, 1):
        # Process each line as it's read
        cleaned_line = line.strip()
        print(f"{line_number}: {cleaned_line} ({len(cleaned_line)} chars)")

## Appending to Files

Sometimes we need to add new content to an existing file without destroying what's already there. The append mode ('a') opens a file for writing but preserves existing content, adding new data at the end. This is perfect for log files where each program run adds new entries, or for any situation where you're accumulating data over time. Think of it like adding new entries to a diary rather than starting a new diary each time.

In [None]:
# Appending to an existing file
log_file = "lecture10_files/activity_log.txt"

# Append mode adds to the end of file
with open(log_file, 'a') as file:
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    file.write(f"{timestamp}: Program started\n")
    file.write(f"{timestamp}: Processing data\n")
    file.write(f"{timestamp}: Operation completed\n")

# Show the accumulated log
print("Log file contents:")
with open(log_file, 'r') as file:
    print(file.read())

## Exercise 1: Create a Personal Diary

Create a simple diary application that:
- Prompts the user for a diary entry
- Appends the entry with a timestamp to diary.txt
- Shows all previous entries

In [None]:
# Your solution here
diary_file = "lecture10_files/diary.txt"

# Get user entry
# entry = input("Enter your diary entry: ")

# For demo purposes, using a fixed entry
entry = "Today I learned about file operations in Python!"

In [None]:
# Solution
diary_file = "lecture10_files/diary.txt"

# For demo, using fixed entry instead of input()
entry = "Today I learned about file operations in Python!"

# Append entry with timestamp
with open(diary_file, 'a') as file:
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    file.write(f"\n[{timestamp}]\n")
    file.write(f"{entry}\n")
    file.write("-" * 50 + "\n")

# Display all entries
print("=== My Diary ===")
try:
    with open(diary_file, 'r') as file:
        print(file.read())
except FileNotFoundError:
    print("No diary entries yet.")

# Part 3: Working with CSV Files

## Understanding CSV Format

CSV (Comma-Separated Values) is a simple file format for storing tabular data, like a spreadsheet saved as plain text. Each line in the file represents a row, and commas separate the columns. It's one of the most common formats for data exchange because virtually every spreadsheet program and database can import and export CSV files. Think of CSV as the universal language for sharing structured data between different programs.

## Creating a CSV File Manually

Let's start by creating a CSV file manually to understand its structure. While this works for simple cases, it can become problematic when data contains commas, quotes, or newlines. That's why Python provides the csv module for robust CSV handling. For now, we'll keep it simple to see exactly how CSV files are structured.

In [None]:
# Create a simple CSV file manually
csv_file = "lecture10_files/grades.csv"

with open(csv_file, 'w') as file:
    # Write header row
    file.write("Name,Midterm,Final,Average\n")
    
    # Write data rows
    file.write("Alice,92,88,90.0\n")
    file.write("Bob,78,85,81.5\n")
    file.write("Carol,95,92,93.5\n")
    file.write("David,88,90,89.0\n")

print(f"Created CSV file: {csv_file}")

## Reading CSV Files Manually

We can read CSV files using basic string operations. This approach splits each line by commas and processes the resulting values. While this works for simple CSV files, it has limitations - it can't handle commas within quoted fields, different delimiters, or other CSV complexities. Understanding manual parsing helps you appreciate why the csv module exists.

In [None]:
# Read CSV manually using string operations
students = []

with open("lecture10_files/grades.csv", 'r') as file:
    # Read and process header
    header = file.readline().strip().split(',')
    print(f"Headers: {header}")
    
    # Read data rows
    for line in file:
        values = line.strip().split(',')
        # Create dictionary for each student
        student = dict(zip(header, values))
        students.append(student)

# Display the data
print(f"\nLoaded {len(students)} students:")
for student in students:
    print(f"{student['Name']}: Average = {student['Average']}")

## Using the csv Module - Reader

Python's csv module provides robust CSV handling that correctly deals with edge cases like commas in data, quoted fields, and different delimiters. The csv.reader returns each row as a list of values. This module is part of Python's standard library, so it's always available and thoroughly tested. Professional code should always use the csv module rather than manual string parsing.

In [None]:
# Using csv.reader for robust CSV parsing
import csv

with open("lecture10_files/grades.csv", 'r') as file:
    csv_reader = csv.reader(file)
    
    # Read header
    headers = next(csv_reader)
    print(f"Headers: {headers}")
    
    # Read and display data
    print("\nStudent data:")
    for row in csv_reader:
        name, midterm, final, average = row
        print(f"{name}: Midterm={midterm}, Final={final}, Avg={average}")

## Using csv.DictReader for Cleaner Code

The csv.DictReader class makes CSV processing more intuitive by returning each row as a dictionary with column names as keys. This eliminates the need to remember column positions and makes code more readable and maintainable. When you access row['Name'] instead of row[0], the code clearly shows what data you're working with, reducing errors and improving code clarity.

In [None]:
# Using DictReader for more readable code
grade_data = []

with open("lecture10_files/grades.csv", 'r') as file:
    csv_reader = csv.DictReader(file)
    
    # Each row is a dictionary
    for row in csv_reader:
        # Access by column name, not index
        midterm = float(row['Midterm'])
        final = float(row['Final'])
        
        # Add calculated field
        row['Letter'] = 'A' if float(row['Average']) >= 90 else 'B'
        grade_data.append(row)

# Display enriched data
for student in grade_data:
    print(f"{student['Name']}: {student['Average']} ({student['Letter']}")

## Writing CSV Files with csv.writer

The csv.writer class handles the complexities of creating properly formatted CSV files. It automatically handles special cases like values containing commas or quotes, ensuring your CSV files can be read by any standard CSV parser. Always specify newline='' when opening files for csv.writer on Windows to prevent extra blank lines between rows.

In [None]:
# Writing CSV with csv.writer
products = [
    ['Product', 'Price', 'Quantity', 'Total'],
    ['Apple', 0.50, 100, 50.00],
    ['Banana', 0.30, 150, 45.00],
    ['Orange', 0.80, 75, 60.00],
    ['Grapes', 2.50, 30, 75.00]
]

with open("lecture10_files/inventory.csv", 'w', newline='') as file:
    writer = csv.writer(file)
    
    # Write all rows at once
    writer.writerows(products)

print("Created inventory.csv")

# Verify the file
with open("lecture10_files/inventory.csv", 'r') as file:
    print(file.read())

## Exercise 2: Student Grade Tracker

Create a program that:
- Reads student names and scores from user input
- Calculates letter grades (90+=A, 80+=B, 70+=C, 60+=D, <60=F)
- Saves data to a CSV file with headers: Name, Score, Letter

In [None]:
# Your solution here
def calculate_letter_grade(score):
    # Add your grade calculation logic here
    pass

In [None]:
# Solution
def calculate_letter_grade(score):
    """Calculate letter grade from numeric score."""
    if score >= 90:
        return 'A'
    elif score >= 80:
        return 'B'
    elif score >= 70:
        return 'C'
    elif score >= 60:
        return 'D'
    else:
        return 'F'

# Sample data (normally from user input)
students = [
    ('Alice Johnson', 92),
    ('Bob Smith', 85),
    ('Carol Davis', 78),
    ('David Wilson', 95),
    ('Eve Brown', 68)
]

# Write to CSV
with open("lecture10_files/final_grades.csv", 'w', newline='') as file:
    writer = csv.writer(file)
    
    # Write header
    writer.writerow(['Name', 'Score', 'Letter'])
    
    # Process and write student data
    for name, score in students:
        letter = calculate_letter_grade(score)
        writer.writerow([name, score, letter])

# Display the results
print("Grade Report:")
print("-" * 40)
with open("lecture10_files/final_grades.csv", 'r') as file:
    reader = csv.DictReader(file)
    for row in reader:
        print(f"{row['Name']}: {row['Score']} ({row['Letter']})")

# Part 4: JSON Data Format

## Understanding JSON

JSON (JavaScript Object Notation) is a lightweight, human-readable format for storing and exchanging structured data. Despite its name suggesting JavaScript origins, JSON is language-independent and has become the standard for web APIs and configuration files. JSON maps naturally to Python data structures: JSON objects become Python dictionaries, JSON arrays become Python lists. This natural mapping makes JSON perfect for saving complex Python data structures and loading them back later.

## JSON Serialization - Python to JSON

Serialization is the process of converting Python objects into JSON format. The json.dumps() function converts Python objects to JSON strings, while json.dump() writes directly to files. JSON supports basic data types: dictionaries, lists, strings, numbers, booleans, and None (null in JSON). Complex objects like dates or custom classes need special handling. The indent parameter makes JSON human-readable by adding proper formatting.

In [None]:
# Creating a Python dictionary to save as JSON
student_record = {
    'name': 'Alice Johnson',
    'student_id': 'A12345',
    'age': 20,
    'courses': ['Python Programming', 'Data Structures', 'Calculus'],
    'gpa': 3.85,
    'active': True,
    'advisor': None
}

# Convert to JSON string (for display)
json_string = json.dumps(student_record, indent=2)
print("JSON representation:")
print(json_string)
print(f"\nData type: {type(json_string)}")

## Saving JSON to Files

When saving JSON to files, use json.dump() (without the 's') which writes directly to a file object. This is more efficient than creating a string first and then writing it. The indent parameter creates pretty-printed JSON that's easy for humans to read and edit. Without indentation, JSON is compact but harder to read. Choose based on whether humans or only programs will read the file.

In [None]:
# Save complex data structure to JSON file
course_data = {
    'course_name': 'Introduction to Python',
    'course_code': 'CS101',
    'instructor': 'Dr. Smith',
    'schedule': {
        'days': ['Monday', 'Wednesday', 'Friday'],
        'time': '10:00 AM',
        'room': 'Science 201'
    },
    'enrolled_students': 25,
    'assignments': [
        {'name': 'Homework 1', 'due': '2024-02-01', 'points': 100},
        {'name': 'Midterm', 'due': '2024-03-15', 'points': 200}
    ]
}

# Write to JSON file
with open("lecture10_files/course_info.json", 'w') as file:
    json.dump(course_data, file, indent=4)

print("Saved course data to JSON file")

## Loading JSON from Files

Deserialization is the reverse process - converting JSON back to Python objects. The json.load() function reads from a file and returns Python objects. JSON's structure is preserved: objects become dictionaries, arrays become lists. This makes JSON perfect for configuration files, data exchange between programs, and saving application state. The loaded data behaves exactly like any Python dictionary or list.

In [None]:
# Load JSON from file
with open("lecture10_files/course_info.json", 'r') as file:
    loaded_course = json.load(file)

# Access the loaded data
print(f"Course: {loaded_course['course_name']}")
print(f"Instructor: {loaded_course['instructor']}")
print(f"Meets on: {', '.join(loaded_course['schedule']['days'])}")
print(f"\nAssignments:")

for assignment in loaded_course['assignments']:
    print(f"- {assignment['name']}: {assignment['points']} points")

## Practical JSON Application - Settings Manager

JSON is perfect for application settings because it's human-readable and editable. Users can modify settings with a text editor if needed, and the structured format prevents errors. This example shows a common pattern: load settings at startup, provide defaults if the file doesn't exist, and save changes when users modify settings. This pattern appears in countless applications from games to productivity software.

In [None]:
class AppSettings:
    """Manage application settings with JSON persistence."""
    
    def __init__(self, filename="lecture10_files/settings.json"):
        self.filename = filename
        self.settings = self.load_settings()
    
    def load_settings(self):
        """Load settings or create defaults."""
        try:
            with open(self.filename, 'r') as file:
                return json.load(file)
        except FileNotFoundError:
            # Create default settings
            defaults = {
                'theme': 'light',
                'font_size': 12,
                'auto_save': True,
                'recent_files': [],
                'window_position': {'x': 100, 'y': 100}
            }
            self.save_settings(defaults)
            return defaults

In [None]:
    def save_settings(self, settings=None):
        """Save current settings to file."""
        if settings:
            self.settings = settings
        
        with open(self.filename, 'w') as file:
            json.dump(self.settings, file, indent=2)
    
    def get(self, key, default=None):
        """Get a setting value."""
        return self.settings.get(key, default)
    
    def set(self, key, value):
        """Update a setting."""
        self.settings[key] = value
        self.save_settings()

# Use the settings manager
app = AppSettings()
print(f"Current theme: {app.get('theme')}")
print(f"Font size: {app.get('font_size')}")

# Change settings
app.set('theme', 'dark')
app.set('font_size', 14)
print("\nSettings updated and saved!")

## Exercise 3: Contact Manager with JSON

Create a contact manager that:
- Stores contacts in JSON format
- Each contact has: name, phone, email
- Provides add and search functionality
- Persists data between runs

In [None]:
# Your solution here
class ContactManager:
    def __init__(self, filename="lecture10_files/contacts.json"):
        # Initialize your contact manager
        pass

In [None]:
# Solution
class ContactManager:
    def __init__(self, filename="lecture10_files/contacts.json"):
        self.filename = filename
        self.contacts = self.load_contacts()
    
    def load_contacts(self):
        """Load contacts from JSON file."""
        try:
            with open(self.filename, 'r') as file:
                return json.load(file)
        except FileNotFoundError:
            return []  # Empty list if no file exists
    
    def save_contacts(self):
        """Save contacts to JSON file."""
        with open(self.filename, 'w') as file:
            json.dump(self.contacts, file, indent=2)
    
    def add_contact(self, name, phone, email):
        """Add a new contact."""
        contact = {
            'name': name,
            'phone': phone,
            'email': email
        }
        self.contacts.append(contact)
        self.save_contacts()
        print(f"Added contact: {name}")

In [None]:
    def search(self, query):
        """Search contacts by name."""
        query = query.lower()
        results = []
        
        for contact in self.contacts:
            if query in contact['name'].lower():
                results.append(contact)
        
        return results
    
    def display_all(self):
        """Display all contacts."""
        if not self.contacts:
            print("No contacts found.")
            return
        
        print(f"\nContacts ({len(self.contacts)} total):")
        print("-" * 50)
        for contact in self.contacts:
            print(f"Name: {contact['name']}")
            print(f"Phone: {contact['phone']}")
            print(f"Email: {contact['email']}")
            print("-" * 50)

# Test the contact manager
contacts = ContactManager()

# Add some contacts
contacts.add_contact("Alice Johnson", "555-0101", "alice@email.com")
contacts.add_contact("Bob Smith", "555-0102", "bob@email.com")
contacts.add_contact("Carol Davis", "555-0103", "carol@email.com")

# Display all
contacts.display_all()

# Search
print("\nSearching for 'john':")
results = contacts.search('john')
for contact in results:
    print(f"Found: {contact['name']} - {contact['phone']}")

# Part 5: Exception Handling for Files

## Why File Operations Need Special Care

File operations are inherently risky because they depend on external resources outside our program's control. Files might not exist, we might lack permission to read or write them, disks might be full, or network drives might be disconnected. Professional programs must handle these situations gracefully rather than crashing. Good error handling provides helpful messages to users and recovers when possible, making programs reliable and user-friendly.

In [None]:
# Common file exceptions and how to handle them
def read_config_file(filename):
    """Read configuration with comprehensive error handling."""
    try:
        with open(filename, 'r') as file:
            config = json.load(file)
            print("Configuration loaded successfully")
            return config
            
    except FileNotFoundError:
        print(f"Config file '{filename}' not found. Creating defaults...")
        return {'version': '1.0', 'debug': False}
        
    except PermissionError:
        print(f"Permission denied accessing '{filename}'")
        return None
        
    except json.JSONDecodeError as e:
        print(f"Invalid JSON in config file: {e}")
        return None
        
    except Exception as e:
        print(f"Unexpected error: {e}")
        return None

# Test with non-existent file
config = read_config_file("lecture10_files/config.json")

## Creating Robust File Operations

Robust file operations anticipate and handle potential problems. This includes checking if files exist before reading, creating directories if needed, making backups before overwriting important files, and providing clear error messages. The goal is to make programs that work reliably in real-world conditions, not just in perfect testing environments.

In [None]:
def safe_save_data(data, filename, create_backup=True):
    """Save data with backup and error recovery."""
    # Ensure directory exists
    directory = os.path.dirname(filename)
    if directory and not os.path.exists(directory):
        os.makedirs(directory)
    
    # Create backup if requested and file exists
    if create_backup and os.path.exists(filename):
        backup_name = filename + '.backup'
        try:
            with open(filename, 'r') as source:
                with open(backup_name, 'w') as backup:
                    backup.write(source.read())
            print(f"Created backup: {backup_name}")
        except Exception as e:
            print(f"Warning: Could not create backup: {e}")
    
    # Save new data
    try:
        with open(filename, 'w') as file:
            json.dump(data, file, indent=2)
        print(f"Data saved successfully to {filename}")
        return True
    except Exception as e:
        print(f"Error saving data: {e}")
        # Try to restore from backup
        if create_backup and os.path.exists(backup_name):
            print("Attempting to restore from backup...")
        return False

# Test the function
test_data = {'important': 'data', 'value': 42}
safe_save_data(test_data, "lecture10_files/important_data.json")

# Part 6: Integrated File Application

## Building a Complete File-Based System

Let's combine everything we've learned to build a student grade management system. This application demonstrates real-world file usage: JSON for configuration and data storage, CSV for reports, text files for logs, and comprehensive error handling. This is the kind of system you might build for a small school or training center.

In [None]:
class StudentGradeSystem:
    """Complete grade management system with file persistence."""
    
    def __init__(self, data_dir="lecture10_files/grade_system"):
        self.data_dir = data_dir
        self.ensure_directories()
        self.students = self.load_students()
        self.log_action("System started")
    
    def ensure_directories(self):
        """Create necessary directories."""
        dirs = [self.data_dir, 
                os.path.join(self.data_dir, 'reports'),
                os.path.join(self.data_dir, 'backups')]
        
        for directory in dirs:
            if not os.path.exists(directory):
                os.makedirs(directory)
    
    def load_students(self):
        """Load student data from JSON."""
        filename = os.path.join(self.data_dir, 'students.json')
        try:
            with open(filename, 'r') as file:
                return json.load(file)
        except FileNotFoundError:
            return {}

In [None]:
    def save_students(self):
        """Save student data to JSON."""
        filename = os.path.join(self.data_dir, 'students.json')
        with open(filename, 'w') as file:
            json.dump(self.students, file, indent=2)
    
    def log_action(self, action):
        """Log actions to text file."""
        log_file = os.path.join(self.data_dir, 'system.log')
        timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        
        with open(log_file, 'a') as file:
            file.write(f"{timestamp}: {action}\n")
    
    def add_student(self, student_id, name):
        """Add a new student."""
        if student_id not in self.students:
            self.students[student_id] = {
                'name': name,
                'grades': {},
                'enrollment_date': datetime.now().isoformat()
            }
            self.save_students()
            self.log_action(f"Added student: {name} (ID: {student_id})")
            print(f"Student {name} added successfully")
        else:
            print(f"Student ID {student_id} already exists")

In [None]:
    def add_grade(self, student_id, course, grade):
        """Add a grade for a student."""
        if student_id in self.students:
            self.students[student_id]['grades'][course] = grade
            self.save_students()
            self.log_action(f"Added grade: {student_id} - {course}: {grade}")
            print(f"Grade added: {course} = {grade}")
        else:
            print(f"Student ID {student_id} not found")
    
    def generate_report(self, format='csv'):
        """Generate grade report in specified format."""
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        
        if format == 'csv':
            filename = os.path.join(self.data_dir, 'reports', 
                                  f'grades_{timestamp}.csv')
            
            with open(filename, 'w', newline='') as file:
                writer = csv.writer(file)
                writer.writerow(['Student ID', 'Name', 'Course', 'Grade'])
                
                for sid, sdata in self.students.items():
                    for course, grade in sdata['grades'].items():
                        writer.writerow([sid, sdata['name'], course, grade])
            
            print(f"Report generated: {filename}")
            self.log_action(f"Generated CSV report: {filename}")

In [None]:
# Use the complete system
grade_system = StudentGradeSystem()

# Add some students
grade_system.add_student("S001", "Alice Johnson")
grade_system.add_student("S002", "Bob Smith")
grade_system.add_student("S003", "Carol Davis")

# Add grades
grade_system.add_grade("S001", "Python", 95)
grade_system.add_grade("S001", "Math", 88)
grade_system.add_grade("S002", "Python", 87)
grade_system.add_grade("S002", "Math", 92)

# Generate report
grade_system.generate_report()

# Show the log
print("\nSystem Log:")
log_file = os.path.join(grade_system.data_dir, 'system.log')
with open(log_file, 'r') as file:
    print(file.read())

# Part 7: Brief Introduction to pandas for CSV

## pandas: The Power Tool for Data Files

While the csv module works well for basic CSV operations, pandas is Python's power tool for data analysis. pandas can read CSV files into DataFrames - powerful data structures that act like in-memory spreadsheets. With pandas, you can filter, sort, group, and analyze data with simple commands. This is just a preview - you'll learn much more about pandas in future courses.

In [None]:
# First, let's install pandas if needed
try:
    import pandas as pd
    print("pandas is already installed")
except ImportError:
    print("pandas not installed")
    print("To install: pip install pandas")
    # For now, we'll skip the pandas demo

In [None]:
# If pandas is available, demonstrate its power
try:
    import pandas as pd
    
    # Read our grades CSV into a DataFrame
    df = pd.read_csv("lecture10_files/final_grades.csv")
    
    print("DataFrame contents:")
    print(df)
    print("\nDataFrame info:")
    print(df.info())
    
    # Simple analysis
    print("\nGrade distribution:")
    print(df['Letter'].value_counts())
    
    # Filter data
    print("\nStudents with A grades:")
    a_students = df[df['Letter'] == 'A']
    print(a_students)
    
except ImportError:
    print("pandas demo skipped - not installed")

## Final Exercise: Build Your Own Data Application

Create a personal expense tracker that:
1. Stores expenses in JSON format
2. Each expense has: date, category, amount, description
3. Generates CSV reports by category
4. Maintains a text log of all transactions
5. Handles all file errors gracefully

In [None]:
# Your final project here
class ExpenseTracker:
    def __init__(self):
        # Design your expense tracking system
        pass

## Summary and Key Takeaways

Congratulations! You've mastered file operations in Python:

1. **Text Files**: Simple read/write operations for unstructured data
2. **CSV Files**: Structured tabular data with the csv module
3. **JSON Files**: Complex structured data with nested relationships
4. **Context Managers**: The `with` statement ensures proper resource cleanup
5. **Exception Handling**: Robust file operations that handle real-world problems
6. **File Paths**: Understanding absolute vs relative paths
7. **Professional Patterns**: Backups, logging, and error recovery

Files transform your programs from temporary tools to persistent applications. Whether you're building a game that saves progress, a data analysis tool that processes datasets, or a web application that stores user data, file operations are essential. As you continue your programming journey, you'll use these skills in virtually every substantial program you write.

Next lecture, we'll explore NumPy arrays - a powerful tool that will revolutionize how you work with numerical data!