Quinnipiac University

CSC 375/575 Generative AI

About the Course

CSC 375/575 Generative AI is an advanced course that takes students from foundational LLM concepts to cutting-edge large-scale AI system implementation. Starting with Raschka's hands-on approach to building language models from scratch, students master tokenization, attention mechanisms, and transformer architectures. The course then advances to large-scale training systems using Xiao & Zhu's comprehensive framework, covering distributed training, efficient attention variants, and memory optimization techniques. Advanced topics include systematic prompt design, chain-of-thought reasoning, retrieval-augmented generation (RAG), reinforcement learning from human feedback (RLHF), constitutional AI, and inference-time scaling. Students complete hands-on projects implementing core LLM components and develop expertise in deploying production-ready generative AI systems.

Course Schedule

Course Materials

📄 Course Syllabus

Available Lectures

Lecture Topic Materials
1 Introduction to Generative AI Slides Handout
PyTorch Google Colab (Part 1) Google Colab (Part 2)
2 LLM Foundations & Pre-training Slides Handout Notebook Google Colab
3 Tokenization & Data Processing Slides Handout Notebook Google Colab
4 Attention Mechanisms & Transformers Slides Handout Notebook
5 Building GPT Architecture: Implementing Core Model Components Slides Handout Notebook Google Colab
6 Model Training Pipeline: Pre-training Large Language Models from Scratch Slides Handout Notebook Google Colab
7 Fine-tuning for Text Classification Slides Notebook Google Colab
8 Instruction Fine-tuning: Aligning Models with Human Instructions Slides Handout Notebook Google Colab
9 Parameter-Efficient Fine-tuning with LoRA Notebook Google Colab
10 Prompting Techniques and Chain of Thought Notebook (Part 1) Notebook (Part 2) Notebook (Part 3)
Google Colab (Part 1) Google Colab (Part 2) Google Colab (Part 3)
11 Alignment: SFT, RLHF, and DPO Notebook (Part 1) Notebook (Part 2)
Google Colab (Part 1) Google Colab (Part 2)

Assignments

Assignment Topic Materials
1 Building Meta's LLaMA Tokenizer Download Package Instructions Submit
2 Building GPT-2 from Scratch - Enhanced with Advanced Concepts Download Package Instructions Submit
3 Classification and Instruction Fine-Tuning Download Package Instructions Submit
4 Prompt Engineering Experiments Instructions (Submit with Final Project)
Final Semiconductor Simulation Code Generation Download Dataset Instructions

Course Structure - 5 Progressive Learning Phases

Phase 1: Foundations (Weeks 1-4) - Raschka Ch.1-6

Build LLMs from scratch: tokenization, attention mechanisms, transformer architecture, pre-training, and supervised fine-tuning

Phase 2: Core Implementation (Week 5) - Raschka Ch.7

Master instruction fine-tuning techniques to align models with human instructions and preferences

Phase 3: Large-Scale Training Systems (Weeks 6-10) - Xiao&Zhu Ch.2

Advanced training infrastructure: scaling laws, distributed training, efficient attention variants, and memory optimization for production systems

Phase 4: Prompting & Tool Integration (Weeks 11-12) - Xiao&Zhu Ch.3

Systematic prompt design, chain-of-thought reasoning, RAG systems, and automatic prompt optimization techniques

Phase 5: Alignment & Inference Optimization (Weeks 13-14) - Xiao&Zhu Ch.4-5

RLHF implementation, constitutional AI, human preference learning, efficient inference, and inference-time scaling

Currently Available: Lectures 1-7 covering foundational concepts through GPT architecture, pre-training, and fine-tuning. Additional lectures will be released progressively throughout the semester.

Required Textbooks

Build a Large Language Model (From Scratch)

Sebastian Raschka

Manning Publications, 2024

Primary textbook for Phases 1-2: hands-on LLM implementation from scratch

Foundations of Large Language Models

Tong Xiao and Jingbo Zhu

NLP Lab, Northeastern University & NiuTrans Research, 2025

Primary textbook for Phases 3-5: large-scale systems, prompting, alignment, and inference optimization

Useful Resources

All
Papers
Tools
Demos
Historical
Tokenization
Visualization