Quinnipiac University

INF 659 Probability and Data Analysis

Final Project

The final project is an open-ended applied data analysis project. Students choose a practical question, find a real dataset, and use statistical tools from INF 659 to produce an evidence-based conclusion. The emphasis is on asking one good question and answering it well.

What every project needs

  • One focused primary question
  • One documented real-world dataset
  • At least one substantial inferential method from the second half of the course
  • At least two labeled visualizations
  • A clear limitations discussion

Good applied directions

  • Comparing groups
  • Explaining or predicting an outcome
  • Working with categories and proportions
  • Operations, risk, and uncertainty decisions

Deliverables

  • Proposal
  • Checkpoint
  • Final notebook plus report or annotated slides
  • Presentation of about 5 minutes plus questions

Submission method

  • Email project materials directly to rongyu.lin@quinnipiac.edu
  • Use a clear subject line such as INF 659 Final Project - Your Name
  • If files are too large, include a shareable cloud link in the email
Applied focus: most projects should use a real dataset and aim at a practical interpretation. Kaggle is a good place to start looking for real datasets, and you may also use other reliable public sources. A narrow, well-executed analysis is better than an ambitious but unfinished project.

Spring 2026 Timeline

Milestone Purpose Target Date
Proposal Confirm the question, dataset, and planned methods before the project grows too broad. Submit by email. Fri, Apr 24
Checkpoint Show cleaned data, one draft figure, and one preliminary result. Submit by email. Mon, Apr 27
Presentation Share the question, evidence, conclusion, and limitation with the class. Wed, Apr 29
Final Submission Email the polished notebook plus report or annotated slides to the instructor. Wed, May 6

Grading Emphasis

Criterion Points What strong work looks like
Question and scope 10 A focused, meaningful, course-appropriate question.
Data quality and documentation 15 Clear source, understandable variables, and a dataset that supports the analysis.
Method choice and design 25 Methods fit the question and are explained rather than only applied.
Accuracy and interpretation 20 Careful reasoning, correct calculations, and conclusions that do not overclaim.
Communication and organization 15 A coherent notebook, readable figures, and a clear story.
Reproducibility and professionalism 5 Reasonable notebook structure and complete references.
Presentation 10 A concise talk centered on evidence and conclusions, not just code.