Final Project: Semiconductor Simulation Code Generation

CSC 375/575 - Generative AI | Fall 2025
Prof. Rongyu Lin, Quinnipiac University

Project Overview

Goal: Build a language model system (≤1B parameters) to generate semiconductor device simulation code from natural language circuit design specifications, and design your own benchmark to evaluate model performance.

About the Simulation Platform: This project uses Silvaco TCAD (Technology Computer-Aided Design), an industry-standard semiconductor simulation platform for device modeling and circuit analysis. You will train models to generate SPICE-compatible simulation code that describes semiconductor device structures and electrical characteristics.

Possible Approaches: You can explore various techniques such as fine-tuning (LoRA, QLoRA), prompt engineering, retrieval-augmented generation (RAG), chain-of-thought prompting, or any combination that works best for your solution. The choice of methodology is completely open.

Benchmark Design New: You are responsible for designing a comprehensive benchmark to evaluate your model's code generation capabilities. This includes creating test cases, defining evaluation metrics, and demonstrating rigorous assessment of your model's strengths and weaknesses.

Format: Individual or team (2-3 students)

Final Presentation: December 3 (10 minutes per team)

Final Submission: December 12, 11:59 PM

Dataset

Download:

Download Dataset (24 MB)

Contents:

Important: You will design your own benchmark to evaluate your model. Focus on creating diverse, challenging test cases that assess generalization, not memorization.

Dataset Usage Restrictions:

  • This dataset is for CSC 375/575 course use only
  • Prohibited: Sharing, distributing, or publishing this dataset outside of this course
  • Prohibited: Using this dataset for other projects, publications, or commercial purposes
  • Violation of these restrictions may result in academic penalties

Model Constraints

CRITICAL: You must use models with ≤1B parameters.

Allowed models:

Benchmark Design Requirements New

You must design a comprehensive benchmark to evaluate your model's code generation capabilities. Your benchmark should demonstrate thoughtful consideration of what makes good semiconductor simulation code.

Minimum Requirements

Suggested Evaluation Dimensions

Evaluation Metrics Examples

Key Point: The quality of your benchmark design is as important as your model's performance. A well-designed benchmark demonstrates deep understanding of the problem domain and rigorous evaluation methodology.

Grading Rubric (100 Points Total)

Component Points Description
Benchmark Design & Evaluation New 30 Quality and rigor of custom benchmark design, evaluation metrics, and results analysis
Implementation & Methodology 30 Training approach and technical implementation
Presentation 20 Live demonstration and explanation
Documentation & Code Quality 20 Technical report and code organization
Total 100

Graduate students (CSC 575): Higher expectations for methodology sophistication, literature review, and analysis depth.

Deliverables

  1. Trained Model: Model weights and tokenizer (Hugging Face format preferred)
  2. Custom Benchmark New:
    • Test dataset (at least 20 test cases in JSON format)
    • Evaluation scripts with implemented metrics
    • Benchmark design document explaining rationale and methodology
  3. Code: Training scripts, data preprocessing, evaluation code
  4. Technical Report (maximum 4 pages):
    • Model selection and justification
    • Training methodology and hyperparameters
    • Benchmark design and evaluation metrics
    • Results and analysis
    • Failure case analysis
  5. Presentation (10 minutes): Live demo, methodology, benchmark results, Q&A
  6. README: Setup instructions and usage guide

Submission

Submit via course website:

Deadline: December 12, 11:59 PM

Academic Integrity

Allowed:

Not Allowed:

Back to Course Page