Application of Linear Regression for Academic Outcome Forecasting: SMC Grade Predictor
Objective and Goals
The purpose of this project is to develop a grade prediction model that estimates a Santa Monica College student’s expected grade in a given math course taught by a specific professor.
The goal is to help students make informed class-selection decisions by using both historical grade data and student academic history, while also assisting instructors and counselors in identifying performance trends.
Requirements
Data Sources
-
SMC official grade distribution reports (past 3 years)
-
Student-submitted course histories and grades
-
Letter-grade-only dataset (A–F)
Technologies & Frameworks
-
Python
-
Scikit-learn (Linear Regression)
-
Flask (backend)
-
HTML/CSS/JavaScript (frontend)
-
Matplotlib (visualizations)
-
GitHub + Render for deployment
Planning
Milestones and Timeline
-
Collect and label historical grade data (701 classes, 22,425 grades)
-
Convert raw grade distributions into numerical values
-
Engineer two core predictors: Instructor Difficulty Score (IDS) and Student Ability Score (SAS)
-
Train linear regression model using historical + synthetic data
-
Build web application UI and integrate prediction API
-
Deploy on Render with live updates via GitHub
-
Evaluate model reliability using correlation significance tests
Roles and Responsibilities
Data Team
-
Gather grade distributions and student-provided data
-
Convert, clean, and format datasets for regression
-
Compute IDS and SAS metrics
Model Development Team
-
Implement linear regression pipeline
-
Train and validate model coefficients
-
Perform correlation significance analysis
-
Optimize model for reliability and consistency
Frontend Team
-
Build user interface using HTML/CSS/JS
-
Enable structured input for multiple previous courses
-
Display predictions and probability-like outputs
-
Implement dynamic features (add/remove courses)
Backend Team
-
Integrate Flask with prediction model
-
Process student inputs and return grade forecasts
-
Render Matplotlib charts (grade distributions, professor difficulty)
-
Maintain deployment pipeline through GitHub → Render
Execution
Development Process
-
Each member contributes weekly updates via GitHub branches
-
Python used for data modeling; Flask connects the model to the web app
-
Frontend collects course/professor history and desired class prediction
-
Backend returns predicted GPA and probable letter outcome
-
Matplotlib graphs display useful visuals (e.g., baseline grade distributions)
Weekly Meetings
-
Online meetings (30–60 minutes)
-
Discuss dataset updates, model accuracy, and UI improvements
-
Coordinate tasks using GitHub commits and version control