AI Proof Grading

Sep 10, 2025 · 1 min read

Site Dataset Code Poster

initial results

Analyzed 780+ graded mathematical proofs to compare LLM vs. human grader consistency and accuracy. Identified feedback errors and applied statistical testing to evaluate grading reliability.

Last updated on Oct 13, 2025

Large Language Models Research Python

Authors

Gabi Friedman

Data Analytics • Data Science • Business Intelligence

← AI Nutrition Grader Sep 20, 2025

Canada's Labor Market Dec 16, 2024 →