Use the vitals package with ellmer to evaluate and compare the accuracy of LLMs, including writing evals to test local models.
Abstract: Basic Programming Practice (BPP), as an introductory course for computer science majors, aims to enable students to have basic programming skills and lay a foundation for subsequent advanced ...