Strategy Consultant - AI Training & Evaluation (MBB & Top-Tier Firms)
solutions for each task, used to train and calibrate an LLM-based grading system that evaluates AI outputs at scale... English (B2+)...
solutions for each task, used to train and calibrate an LLM-based grading system that evaluates AI outputs at scale... English (B2+)...
application codebase Write tests that accept all correct solutions and reject incorrect ones - neither too strict (breaking..., don't miss bad solutions, and don't break on good ones Review code written by agents, analyze why an agent failed...
solutions for each task, used to train and calibrate an LLM-based grading system that evaluates AI outputs at scale... English (B2+)...