LLM EvaluationPrescriptive Scaling Reveals the Evolution of Language Model CapabilitiesDiscovering Hierarchical Latent Capabilities of Language Models via Causal Representation LearningSolving Inequality Proofs with Large Language Models