A 2026 text-to-SQL benchmark methodology: execution accuracy, schema linking, latency, and cost across model tiers - plus where generated SQL goes wrong.
A 2026 benchmark methodology for small language models on edge GPUs — latency, tokens/sec, memory, and cost for Phi, Gemma, and Qwen on Jetson-class hardware.