A stylometric analysis of 3,095 AI responses found 9 near-identical model clusters, with Gemini 2.5 Flash Lite writing 78% like Claude 3 Opus at 185x lower cost.
An independent researcher published a dataset of 3,095 standardized AI responses across 43 prompts from 178 models, extracting 32-dimension stylometric fingerprints per response. Analysis revealed 9 clone clusters with >90% cosine similarity, Mistral Large 2 and 3 sharing 84.8% stylistic overlap, and Gemini 2.5 Flash Lite mimicking Claude 3 Opus at a fraction of the cost. Meta was identified as having the strongest distinct house style at a 37.5x distinctiveness ratio. The full analysis script runs ~1,400 lines in Node.js.
This research gives you empirical cover to substitute expensive flagship models with cheaper clones without degrading output style — a real architectural decision, not a guess. The 32-dimension stylometric fingerprint methodology is reproducible: you can run the same analysis on your own domain-specific prompts to validate whether the cheaper model actually holds up for your use case. The 'satirical fake news' convergence finding also signals which prompt types flatten model differentiation — useful if you're building evals.
Pull your top 5 highest-volume prompts, run them through both Claude 3 Opus and Gemini 2.5 Flash Lite this week, and compare output length, formatting, and sentence structure — if they match on your domain, cut your inference cost by an order of magnitude.
Install both SDKs: pip install anthropic google-generativeai
Tags
Also today
Signals by role
Also today
Tools mentioned