HeadlinesBriefing favicon HeadlinesBriefing.com

AI Model Writing Styles Mapped Across 178 Systems

Hacker News •
×

A Hacker News post released a dataset of 3,095 AI-generated answers to 43 prompts, each annotated with a 32‑dimensional stylometric fingerprint covering lexical richness, sentence structure, punctuation, formatting and discourse markers. The author built a Node.js pipeline that z‑scores each feature and computes cosine similarity across responses. The goal was to map writing‑style similarity among modern language models, helping auditors trace model provenance reliably.

Analysis uncovered nine clone clusters where models share over 90 % cosine similarity on normalized vectors. Mistral Large 2 and Mistral Large 3 form the tightest pair, while Gemini 2.5 Flash Lite mimics Claude 3 Opus with a 78 % likeness score yet costs 185× less. Meta exhibits the strongest provider “house style,” scoring a 37.5× distinctiveness ratio. The composite clone score aggregates five independent signals, including prompt‑controlled head‑to‑head similarity and cross‑prompt consistency.

Results show prompt choice drives convergence: a satirical‑fake‑news prompt forces models into near‑identical prose, whereas a simple letter‑count task maximizes divergence. Researchers shared a 1,400‑line analysis script, enabling others to reproduce the stylometric extraction and correlation metrics. The dataset offers a practical benchmark for detecting model‑specific fingerprints and for evaluating mitigation strategies against plagiarism or model‑copying effectively.