We Measured Six GPAI Model Cards Against Annex XI. None Passed.
By Provectio | April 2026 | ~6 minute read
We took six publicly available model cards from the six largest general-purpose AI providers. We measured coverage against the 14 requirements of EU AI Act Annex XI using Provectio's traceable measurement process. Every number in this post is a documented measurement output. The legal significance of those findings remains a matter for qualified counsel and regulators.
The results
| Provider | Model | Covered | Partial | Gap | Completeness |
|---|---|---|---|---|---|
| Meta | Llama 3.1 405B | 12 | 1 | 1 | 89% |
| Anthropic | Claude Opus 4.5 | 9 | 3 | 2 | 75% |
| Mistral | Magistral | 4 | 6 | 4 | 50% |
| Google DeepMind | Gemini 3 Pro | 1 | 11 | 2 | 46% |
| OpenAI | GPT-5 | 0 | 10 | 4 | 36% |
| xAI | Grok 4 | 1 | 8 | 5 | 36% |
What the numbers mean
The threshold for a pass is 100% coverage. No provider reached it. Meta's Llama 3.1 scores highest at 89%, and it still carries one critical gap. That gap is not a minor disclosure omission. It is a missing Annex XI requirement. Under the Act, a partial disclosure and an absent one are both non-compliant.
Meta's lead reflects the nature of its documentation. The Llama 3.1 technical report is a research paper. It discloses architecture decisions, training data composition, and compute in FLOPs. These are the categories Annex XI was designed to capture. The report was written for scientific audiences, not regulators, but the information happens to satisfy most of what the law requires.
OpenAI and xAI score lowest, at 36% each. Their system cards are safety communication documents. They describe intended use, known limitations, and evaluation results. That is appropriate and useful content. It is not technical documentation under the Act. Annex XI requires both safety evaluation and technical disclosure. A document that covers one and not the other leaves half the obligation unmet.
The universal gaps
Three Annex XI requirements appear as gaps across all six providers, or close to it.
Energy consumption (AXI-2e) is absent from every provider measured. No model card in this set discloses training energy use, inference energy use, or any proxy metric from which consumption could be estimated. This requirement does not appear to be under active disclosure by any major GPAI provider.
Computational resources (AXI-2d) are disclosed only by Meta. The remaining five providers give no figure for the compute used to train their models. In some cases this information exists in published research. It does not appear in the model card documentation measured here.
Acceptable use policies (AXI-1b) follow a consistent pattern across all providers. The documentation references an external policy by URL. It does not include the policy text, a structured summary, or a governed extract. A URL is not documentation. It is a pointer to documentation that may change, may be removed, and cannot be versioned within the model card itself.
These three requirements produce the largest force gaps across the dataset. They are also the ones least likely to be closed by minor documentation updates.
What this means for the deadline
The GPAI provisions of the EU AI Act came into enforcement on 2 August 2025. That date has passed. Every provider measured in this study is already inside the enforcement window.
The gaps identified in this analysis are not future compliance risks. They are current exposures. Article 101(1) provides for administrative fines of up to EUR 15,000,000 or 3% of global annual turnover per infringement, whichever is higher. The six providers in this dataset are among the largest technology companies operating in the EU market. The 3% turnover figure, not the fixed ceiling, is the relevant number for each of them.
No enforcement action related to Annex XI GPAI documentation has been publicly reported as of the date of this post. That does not change the legal position of providers whose documentation falls below the requirements. The enforcement window is open. The documentation gaps are measurable. Both facts are in the record.
Methodology note
All measurements in this post were conducted using Provectio's traceable measurement process. Each finding carries an integrity hash and can be independently verified. The process maps documentation against each Annex XI requirement and scores completeness based on present, partial, and absent findings.
The full case studies, including per-requirement coverage matrices for all six providers, are available on the case studies page.
Provectio measures. It does not advise. These findings are structured input for legal, audit, and compliance review.