Use Open-Source AI models safely

Open-source artificial intelligence (AI) models can be useful, but they should not be used without basic security and governance checks.

Before downloading, testing, or deploying an open-source AI model, you should validate where it came from, how it behaves, what risks it introduces, and whether it is suitable for the intended use case. 

This checklist helps you assess open-source AI models before they are used with University data, connected to internal systems, or embedded into process workflows.

It is designed to support safe experimentation with clear ownership, traceability and proportionate security controls.

At a glance

 Verify the publisher is a real, credible entity with traceable history.

 Confirm file integrity by comparing checksums against the original source.

  Use safetensors format wherever available. Avoid pickle-based formats without sandboxing.

 Scan model files before loading. Treat model weights as untrusted code.

 Run the first test in an isolated sandbox with no access to production systems, credentials or sensitive data.

 
Check

Before you download

Check whether the model was published by a credible organisation, research group, company or individual.

How to check:

  • Look for verified organisation status on the model hub.
  • Treat an unverified account with no clear history as a warning sign.
  • Cross-check the linked website or domain against the hosting account.
  • Review the account’s previous models and commit history.
  • Look for consistent, legitimate activity over time.
  • Search for the maintainer or organisation name outside the model hub.

 

Check whether the model’s origin and development are clear.

How to check:

  • Read the model card in full.
  • Treat missing or vague model cards as a warning sign.
  • Check which base model it derives from.
  • Verify the provenance of the base model.
  • Confirm whether the training dataset is disclosed.
  • Consider whether undisclosed or unvetted training data creates bias, legal or ethical risk.
  • Follow linked papers, blogs or documentation to validate the stated method.

Check model versions as behaviours of a model can change with updates.

How to check:

  • Record the commit hash, revision ID or release tag.
  • Log the approved version in your risk assessment, deployment configuration and model inventory.
  • Do not use a floating reference such as [latest] in production.
  • Review new model versions before adoption, as updates can change behaviour.

Use file formats that reduce execution risk.

How to check:

  • Prefer safetensors where available.
  • Treat .pkl, .pt, .pth and .bin files as higher risk.
  • These files may allow arbitrary code to run when loaded.
  • If you must use a pickle-based format, use sandboxing and scanning before loading it.
  • Document the reason for any exception.

Check whether the licence permits your intended use.

How to check:

  • Confirm whether commercial use, hosted inference and fine-tuning are permitted.
  • Check attribution requirements.
  • Check prohibited use clauses.
  • Some licences restrict specific industries, purposes or applications.
  • Confirm that the licence does not conflict with University obligations, customer agreements, research terms or product commitments.
  • Use the SPDX Licence List or Open Source Initiative material to clarify unclear licence terms.

Before you test

Expand All

Check that the downloaded files match the original source.

How to check:

  • Compare hashes or checksums against the values published in the canonical repository.
  • Download from the original source only.
  • Do not use mirrored or re-uploaded copies unless they have been separately validated.
  • Treat a mismatch as a sign of corruption or tampering.
  • If there is a mismatch, discard the file and start again.

Scan all model files before loading them.

How to check:

  • Run Protect AI ModelScan against all model artefacts before loading.
  • For pickle-based files, also run PickleScan.
  • On Hugging Face, check the model’s malware scan badge.
  • Check any Protect AI Guardian results shown in the repository.
  • Treat any detected payload as untrusted.
  • Do not load a file with a detected payload, even if the publisher appears credible.

Review available security warnings and repository activity.

How to check:

  • Check model hub security badges.
  • Review warning banners.
  • Check commit history for suspicious changes.
  • Pay particular attention to recent changes affecting model weights.
  • Look for unresolved security discussions in the Issues tab.
  • Apply extra scrutiny where a repository shows sudden unexplained activity after a long quiet period.

Assess the publisher’s reputation, but do not rely on popularity alone.

How to check:

  • Check download count and account age.
  • Check whether recognised organisations reference or depend on the model.
  • Review issue history and community feedback.
  • Look for active and constructive maintainer engagement.
  • Search for citations in papers, dependent projects or known enterprise use.
  • Do not treat high download numbers as evidence that a model is safe.

Check whether the model, model family or provider has been linked to known incidents.

How to check:

  • Search the AI Incident Database and MIT AI Incident Tracker for the model name, model family and provider.
  • Check GitHub issues and security blogs for disclosed vulnerabilities or misuse reports.
  • Search the National Vulnerability Database (NVD) and Common Vulnerabilities and Exposures (CVE) records for the provider name and related infrastructure.
  • Consider incidents involving the wider model family, not only the exact version.

Before you deploy

Expand All

Run structured testing before approving the model.

How to check:

  • Use a defined evaluation set.
  • Do not rely on ad hoc testing.
  • Test for prompt injection, jailbreak attempts, unsafe output, hallucination, bias and sensitive data leakage.
  • Include misuse cases that reflect the intended use.
  • Save all test prompts and results.
  • Use these records as evidence for the risk assessment.
  • Use the OWASP Top 10 for Large Language Model Applications and MITRE ATLAS to build test cases systematically

Treat the first execution environment as untrusted.

How to check:

  • Run the model in a sandbox.
  • Do not provide production credentials, use sandbox credentials.
  • Do not use sensitive datasets, use test datasets instead.
  • Do not allow write access to shared systems.
  • Do not allow outbound network calls to internal infrastructure.
  • Use disposable infrastructure where possible.
  • Destroy the test environment after initial testing where appropriate.
  • Treat the execution environment as hostile until the model’s behaviour has been validated.

Give the model only the access it needs.

How to check:

  • Restrict filesystem access to least privileges necessary.
  • Restrict outbound network access.
  • Restrict application programming interface (API) permissions.
  • Restrict access to secrets.
  • Treat model execution as untrusted code.
  • Apply the same controls you would apply to an unknown binary.
  • Limit tool-calling capabilities, especially tools that can write to external systems or make network requests.
  • Log model interactions.
  • Send logs to a location that the model cannot access or modify.

Check the wider software environment, not just the model file.

How to check:

  • Scan Docker images.
  • Scan Python dependencies.
  • Scan inference scripts and example code.
  • Use software composition analysis (SCA) or container scanning tools.
  • Use tools such as Trivy, Grype or Docker Scout for container and software bill of materials (SBOM) vulnerability scanning.
  • Do not assume example notebooks are safe.
  • Pin dependency versions.
  • Monitor for new Common Vulnerabilities and Exposures (CVEs) after deployment.

Ongoing governance

Expand All

Record ownership and review arrangements before the model goes live.

How to check:

  • Record the model owner.
  • Record the approved use case.
  • Record the approved version.
  • Record the risk rating and the next review date.
  • Keep this information in your model inventory.
  • Define a rollback plan before deployment.
  • Set out who receives alerts.
  • Define what thresholds trigger review.
  • Define how new model revisions are assessed before adoption.
  • Define how security notices from the model provider or upstream base model will be handled.
  • Define a rollback plan before you go live — not after something goes wrong.
  • Establish a monitoring route: who receives alerts, what thresholds trigger review, and how new model revisions are assessed before adoption.
  • Set a process for handling security notices from the model provider or upstream base model.

Before you approve

Answer these 3 questions before approving any model.

A “no” answer to any question should pause the process.

Q1. Can you trace where this model came from and what it was trained on?

Provenance gaps are operational risks as well as compliance risks.

If you cannot answer this question, you cannot assess what the model might do.

Fail → Do not proceed

Q2 Has the model behaved acceptably in tests that reflect real use, including adversarial testing?

Informal testing is not enough.

If you have not run a structured evaluation and saved the results, the model has not been tested adequately.

Fail → Return to 'Check Test model behaviour before approval'

Q3 Is there a named owner, documented version and review date on record?

Without these, approval is informal and difficult to enforce.

The model should not move into production without an accountability trail.

Fail → Return to 'Define approval and monitoring'