Securing NVIDIA’s DGX Spark AI Supercomputer: A Firs...

Securing NVIDIA’s DGX Spark AI Supercomputer:

A First-Look Security Analysis

By Rex M. Lee, Security Advisor, My Smart Privacy

While consulting for a defense contractor, I recently conducted a comprehensive security and capability assessment of NVIDIA’s DGX Spark AI Supercomputer—a compact yet extraordinarily powerful workstation capable of running AI models with up to 400 billion parameters. Although the platform marks a major advancement in on-premise AI innovation, my analysis concludes that it should not be used for development until it undergoes proper security pre-configuration and full air-gapping to mitigate potential threats.

I was alarmed to discover that these computers incorporate Chinese-manufactured components and can run AI software such as DeepSeek AI, which was developed in China.

Additionally, the devices are supported by Meta AI software, and both DeepSeek and Meta are known to develop technologies embedded with surveillance and data-mining capabilities—some operating below the BIOS level, where they cannot be independently audited.

This poses serious privacy, security, safety, and data-sovereignty risks, particularly if these systems are deployed within the defense industry, critical infrastructure, government agencies, or enterprise environments where protecting confidential and classified information—such as intellectual property, copyright content, health data, legal records, sensitive user data, legal information, employment information, or trade secrets—is non-negotiable.

Given the presence of potential Chinese-origin hardware and AI components, as well as supply-chain and model-level surveillance risks, it’s essential that these systems be hardened before deploying them in defense, critical infrastructure, or other sensitive environments.

If you purchased DGX Spark units through a vendor or reseller, I strongly recommend forwarding them this analysis and requesting that they implement the necessary security configurations—ideally at no or minimal cost.

What You’re Getting with the DGX Spark

Each DGX Spark combines NVIDIA’s new GB10 Grace-Blackwell Superchip—delivering up to 1 PFLOP FP4 performance—into a compact chassis about the size of a paperback book. Each unit includes:

128 GB unified memory
200 Gb/s ConnectX-7 networking
4 TB self-encrypting NVMe storage

According to NVIDIA documentation, one Spark supports models up to 200 billion parameters, and two units connected together (“Spark Stacking”) can handle up to 405 billion. This puts enterprise-scale generative-AI capability directly on a desktop—an impressive feat for under $4,000 per unit.

The DeepSeek V3 Risk: Chinese-Origin AI

One model marketed for use on this platform, DeepSeek V3, originates from Hangzhou, China, and is backed by a Chinese hedge fund. Although technically efficient, it introduces serious national-security and supply-chain concerns.

DeepSeek’s model weights constitute Chinese-origin intellectual property, meaning any organization with a “no foreign AI” policy—especially in defense or critical-infrastructure sectors—should avoid it altogether. Even when air-gapped, unverified model artifacts can carry hidden vulnerabilities or backdoors.

Hardware and Supply-Chain Exposure

While the DGX Spark is primarily designed and assembled by NVIDIA, certain components—such as NVMe drives and wireless modules—may be sourced from third-party vendors, potentially outside the U.S. or allied countries.

For organizations enforcing strict supply-chain policies:

Specify Micron (US), Kioxia (Japan), or Samsung (Korea) self-encrypting drives.
Disable or remove Wi-Fi / Bluetooth modules in the BIOS/UEFI.
Use only ConnectX-7 RDMA networking for node-to-node communication.

Meta Llama 3.1: A Capable but Complex Option

Meta’s Llama 3.1 family (8B, 70B, 405B models) offers strong U.S.-based alternatives for local AI development, fully compatible with PyTorch, TensorRT-LLM, and NVIDIA’s NGC ecosystem.

However, these models still introduce significant surveillance and misuse risks when deployed in high-sensitivity environments:

Model exploitation and “agentic” behavior – Large models can act autonomously in unpredictable ways, potentially generating or manipulating outputs that bypass security oversight.
Remote code-execution vulnerabilities – Flaws in the open-source Llama-stack could allow attackers to execute arbitrary code if systems are not fully locked down.
Prompt-injection and data-exfiltration – Adversarial prompts can trick the model into revealing internal configurations or sensitive data.
License modifications and guardrail removal – Open models can be altered, reducing built-in safety controls.
Third-party integrations and supply-chain risk – External tools used with Llama (retrieval pipelines, plug-ins, etc.) can become vectors for data capture or surveillance.
Hidden behaviors in large models – At 405B parameters, full behavioral mapping is nearly impossible; latent vulnerabilities may persist despite safeguards.

Mitigation and Hardening Strategies

For organizations building proprietary IP, classified systems, or critical-infrastructure AI tools:

Air-gap the environment completely and restrict all external I/O.
Harden the OS and container stack with Secure Boot, measured boot, and signed images.
Replace unverified hardware (especially storage) with trusted components.
Monitor model outputs and logs for anomalous or manipulative behavior.
Run smaller, safer models (8B or 70B) for sensitive workloads; reserve large-scale (405B) use for isolated R&D only.
Red-team the AI regularly to test for prompt-injection or exploitation pathways.
Apply FIPS-compliant encryption for all model weights and logs, using TPM/HSM key custody.

OpenAI’s Air-Gapped Solution

For U.S.-origin, enterprise-friendly alternatives, OpenAI’s open-weight models (gpt-oss-20B and 120B) provide a clear path forward.
These Apache-licensed models are designed to run entirely on owned infrastructure—including fully air-gapped systems—and come with robust documentation for offline deployment.

gpt-oss-120B runs comfortably within a single DGX Spark and can scale across two units for higher throughput.
US origin + clean licensing make it ideal for defense, aerospace, or government use cases.
Future scalability: OpenAI’s roadmap supports secure hybrid deployments via Azure Government and air-gapped GPT-4 environments, providing a migration path for classified workloads.

My Recommended Pre-Configuration Plan

Before developing on the DGX Spark platform:

Build a “No-China” Profile
- Replace storage with Micron/Kioxia/Samsung SEDs.
- Disable Wi-Fi/BT; enable UEFI Secure Boot + admin PIN.
- Verify the 200 Gb/s ConnectX-7 link.
Air-Gap the Software Supply Chain
- Create an offline container and model registry.
- Mirror TensorRT-LLM and PyTorch containers after vulnerability scans and checksum verification.
Select Trusted Models
- Start with OpenAI gpt-oss-20B/120B and Meta Llama 3.1-70B for baseline performance testing.
- Avoid DeepSeek models in production unless validated in a quarantined lab with independent artifact verification.
Maintain Ongoing Security Hygiene
- Operate under egress-deny firewall policies.
- Sign and pin all artifacts (container hashes, model checksums).
- Conduct routine prompt-injection and adversarial testing exercises.

Final Thoughts

I advised the defense contractor to forward my security analysis and recommendations to the vendor from whom they purchased the computers, and also to share a copy with NVIDIA for their review, feedback, and recommendations. I will update this blog once I receive their response—if they choose to reply at all.

The NVIDIA DGX Spark represents an exciting step toward affordable, on-premise AI supercomputing, but as with any powerful technology, security must come first. Before writing a single line of code, ensure your systems are air-gapped, hardened, and verified—and when in doubt, seek validation from a qualified IT security expert or vendor partner.

This assessment is based on my independent research and validated public sources. While I specialize in endpoint and connected-device security for Android, iOS, and Windows platforms, this marks my first direct review of high-performance AI hardware—and the same core principle applies: security must be engineered at the foundation, not bolted on later.

For more privacy and security analyses, visit MySmartPrivacy.com or follow my updates on The Electronic Bill of Rights Initiative.

— Rex M. Lee
Security Advisor, My Smart Privacy