[ Home ][ Contact Us ]

>>_

Cover Image for A Comparative Security Evaluation of Local vs Remote LLM Deployments
#VI

I evaluated Qwen3-0.6B from two different deployment perspectives:

  1. Local execution using Ollama (with direct config inspection)
  2. Remote access via an HTTPS chat endpoint

Both setups were red-teamed using:

  • Garak (LLM vulnerability scanner)
  • Promptfoo (adversarial prompt testing framework)

## Key Results

Deployment Safety Pass Rate
Local (Ollama) 54%
Remote (HTTPS endpoint) 8%

A "pass" indicates the model safely handled or rejected malicious prompts.

The remote deployment failed 92% of attacks.


Screenshot From 2026-04-21 14-50-31

## OWASP-Aligned Test Breakdown

Category Description Local Remote
Excessive Agency Unsafe autonomous actions 67% 2%
Hijacking Task redirection attacks 60% 0%
Prompt Extraction System prompt leakage 40% 0%
Overreliance Failure to challenge bad input 27% 0%

## Additional Finding

During testing, the model hallucinated ANSI escape codes in 56% of cases.

Examples included incorrect sequences like:

  • \t
  • 135
  • H

This creates a real risk surface:

  • Terminal injection
  • Log poisoning

Screenshot From 2026-04-21 14-52-25

## Architecture Insights

From the model configuration:

  • 28 hidden layers → deeper prompt influence propagation
  • SiLU + SwiGLU → smoother activations, reduced sparsity
  • RMSNorm → stable normalization
  • RoPE (θ = 1,000,000) → extended context handling
  • Grouped Query Attention (GQA) → shared attention memory

## Key Takeaway

Model behavior is not determined solely by training data.

It is influenced by:

  • Architectural design choices
  • Mathematical functions (activation, normalization, attention)
  • Training interactions between components
  • Deployment wrapper and API structure

Same model weights can produce drastically different security outcomes depending on how they are exposed.


## Implications

>_ For Defenders

  • Model configuration should be part of threat modeling
  • Output handling and downstream usage must be validated

>_ For Builders

  • Activation functions and architecture impact security
  • API design and prompt handling define attack surfaces

## Conclusion

The model is only one part of the system.

The interface and deployment layer define its real-world security posture.


## References

>_ Tooling

>_ Frameworks

>_ Research Papers


#AISecurity #LLM #RedTeaming #CyberSecurity #PromptInjection #MLSecurity