VaultLLM

Common Questions

Frequently Asked Questions

Everything you need to know about VaultLLM — from GDPR compliance to what happens if your hardware fails.

Yes — by architecture, not by policy. GDPR compliance for AI tools is difficult when using cloud services because your data is processed on third-party servers outside your control. With VaultLLM, all AI processing happens on hardware installed at your premises. Your data never travels to an external server, so there is no third-party data processor for AI queries, no cross-border data transfer, and no dependency on a vendor's privacy policy to protect your clients' information.

This approach satisfies the ICO's guidance on data minimisation and purpose limitation, and makes your AI use auditable and demonstrable to clients and regulators.

Absolutely. The VaultLLM node processes every query locally, on hardware physically located in your office. No query, document, prompt, or response is transmitted to any external server — not to VaultLLM, not to any AI company, not to any cloud provider.

Remote management (for monitoring and updates) uses a minimal, read-only connection that does not transmit your business data. You can verify this at any time through your own network monitoring tools, and we can provide a network traffic audit on request.

Yes. The VaultLLM node can run in a fully air-gapped environment — no internet connection required for day-to-day AI use. All models are stored locally on the node's storage, and inference runs entirely offline.

For our managed maintenance plan, a periodic internet connection is needed to receive updates (model upgrades, firmware patches). If your compliance environment prohibits any outbound connection, we offer an alternative where updates are delivered on physical media. Please mention this requirement during your discovery call.

We support a range of best-in-class open-source models, selected and configured for professional use. Current recommendations include Llama 3.3 70B (ideal for legal document work), Qwen 2.5 32B (excellent for financial data analysis), Mistral Small 22B (fast everyday use), and Phi-4 14B (structured reasoning and research).

Multiple models can be installed simultaneously so your team can choose the best tool for each task. We keep up with the rapidly evolving open-source AI landscape and push updates to your node as superior models become available.

See the full list of supported models on our Services page →

Our remote monitoring detects most hardware issues before they cause a service interruption. In the event of a failure, contact our support team and we will diagnose the issue remotely. For component failures, we dispatch a replacement part or engineer as quickly as possible — typically within two working days for UK mainland locations.

Major hardware replacement is handled under a separate agreement. We'll discuss your specific recovery time requirements during the discovery call and ensure your service agreement reflects them.

Support is available via telephone and email during UK business hours (Monday–Friday, 9am–5:30pm). All managed plan customers receive a dedicated support contact and a direct phone number for their account.

We guarantee a first response within one business day for all support requests. Critical issues (node down, complete service loss) are prioritised and we aim to respond within four business hours. Out-of-hours emergency support is available as an add-on for practices that require it.

From initial enquiry to a live node typically takes 3–5 weeks. This includes a discovery call, agreement, hardware build (5–10 working days), delivery, and on-site installation. The installation appointment itself takes approximately half a day — we handle everything, and your team can start using the AI the same afternoon.

If you have a specific go-live deadline, let us know during the discovery call and we'll do our best to accommodate it.

The £199/month figure is the starting price for our standard managed maintenance plan. Your actual monthly fee depends on the number of nodes installed, the level of support coverage required, and whether you need any add-ons such as out-of-hours emergency support or extended SLAs.

We will provide a fixed, transparent quote before you commit to anything. There are no hidden fees, no per-query charges, and no usage caps. You pay a flat monthly fee regardless of how much your team uses the AI.

Yes. The hardware cost is a one-time investment regardless of team size — a sole practitioner and a 50-person firm pay the same for the node itself. For small firms, the primary value proposition is not cost savings over ChatGPT subscriptions (the ROI table shows this is more compelling at 20+ seats), but rather compliance certainty.

For a law firm or accountancy practice of any size, the ability to tell clients "your data is processed entirely on our own hardware, never sent to any cloud service" is a significant professional differentiator — and increasingly, a regulatory expectation.

Yes. Multiple models can run simultaneously on your node, and your team can select the most appropriate model through the user interface. For example, you might use Llama 3.3 70B for detailed contract review and Mistral Small 22B for quick email drafting — each optimised for its task.

We configure the interface and model selection during installation, tailored to your team's most common workflows. Additional models can be added at any time.

This is the right question to ask — and the honest answer is that the hardware gets more capable over time, not less. Open-source model research moves fast, and the clear trend is that newer models deliver better results on the same hardware. The node you install today will run models in two years that don't exist yet and that outperform today's best.

For the document and text tasks common in legal and accountancy work — summarising contracts, drafting correspondence, extracting data — today's 70B parameter models already outperform what GPT-4 could do at launch. The practical ceiling for your use cases is likely already met.

Model updates are included in the managed maintenance plan — we push improved versions to your node as they become available. You benefit from progress in open-source AI automatically, without buying new hardware.

Cloud AI has its own obsolescence problem: providers retire models with little warning, and moving to a new version often means re-testing your workflows and accepting changed behaviour. With VaultLLM, model updates happen on your schedule, not a provider's.

For clients with a long-term outlook, we also offer a hardware refresh programme — planned upgrades at agreed intervals. Ask about this during your discovery call.

Have a question we haven't answered?

Get in touch and we'll respond within one business day.

Contact Us