All posts

Engineering

Training inside your perimeter, end to end

How Lua trains, serves, and improves your model entirely inside your own infrastructure.

Every enterprise ML project reaches the same meeting. The model works in the pilot, the business case is clear, and then someone from security asks the question that actually decides things: where does our data go?

For most vendors the honest answer is “to us.” Your data is copied to their cloud, their account, their model. That answer is what turns a six-week pilot into a nine-month legal review — and it’s where a lot of good projects quietly die.

Lua’s answer is: nowhere. The data stays exactly where it is. We come to it.

Bring the method to the data, not the data to the method

There are three ways to get a model trained on your data. Two of them move your data; one doesn’t.

  • Send it to an API. Your records leave your perimeter on every call. Simple, and a non-starter the moment the data is regulated or genuinely sensitive.
  • Let a SaaS platform ingest it. Your data is copied into the vendor’s environment and lives there. Now your most sensitive asset sits in someone else’s account, under their controls, for the life of the contract.
  • Keep the data where it lives and bring the training to it. The model is built next to the warehouse it learns from. Nothing is copied out.

Lua is the third. We run inside your cloud account, your VPC, or your on-prem environment — wherever the data already sits — and train against it in place.

The perimeter holds at every stage

“Where does the data go” usually has an uncomfortable answer because the ML lifecycle has several stages, and each one is a chance for data to slip across the boundary. So the useful way to read a vendor’s claim is stage by stage. Here’s ours.

A box labeled 'your perimeter' containing the full lifecycle — Data, Train, Serve, Monitor — operated by forward-deployed Lua engineers, with a single arrow labeled 'high-level metrics' crossing the boundary outward.
The whole lifecycle runs inside your perimeter. Only high-level metrics cross the line.
  • Data — read in place. Lua reads from your warehouse or object store over your own network. No copy is made outside your boundary, and no staging bucket in our account.
  • Train — on your compute. Training runs on your hardware — your GPUs, your cloud account, your bill. The stack ships in as containers and runs where you tell it to; it does not call home to train.
  • Serve — in your VPC. The model is deployed for inference inside the same perimeter, next to the systems that consume it. Predictions never require a round-trip to us.
  • Monitor and improve — in-perimeter. This is the stage people assume must leak, because “the vendor improves the model over time” sounds like telemetry flowing out. It isn’t. The work happens inside your environment (more on exactly how next), and only high-level metrics ever leave.

There are no third-party calls at training or inference time. The model that learns from your data, and the model that serves it, both live entirely within your walls.

The part most vendors won’t say out loud: who can see your data

Egress is the question security teams ask first. It’s not the one that matters most. Even if your data never leaves, you’re letting a vendor operate on it in place — so the real question is who can see it, and under what controls.

We’d rather be plain about this than hide it behind a diagram. Our forward-deployed engineers operate inside your environment, on a Lua-run workspace that you provision in your own infra, under NDA. During a proof-of-concept that means read access to the data in scope, so we can actually build the thing. It does not mean write access to your live production systems.

What makes that safe isn’t a promise, it’s that the controls are yours:

  • Access runs through your identity provider, scoped to the project — not a standing key to everything.
  • Every action lands in your audit logs, not ours.
  • Access is revocable by you, at any time, without our involvement.

“Improve the model over time” then means something concrete and unmysterious: engineers working inside your perimeter, under your access controls — not a service quietly phoning home.

What actually crosses the boundary

To be exact about the one arrow that does cross the line: high-level metrics. Model performance, training progress, the numbers we need to know the system is healthy and getting better. Aggregate, non-sensitive, and reviewable by you.

Your records, your features, and the model’s view of your customers stay put. The thing that leaves is a status report, not your data.

Why this is the only model that works

You could read all of this as a compliance accommodation — a way to get past the security review. It’s deeper than that. It’s the only architecture that makes the underlying idea possible at all.

One model per company only works if it can train on all of your operational data — the full, sensitive, proprietary record of how your business runs. No enterprise is going to copy that into a vendor’s cloud, and they’re right not to. The representations that make the approach work and the data governance that makes it adoptable point to the same answer: the data never moves. The method does.

So the model is built where the data already lives, served where the decisions are made, and improved by people working inside your walls. Your data never has to leave to become the most valuable model you own.