Why we mask your data before sending anything to the AI

AI is a black box that needs proper guardrails

Artificial intelligence is profoundly changing how we can analyze financial documents. Where a human team would spend weeks combing through thousands of invoices, a well-instrumented model does it in a few hours.

But let’s be honest: these models are operated by third-party providers (OpenAI, Anthropic). Sending them your supplier names, bank account numbers, or contacts raw means sharing a commercial map that has no business leaving your company.

That’s why we made a very specific choice: the AI never sees your data as it is.

The principle: anonymization by mapping

Before every call to an AI model, Finareo runs your documents through an automatic anonymization step. Concretely:

Before (raw data)	After (sent to AI)
Carrefour Maroc SARL	`SUPP_001`
IBAN: 011 780 0000123…	`IBAN_017`
contact@carrefour.ma	`CTC_009`
Share capital: 50M MAD	`CAPITAL_AMT`

The AI always sees:

The document structure (header, invoice lines, totals, VAT)
The amounts (essential to detect anomalies)
The dates (essential for patterns)

But it never sees who is who.

The mapping never leaves our infrastructure

The correspondence table between SUPP_001 and “Carrefour Maroc SARL” is kept exclusively on Finareo infrastructure. It never transits to AI providers, and is never logged on their side.

Once the AI’s response comes back, we locally re-inject the real names into the result before showing it to you. You see real data. The third party only saw tokens.

On top of that: contractual Zero Data Retention

Anonymization alone wasn’t enough for us. We wanted a contractual guarantee too.

Our enterprise contracts with OpenAI and Anthropic include a Zero Data Retention clause:

No data sent is retained on their side
No data sent is used to train or improve their models
Each call is stateless: no “conversation” feature, no memory

Concretely, if an AI provider were breached tomorrow, there’d be nothing to steal about your documents — they’re simply no longer there.

And on the Finareo side?

We keep traces of the calls we make to AI models — but without the content:

✅ Kept on our side: model used, call duration, token count, technical identifier
❌ Not kept: the content sent, except for occasional debug with your explicit consent

This is how we can monitor service quality without creating a second copy of your data.

Why this choice matters

A simple argument we sometimes hear: “LLMs are already secure, you don’t need anonymization.” That’s wrong for three reasons:

Defense in depth: several independent layers beat a single excellent one
Provider futures: their policies can change, their models evolve — anonymization, however, depends on us
Separation of concerns: it’s not the AI provider’s job to guarantee the confidentiality of your supplier names — it’s ours

In practice

When you read in a Finareo analysis that “supplier X was overbilled by 12% on contract Y”, know that:

The AI that detected the gap never saw the name of X or the name of contract Y
The reconciliation between SUPP_001 and “X” was done locally, on our side
The contracts we signed with our AI providers are available for inspection under NDA

This is what we call working with AI — without handing it your business.