← Retour au controleur

/genesis // How /comprendre works

Everything you need to call the controleur, with or without Haiku.

1. What /comprendre does

POST /comprendre is the single entry point for any business interaction. You send raw text — an invoice, an email, a complaint, a quote request, anything — and the controleur:

  1. Extracts structured features from the text (via Claude Haiku or regex fallback)
  2. Classifies the task type via a trained Snake SAT model (7 classes)
  3. Routes to the appropriate sub-classifier in the Monce Suite
  4. Returns the full response: prediction, probabilities, routing decision, XAI audit

It always returns a 200 with a viable JSON payload. No 502, no 500. If Haiku is down, it falls back to regex. If Snake fails, it falls back to "autre" with human escalation.

2. How to prompt it

Request

POST /comprendre
Content-Type: application/json

{
  "text": "your raw text here — invoice, email, complaint, anything",
  "factory_id": 3   // optional, defaults to 3. Monce factory ID (1=VIT, 3=Monce, 4=VIP, 9=Euro, 10=TGVI)
}

That's it. Two fields. text is required, factory_id is optional.

What to put in text

Content typeExampleExpected routing
Supplier invoiceFACTURE N FA-2026-0891 GlassCorp SA Total HT 12950 EURcontrole_facture_fournisseur → procurement
Client emailDe: jean@example.fr Objet: Suivi commande Bonjour...classification_email → email
Quote requestDemande devis: 200 pcs trempe 10mm, livraison Lyondemande_devis → quote
ComplaintRECLAMATION: verre casse a la livraison, cmd CMD-2026-412reclamation_client → request
New prospectNouveau contact: Pierre Martin, BatiPro, renovation, Parisscoring_prospect → salesagent
Supplier benchmarkCOMPARATIF: AGC 45EUR/u, Saint-Gobain 42EUR/u, Guardian 48EUR/ubenchmark_offre → benchmark
NegotiationNEGOCIATION: GlassCorp, volume 200kEUR, demande remise 10%negociation_offre → negociation

You can paste full documents, partial text, structured or unstructured. The longer and more realistic, the better Haiku extracts. But even a few keywords work in regex fallback mode.

curl examples

# Minimal curl -X POST https://businessclassifier.aws.monce.ai/comprendre \ -H "Content-Type: application/json" \ -d '{"text":"FACTURE GlassCorp Total 12950 EUR"}'
# Full invoice curl -X POST https://businessclassifier.aws.monce.ai/comprendre \ -H "Content-Type: application/json" \ -d '{"text":"FACTURE N FA-2026-0891\nGlassCorp SA\nClient: Monce SAS\nRef VS-TMP10 Trempe 10mm x 200 pcs 4200 EUR\nTotal HT: 12950 EUR\nTVA: 2590 EUR","factory_id":3}'
# Complaint curl -X POST https://businessclassifier.aws.monce.ai/comprendre \ -H "Content-Type: application/json" \ -d '{"text":"RECLAMATION urgent: livraison verre casse, ref CMD-2026-412, inacceptable"}'

3. What goes in, what comes out

Input datastructure

{
  "text": string,       // REQUIRED — raw text content
  "factory_id": int     // optional — default 3
}

Output datastructure (always returned)

{
  "factory_id": 3,
  "version": "v0.1.0410",

  "controleur": {
    "tache_detectee": {
      "Prediction": "controle_facture_fournisseur",  // one of 7 task types
      "Probability": {                          // all 7 probabilities, sum to 1.0
        "controle_facture_fournisseur": 0.85,
        "classification_email": 0.05,
        ...
      },
      "method": "snake_sat"
    },
    "classifier_delegue": "procurementclassifier.aws.monce.ai",
    "confiance_routage": 0.85
  },

  "resultat_delegue": { ... },  // downstream classifier result (mocked in v0)

  "decision": {
    "action": "router",                    // router | valider_paiement | bloquer_paiement | escalade_humaine
    "raison": "Interaction routee vers...",
    "escalade": null
  },

  "quality_score": 0.81,      // 0-1, derived from routing confidence
  "latency_ms": 1200,          // end-to-end server time

  "xai": {
    "dispatch_audit": "Snake SAT audit trail...",
    "matching_audit": "...",     // null for non-facture tasks
    "anomaly_audit": "...",      // null for non-facture tasks
    "governance": "Dispatch: Snake SAT model. Features: {...}."
  },

  "extraction_mode": "haiku",  // "haiku" or "regex_fallback"
  "mock_downstream": true      // downstream classifiers are mocked in v0
}

4. Two modes: Haiku vs Regex fallback

Mode 1: Haiku enabled extraction_mode: "haiku"

Claude Haiku reads the full text and extracts rich structured JSON: document type, line items with refs/quantities/prices, totals, payment conditions, entities, resume. These fields are mapped to 8 Snake features for dispatch classification.

Quality: High. Haiku understands context, handles messy OCR, disambiguates overlapping patterns.

Latency: ~1-3s (Bedrock round-trip).

When: AWS Bedrock credentials are set OR ANTHROPIC_API_KEY=sk-... is set.

Mode 2: Regex fallback extraction_mode: "regex_fallback"

When no LLM is available, the controleur falls back to regex pattern matching: looks for FACTURE, RECLAMATION, EUR amounts, invoice numbers, company names, product references. Extracts what it can, passes to Snake.

Quality: Lower but viable. Works well on structured documents (invoices, comparatives). Struggles with ambiguous emails.

Latency: ~10-50ms (no network call).

When: No API key, Bedrock auth failure, network timeout, quota exceeded. Any Haiku failure triggers fallback silently.

The response always tells you which mode was used via "extraction_mode". Check this field if you need to know whether the classification was backed by LLM or regex.

Feature mapping comparison

FeatureHaikuRegex fallback
document_typeFrom document.type — accuratePattern match on FACTURE/email headers — decent
has_refs_fournisseurFrom lignes[].ref_fournisseurRegex [A-Z]{2,}-[A-Z0-9]+ — catches VS-TMP10 style refs
has_montantFrom totaux.ht or line amountsRegex for EUR / patterns
has_client_refFrom document.numeroRegex for FA- / CMD- patterns
has_produit_mentionFrom line items or entity listKeyword match: verre, trempe, feuillete...
tonFrom text sentiment analysisKeyword match: urgent/bonjour/cordialement
has_objectionFrom complaint/negotiation contextKeyword match: reclamation, litige, remise...
canalFrom source metadataDefault: email (no source in raw text)

5. EC2 setup: enabling Haiku

The controleur runs on EC2 behind nginx+certbot. Haiku calls go through AWS Bedrock (not direct Anthropic API).

Option A: Bedrock via IAM credentials (current setup)

# On EC2, set credentials in the service environment file: sudo tee /etc/default/businessclassifier > /dev/null <<EOF AWS_ACCESS_KEY_ID=AKIA... AWS_SECRET_ACCESS_KEY=IVxL... AWS_REGION=eu-west-3 EOF sudo systemctl restart businessclassifier

Option B: Bedrock via instance profile (recommended for prod)

# Attach an IAM role with bedrock:InvokeModel to the EC2 instance. # No env vars needed — the SDK picks up creds from IMDS automatically. # Just ensure AWS_REGION is set: echo "AWS_REGION=eu-west-3" | sudo tee /etc/default/businessclassifier sudo systemctl restart businessclassifier

Option C: Direct Anthropic API key

# If you have a direct sk-... key: echo "ANTHROPIC_API_KEY=sk-ant-..." | sudo tee /etc/default/businessclassifier sudo systemctl restart businessclassifier

Option D: Your .zshrc AWS_BEARER_TOKEN

# If your local .zshrc has AWS_BEARER_TOKEN or AWS session credentials, # forward them to the EC2 environment: # 1. Check your local creds echo $AWS_ACCESS_KEY_ID echo $AWS_REGION # 2. Copy to EC2 ssh ubuntu@businessclassifier-ip \ "echo 'AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY AWS_SESSION_TOKEN=$AWS_SESSION_TOKEN AWS_REGION=${AWS_REGION:-eu-west-3}' | sudo tee /etc/default/businessclassifier" # 3. Restart ssh ubuntu@businessclassifier-ip "sudo systemctl restart businessclassifier"
Session tokens (AWS_SESSION_TOKEN / AWS_BEARER_TOKEN) expire. For production, use Option A (long-lived IAM user keys) or Option B (instance profile). Session tokens are fine for testing.

Option E: No key at all

It still works. The controleur falls back to regex extraction. You get "extraction_mode": "regex_fallback" in the response. Snake still classifies. Quality is lower but the endpoint never fails.

6. Outcome comparison

Same input, two modes

Input: "FACTURE N FA-2026-0891 GlassCorp SA Trempe 10mm x 200 pcs 4200 EUR Total HT 12950 EUR"

With Haiku

{
  "controleur": {
    "tache_detectee": {
      "Prediction": "controle_facture_fournisseur",
      "Probability": {"controle_facture_fournisseur": 1.0, ...},
      "method": "snake_sat"
    },
    "classifier_delegue": "procurementclassifier.aws.monce.ai",
    "confiance_routage": 1.0
  },
  "resultat_delegue": {
    "type": "controle_facture",
    "fournisseur": {"match": "GlassCorp SA", "confidence": 0.95},
    "lignes": [{"ref_fournisseur": "VS-TMP10", "ref_interne": {...}, "controle_prix": {...}}],
    "controle_global": {"Prediction": "Conforme", ...},
    "verification_arithmetique": {"coherent": true, ...}
  },
  "extraction_mode": "haiku",
  "latency_ms": ~2000
}

Rich: line items extracted, prices checked, arithmetic verified, fournisseur matched.

Without Haiku (regex fallback)

{
  "controleur": {
    "tache_detectee": {
      "Prediction": "controle_facture_fournisseur",
      "Probability": {"controle_facture_fournisseur": 0.82, ...},
      "method": "snake_sat"
    },
    "classifier_delegue": "procurementclassifier.aws.monce.ai",
    "confiance_routage": 0.82
  },
  "resultat_delegue": {
    "type": "controle_facture",
    "fournisseur": {"match": "GlassCorp SA", "confidence": 0.95},
    "lignes": [...],       // regex-extracted, may miss some fields
    "controle_global": {...}
  },
  "extraction_mode": "regex_fallback",
  "latency_ms": ~50
}

Lighter: same routing, lower confidence, regex may miss refs or misparse amounts. But viable.

When does it matter?

ScenarioHaikuRegexDifference
Clean invoice with refsCorrect routing, rich extractionCorrect routing, partial extractionLow — regex handles invoices well
Ambiguous emailCorrect routing (context understanding)May misroute (keyword overlap)High — Haiku disambiguates
OCR / messy textHandles noise, still extractsRegex breaks on garbled textHigh — Haiku is noise-tolerant
Short text (< 20 words)GoodDepends on keywords presentMedium
Mixed languageHandles FR/EN/mixedFR keywords onlyMedium

7. The 7 task types

TaskRoutes toTrigger signals
controle_facture_fournisseurprocurementclassifierdocument_type=facture, has_montant, has_refs_fournisseur
classification_emailemailclassifierdocument_type=email, has_greeting, low montant
demande_devisquoteclassifierhas_produit_mention, no montant, has_greeting
reclamation_clientrequestclassifierhas_objection, ton=urgent, has_client_ref
scoring_prospectsalesagentclassifierno refs, no montant, emetteur_externe
benchmark_offrebenchmarkclassifierhas_refs_fournisseur, has_montant, multiple entities
negociation_offrenegociationclassifierhas_objection, has_montant, has_refs_fournisseur
The dispatch model has 0.726 AUROC on 8 features. It's 3.3× random but not perfect. The confiance_routage field tells you how confident it is. Below 0.5: treat with caution.
← Retour au controleur