/genesis // How /comprendre works

Everything you need to call the controleur, with or without Haiku.

1. What /comprendre does

POST /comprendre is the single entry point for any business interaction. You send raw text — an invoice, an email, a complaint, a quote request, anything — and the controleur:

Extracts structured features from the text (via Claude Haiku or regex fallback)
Classifies the task type via a trained Snake SAT model (7 classes)
Routes to the appropriate sub-classifier in the Monce Suite
Returns the full response: prediction, probabilities, routing decision, XAI audit

It always returns a 200 with a viable JSON payload. No 502, no 500. If Haiku is down, it falls back to regex. If Snake fails, it falls back to "autre" with human escalation.

2. How to prompt it

Request

POST /comprendre
Content-Type: application/json

{
  "text": "your raw text here — invoice, email, complaint, anything",
  "factory_id": 3   // optional, defaults to 3. Monce factory ID (1=VIT, 3=Monce, 4=VIP, 9=Euro, 10=TGVI)
}

That's it. Two fields. text is required, factory_id is optional.

What to put in `text`

Content type	Example	Expected routing
Supplier invoice	`FACTURE N FA-2026-0891 GlassCorp SA Total HT 12950 EUR`	controle_facture_fournisseur → procurement
Client email	`De: jean@example.fr Objet: Suivi commande Bonjour...`	classification_email → email
Quote request	`Demande devis: 200 pcs trempe 10mm, livraison Lyon`	demande_devis → quote
Complaint	`RECLAMATION: verre casse a la livraison, cmd CMD-2026-412`	reclamation_client → request
New prospect	`Nouveau contact: Pierre Martin, BatiPro, renovation, Paris`	scoring_prospect → salesagent
Supplier benchmark	`COMPARATIF: AGC 45EUR/u, Saint-Gobain 42EUR/u, Guardian 48EUR/u`	benchmark_offre → benchmark
Negotiation	`NEGOCIATION: GlassCorp, volume 200kEUR, demande remise 10%`	negociation_offre → negociation

You can paste full documents, partial text, structured or unstructured. The longer and more realistic, the better Haiku extracts. But even a few keywords work in regex fallback mode.

curl examples

# Minimal curl -X POST https://businessclassifier.aws.monce.ai/comprendre \ -H "Content-Type: application/json" \ -d '{"text":"FACTURE GlassCorp Total 12950 EUR"}'

# Full invoice curl -X POST https://businessclassifier.aws.monce.ai/comprendre \ -H "Content-Type: application/json" \ -d '{"text":"FACTURE N FA-2026-0891\nGlassCorp SA\nClient: Monce SAS\nRef VS-TMP10 Trempe 10mm x 200 pcs 4200 EUR\nTotal HT: 12950 EUR\nTVA: 2590 EUR","factory_id":3}'

# Complaint curl -X POST https://businessclassifier.aws.monce.ai/comprendre \ -H "Content-Type: application/json" \ -d '{"text":"RECLAMATION urgent: livraison verre casse, ref CMD-2026-412, inacceptable"}'

3. What goes in, what comes out

Input datastructure

{
  "text": string,       // REQUIRED — raw text content
  "factory_id": int     // optional — default 3
}

Output datastructure (always returned)

{
  "factory_id": 3,
  "version": "v0.1.0410",

  "controleur": {
    "tache_detectee": {
      "Prediction": "controle_facture_fournisseur",  // one of 7 task types
      "Probability": {                          // all 7 probabilities, sum to 1.0
        "controle_facture_fournisseur": 0.85,
        "classification_email": 0.05,
        ...
      },
      "method": "snake_sat"
    },
    "classifier_delegue": "procurementclassifier.aws.monce.ai",
    "confiance_routage": 0.85
  },

  "resultat_delegue": { ... },  // downstream classifier result (mocked in v0)

  "decision": {
    "action": "router",                    // router | valider_paiement | bloquer_paiement | escalade_humaine
    "raison": "Interaction routee vers...",
    "escalade": null
  },

  "quality_score": 0.81,      // 0-1, derived from routing confidence
  "latency_ms": 1200,          // end-to-end server time

  "xai": {
    "dispatch_audit": "Snake SAT audit trail...",
    "matching_audit": "...",     // null for non-facture tasks
    "anomaly_audit": "...",      // null for non-facture tasks
    "governance": "Dispatch: Snake SAT model. Features: {...}."
  },

  "extraction_mode": "haiku",  // "haiku" or "regex_fallback"
  "mock_downstream": true      // downstream classifiers are mocked in v0
}

4. Two modes: Haiku vs Regex fallback

Mode 1: Haiku enabled `extraction_mode: "haiku"`

Claude Haiku reads the full text and extracts rich structured JSON: document type, line items with refs/quantities/prices, totals, payment conditions, entities, resume. These fields are mapped to 8 Snake features for dispatch classification.

Quality: High. Haiku understands context, handles messy OCR, disambiguates overlapping patterns.

Latency: ~1-3s (Bedrock round-trip).

When: AWS Bedrock credentials are set OR ANTHROPIC_API_KEY=sk-... is set.

Mode 2: Regex fallback `extraction_mode: "regex_fallback"`

When no LLM is available, the controleur falls back to regex pattern matching: looks for FACTURE, RECLAMATION, EUR amounts, invoice numbers, company names, product references. Extracts what it can, passes to Snake.

Quality: Lower but viable. Works well on structured documents (invoices, comparatives). Struggles with ambiguous emails.

Latency: ~10-50ms (no network call).

When: No API key, Bedrock auth failure, network timeout, quota exceeded. Any Haiku failure triggers fallback silently.

The response always tells you which mode was used via "extraction_mode". Check this field if you need to know whether the classification was backed by LLM or regex.

Feature mapping comparison

Feature	Haiku	Regex fallback
`document_type`	From `document.type` — accurate	Pattern match on FACTURE/email headers — decent
`has_refs_fournisseur`	From `lignes[].ref_fournisseur`	Regex `[A-Z]{2,}-[A-Z0-9]+` — catches VS-TMP10 style refs
`has_montant`	From `totaux.ht` or line amounts	Regex for `EUR` / `€` patterns
`has_client_ref`	From `document.numero`	Regex for `FA-` / `CMD-` patterns
`has_produit_mention`	From line items or entity list	Keyword match: verre, trempe, feuillete...
`ton`	From text sentiment analysis	Keyword match: urgent/bonjour/cordialement
`has_objection`	From complaint/negotiation context	Keyword match: reclamation, litige, remise...
`canal`	From source metadata	Default: `email` (no source in raw text)

5. EC2 setup: enabling Haiku

The controleur runs on EC2 behind nginx+certbot. Haiku calls go through AWS Bedrock (not direct Anthropic API).

Option A: Bedrock via IAM credentials (current setup)

# On EC2, set credentials in the service environment file: sudo tee /etc/default/businessclassifier > /dev/null <<EOF AWS_ACCESS_KEY_ID=AKIA... AWS_SECRET_ACCESS_KEY=IVxL... AWS_REGION=eu-west-3 EOF sudo systemctl restart businessclassifier

Option B: Bedrock via instance profile (recommended for prod)

# Attach an IAM role with bedrock:InvokeModel to the EC2 instance. # No env vars needed — the SDK picks up creds from IMDS automatically. # Just ensure AWS_REGION is set: echo "AWS_REGION=eu-west-3" | sudo tee /etc/default/businessclassifier sudo systemctl restart businessclassifier

Option C: Direct Anthropic API key

# If you have a direct sk-... key: echo "ANTHROPIC_API_KEY=sk-ant-..." | sudo tee /etc/default/businessclassifier sudo systemctl restart businessclassifier

Option D: Your .zshrc AWS_BEARER_TOKEN

# If your local .zshrc has AWS_BEARER_TOKEN or AWS session credentials, # forward them to the EC2 environment: # 1. Check your local creds echo $AWS_ACCESS_KEY_ID echo $AWS_REGION # 2. Copy to EC2 ssh ubuntu@businessclassifier-ip \ "echo 'AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY AWS_SESSION_TOKEN=$AWS_SESSION_TOKEN AWS_REGION=${AWS_REGION:-eu-west-3}' | sudo tee /etc/default/businessclassifier" # 3. Restart ssh ubuntu@businessclassifier-ip "sudo systemctl restart businessclassifier"

Session tokens (AWS_SESSION_TOKEN / AWS_BEARER_TOKEN) expire. For production, use Option A (long-lived IAM user keys) or Option B (instance profile). Session tokens are fine for testing.

Option E: No key at all

It still works. The controleur falls back to regex extraction. You get "extraction_mode": "regex_fallback" in the response. Snake still classifies. Quality is lower but the endpoint never fails.

6. Outcome comparison

Same input, two modes

Input: "FACTURE N FA-2026-0891 GlassCorp SA Trempe 10mm x 200 pcs 4200 EUR Total HT 12950 EUR"

With Haiku

{
  "controleur": {
    "tache_detectee": {
      "Prediction": "controle_facture_fournisseur",
      "Probability": {"controle_facture_fournisseur": 1.0, ...},
      "method": "snake_sat"
    },
    "classifier_delegue": "procurementclassifier.aws.monce.ai",
    "confiance_routage": 1.0
  },
  "resultat_delegue": {
    "type": "controle_facture",
    "fournisseur": {"match": "GlassCorp SA", "confidence": 0.95},
    "lignes": [{"ref_fournisseur": "VS-TMP10", "ref_interne": {...}, "controle_prix": {...}}],
    "controle_global": {"Prediction": "Conforme", ...},
    "verification_arithmetique": {"coherent": true, ...}
  },
  "extraction_mode": "haiku",
  "latency_ms": ~2000
}

Rich: line items extracted, prices checked, arithmetic verified, fournisseur matched.

Without Haiku (regex fallback)

{
  "controleur": {
    "tache_detectee": {
      "Prediction": "controle_facture_fournisseur",
      "Probability": {"controle_facture_fournisseur": 0.82, ...},
      "method": "snake_sat"
    },
    "classifier_delegue": "procurementclassifier.aws.monce.ai",
    "confiance_routage": 0.82
  },
  "resultat_delegue": {
    "type": "controle_facture",
    "fournisseur": {"match": "GlassCorp SA", "confidence": 0.95},
    "lignes": [...],       // regex-extracted, may miss some fields
    "controle_global": {...}
  },
  "extraction_mode": "regex_fallback",
  "latency_ms": ~50
}

Lighter: same routing, lower confidence, regex may miss refs or misparse amounts. But viable.

When does it matter?

Scenario	Haiku	Regex	Difference
Clean invoice with refs	Correct routing, rich extraction	Correct routing, partial extraction	Low — regex handles invoices well
Ambiguous email	Correct routing (context understanding)	May misroute (keyword overlap)	High — Haiku disambiguates
OCR / messy text	Handles noise, still extracts	Regex breaks on garbled text	High — Haiku is noise-tolerant
Short text (< 20 words)	Good	Depends on keywords present	Medium
Mixed language	Handles FR/EN/mixed	FR keywords only	Medium

7. The 7 task types

Task	Routes to	Trigger signals
`controle_facture_fournisseur`	procurementclassifier	document_type=facture, has_montant, has_refs_fournisseur
`classification_email`	emailclassifier	document_type=email, has_greeting, low montant
`demande_devis`	quoteclassifier	has_produit_mention, no montant, has_greeting
`reclamation_client`	requestclassifier	has_objection, ton=urgent, has_client_ref
`scoring_prospect`	salesagentclassifier	no refs, no montant, emetteur_externe
`benchmark_offre`	benchmarkclassifier	has_refs_fournisseur, has_montant, multiple entities
`negociation_offre`	negociationclassifier	has_objection, has_montant, has_refs_fournisseur

The dispatch model has 0.726 AUROC on 8 features. It's 3.3× random but not perfect. The confiance_routage field tells you how confident it is. Below 0.5: treat with caution.

← Retour au controleur