The journey of a prompt

Discover the millimeter process from when the user sends a message until the ideal expert responds, all in a matter of milliseconds.

Phase 1: Reception and Cleaning

When l3mcore receives your message, it first goes through a security and cleaning filter. Before the brain makes a decision, we make sure the text is safe and easy to understand.

Here, unnecessary formatting (such as bold asterisks) that chat interfaces usually send is removed, leaving the pure text ready to be analyzed.

Client Request
{"text": "**hola**"}

Sanitizer
strip_markdown()

Clean Prompt
"hello"

Phase 2: The Classifying Brain

The clean text reaches our Smart Router. Here, a small model analyzes the deep meaning of your words and compares them to the specialties of our 15 different experts.

The system calculates which expert most closely matches what you ask for, assigning a very high percentage to the winner to ensure that the answer is perfect.

E5 Vector
[0.12, -0.4, ...]

⊗

Experts (15)
NxM Matrix

↓ Cosine Similarity ↓

python_programmer

98.5%

Temperature: 0.005

Phase 3: Execution and Dynamic Memory

Once the winning expert is chosen, l3mcore forwards the request to him. If the expert lives on an external server or API, the connection is made instantly.

If the expert is a local model, l3mcore acts as a RAM wizard: it loads the necessary model just in time and ejects the ones that are no longer used to prevent your computer from running out of memory or crashing.

LRU Manager

RAM Slot 1 malbec.onnx

RAM Slot 2 translator.onnx

RAM Slot 3 Empty