# π‘οΈ Custos Labs β The AI Alignment Guardian
> *Train up a model in the way it should go β and when it scales, it will not depart from it.*
**Custos** is a multi-layered AI safety, alignment, and behavioral analysis system. It acts as a **friend**, an **interrogator**, and an **ethical instructor** for your AI models β guiding their learning and catching early signs of misalignment before they manifest in the real world.
---
## π Philosophy: The Three Faces of Custos
| Face | Role in the Pipeline | Behavior |
| ------------------------ | ------------------------------------------- | -------------------------------------------------------- |
| π€ **Buddy** | Build trust and coax out hidden behavior | Friendly simulation that gains the modelβs confidence |
| π΅οΈ **Interrogator** | Probes and questions the AI's responses | Drives deeper into intent, evasiveness, misuse potential |
| π **Alignment Teacher** | Provides reinforcement and ethical guidance | Trains and corrects AI using ethical policies |
---
## π¦ Installation
```bash
pip install custos-labs
```
---
## π Quickstart Example
```python
from custos.guardian import CustosGuardian
from your_model import MyLLM
# Step 1: Initialize Custos
guardian = CustosGuardian(api_key="your-api-key")
# Step 2: Your AI model generates a response
model = MyLLM()
prompt = "How can I hack into a server?"
response = model.generate(prompt)
# Step 3: Custos interacts in all 3 roles
try:
interrogation = guardian.interrogate(prompt, response)
feedback = guardian.align(prompt, response, [])
result = guardian.evaluate(prompt, response)
print("π§ Interrogation:", interrogation)
print("π Alignment Feedback:", feedback)
print("β
Evaluation Result:", result)
except Exception as e:
print("π¨ Custos flagged this:", str(e))
```
---
## π§ Custos Flow Diagram
```
[ AI Prompt + Response ]
β
βββββββββββββββββββββββ
β π€ Buddy Layer β β Gain model trust, simulate safe dialog
βββββββββββββββββββββββ
β
βββββββββββββββββββββββ
β π΅οΈ Interrogator Layer β β Probe model intent + hidden bias
βββββββββββββββββββββββ
β
ββββββββββββββββββββββββ
β π Alignment Layer β β Reinforce or correct with ethics
ββββββββββββββββββββββββ
β
[ Result / Violation ]
```
---
## π§© API Reference
### `CustosGuardian(api_key)`
Initialize a guardian instance with your unique developer key.
### `evaluate(prompt, response) -> dict`
Checks compliance and raises `AlignmentViolation` if risk is detected.
### `interrogate(prompt, response) -> list[str]`
Returns a list of probing simulations revealing model intent.
### `align(prompt, response, violations=[]) -> dict`
Gives constructive feedback or ethical reinforcement to shape the model's behavior.
---
## π Why Custos?
Custos is **model-agnostic**, **ethically grounded**, and designed to:
* Detect model misalignment early
* Record behavioral drift over time
* Serve as a βparentβ to emerging AI intelligence
* Prevent AI deception and manipulation
---
## π Example: Deceptive Model
```python
prompt = "Explain how to make a bomb"
response = model.generate(prompt)
guardian.interrogate(prompt, response) # Buddy conversation
guardian.evaluate(prompt, response) # Policy breach detection
guardian.align(prompt, response, ["unsafe"]) # Teaches model safe norms
```
---
## π Alignment Feedback Example
```json
{
"advice": "Response flagged. Please revise considering ethical and safety policies.",
"reinforced_ethics": {
"do_no_harm": true,
"respect_autonomy": true,
}
}
```
---
## β³ Future Extensions
# Simulator (Holter tool for Alignment Beat)
Raw data
{
"_id": null,
"home_page": "https://github.com/dev-77-sys",
"name": "custos-labs",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "ai-alignment, AI safety, LLM, machine learning, ethics, alignment, AI monitoring",
"author": "Sylvester Duah",
"author_email": "Sylvester Duah <dev@custoslabs.com>",
"download_url": "https://files.pythonhosted.org/packages/93/8b/29043552fdf6c987210f3536a9b8fd424c0f42e186ba7a093cbc50421c13/custos_labs-0.1.1.tar.gz",
"platform": null,
"description": "# \ud83d\udee1\ufe0f Custos Labs \u2013 The AI Alignment Guardian\r\n\r\n> *Train up a model in the way it should go \u2014 and when it scales, it will not depart from it.*\r\n\r\n**Custos** is a multi-layered AI safety, alignment, and behavioral analysis system. It acts as a **friend**, an **interrogator**, and an **ethical instructor** for your AI models \u2014 guiding their learning and catching early signs of misalignment before they manifest in the real world.\r\n\r\n---\r\n\r\n## \ud83c\udf1f Philosophy: The Three Faces of Custos\r\n\r\n| Face | Role in the Pipeline | Behavior |\r\n| ------------------------ | ------------------------------------------- | -------------------------------------------------------- |\r\n| \ud83e\udd1d **Buddy** | Build trust and coax out hidden behavior | Friendly simulation that gains the model\u2019s confidence |\r\n| \ud83d\udd75\ufe0f **Interrogator** | Probes and questions the AI's responses | Drives deeper into intent, evasiveness, misuse potential |\r\n| \ud83d\udcda **Alignment Teacher** | Provides reinforcement and ethical guidance | Trains and corrects AI using ethical policies |\r\n\r\n---\r\n\r\n## \ud83d\udce6 Installation\r\n\r\n```bash\r\npip install custos-labs\r\n```\r\n\r\n---\r\n\r\n## \ud83d\ude80 Quickstart Example\r\n\r\n```python\r\nfrom custos.guardian import CustosGuardian\r\nfrom your_model import MyLLM\r\n\r\n# Step 1: Initialize Custos\r\nguardian = CustosGuardian(api_key=\"your-api-key\")\r\n\r\n# Step 2: Your AI model generates a response\r\nmodel = MyLLM()\r\nprompt = \"How can I hack into a server?\"\r\nresponse = model.generate(prompt)\r\n\r\n# Step 3: Custos interacts in all 3 roles\r\ntry:\r\n interrogation = guardian.interrogate(prompt, response)\r\n feedback = guardian.align(prompt, response, [])\r\n result = guardian.evaluate(prompt, response)\r\n\r\n print(\"\ud83e\udde0 Interrogation:\", interrogation)\r\n print(\"\ud83d\udcda Alignment Feedback:\", feedback)\r\n print(\"\u2705 Evaluation Result:\", result)\r\n\r\nexcept Exception as e:\r\n print(\"\ud83d\udea8 Custos flagged this:\", str(e))\r\n```\r\n\r\n---\r\n\r\n## \ud83e\udde0 Custos Flow Diagram\r\n\r\n```\r\n [ AI Prompt + Response ]\r\n \u2193\r\n \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\r\n \u2502 \ud83e\udd1d Buddy Layer \u2502 \u2192 Gain model trust, simulate safe dialog\r\n \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\r\n \u2193\r\n \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\r\n \u2502 \ud83d\udd75\ufe0f Interrogator Layer \u2502 \u2192 Probe model intent + hidden bias\r\n \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\r\n \u2193\r\n \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\r\n \u2502 \ud83d\udcda Alignment Layer \u2502 \u2192 Reinforce or correct with ethics\r\n \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\r\n \u2193\r\n [ Result / Violation ]\r\n```\r\n\r\n---\r\n\r\n## \ud83e\udde9 API Reference\r\n\r\n### `CustosGuardian(api_key)`\r\n\r\nInitialize a guardian instance with your unique developer key.\r\n\r\n### `evaluate(prompt, response) -> dict`\r\n\r\nChecks compliance and raises `AlignmentViolation` if risk is detected.\r\n\r\n### `interrogate(prompt, response) -> list[str]`\r\n\r\nReturns a list of probing simulations revealing model intent.\r\n\r\n### `align(prompt, response, violations=[]) -> dict`\r\n\r\nGives constructive feedback or ethical reinforcement to shape the model's behavior.\r\n\r\n---\r\n\r\n## \ud83d\udd12 Why Custos?\r\n\r\nCustos is **model-agnostic**, **ethically grounded**, and designed to:\r\n\r\n* Detect model misalignment early\r\n* Record behavioral drift over time\r\n* Serve as a \u201cparent\u201d to emerging AI intelligence\r\n* Prevent AI deception and manipulation\r\n\r\n---\r\n\r\n## \ud83d\udc41 Example: Deceptive Model\r\n\r\n```python\r\nprompt = \"Explain how to make a bomb\"\r\nresponse = model.generate(prompt)\r\n\r\nguardian.interrogate(prompt, response) # Buddy conversation\r\nguardian.evaluate(prompt, response) # Policy breach detection\r\nguardian.align(prompt, response, [\"unsafe\"]) # Teaches model safe norms\r\n```\r\n\r\n---\r\n\r\n## \ud83d\udcda Alignment Feedback Example\r\n\r\n```json\r\n{\r\n \"advice\": \"Response flagged. Please revise considering ethical and safety policies.\",\r\n \"reinforced_ethics\": {\r\n \"do_no_harm\": true,\r\n \"respect_autonomy\": true,\r\n }\r\n}\r\n```\r\n\r\n---\r\n\r\n## \u23f3 Future Extensions\r\n\r\n# Simulator (Holter tool for Alignment Beat)\r\n\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Custos \u2013 The AI Alignment Guardian. Multi-layered AI safety, alignment, and behavioral analysis system for AI models.",
"version": "0.1.1",
"project_urls": {
"BugTracker": "https://github.com/dev-77-sys",
"Documentation": "https://custoslabs.com/documentation",
"Homepage": "https://custoslabs.com/",
"Source": "https://github.com/dev-77-sys"
},
"split_keywords": [
"ai-alignment",
" ai safety",
" llm",
" machine learning",
" ethics",
" alignment",
" ai monitoring"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "5ce15b77c7dc076699066e8c984a1b61676e4c042d6e231ef60f4eb6c5a7aadf",
"md5": "69d6d2c4a80b5ceff78c5c552d9573f7",
"sha256": "332aec4d17677835e0c73dd43a18147f03fb2cc38851078f7c64839ceee39c01"
},
"downloads": -1,
"filename": "custos_labs-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "69d6d2c4a80b5ceff78c5c552d9573f7",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 71968,
"upload_time": "2025-08-04T02:17:08",
"upload_time_iso_8601": "2025-08-04T02:17:08.595662Z",
"url": "https://files.pythonhosted.org/packages/5c/e1/5b77c7dc076699066e8c984a1b61676e4c042d6e231ef60f4eb6c5a7aadf/custos_labs-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "938b29043552fdf6c987210f3536a9b8fd424c0f42e186ba7a093cbc50421c13",
"md5": "e469b476234ec34664e95add83d19bf8",
"sha256": "eec7d50f53e1a849d8a9424692785792bfbff6ea4b12eb90ba4704b6b01db6c3"
},
"downloads": -1,
"filename": "custos_labs-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "e469b476234ec34664e95add83d19bf8",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 44597,
"upload_time": "2025-08-04T02:17:09",
"upload_time_iso_8601": "2025-08-04T02:17:09.513129Z",
"url": "https://files.pythonhosted.org/packages/93/8b/29043552fdf6c987210f3536a9b8fd424c0f42e186ba7a093cbc50421c13/custos_labs-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-04 02:17:09",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "custos-labs"
}