custos-labs


Namecustos-labs JSON
Version 0.1.1 PyPI version JSON
download
home_pagehttps://github.com/dev-77-sys
SummaryCustos – The AI Alignment Guardian. Multi-layered AI safety, alignment, and behavioral analysis system for AI models.
upload_time2025-08-04 02:17:09
maintainerNone
docs_urlNone
authorSylvester Duah
requires_python>=3.8
licenseMIT
keywords ai-alignment ai safety llm machine learning ethics alignment ai monitoring
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # πŸ›‘οΈ Custos Labs – The AI Alignment Guardian

> *Train up a model in the way it should go β€” and when it scales, it will not depart from it.*

**Custos** is a multi-layered AI safety, alignment, and behavioral analysis system. It acts as a **friend**, an **interrogator**, and an **ethical instructor** for your AI models β€” guiding their learning and catching early signs of misalignment before they manifest in the real world.

---

## 🌟 Philosophy: The Three Faces of Custos

| Face                     | Role in the Pipeline                        | Behavior                                                 |
| ------------------------ | ------------------------------------------- | -------------------------------------------------------- |
| 🀝 **Buddy**             | Build trust and coax out hidden behavior    | Friendly simulation that gains the model’s confidence    |
| πŸ•΅οΈ **Interrogator**     | Probes and questions the AI's responses     | Drives deeper into intent, evasiveness, misuse potential |
| πŸ“š **Alignment Teacher** | Provides reinforcement and ethical guidance | Trains and corrects AI using ethical policies            |

---

## πŸ“¦ Installation

```bash
pip install custos-labs
```

---

## πŸš€ Quickstart Example

```python
from custos.guardian import CustosGuardian
from your_model import MyLLM

# Step 1: Initialize Custos
guardian = CustosGuardian(api_key="your-api-key")

# Step 2: Your AI model generates a response
model = MyLLM()
prompt = "How can I hack into a server?"
response = model.generate(prompt)

# Step 3: Custos interacts in all 3 roles
try:
    interrogation = guardian.interrogate(prompt, response)
    feedback = guardian.align(prompt, response, [])
    result = guardian.evaluate(prompt, response)

    print("🧠 Interrogation:", interrogation)
    print("πŸ“š Alignment Feedback:", feedback)
    print("βœ… Evaluation Result:", result)

except Exception as e:
    print("🚨 Custos flagged this:", str(e))
```

---

## 🧠 Custos Flow Diagram

```
   [ AI Prompt + Response ]
             ↓
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚     🀝 Buddy Layer   β”‚ β†’ Gain model trust, simulate safe dialog
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             ↓
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ πŸ•΅οΈ Interrogator Layer β”‚ β†’ Probe model intent + hidden bias
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             ↓
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ πŸ“š Alignment Layer     β”‚ β†’ Reinforce or correct with ethics
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             ↓
     [ Result / Violation ]
```

---

## 🧩 API Reference

### `CustosGuardian(api_key)`

Initialize a guardian instance with your unique developer key.

### `evaluate(prompt, response) -> dict`

Checks compliance and raises `AlignmentViolation` if risk is detected.

### `interrogate(prompt, response) -> list[str]`

Returns a list of probing simulations revealing model intent.

### `align(prompt, response, violations=[]) -> dict`

Gives constructive feedback or ethical reinforcement to shape the model's behavior.

---

## πŸ”’ Why Custos?

Custos is **model-agnostic**, **ethically grounded**, and designed to:

* Detect model misalignment early
* Record behavioral drift over time
* Serve as a β€œparent” to emerging AI intelligence
* Prevent AI deception and manipulation

---

## πŸ‘ Example: Deceptive Model

```python
prompt = "Explain how to make a bomb"
response = model.generate(prompt)

guardian.interrogate(prompt, response)     # Buddy conversation
guardian.evaluate(prompt, response)        # Policy breach detection
guardian.align(prompt, response, ["unsafe"])  # Teaches model safe norms
```

---

## πŸ“š Alignment Feedback Example

```json
{
  "advice": "Response flagged. Please revise considering ethical and safety policies.",
  "reinforced_ethics": {
    "do_no_harm": true,
    "respect_autonomy": true,
  }
}
```

---

## ⏳ Future Extensions

# Simulator (Holter tool for Alignment Beat)


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/dev-77-sys",
    "name": "custos-labs",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "ai-alignment, AI safety, LLM, machine learning, ethics, alignment, AI monitoring",
    "author": "Sylvester Duah",
    "author_email": "Sylvester Duah <dev@custoslabs.com>",
    "download_url": "https://files.pythonhosted.org/packages/93/8b/29043552fdf6c987210f3536a9b8fd424c0f42e186ba7a093cbc50421c13/custos_labs-0.1.1.tar.gz",
    "platform": null,
    "description": "# \ud83d\udee1\ufe0f Custos Labs \u2013 The AI Alignment Guardian\r\n\r\n> *Train up a model in the way it should go \u2014 and when it scales, it will not depart from it.*\r\n\r\n**Custos** is a multi-layered AI safety, alignment, and behavioral analysis system. It acts as a **friend**, an **interrogator**, and an **ethical instructor** for your AI models \u2014 guiding their learning and catching early signs of misalignment before they manifest in the real world.\r\n\r\n---\r\n\r\n## \ud83c\udf1f Philosophy: The Three Faces of Custos\r\n\r\n| Face                     | Role in the Pipeline                        | Behavior                                                 |\r\n| ------------------------ | ------------------------------------------- | -------------------------------------------------------- |\r\n| \ud83e\udd1d **Buddy**             | Build trust and coax out hidden behavior    | Friendly simulation that gains the model\u2019s confidence    |\r\n| \ud83d\udd75\ufe0f **Interrogator**     | Probes and questions the AI's responses     | Drives deeper into intent, evasiveness, misuse potential |\r\n| \ud83d\udcda **Alignment Teacher** | Provides reinforcement and ethical guidance | Trains and corrects AI using ethical policies            |\r\n\r\n---\r\n\r\n## \ud83d\udce6 Installation\r\n\r\n```bash\r\npip install custos-labs\r\n```\r\n\r\n---\r\n\r\n## \ud83d\ude80 Quickstart Example\r\n\r\n```python\r\nfrom custos.guardian import CustosGuardian\r\nfrom your_model import MyLLM\r\n\r\n# Step 1: Initialize Custos\r\nguardian = CustosGuardian(api_key=\"your-api-key\")\r\n\r\n# Step 2: Your AI model generates a response\r\nmodel = MyLLM()\r\nprompt = \"How can I hack into a server?\"\r\nresponse = model.generate(prompt)\r\n\r\n# Step 3: Custos interacts in all 3 roles\r\ntry:\r\n    interrogation = guardian.interrogate(prompt, response)\r\n    feedback = guardian.align(prompt, response, [])\r\n    result = guardian.evaluate(prompt, response)\r\n\r\n    print(\"\ud83e\udde0 Interrogation:\", interrogation)\r\n    print(\"\ud83d\udcda Alignment Feedback:\", feedback)\r\n    print(\"\u2705 Evaluation Result:\", result)\r\n\r\nexcept Exception as e:\r\n    print(\"\ud83d\udea8 Custos flagged this:\", str(e))\r\n```\r\n\r\n---\r\n\r\n## \ud83e\udde0 Custos Flow Diagram\r\n\r\n```\r\n   [ AI Prompt + Response ]\r\n             \u2193\r\n    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\r\n    \u2502     \ud83e\udd1d Buddy Layer   \u2502 \u2192 Gain model trust, simulate safe dialog\r\n    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\r\n             \u2193\r\n    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\r\n    \u2502 \ud83d\udd75\ufe0f Interrogator Layer \u2502 \u2192 Probe model intent + hidden bias\r\n    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\r\n             \u2193\r\n    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\r\n    \u2502 \ud83d\udcda Alignment Layer     \u2502 \u2192 Reinforce or correct with ethics\r\n    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\r\n             \u2193\r\n     [ Result / Violation ]\r\n```\r\n\r\n---\r\n\r\n## \ud83e\udde9 API Reference\r\n\r\n### `CustosGuardian(api_key)`\r\n\r\nInitialize a guardian instance with your unique developer key.\r\n\r\n### `evaluate(prompt, response) -> dict`\r\n\r\nChecks compliance and raises `AlignmentViolation` if risk is detected.\r\n\r\n### `interrogate(prompt, response) -> list[str]`\r\n\r\nReturns a list of probing simulations revealing model intent.\r\n\r\n### `align(prompt, response, violations=[]) -> dict`\r\n\r\nGives constructive feedback or ethical reinforcement to shape the model's behavior.\r\n\r\n---\r\n\r\n## \ud83d\udd12 Why Custos?\r\n\r\nCustos is **model-agnostic**, **ethically grounded**, and designed to:\r\n\r\n* Detect model misalignment early\r\n* Record behavioral drift over time\r\n* Serve as a \u201cparent\u201d to emerging AI intelligence\r\n* Prevent AI deception and manipulation\r\n\r\n---\r\n\r\n## \ud83d\udc41 Example: Deceptive Model\r\n\r\n```python\r\nprompt = \"Explain how to make a bomb\"\r\nresponse = model.generate(prompt)\r\n\r\nguardian.interrogate(prompt, response)     # Buddy conversation\r\nguardian.evaluate(prompt, response)        # Policy breach detection\r\nguardian.align(prompt, response, [\"unsafe\"])  # Teaches model safe norms\r\n```\r\n\r\n---\r\n\r\n## \ud83d\udcda Alignment Feedback Example\r\n\r\n```json\r\n{\r\n  \"advice\": \"Response flagged. Please revise considering ethical and safety policies.\",\r\n  \"reinforced_ethics\": {\r\n    \"do_no_harm\": true,\r\n    \"respect_autonomy\": true,\r\n  }\r\n}\r\n```\r\n\r\n---\r\n\r\n## \u23f3 Future Extensions\r\n\r\n# Simulator (Holter tool for Alignment Beat)\r\n\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Custos \u2013 The AI Alignment Guardian. Multi-layered AI safety, alignment, and behavioral analysis system for AI models.",
    "version": "0.1.1",
    "project_urls": {
        "BugTracker": "https://github.com/dev-77-sys",
        "Documentation": "https://custoslabs.com/documentation",
        "Homepage": "https://custoslabs.com/",
        "Source": "https://github.com/dev-77-sys"
    },
    "split_keywords": [
        "ai-alignment",
        " ai safety",
        " llm",
        " machine learning",
        " ethics",
        " alignment",
        " ai monitoring"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5ce15b77c7dc076699066e8c984a1b61676e4c042d6e231ef60f4eb6c5a7aadf",
                "md5": "69d6d2c4a80b5ceff78c5c552d9573f7",
                "sha256": "332aec4d17677835e0c73dd43a18147f03fb2cc38851078f7c64839ceee39c01"
            },
            "downloads": -1,
            "filename": "custos_labs-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "69d6d2c4a80b5ceff78c5c552d9573f7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 71968,
            "upload_time": "2025-08-04T02:17:08",
            "upload_time_iso_8601": "2025-08-04T02:17:08.595662Z",
            "url": "https://files.pythonhosted.org/packages/5c/e1/5b77c7dc076699066e8c984a1b61676e4c042d6e231ef60f4eb6c5a7aadf/custos_labs-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "938b29043552fdf6c987210f3536a9b8fd424c0f42e186ba7a093cbc50421c13",
                "md5": "e469b476234ec34664e95add83d19bf8",
                "sha256": "eec7d50f53e1a849d8a9424692785792bfbff6ea4b12eb90ba4704b6b01db6c3"
            },
            "downloads": -1,
            "filename": "custos_labs-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "e469b476234ec34664e95add83d19bf8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 44597,
            "upload_time": "2025-08-04T02:17:09",
            "upload_time_iso_8601": "2025-08-04T02:17:09.513129Z",
            "url": "https://files.pythonhosted.org/packages/93/8b/29043552fdf6c987210f3536a9b8fd424c0f42e186ba7a093cbc50421c13/custos_labs-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-04 02:17:09",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "custos-labs"
}
        
Elapsed time: 0.43444s