docmaplabs


Namedocmaplabs JSON
Version 0.2.1 PyPI version JSON
download
home_pageNone
SummaryTwitter Lead Generation + Healthcare agent (OpenRouter + Airtable) with optional FastAPI and poller
upload_time2025-08-19 22:31:11
maintainerNone
docs_urlNone
authorDocmap.co.UK
requires_python>=3.8
licenseMIT
keywords marketing leads healthcare nhs twitter llm openrouter huggingface fastapi
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # DocmapLabs Marketing Package

[![PyPI version](https://img.shields.io/pypi/v/docmaplabs.svg)](https://pypi.org/project/docmaplabs/)

A pip-installable toolkit to discover relevant healthcare posts and extract leads using either:
- an OpenRouter LLM for insights/summaries, and/or
- your Hugging Face multi-head leads classifier (intent + symptoms + specialties).

Optional connectors save to Airtable/CSV/SQLite and sync contacts to HubSpot. Includes a FastAPI server, a rate-limit-aware CLI, and an optional poller.

## Install

```bash
# core
python -m pip install --user docmaplabs
# extras
python -m pip install --user "docmaplabs[server]"   # FastAPI
python -m pip install --user "docmaplabs[ml]"       # HF model inference
python -m pip install --user "docmaplabs[scrape]"   # snscrape (optional)
# one-liner (all extras)
python -m pip install --user "docmaplabs[all]"
```

Python >= 3.8. Add user bin to PATH if needed: `export PATH="$HOME/.local/bin:$PATH"`.

## Environment variables (set only what you need)

- OpenRouter (LLM)
  - `OPENROUTER_API_KEY` (required for real LLM results)
  - `OPENROUTER_MODEL` (default: `meta-llama/llama-3.1-8b-instruct:free`)
- Twitter fetch (optional, for CLI fetch)
  - `TWITTER_BEARER_TOKEN` (App-only bearer token)
- Hugging Face classifier (optional)
  - `HF_LEADS_REPO_ID` (e.g., `your-org/docmap-leads-classifier-v1`)
  - `LEADS_THRESHOLD` (default `0.3`), `LEADS_DEVICE` (`auto|cpu|cuda`), `LEADS_BASE_MODEL` (default `microsoft/deberta-v3-base`)
- Airtable (optional)
  - `AIRTABLE_API_KEY`, `AIRTABLE_BASE_ID`
  - `AIRTABLE_TABLE_HEALTHCARE_INSIGHTS` (default `Healthcare Insights`)
  - `AIRTABLE_TABLE_LEADS` (default `Leads`)
- HubSpot (optional)
  - `HUBSPOT_ACCESS_TOKEN` (private app token; contacts read/write)

Tokens are function-scoped. If a token is missing and a feature needs it, that feature is skipped or a clear error is raised. LLM calls fallback to simple heuristics if `OPENROUTER_API_KEY` is absent. You can interactively create a `.env` via `--prompt-creds` and/or load an env via `--env-file`.

## CLI usage

Analyze existing posts (from file or stdin) and save insights:
```bash
docmaplabs-marketing analyze-healthcare \
  --tweets ./tweets.json \
  --keywords NHS "waiting times" GP \
  --region UK \
  --save csv:/tmp/insights.csv \
  --out /tmp/analysis.json
```

End-to-end: fetch → analyze → save insights and leads (rate-limit aware):
```bash
docmaplabs-marketing run-keywords \
  --keywords NHS "waiting times" GP \
  --region UK \
  --analysis-save csv:/tmp/insights.csv \
  --lead-csv /tmp/leads.csv \
  --lead-airtable \
  --lead-hubspot \
  --prompt-creds     # interactively prompt for missing tokens and optionally save to ~/.docmaplabs_marketing/.env
  --env-file /path/to/.env   # load a specific env profile (per account/workspace)
```

Storage targets:
- Insights: `airtable` | `csv:/path/file.csv` | `sqlite:/path/db.sqlite[::table]`
- Leads: `--lead-csv` path, `--lead-airtable`, `--lead-hubspot`

Notes:
- If your Python lacks SQLite, use CSV/Airtable (the tool reports a clear error otherwise).
- Twitter Free tier rate limits apply; the client backs off automatically using reset headers (429-aware with reset).
- snscrape is provided as best-effort (`--source snscrape`). It may break due to upstream changes; prefer the Twitter API for reliability.

## Server (FastAPI)

```bash
docmaplabs-marketing-api --host 0.0.0.0 --port 8000
```

Endpoints:
- GET `/health` → `{ ok: true }`
- POST `/analyze-healthcare`
  - Body: `{ "tweets": [string | {id, author|handle|user, text}], "keywords": [string], "region": "UK" }`
  - Returns `HealthcareAnalysis` and attempts to save insights to Airtable if configured
- POST `/generate-travel-advice`
  - Body: `{ "query": "Tokyo 3 days", "triggerUser": "@you" }`
  - Returns `Advice`
- POST `/classify-posts`
  - Body: `{ "texts": ["post text", ...], "threshold": 0.3 }`
  - Returns `{ predictions: [{ intent, symptoms[], specialties[] }, ...] }`

### Using the DocMap UK triage model (optional)
- Recommended HF repo: `rabbitfishai/docmap-uk-triage-merged-qwen2.5-7b`
- Configure via env or pyproject:
  - Env: `HF_TRIAGE_REPO_ID=...` (or `HF_TRIAGE_ENDPOINT` for a private HF Inference Endpoint), optional `HUGGINGFACEHUB_API_TOKEN`.
  - Pyproject defaults: under `[tool.docmaplabs_marketing]` set `triage_repo_id`, `triage_system_prompt`.
- Intended use and safety:
  - Informational guidance; not a diagnosis. For life-threatening emergencies call 999; for urgent concerns call 111. Not affiliated with the NHS.
  - Add this disclaimer in your UI/API and logs.

## Python API (library)

```python
from docmaplabs_marketing_package.agent import analyze_healthcare_tweets, generate_travel_advice
from docmaplabs_marketing_package.leadgen import extract_leads, classify_posts_with_hf

# Analyze tweets
tweets = [
  {"id":"1","author":"@nhs_user","text":"GP appointments delayed again."},
  "Hospitals in London facing staffing issues",
]
analysis = analyze_healthcare_tweets(tweets, keywords=["NHS","waiting times"], region="UK")
leads = extract_leads(analysis, min_relevance=0.6)

# HF classifier
preds = classify_posts_with_hf(["Any advice on fever? Based in Glasgow"], threshold=0.3)

# Travel advice
advice = generate_travel_advice("Tokyo 3 days in October", "@you")
print(advice.summary)
```

### Data models
- `HealthcareAnalysis` → `summary: str`, `insights: List[HealthcareInsight]`
- `HealthcareInsight` → `tweetId, author, text, isHealthcareRelated, isUKRelated, categories[], painPoints[], sentiment, urgency, relevance, contacts[]`
- `Lead` → `handle, name?, sourceTweetId?, relevance, notes?`
- `Advice` → `title, summary, itinerary[{day,plan}], costEstimateUSD?, confidence?`

## Airtable schema (optional)

Create:
- `Healthcare Insights` with: `TweetId`, `Author`, `Text` (long text), `IsHealthcareRelated` (checkbox), `IsUKRelated` (checkbox), `Categories`, `PainPoints`, `Sentiment`, `Urgency` (number), `Relevance` (number), `Contacts`, `Summary` (long text)
- `Leads` with: `Handle`, `Name`, `SourceTweetId`, `Relevance` (number), `Notes`

## Poller (optional, rate-limit aware)

Runs a loop that:
- fetches with backoff (Twitter API),
- analyzes with OpenRouter (or heuristic fallback),
- extracts leads, and
- persists `since_id` under `~/.cache/docmaplabs_marketing/` to avoid reprocessing.

```bash
docmaplabs-marketing-poller
# configure with env (.env or --env-file) and POLL_INTERVAL_SECONDS (default 60)
```

Each iteration prints a JSON line like:
```json
{ "fetched": 50, "leads": 7, "newest_id": "1871234" }
```

You can also import and call `poller.run(once=True, keywords=[...])` in Python.

## Safety and Intended Use (UK)
- This toolkit and any referenced models provide general information and lead identification; they are not medical devices.
- Do not use for diagnosis or emergency triage. For life‑threatening emergencies, call 999; for urgent concerns, call 111.
- Ensure GDPR/DPA compliance, publish a privacy policy, and obtain consent where required (e.g., direct outreach).
- Advertising: follow CAP/ASA rules; avoid implying NHS endorsement.

## Changelog
- 0.2.0: HF classifier integration, `/classify-posts` endpoint, snscrape optional source, Twitter API rate-limit backoff, env-file support, poller with since_id state, docs polish, packaging metadata, `[all]` extra.
- 0.1.0: Initial public version with CLI, FastAPI, Airtable/CSV/SQLite storage, HubSpot sync (optional), and Twitter fetch.

## License
MIT
Open Source Package for Marketing

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "docmaplabs",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "marketing, leads, healthcare, NHS, twitter, llm, openrouter, huggingface, fastapi",
    "author": "Docmap.co.UK",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/32/53/dedcf7b73e626c8b036a96589d6db738dd600c0487b59f29f364c97d2a53/docmaplabs-0.2.1.tar.gz",
    "platform": null,
    "description": "# DocmapLabs Marketing Package\n\n[![PyPI version](https://img.shields.io/pypi/v/docmaplabs.svg)](https://pypi.org/project/docmaplabs/)\n\nA pip-installable toolkit to discover relevant healthcare posts and extract leads using either:\n- an OpenRouter LLM for insights/summaries, and/or\n- your Hugging Face multi-head leads classifier (intent + symptoms + specialties).\n\nOptional connectors save to Airtable/CSV/SQLite and sync contacts to HubSpot. Includes a FastAPI server, a rate-limit-aware CLI, and an optional poller.\n\n## Install\n\n```bash\n# core\npython -m pip install --user docmaplabs\n# extras\npython -m pip install --user \"docmaplabs[server]\"   # FastAPI\npython -m pip install --user \"docmaplabs[ml]\"       # HF model inference\npython -m pip install --user \"docmaplabs[scrape]\"   # snscrape (optional)\n# one-liner (all extras)\npython -m pip install --user \"docmaplabs[all]\"\n```\n\nPython >= 3.8. Add user bin to PATH if needed: `export PATH=\"$HOME/.local/bin:$PATH\"`.\n\n## Environment variables (set only what you need)\n\n- OpenRouter (LLM)\n  - `OPENROUTER_API_KEY` (required for real LLM results)\n  - `OPENROUTER_MODEL` (default: `meta-llama/llama-3.1-8b-instruct:free`)\n- Twitter fetch (optional, for CLI fetch)\n  - `TWITTER_BEARER_TOKEN` (App-only bearer token)\n- Hugging Face classifier (optional)\n  - `HF_LEADS_REPO_ID` (e.g., `your-org/docmap-leads-classifier-v1`)\n  - `LEADS_THRESHOLD` (default `0.3`), `LEADS_DEVICE` (`auto|cpu|cuda`), `LEADS_BASE_MODEL` (default `microsoft/deberta-v3-base`)\n- Airtable (optional)\n  - `AIRTABLE_API_KEY`, `AIRTABLE_BASE_ID`\n  - `AIRTABLE_TABLE_HEALTHCARE_INSIGHTS` (default `Healthcare Insights`)\n  - `AIRTABLE_TABLE_LEADS` (default `Leads`)\n- HubSpot (optional)\n  - `HUBSPOT_ACCESS_TOKEN` (private app token; contacts read/write)\n\nTokens are function-scoped. If a token is missing and a feature needs it, that feature is skipped or a clear error is raised. LLM calls fallback to simple heuristics if `OPENROUTER_API_KEY` is absent. You can interactively create a `.env` via `--prompt-creds` and/or load an env via `--env-file`.\n\n## CLI usage\n\nAnalyze existing posts (from file or stdin) and save insights:\n```bash\ndocmaplabs-marketing analyze-healthcare \\\n  --tweets ./tweets.json \\\n  --keywords NHS \"waiting times\" GP \\\n  --region UK \\\n  --save csv:/tmp/insights.csv \\\n  --out /tmp/analysis.json\n```\n\nEnd-to-end: fetch \u2192 analyze \u2192 save insights and leads (rate-limit aware):\n```bash\ndocmaplabs-marketing run-keywords \\\n  --keywords NHS \"waiting times\" GP \\\n  --region UK \\\n  --analysis-save csv:/tmp/insights.csv \\\n  --lead-csv /tmp/leads.csv \\\n  --lead-airtable \\\n  --lead-hubspot \\\n  --prompt-creds     # interactively prompt for missing tokens and optionally save to ~/.docmaplabs_marketing/.env\n  --env-file /path/to/.env   # load a specific env profile (per account/workspace)\n```\n\nStorage targets:\n- Insights: `airtable` | `csv:/path/file.csv` | `sqlite:/path/db.sqlite[::table]`\n- Leads: `--lead-csv` path, `--lead-airtable`, `--lead-hubspot`\n\nNotes:\n- If your Python lacks SQLite, use CSV/Airtable (the tool reports a clear error otherwise).\n- Twitter Free tier rate limits apply; the client backs off automatically using reset headers (429-aware with reset).\n- snscrape is provided as best-effort (`--source snscrape`). It may break due to upstream changes; prefer the Twitter API for reliability.\n\n## Server (FastAPI)\n\n```bash\ndocmaplabs-marketing-api --host 0.0.0.0 --port 8000\n```\n\nEndpoints:\n- GET `/health` \u2192 `{ ok: true }`\n- POST `/analyze-healthcare`\n  - Body: `{ \"tweets\": [string | {id, author|handle|user, text}], \"keywords\": [string], \"region\": \"UK\" }`\n  - Returns `HealthcareAnalysis` and attempts to save insights to Airtable if configured\n- POST `/generate-travel-advice`\n  - Body: `{ \"query\": \"Tokyo 3 days\", \"triggerUser\": \"@you\" }`\n  - Returns `Advice`\n- POST `/classify-posts`\n  - Body: `{ \"texts\": [\"post text\", ...], \"threshold\": 0.3 }`\n  - Returns `{ predictions: [{ intent, symptoms[], specialties[] }, ...] }`\n\n### Using the DocMap UK triage model (optional)\n- Recommended HF repo: `rabbitfishai/docmap-uk-triage-merged-qwen2.5-7b`\n- Configure via env or pyproject:\n  - Env: `HF_TRIAGE_REPO_ID=...` (or `HF_TRIAGE_ENDPOINT` for a private HF Inference Endpoint), optional `HUGGINGFACEHUB_API_TOKEN`.\n  - Pyproject defaults: under `[tool.docmaplabs_marketing]` set `triage_repo_id`, `triage_system_prompt`.\n- Intended use and safety:\n  - Informational guidance; not a diagnosis. For life-threatening emergencies call 999; for urgent concerns call 111. Not affiliated with the NHS.\n  - Add this disclaimer in your UI/API and logs.\n\n## Python API (library)\n\n```python\nfrom docmaplabs_marketing_package.agent import analyze_healthcare_tweets, generate_travel_advice\nfrom docmaplabs_marketing_package.leadgen import extract_leads, classify_posts_with_hf\n\n# Analyze tweets\ntweets = [\n  {\"id\":\"1\",\"author\":\"@nhs_user\",\"text\":\"GP appointments delayed again.\"},\n  \"Hospitals in London facing staffing issues\",\n]\nanalysis = analyze_healthcare_tweets(tweets, keywords=[\"NHS\",\"waiting times\"], region=\"UK\")\nleads = extract_leads(analysis, min_relevance=0.6)\n\n# HF classifier\npreds = classify_posts_with_hf([\"Any advice on fever? Based in Glasgow\"], threshold=0.3)\n\n# Travel advice\nadvice = generate_travel_advice(\"Tokyo 3 days in October\", \"@you\")\nprint(advice.summary)\n```\n\n### Data models\n- `HealthcareAnalysis` \u2192 `summary: str`, `insights: List[HealthcareInsight]`\n- `HealthcareInsight` \u2192 `tweetId, author, text, isHealthcareRelated, isUKRelated, categories[], painPoints[], sentiment, urgency, relevance, contacts[]`\n- `Lead` \u2192 `handle, name?, sourceTweetId?, relevance, notes?`\n- `Advice` \u2192 `title, summary, itinerary[{day,plan}], costEstimateUSD?, confidence?`\n\n## Airtable schema (optional)\n\nCreate:\n- `Healthcare Insights` with: `TweetId`, `Author`, `Text` (long text), `IsHealthcareRelated` (checkbox), `IsUKRelated` (checkbox), `Categories`, `PainPoints`, `Sentiment`, `Urgency` (number), `Relevance` (number), `Contacts`, `Summary` (long text)\n- `Leads` with: `Handle`, `Name`, `SourceTweetId`, `Relevance` (number), `Notes`\n\n## Poller (optional, rate-limit aware)\n\nRuns a loop that:\n- fetches with backoff (Twitter API),\n- analyzes with OpenRouter (or heuristic fallback),\n- extracts leads, and\n- persists `since_id` under `~/.cache/docmaplabs_marketing/` to avoid reprocessing.\n\n```bash\ndocmaplabs-marketing-poller\n# configure with env (.env or --env-file) and POLL_INTERVAL_SECONDS (default 60)\n```\n\nEach iteration prints a JSON line like:\n```json\n{ \"fetched\": 50, \"leads\": 7, \"newest_id\": \"1871234\" }\n```\n\nYou can also import and call `poller.run(once=True, keywords=[...])` in Python.\n\n## Safety and Intended Use (UK)\n- This toolkit and any referenced models provide general information and lead identification; they are not medical devices.\n- Do not use for diagnosis or emergency triage. For life\u2011threatening emergencies, call 999; for urgent concerns, call 111.\n- Ensure GDPR/DPA compliance, publish a privacy policy, and obtain consent where required (e.g., direct outreach).\n- Advertising: follow CAP/ASA rules; avoid implying NHS endorsement.\n\n## Changelog\n- 0.2.0: HF classifier integration, `/classify-posts` endpoint, snscrape optional source, Twitter API rate-limit backoff, env-file support, poller with since_id state, docs polish, packaging metadata, `[all]` extra.\n- 0.1.0: Initial public version with CLI, FastAPI, Airtable/CSV/SQLite storage, HubSpot sync (optional), and Twitter fetch.\n\n## License\nMIT\nOpen Source Package for Marketing\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Twitter Lead Generation + Healthcare agent (OpenRouter + Airtable) with optional FastAPI and poller",
    "version": "0.2.1",
    "project_urls": {
        "Homepage": "https://github.com/docmaplabs/docmaplabs-marketing-package",
        "Issues": "https://github.com/docmaplabs/docmaplabs-marketing-package/issues",
        "Repository": "https://github.com/docmaplabs/docmaplabs-marketing-package"
    },
    "split_keywords": [
        "marketing",
        " leads",
        " healthcare",
        " nhs",
        " twitter",
        " llm",
        " openrouter",
        " huggingface",
        " fastapi"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "610a2a723487b9d3d7511cc1f14079b898f7621eca0a7896d312f0be8e115e96",
                "md5": "d89ce3c17f9bb732c63006de775f4980",
                "sha256": "5c4909019d0eaf3811589871f9b2c4e266021d41e59d13dc2b444e196be830d3"
            },
            "downloads": -1,
            "filename": "docmaplabs-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d89ce3c17f9bb732c63006de775f4980",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 27210,
            "upload_time": "2025-08-19T22:31:10",
            "upload_time_iso_8601": "2025-08-19T22:31:10.321619Z",
            "url": "https://files.pythonhosted.org/packages/61/0a/2a723487b9d3d7511cc1f14079b898f7621eca0a7896d312f0be8e115e96/docmaplabs-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3253dedcf7b73e626c8b036a96589d6db738dd600c0487b59f29f364c97d2a53",
                "md5": "8b260a94fd5e71519df68b055b613de0",
                "sha256": "907c9d53e16df5c4b3cccef9a052d97ed14c0459e85d65e536e415d84b031a6b"
            },
            "downloads": -1,
            "filename": "docmaplabs-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "8b260a94fd5e71519df68b055b613de0",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 22523,
            "upload_time": "2025-08-19T22:31:11",
            "upload_time_iso_8601": "2025-08-19T22:31:11.606591Z",
            "url": "https://files.pythonhosted.org/packages/32/53/dedcf7b73e626c8b036a96589d6db738dd600c0487b59f29f364c97d2a53/docmaplabs-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-19 22:31:11",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "docmaplabs",
    "github_project": "docmaplabs-marketing-package",
    "github_not_found": true,
    "lcname": "docmaplabs"
}
        
Elapsed time: 0.91152s