osintgraph


Nameosintgraph JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryOsintgraph is a tool that maps Instagram targets, revealing social connections, posts, and interactions for OSINT investigations.
upload_time2025-09-06 17:13:22
maintainerNone
docs_urlNone
authorXD-MHLOO
requires_python>=3.9
licenseNone
keywords instagram langchain langgraph neo4j osint social network analysis
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Osintgraph (Open Source Intelligence Graph)

![osintgraph_banner](https://github.com/user-attachments/assets/04a46de3-8f0e-40fa-83f6-2a9ff811a667)

**Osintgraph** is a tool for deep social analysis and OSINT investigations focused on Instagram targets.
It uses Neo4j to map a target’s network — revealing connections, interests, and affiliations — and an interactive AI Agent to speed up investigations and simplify analysis.

## ⚡ What OSINTGraph Does
**OSINTGraph CLI** gathers all public Instagram data from a target and maps their social connections, including **profiles**, **followers**, **followees**, **posts**, **comments**, and **likes**. It helps you thoroughly examine your target by gathering all relevant data and analyzing it for investigations.

[See how it works ↗](#-how-osintgraph-works)

### Data collection via CLI:
| ![osintgrah_cli](https://github.com/user-attachments/assets/131fca5d-a0ac-4193-bf7c-af52bafc75b1) |
|-----------------|
| *Overview of CLI Interface for data collection* |

### Explore and analyze your target's data via two ways:

### 1. **Osintgraph AI Agent**
Use natural language to query about your target.
The AI Agent supports data retrieval, keyword and semantic searches, relationship queries, and template-driven analyses — helping you get focused answers without manually digging through data.
| [![asciicast](https://asciinema.org/a/732693.svg)](https://asciinema.org/a/732693) |
|-----------------|
| *Overview of interacting with the agent performing data retrieval, keyword and semantic searches, and template-based analyses.* |

### 2. **Neo4j Visualization**
Visualize your target’s social network, trace interactions, and query relationships directly.  

[![video](https://github.com/user-attachments/assets/71a6c81c-655e-4831-83e8-585e9d270b5a)](https://github.com/user-attachments/assets/71a6c81c-655e-4831-83e8-585e9d270b5a)

| *Example of tracing a target user’s close connection through their most commented post, then investigating mutual followers and all interactions between them.* |
|-----------------|




## 📚 Table of Contents

* [✨ About OSINTGraph](#osintgraph-open-source-intelligence-graph)  
* [⚡ What OSINTGraph Does](#-what-osintgraph-does)  
* [🚀 Getting Started](#-getting-started)  
  * [1. Install OSINTGraph](#1-install-osintgraph)  
  * [2. Setup Configuration](#2-setup-configuration)  
  * [3. Start Collecting Instagram Data](#3-start-collecting-instagram-data)  
  * [4. Analyze & Investigate](#4-analyze--investigate)  
  * [5. Visualize in Neo4j](#5-visualize-in-neo4j)
* [⚡ How OSINTGraph Works](#-how-osintgraph-works)  
  * [Phase 1: Reconnaissance](#phase-1-reconnaissance)  
  * [Phase 2: Analysis & Investigation](#phase-2-analysis--investigation)  
* [⚙ Commands Reference](#-commands-reference)  
  * [`setup`](#-setup-option)  
  * [`reset`](#-reset-option)  
  * [`discover`](#-discover-username)  
  * [`explore`](#-explore-username)  
  * [`agent`](#-agent)  
* [🧩 Data Model (Neo4j Schema)](#-data-model-neo4j-schema)  
  * [👤 Person Node](#-person---represents-an-instagram-account)
  * [📷 Post Node](#-post---represents-an-instagram-post)
  * [💬 Comment Node](#-comment---represents-a-comment-on-a-post) 
  * [🕸️ Relationships](#-relationships)
* [🕵️ OSINTGraph AI Agent – Getting Started Guide](#-osintgraph-ai-agent--getting-started-guide)  
  * [1. 🔧 Data Retrieval](#1--data-retrieval)  
    * [Approach 1: Basic Data Retrieval](#approach-1-basic-data-retrieval)  
    * [Approach 2: Relationship Traversal](#approach-2-relationship-traversal)  
    * [Approach 3: Content Search](#approach-3-content-search)  
    * [Combining Approaches](#combining-approaches)  
    * [Best Practices – How to Ask Questions for Best Results](#-best-practices--how-to-ask-questions-for-best-results)  
  * [2. 📝 Template-Based Analysis](#2--template-based-analysis)  
    * [⚡How Templates Work](#-how-templates-work)  
    * [🛠 How to Create Custom Templates](#-how-to-create-your-own-custom-template)  
* [🚫 How to Avoid Account Suspension](#-how-to-avoid-account-suspension)  
* [📦 Dependencies](#-dependencies)  

## 🚀 Getting Started
### 1. Install OSINTGraph
```bash
pipx install osintgraph
```
or
```bash
pip install osintgraph
```
> [!NOTE]
> When using pip, it’s recommended to install inside a Python virtual environment to avoid dependency conflicts.

### 2. Setup Configuration 
Before running `osintgraph setup`, make sure you have the following ready:

- **Instagram Account:** Preferably not your main account

- **Neo4j Database:** For storing and visualizing data.
  
  (Sign up at [Neo4j](https://neo4j.com) → Create an instance for free → Download admin credentials) — you’ll need these for connection.

- **Gemini API Key:** Enables data pre-analyses and the AI agent.
  
  (Sign up at [Google AI Studio](https://aistudio.google.com) → Create or select a Google Cloud project → Get API Key for free)

- **User Agent (Optional):** Helps reduce Instagram detection risk.
  (Open your Firefox browser where you log in to Instagram, search “my user agent” on Google, and copy it)

Then run 
```bash
osintgraph setup
```

### 3. Start collecting Instagram data
Start gathering data on your target:
```bash
osintgraph discover TARGET_INSTAGRAM_USERNAME --limit follower=100 followee=100 post=2 
```
### 4. Analyze & Investigate
Launch the AI Agent to explore and analyze collected data:
```bash
osintgraph agent
```
Once the agent starts, try asking it:
```Show the target user’s profile info```

### 5. Visualize in Neo4j
Explore your target’s network graph interactively.
- Go to the [Neo4j Console](https://console-preview.neo4j.io/tools/explore).
- Click the **Explore tab**, then **Connect**.
- In the search bar, type "Show me a graph".
- You should now see the person you just collected, along with their relationships.


## ⚡ How OSINTGraph Works

**OSINTGraph run in two main phases: [Reconnaissance](#phase-1-reconnaissance) and [Analysis & Investigation](#phase-2-analysis--investigation).**



```bash
   ⚡PHASE 1: RECONNAISSANCE                                           ⚡PHASE 2: ANALYSIS & INVESTIGATION
   ──────────────────────────                                           ───────────────────────────────────
   [ Data Collection ] (osintgraph discover <target>)                    [ Investigation ] 
     ├─ Profile Metadata                                                   ├─ [AI Agent] (osintgraph agent)
     ├─ Followers                                                          │    • Retrieve Data    
     ├─ Followees                                                          │    • Keyword Search
     └─ Posts (with Comments)                                              │    • Semantic Search
           ↓                                                               │    • Graph Relationship Search
   Posts Pre-Analysis                                                      │    • Run Template Analyses
     ├─ Uses:                                                              └─ [Neo4j Visualization]
     │    • Post Metadata
     │    • Comments
     │    • Image Pre-Analyses
     │         ├─ Uses:
     │         │    • Post media (thumbnails & images)
     │         └─ Generates:
     │              • Structured Image Analysis Report
     └─ Generates:
          • Structured Post Analysis Report
            ↓
    Account Pre-Analysis
      ├─ Uses:
      │    • All Post Analyses
      │    • Profile Metadata
      └─ Generates:
           • Structured Account Analysis Report

```

### Phase 1: Reconnaissance
In this phase, you **collect all public Instagram data** for a target and their network.
You’re building the raw intelligence database that you’ll investigate later.

**What you do:**

Run one of these commands to collect all public Instagram data for a target and their network:

- `osintgraph discover <target>` — Collect and (optionally) pre-analyze the target account’s data.

- `osintgraph explore <target>` — Recursively run `discover` on each followee of the target, prioritizing followees with the largest follower base in the Neo4j database.

**What OSINTGraph does in the background:**
1. Scrapes the target’s profile, followers, followees, posts, and comments.
2. If Gemini API is enabled, pre-analyzes:
   - Image Analysis: Each post’s media is examined for visual clues and details.
   - Post Analysis: Combines image findings, post metadata, and comments into a structured OSINT report.
   - Account Analysis: Summarizes patterns and behaviors across all posts for the account.
   > Pre-analysis quickly examines posts and account data to give you early insights. It’s also useful for template-based investigations, because templates can use the pre-analyzed data immediately for deeper analysis.
3. Maps all relationships (likes, follows, replies, etc.) into Neo4j. [See how Instagram data is stored in Neo4j ↗](#-data-model-neo4j-schema)


### Phase 2: Analysis & Investigation

In this phase, you **search**, **analyze**, and **visualize** the intelligence gathered in Phase 1.
Now you’re making sense of the network, activities, and patterns.

**What you do:**
- **Query** data using natural language, keyword/semantic search, and graph-relationship queries.
- **Run** analyses using predefined or custom templates.
- **Explore and Visualize** social networks interactively.


**You have two main ways to do this:**

#### 1. AI Agent  `osintgraph agent`

- Ask questions for data retrieval, keyword and semantic searches, graph-relationship based queries, and analyses using predefined or custom templates.
[Learn more about Agent ↗](#-osintgraph-ai-agent--getting-started-guide)

#### 2. Neo4j Visualization ([Neo4j Console Browser](https://console-preview.neo4j.io/tools/explore))

- Explore visualize the social network map interactively.
- See how people, posts, and interactions are connected.

## ⚙ Commands Reference
Below is a breakdown of each command, what it does, and when to use it.

### 🔧 `setup [option]`

<details>
<summary>See Usage & options</summary>
   
**Purpose:**

Configures services and credentials so OSINTGraph can access Instagram, Neo4j, Gemini.

**Options:**

- `all` (default) — configure everything.

- `instagram` — configure Instagram scraping credentials (cookies/session).

- `neo4j` — set up your Neo4j database connection.

- `gemini` — set your Gemini API key for AI analysis.

- `user-agent` — customize the User-Agent string for scraping.

**When to use:**
Run this the first time you install OSINTGraph or to set credentials.


Examples:

```bash
osintgraph setup
osintgraph setup instagram
```

</details>

### 🔧 `reset [option]`

<details>
<summary>See Usage & options</summary>
   
**Purpose:**
Clears stored credentials for the chosen option and immediately re-runs setup for that option.

**Options:**

- `all` (default) — reset everything and reconfigure.

- `instagram` — reset Instagram credentials.

- `neo4j` — reset Neo4j database connection settings.

- `gemini` — reset your Gemini API key.

- `user-agent` — reset the User-Agent string for scraping.
  
**When to use:**
Use this when you need to change or update your credentials (e.g., expired Instagram session, new API key, changed Neo4j password).

**Examples:**

```bash
osintgraph reset
osintgraph reset instagram
```

</details>


### 🔍 `discover <username>`

<details>
<summary>See Usage & options</summary>
   
**Purpose:**
Collects all public data for a single Instagram account.

**What it does:**

- Scrapes followers, followees, and posts (with comments).

- Runs **AI-powered post analysis** (`post_analysis`) ). (if Gemini is configured)

- Runs **AI-powered account analysis** (`account_analysis`) after all posts are analyzed. (if Gemini is configured)

- Saves everything in Neo4j.


> **Resumable runs**  
> - If `discover` cannot finish scraping or analysis in one run (for example, a target has thousands of followers or many posts), the progress is saved automatically.  
> - Running the same command again with the same target will continue from where it left off until all data and analysis are complete.  
> - Finished sections are skipped on later runs, so no duplicate work is done.  
> - Use `--force` if you want to re-fetch or re-analyze any part (e.g., `--force follower`, `--force account-analysis`).
>
> **Limitation**
> - When scraping followers and followees, only username and basic profile info are collected. To get full profiles, posts, and comments, you need to run `discover` on each account separately.
> - When scraping posts, likes and comments are collected, but only a partial amount may be available due to Instagram’s limitations.


**Options:**

- `--skip [parts]` — skip certain steps.

   *(Options: all, follower, followee, post, post-analysis, account-analysis)*  
   Example: `--skip post-analysis` will skip analyzing posts with AI.
- `--limit TYPE=NUMBER` — limit how many items to fetch per type (default: follower=1000, followee=1000, post=10).

   *(Options: follower, followee, post)*  
   Example: `--limit post=5` — fetches only 5 posts.
- `--rate-limit NUMBER` —  pause for 8–10 minutes after every N request to avoid detection.  
   Example: `--rate-limit 500` will wait 8~10 minutes after every 500 Instagram requests.
- `--force [parts]` — re-fetch or re-analyze even if already done.   

  *(Options: all, follower, followee, post, post-analysis, account-analysis)*  
   Example: `--force account-analysis` — **resets the progress** and reruns the AI analysis on the account data

**When to use:**
First step of any investigation — gets all data for your primary target.

Example:

```bash
osintgraph discover "target_user"
osintgraph discover "target_user" --skip post-analysis account-analysis --limit follower=200 post=15 --force follower followee
```

</details>


### 🌐 `explore <username>`

<details>
<summary>See Usage & options</summary>

**Purpose:**

Recursive discovery — goes beyond your target to their network.

**What it does:**

- Runs `discover` on each followee of the target, prioritizing those with the largest follower counts in your Neo4j database.

> Focuses on followees because they often reveal the target’s real interests, communities, and affiliations—such as local groups, news sources, favorite influencers, or close friends. Within these, accounts with larger follower bases in your Neo4j DB are explored first, increasing the chances of uncovering valuable insights.

- Stops after a set number of accounts.

**Options:**

- `--max NUMBER` — how many accounts to discover in total.
   Example: `--max 10` — the agent will `discover` up to 10 followees of the target, then stop.
  
*(The following options work the same way as in `discover`)*
- `--skip [parts]` — skip certain steps (e.g., post-analysis).

   *(Options: all, follower, followee, post, post-analysis, account-analysis)*
- `--limit TYPE=NUMBER` — limit how many items to fetch per type (default: follower=1000, followee=1000, post=10).

   *(Options: follower, followee, post)*
- `--rate-limit NUMBER` —  pause for 8–10 minutes after every N request to avoid detection.

- `--force [parts]` — re-fetch or re-analyze even if already done.

  *(Options: all, follower, followee, post, post-analysis, account-analysis)*


**When to use:**
To expand your investigation into the wider social network.

Example:

```bash
osintgraph explore "target_user"
osintgraph explore "target_user" --max 10 --limit follower=1000 followee=500
```

</details>


### 🤖 `agent`

<details>
<summary>See Usage & options</summary>

**Purpose:**

Launches the OSINTGraph AI Agent for natural language investigations.

**What it can do:**

- Keyword search across your Neo4j database.

- Semantic search using AI embeddings.

- Auto-generate and run Cypher queries.

- Execute prebuilt or custom YAML investigation templates.

**Key options:**

- `--debug` — store detailed debug output for template.

**When to use:**

After you’ve collected data use the agent to ask questions, run analysis, or execute templates.

Example:

```bash
osintgraph agent --debug
```

</details>


## 🧩 Data Model (Neo4j Schema)
After scraping, OSINTGraph stores Instagram data in Neo4j as interconnected nodes and relationships.
<img width="710" height="447.8" alt="OsintgraphNeo4j" src="https://github.com/user-attachments/assets/dc34d94b-fa2b-43c4-8435-a898c8a4dcb1" />

*OSINTGraph Data Model (All Entities & Relationships)*

### 👤 Person - Represents an Instagram account.

<details>
<summary>See all properties</summary>
   
| Property                          | Type    | Description                                          |
| --------------------------------- | ------- | ---------------------------------------------------- |
| **id**                            | INTEGER | Unique identifier for the person node.               |
| **username**                      | STRING  | Instagram username.                                  |
| **fullname**                      | STRING  | Full display name from profile.                      |
| **bio**                           | STRING  | Profile biography text.                              |
| **followers**                     | INTEGER | Number of followers.                                 |
| **followees**                     | INTEGER | Number of accounts followed.                         |
| **mediacount**                    | INTEGER | Number of posts uploaded.                            |
| **external\_url**                 | STRING  | External link in profile bio.                        |
| **business\_category\_name**      | STRING  | Business category if a business account.             |
| **is\_verified**                  | BOOLEAN | True if the account has a verification badge.        |
| **is\_business\_account**         | BOOLEAN | True if the account is marked as a business account. |
| **has\_highlight\_reels**         | BOOLEAN | True if the user has highlight stories.              |
| **has\_public\_story**            | BOOLEAN | True if the account has public stories.              |
| **is\_private**                   | BOOLEAN | True if the account is private.                      |
| **profile\_pic\_url**             | STRING  | Profile picture URL.                                 |
| **profile\_pic\_url\_no\_iphone** | STRING  | Alternate profile picture URL.                       |
| **biography\_hashtags**           | LIST    | Hashtags used in the bio.                            |
| **biography\_mentions**           | LIST    | Usernames mentioned in the bio.                      |

#### Analysis Fields
| Property              | Type   | Description                           |
| --------------------- | ------ | ------------------------------------- |
| **account\_analysis** | STRING | AI-generated analysis of the account. (stringified JSON)|

<details>
  <summary>Show account_analysis structure</summary>
  <pre><code class="json">
  {
  "account_summary": {
    "who_runs_this_account": {
      "summary": "",
      "confidence": ""
    },
    "what_type_of_account": {
      "label": "",
      "reasoning": "",
      "confidence": ""
    },
    "why_this_account_exists": {
      "main_purpose": "",
      "supporting_signals": []
    },
    "who_is_the_target_audience": {
      "summary": "",
      "reasoning": ""
    },
    "what_it_posts_about": {
      "topic_distribution": [
        {
          "topic": "",
          "percentage": ""
        }
      ]
    },
    "how_often_it_posts": {
      "avg_posts_per_month": "",
      "most_active_days": [],
      "seasonal_patterns": ""
    },
    "who_comments_on_it": {
      "audience_profile": {
        "likely_age_range": "",
        "languages_used": [],
        "comment_style": "",
        "emotional_tone": ""
      },
      "relationship_to_owner": ""
    },
    "how_comments_look": {
      "comment_quality": "",
      "reply_behavior": "",
      "engagement_style": "",
      "detected_bots_or_fake_activity": false
    },
    "notable_flags_or_anomalies": {
      "inconsistencies": [],
      "suspicious_behavior": [],
      "possible_account_switch_history": false
    },
    "language_and_text_patterns": {
      "caption_language": [],
      "common_caption_themes": [],
      "hashtags_usage": "",
      "emoji_usage": "",
      "comment_language_distribution": [],
      "comment_length": ""
    },
    "summary_notes": ""
  }
}
  </code></pre>
</details>


#### Semantic Search Fields
| Property                      | Type | Description                                               |
| ----------------------------- | ---- | --------------------------------------------------------- |
| **username\_vector**          | LIST | Vector embedding of username for semantic search.         |
| **bio\_vector**               | LIST | Vector embedding of biography for semantic search.        |
| **fullname\_vector**          | LIST | Vector embedding of full name for semantic search.        |
| **account\_analysis\_vector** | LIST | Vector embedding of account analysis for semantic search. |

#### Internal Fields
| Property                          | Type    | Description                                    |
| --------------------------------- | ------- | ---------------------------------------------- |
| **\_profile\_complete**           | BOOLEAN | Internal flag: profile scrape completed.       |
| **\_followers\_complete**         | BOOLEAN | Internal flag: follower list scrape completed. |
| **\_followees\_complete**         | BOOLEAN | Internal flag: followee list scrape completed. |
| **\_posts\_complete**             | BOOLEAN | Internal flag: posts scrape completed.         |
| **\_posts\_analysis\_complete**   | BOOLEAN | Internal flag: post analysis completed.        |
| **\_account\_analysis\_complete** | BOOLEAN | Internal flag: account analysis completed.     |
| **\_followers\_resume\_hash**     | STRING  | Internal resume state for follower scraping.   |
| **\_followees\_resume\_hash**     | STRING  | Internal resume state for followee scraping.   |
| **\_posts\_resume\_hash**         | STRING  | Internal resume state for posts scraping.      |

</details>


### 📷 Post - Represents an Instagram post.

<details>
<summary>See all properties</summary>

| Property                   | Type       | Description                                      |
| -------------------------- | ---------- | ------------------------------------------------ |
| **id**                     | INTEGER    | Unique identifier for the post node.             |
| **shortcode**              | STRING     | Instagram post shortcode (URL-friendly ID).      |
| **caption**                | STRING     | Post caption text.                               |
| **pcaption**               | STRING     | Preprocessed caption text (cleaned).             |
| **title**                  | STRING     | Post title (if available).                       |
| **likes**                  | INTEGER    | Number of likes on the post.                     |
| **comments**               | INTEGER    | Number of comments on the post.                  |
| **is\_video**              | BOOLEAN    | True if the post is a video.                     |
| **video\_duration**        | INTEGER    | Video length in seconds.                         |
| **video\_view\_count**     | INTEGER    | Number of video views.                           |
| **is\_pinned**             | BOOLEAN    | True if the post is pinned on profile.           |
| **is\_sponsored**          | BOOLEAN    | True if the post is marked as sponsored content. |
| **typename**               | STRING     | Instagram media type name.                       |
| **mediacount**             | INTEGER    | Number of media items (for carousel posts).      |
| **accessibility\_caption** | STRING     | Alt-text or accessibility caption.               |
| **tagged\_users**          | LIST       | Usernames tagged in the post.                    |
| **caption\_hashtags**      | LIST       | Hashtags used in the post caption.               |
| **caption\_mentions**      | LIST       | Mentions in the post caption.                    |
| **date\_utc**              | DATE\_TIME | UTC timestamp of post creation.                  |
| **date\_local**            | DATE\_TIME | Local timestamp of post creation.                |

#### Analysis Fields
| Property            | Type   | Description                               |
| ------------------- | ------ | ----------------------------------------- |
| **post\_analysis**  | STRING | AI-generated analysis of the post. (stringified JSON)|
| **image\_analysis** | STRING | AI-generated image analysis for the post. (stringified JSON array)|
<details>
  <summary>Show post_analysis structure</summary>
  <pre><code class="json">
    {
  "post_metadata_summary": {
    "post_type": "",
    "post_tone": "",
    "post_intent": "",
    "poster_role_or_affiliation": "",
    "target_audience": "",
    "posting_motivation": "",
    "date_context": "",
    "sponsored_or_promotional": false
  },
  "visual_analysis_summary": {
    "key_findings": "",
    "notable_objects_or_symbols": "",
    "people_or_groups_shown": "",
    "locations_or_geo_clues": "",
    "emotion_or_energy_level": "",
    "forensic_red_flags": []
  },
  "comment_section_analysis": {
    "overall_sentiment": "",
    "common_comment_behaviors": "",
    "dominant_tones_or_emotions": "",
    "top_words_or_emojis": [],
    "interaction_patterns": "",
    "bot_or_coordinated_activity": false,
    "cultural_or_linguistic_signals": ""
  },
  "behavioral_and_social_insight": {
    "likely_poster_motivation": "",
    "social_group_affiliations": "",
    "influence_or_recruitment_signs": "",
    "propaganda_or_polarization_signals": "",
    "deception_or_misinfo_signs": ""
  },
  "osint_value": {
    "intelligence_usefulness": "",
    "recommended_followup": "",
    "confidence_level": "",
    "summary_takeaways": ""
  }
}
  </code></pre>
</details>
<details>
  <summary>Show image_analysis structure</summary>
  <pre><code class="json">
{
  "image_type": "",
  "image_tone": "",
  "image_scenario": "",
  "image_intent": "",
  "people_count_visible": "",
  "people_visibility_level": "",
  "people_gender": "",
  "people_age_range": "",
  "people_ethnicity": "",
  "people_clothing": "",
  "people_accessories": "",
  "people_hair_description": "",
  "people_facial_hair": "",
  "people_face_features": "",
  "people_body_type": "",
  "people_skin_tone": "",
  "people_posture": "",
  "people_actions": "",
  "people_dominant_hand": "",
  "people_walking_style": "",
  "people_emotions": "",
  "people_interaction": "",
  "people_possible_role": "",
  "people_items_carried": "",
  "people_visible_tech": "",
  "people_tattoos_piercings": "",
  "people_symbols_or_badges": "",
  "people_identity_clues": "",
  "people_eye_color": "",
  "people_glasses_or_contacts": "",
  "people_mouth_expression": "",
  "people_visible_injuries": "",
  "people_makeup_or_face_paint": "",
  "people_body_language": "",
  "people_proximity": "",
  "people_group_behavior": "",
  "people_footwear": "",
  "people_carry_method": "",
  "people_visible_tattoos": "",
  "people_eye_contact": "",
  "people_accessory_details": "",
  "people_disabilities_or_devices": "",
  "people_behavior_notes": "",
  "text_present": false,
  "text_transcribed": "",
  "text_language": "",
  "text_font_style": "",
  "text_meaning": "",
  "clothing_style": "",
  "clothing_colors": "",
  "clothing_symbols_or_logos": "",
  "facial_expressions": "",
  "group_mood": "",
  "scene_location_type": "",
  "scene_background": "",
  "scene_time_weather": "",
  "notable_objects": "",
  "tech_or_tools": "",
  "vehicles_or_props": "",
  "visible_text_on_objects": "",
  "uniforms_or_insignia": "",
  "environment_signs": "",
  "editing_or_staging_signs": "",
  "license_plate_number": "",
  "license_plate_region": "",
  "brands_or_product_names": "",
  "unique_identifiers": "",
  "safety_gear": "",
  "weapon_type": "",
  "vehicle_type_or_model": "",
  "unusual_objects": "",
  "animals_seen": "",
  "activity_signs": "",
  "time_displayed": "",
  "image_quality": "",
  "visual_style": "",
  "filters_or_watermarks": "",
  "geo_clues": "",
  "primary_language_seen": "",
  "regional_indicators": "",
  "slang_or_dialect_detected": "",
  "cultural_or_religious_signs": "",
  "group_affiliations": "",
  "flags_uniforms_gestures": "",
  "deception_signs": "",
  "hashtags_or_keywords": "",
  "geo_political_relevance": "",
  "game_detected": false,
  "game_name": "",
  "exif_device": "",
  "watermark_found": false,
  "original_image_source": "",
  "poster_intent": "",
  "target_audience": "",
  "engagement_tricks": "",
  "psychological_triggers": "",
  "radical_language_or_symbols": "",
  "call_to_action": "",
  "recruiting_or_polarizing_content": "",
  "misinfo_or_agenda_signals": "",
  "summary_type": "",
  "key_takeaways": "",
  "cultural_or_geo_significance": "",
  "poster_purpose": "",
  "osint_value": "",
  "confidence_in_analysis": ""
}
  </code></pre>
</details>


#### Semantic Search Fields
| Property                    | Type | Description                         |
| --------------------------- | ---- | ----------------------------------- |
| **caption\_vector**         | LIST | Vector embedding of caption text for semantic search..   |
| **title\_vector**           | LIST | Vector embedding of title text for semantic search..     |
| **post\_analysis\_vector**  | LIST | Vector embedding of post analysis for semantic search..  |
| **image\_analysis\_vector** | LIST | Vector embedding of image analysis for semantic search.. |

</details>

### 💬 Comment - Represents a comment on a post.

<details>
<summary>See all properties</summary>
   
| Property             | Type       | Description                             |
| -------------------- | ---------- | --------------------------------------- |
| **id**               | INTEGER    | Unique identifier for the comment node. |
| **text**             | STRING     | Comment text.                           |
| **likes\_count**     | INTEGER    | Number of likes on the comment.         |
| **created\_at\_utc** | DATE\_TIME | UTC timestamp of comment creation.      |

#### Semantic Search Fields
| Property         | Type | Description                                           |
| ---------------- | ---- | ----------------------------------------------------- |
| **text\_vector** | LIST | Vector embedding of comment text for semantic search. |

</details>


### 🕸 Relationships

| Relationship                            | Description                                  |
| --------------------------------------- | -------------------------------------------- |
| 👤 Person - **Follows** -> 👤 Person    | A person **follows** another person.         |
| 👤 Person - **Posted** -> 📷 Post       | A person **created** the post.               |
| 👤 Person - **Liked** -> 📷 Post        | A person **liked** a specific post.          |
| 👤 Person - **Commented** -> 💬 Comment | A person **authored** the comment.           |
| 💬 Comment - **On** -> 📷 Post          | The comment is **made on** a specific post.  |
| 💬 Comment - **Reply To** -> 💬 Comment | A comment is a **reply to** another comment. |
| 👤 Person - **Liked** -> 💬 Comment     | A person **liked** a comment.                |


## 🕵 OSINTGraph AI Agent – Getting Started Guide
The OSINTGraph Agent helps you **explore, retrieve, and analyze your OSINT data in Neo4j.**
It works in two main ways:

- **Data Retrieval & Simple Analysis** – Fetch accounts, posts, comments, and relationships using filters, graph queries, and searches. You can also ask for quick insights (summaries, counts, highlights) on the retrieved data.

- **Template-Based Analysis** – For deeper investigations, use pre-built or custom templates. Templates guide the agent to retrieve the right data and apply structured analysis for more controlled , focused, and repeatable investigations.

This guide shows the two main ways to interact with the OSINTGraph AI Agent - **Data Retrieval** for quick questions, and **Template-Based Analysis** for deeper investigations. It also explains how to ask clear questions so you get the most accurate results.

> [!NOTE]
> These example questions are just a guide — you can ask the agent in your own words, and it will understand.


### 1. 🔧 Data Retrieval
Data Retrieval is best for **direct queries** and **simple analyses questions**
You can use it to fetch data based on **filters**, **relationships**, or **searches**.

#### Approach 1: Basic Data Retrieval  
Get data by filtering on straightforward criteria (e.g., usernames or dates).

**Example:**  
- “Get John’s comments from 2025”  
  *(Returns all comments made by John during 2025)*

- “How many comments has John made in 2025”  
  *(Returns the total number of comments John made during 2025)*

---

#### Approach 2: Relationship Traversal  
Include social connections in your query — followers, likers, commenters, etc.

**Example:**  
- “Find followers of John who commented on his posts in 2025”  
  *(Returns users who follow John and commented on his posts during 2025)*

---

#### Approach 3: Content Search

You can search data using two methods:

- **Keyword Search (literal word match):**
  Finds exact matches of words or phrases.  
  *Example:* “Find John’s comments from 2025 with the word ‘conference’”  
  *(Returns John’s 2025 comments containing the exact word “conference”)*

- **Semantic Search (meaning-based):**
  Finds content based on related meanings, including synonyms or related terms.
  
   Supported fields include:

   - Person: `username`, `fullname`, `bio`, `account_analysis`
   
   - Post: `caption`, `title`, `post_analysis`, `image_analysis`
   
   - Comment: `text`
     
  *Example:* “Show John’s comments from 2025 about startups”  
  *(Returns John’s 2025 comments'text related to “startups,” such as “new companies” or “ventures”)*

---

#### Combining Approaches
You can mix filters, relationships, and content search for precise results:

- “Find followers of John who liked his posts about startups in 2025”

> - Filters posts by date (2025)  
> - Traverses relationships to get John’s followers who liked those posts  
> - Apply semantic search on post content to find those about startups  

- “Find followers of John who liked his posts with the word ‘conference’ in 2025”  
> - Filters posts by date (2025)  
> - Traverses relationships to get John’s followers who liked those posts  
> - Apply keyword search on post content for the exact word “conference”

--- 

### 🎯 Best Practices – How to Ask Questions for Best Results

Being precise makes your results more accurate and useful. Here are key ways to improve your queries:

**Examples of precision:**

#### Precision in Searching Method  
- **Vague:** "Find posts about aura farming"  
- **Precise:** "**Use semantic search**, find posts about aura farming."

#### Precision in Targeting Data Fields  
- **Vague:** "Search for aura farming"  
- **Precise:** "Use semantic search on **post captions** about aura farming."

#### Precision in Context and Entities  
- **Vague:** "Where is John?"  
- **Precise:** "Which location might John be at **based on post captions, post analysis, and person bio**?"

#### Precision in Getting Results  
- **Vague:** "Tell me about John"  
- **Precise:** "Give John’s account analysis and follower count."

💡 **Tip:** When asking, think about:  
- What searching method should be applied if needed? (semantic search, keyword search)  
- Which data fields should be checked? (person bio, post analysis, post captions, etc.)  
- What exactly do you want back? (summary, detailed context, related entities, relationships, etc.)

This will speed up your investigation and ensure the Agent looks in the right places.

---

### 2. 🧩 Template-Based Analysis

Templates are **blueprints that tell the AI how to analyze your data**. Instead of manually going through posts, comments, likes, and social connections—which can take days—a template lets the OSINTGraph agent **gather all the needed data, feed it into a fresh AI, and get clear answers**.

**Example scenario:** You want to figure out where a person might be located. Doing it manually would take hours or days—looking through every post, comment, and followee. With a template, the AI can **analyze all this data** and **summarize likely locations**, saving you time and effort.

Each template run:

- Spawns a **new AI instance** with no memory of previous runs.

- Uses a **system prompt** (the AI’s “brain”) to guide reasoning.

- Injects the gathered data into a **user prompt** for analysis.

Templates are great because they let you:

1. **Control** how the AI thinks and reasons.

2. **Get consistent, repeatable results.**

3. **Analyze large datasets quickly** without doing manual work.

4. **Reuse the same template** across different targets or investigations.


### 📝 Template Structure

Templates are written as `.yaml` files with the following structure:

```yaml
name: <unique_template_name>
# Example: liked_post_analysis
# A unique identifier for the template. Used to select and run this template.

description: |
  <Brief explanation of what the template does, what kind of data it processes, and the type of output it produces.>
  # Example:
  #    Analyze liked posts to infer user interests and personality traits.

input_fields:
  # List of placeholders that will be replaced by actual data when running the template.
  # Each field defines a unique placeholder name and what data should be injected by OSINTGraph agent into that placeholder.
  - name: placeholder1
    description: |
      <Explain clearly what data this field should contain, and the exact format required.>
      # The agent will read these descriptions to automatically choose the correct Cypher queries, run them, and inject the results in the requested format.
      # Example:
      #    Provide User profile info including Person.username and Person.bio.
      #    Give results in this format:
      #       Username: ...
      #       Bio: ...

  - name: placeholder2
    description: |
      <Explain what this second input field should contain and its format.>
      # Describe what kind of data should be injected into this second placeholder when the template runs.
      # Example:
      #    A list of posts liked by the user, each with Post.caption and Post.post_analysis.
      #    Format in this way:
      #    Post:
      #       Catpion: ...
      #       Post analysis: ...     

system_prompt: |
  <Instructions defining the AI’s role, behavior, reasoning style, and output format>
  # Defines the LLM style, tone, rules, how to reason, what to infer, and how to format results
  # Example:
  #   You are a social media analyst. Review the user's liked posts and infer behavioral patterns or thematic interests based on post content.

user_prompt: |
  <Task description with placeholders for injected data>
  # The task request, with special placeholders `{placeholders}` for injected data
  # Example:
  #    Analyze the following profile and liked posts:
  #    Profile Info:
  #    {placeholder1}
  #
  #    Posts liked by the user:
  #    {placeholder2}
```

See an example template here: [location_analysis.yaml](https://github.com/XD-MHLOO/osintgraph-templates/blob/main/templates/location_analysis.yaml)

### 📦 Predefined Templates

OSINTGraph comes with several ready-to-use templates that cover common OSINT investigations. You can run them immediately without creating your own.

Examples include:

- **location_analysis** – Determine possible locations of the target user by analyzing posts, comments, likes, and their social graph.

- **contact_info_extraction** – Scan bios, captions, comments, and images for potential leaks of emails, phone numbers, or addresses.

- **interests_hobbies_lifestyle_analysis** – Uncover the target user’s interests, hobbies, and lifestyle preferences with supporting evidence from posts, likes, and network connections.

All predefined templates are maintained in this repository: https://github.com/XD-MHLOO/osintgraph-templates

**👉 To see the full list of predefined templates**:

  Ask the agent to list all templates in the folder. 

  > "list all templates"

**👉 To view details of a specific one:**

  Ask the agent to show a template by name, or you can view the YAML file directly in your templates folder (`osintgraph -h` to see the folder path).

  > "show template location_analysis"

**👉 To run a predefined template:**  
Ask the agent to execute the template.

> "Run location_analysis on target_username"


### ⚡ How Templates Work

1. **You request a template to run**  
    Example template with required additional context (e.g., username):  

   > "Run location_analysis template on JohnDoe"

   Choose the template you want to run and provide the agent with any required context.
  
   If you're not sure what to provide, simply ask the agent(e.g. "How to use \<the template\>") — it will guide you.

2. **Agent collects required data automatically**

   Based on the template’s input field descriptions, the agent automatically runs Cypher queries on your Neo4j database. It retrieves all required fields, formats the results, and fills the `{placeholders}` in the template's user prompt.

3. **Run Template and Get Output**

   A new LLM instance is created internally, using the template’s system and user prompts to analyze the data, then returns the output (e.g., analysis, summaries, or explanations) depending on the template's system prompt.

> [!NOTE]  
> OSINTGraph is primarily built using free services (e.g. Gemini API), therefore template runs are **rate-limited internally** to ensure stability.

### 🛠 How to Create Your Own Custom Template

**You can create a custom template by defining a `.yaml` file that controls how the AI analyzes your data.**

#### 🧠 Example Use Case

Let’s say you want to analyze a user's **bio**, **post captions**, and **comment texts** to extract any possible of contact details (such as emails, phone numbers, addresses, etc.) You can build a custom template like this:

```yaml
name: contact_info_extraction

description: |
  Analyze a user's profile bio, post captions, comment texts and image analysis
  (OCR and visual text) to detect any possible leaks of contact details such as emails, phone numbers, or addresses, and return them in a structured Markdown list with supporting context.

input_fields:
  - name: bio
    description: |
      The user’s Person.bio.

      Format:
      Bio:
        Text: ...

  - name: posts
    description: |
      List of all posts made by the user. Each post must include:
        - Post.shortcode
        - Post.caption
        - Post.image_analysis

      Format (One post per entry):
        User Post:
          Post Url: https://www.instagram.com/p/<Post.shortcode>/
          Caption: ...
          Image Analysis:

          Image 1:
          - People: [...]
          - Text/OCR: [...]
          - Summary: [...]

          Image 2:
          ...

  - name: comments
    description: |
      A list of Comment.text authored by the user.

      Format:
      Comment:
        Text: ...

system_prompt: |
  You are a digital privacy analyst. Your task is to carefully analyze the provided data to identify any possible leaks of contact details, including but not limited to:
  - Email addresses
  - Phone numbers
  - Addresses
  - Social media handles, usernames, or IDs
  - Any other identifiers that may reveal contact information
  - Use pattern recognition and contextual reasoning to flag potential contact details.
  - If detected, report each type of possible contact detail (email, phone, address, ..) in a structured Markdown format.
  - For each match, include:
    - The type of contact detail (Email, Phone, Address, etc.)
    - The exact string detected
    - The source field (bio, caption, comment, image OCR) (cite Post Url for Post image OCR )
    - Context / Possible Use — Based on surrounding information, what the contact might be
    - A brief reasoning (if the match is inferred and not explicit)
    - A confidence level (High / Medium / Low), with justification for the confidence
  - If nothing is found, return: "No possible contact details detected."

user_prompt: |
  Review the following content and extract any possible contact-related information:
  
  User Bio:
  {bio}
  
  List of User Posts:
  {posts}
  
  List of User Comments:
  {comments}


```
**Steps to Create Your Template**

1. `name`
Choose a unique name to identify your template. This will be used to select and run the template.

2. `description`
Briefly describe what your template does and the kind of output it produces.
   > This helps the OSINTGraph agent better understand the intent and use of the template.

3. `input_fields`
Define what data the agent should inject at runtime. Each input field includes:

- `name`: Used as `{placeholder}` in the user prompt.

- `description`: Explain exactly what data should be injected here and how it should be formatted.

> [!NOTE]
> - For direct schema attributes (e.g., `Person.bio`, `Post.caption`), mention them explicitly so the agent knows to fetch them directly from the database.

4. `System Prompt`
Write clear instructions defining the AI’s role, behavior, how to reason, and how to format its output. This controls how the AI thinks and processes the data.

5. `User Prompt`
Write the actual task description, with `{placeholder_name}` tags for runtime data injection.


#### 📂 Add Your Custom Template

1. Place your custom `.yaml` template file into your templates folder.
(Run osintgraph -h to see where the folder is located.)

2. Validate Your Template:
   
   > "list all templates including invalid ones"
   
   The agent will display all templates in the folder. If your custom template has errors, it will show where; if no errors appear, your template is valid and ready to use. (No need to restart `osintgraph agent` if it’s already running — simply ask to "refresh and list all templates" again.)

## 🚫 **How to Avoid Account Suspension**

1. <mark>**Use Your Browser Session**</mark>  
   When running `osintgraph setup instagram`, choose login via Firefox session to make the login look natural. 🌐

2. <mark>**Use Your Real User-Agent**</mark>  
   When running `osingraph setup user-agent`, provide the exact user-agent from the browser you use to log in to your Instagram account. 🖥️

3. <mark>**Enable 2FA**</mark>  
   Turn on 2FA for your Instagram account. It’s simple: just use an authenticator app, and it helps Instagram recognize that your account is legitimate. 🔒

4. <mark>**Build Account Reputation**</mark>  
   Use your Instagram account normally (like posts, comment, watch stories) for a few days or weeks before scraping. 📈

5. <mark>**Warm Up Your Session**</mark>  
   Spend time using Instagram before scraping, like a normal user, to avoid looking suspicious. ⏳

6. **Avoid VPNs**  
   Don’t use VPNs. Instagram may flag accounts with mismatched or suspicious locations. 🚫🌍

7. **Don’t Use the Account for Other Activities While Scraping**  
   When using this tool to collect data, avoid using the same Instagram account for any other activities. 🛑

8. **Limit Scraping Time**  
   Don’t scrape for more than 6 hours straight. ⏰
### Credit:  
- Thanks to [@ahmdrz](https://github.com/ahmdrz) for these valuable insights on avoiding account suspension. 🙏
- Also see [this useful comment](https://github.com/instaloader/instaloader/issues/2391#issuecomment-2400987481) on Instaloader's GitHub for more tips.

---

## 📦 Dependencies:
- **[Instaloader](https://github.com/instaloader/instaloader)** – Used to collect Instagram profile data, followers, and followees.
- **[Neo4j](https://neo4j.com/)** – Graph database used to store and visualize the Instagram social network.
- **[LangGraph](https://github.com/langgraph/langgraph)** – Handles structured multi-step LLM reasoning and ReAct-style agent execution.
- **[Gemini / Google Generative AI](https://developers.google.com/)** – Provides the LLM model used for AI-powered analysis and powers the AI agent.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "osintgraph",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "Instagram, LangChain, LangGraph, Neo4j, OSINT, Social Network Analysis",
    "author": "XD-MHLOO",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/49/99/59a03d9217131a1a7e5f3b49f797a6d3dfabece80f628e9b4ec0eb93c314/osintgraph-0.1.1.tar.gz",
    "platform": null,
    "description": "# Osintgraph (Open Source Intelligence Graph)\n\n![osintgraph_banner](https://github.com/user-attachments/assets/04a46de3-8f0e-40fa-83f6-2a9ff811a667)\n\n**Osintgraph** is a tool for deep social analysis and OSINT investigations focused on Instagram targets.\nIt uses Neo4j to map a target\u2019s network \u2014 revealing connections, interests, and affiliations \u2014 and an interactive AI Agent to speed up investigations and simplify analysis.\n\n## \u26a1 What OSINTGraph Does\n**OSINTGraph CLI** gathers all public Instagram data from a target and maps their social connections, including **profiles**, **followers**, **followees**, **posts**, **comments**, and **likes**. It helps you thoroughly examine your target by gathering all relevant data and analyzing it for investigations.\n\n[See how it works \u2197](#-how-osintgraph-works)\n\n### Data collection via CLI:\n| ![osintgrah_cli](https://github.com/user-attachments/assets/131fca5d-a0ac-4193-bf7c-af52bafc75b1) |\n|-----------------|\n| *Overview of CLI Interface for data collection* |\n\n### Explore and analyze your target's data via two ways:\n\n### 1. **Osintgraph AI Agent**\nUse natural language to query about your target.\nThe AI Agent supports data retrieval, keyword and semantic searches, relationship queries, and template-driven analyses \u2014 helping you get focused answers without manually digging through data.\n| [![asciicast](https://asciinema.org/a/732693.svg)](https://asciinema.org/a/732693) |\n|-----------------|\n| *Overview of interacting with the agent performing data retrieval, keyword and semantic searches, and template-based analyses.* |\n\n### 2. **Neo4j Visualization**\nVisualize your target\u2019s social network, trace interactions, and query relationships directly.  \n\n[![video](https://github.com/user-attachments/assets/71a6c81c-655e-4831-83e8-585e9d270b5a)](https://github.com/user-attachments/assets/71a6c81c-655e-4831-83e8-585e9d270b5a)\n\n| *Example of tracing a target user\u2019s close connection through their most commented post, then investigating mutual followers and all interactions between them.* |\n|-----------------|\n\n\n\n\n## \ud83d\udcda Table of Contents\n\n* [\u2728 About OSINTGraph](#osintgraph-open-source-intelligence-graph)  \n* [\u26a1 What OSINTGraph Does](#-what-osintgraph-does)  \n* [\ud83d\ude80 Getting Started](#-getting-started)  \n  * [1. Install OSINTGraph](#1-install-osintgraph)  \n  * [2. Setup Configuration](#2-setup-configuration)  \n  * [3. Start Collecting Instagram Data](#3-start-collecting-instagram-data)  \n  * [4. Analyze & Investigate](#4-analyze--investigate)  \n  * [5. Visualize in Neo4j](#5-visualize-in-neo4j)\n* [\u26a1 How OSINTGraph Works](#-how-osintgraph-works)  \n  * [Phase 1: Reconnaissance](#phase-1-reconnaissance)  \n  * [Phase 2: Analysis & Investigation](#phase-2-analysis--investigation)  \n* [\u2699 Commands Reference](#-commands-reference)  \n  * [`setup`](#-setup-option)  \n  * [`reset`](#-reset-option)  \n  * [`discover`](#-discover-username)  \n  * [`explore`](#-explore-username)  \n  * [`agent`](#-agent)  \n* [\ud83e\udde9 Data Model (Neo4j Schema)](#-data-model-neo4j-schema)  \n  * [\ud83d\udc64 Person Node](#-person---represents-an-instagram-account)\n  * [\ud83d\udcf7 Post Node](#-post---represents-an-instagram-post)\n  * [\ud83d\udcac Comment Node](#-comment---represents-a-comment-on-a-post) \n  * [\ud83d\udd78\ufe0f Relationships](#-relationships)\n* [\ud83d\udd75\ufe0f OSINTGraph AI Agent \u2013 Getting Started Guide](#-osintgraph-ai-agent--getting-started-guide)  \n  * [1. \ud83d\udd27 Data Retrieval](#1--data-retrieval)  \n    * [Approach 1: Basic Data Retrieval](#approach-1-basic-data-retrieval)  \n    * [Approach 2: Relationship Traversal](#approach-2-relationship-traversal)  \n    * [Approach 3: Content Search](#approach-3-content-search)  \n    * [Combining Approaches](#combining-approaches)  \n    * [Best Practices \u2013 How to Ask Questions for Best Results](#-best-practices--how-to-ask-questions-for-best-results)  \n  * [2. \ud83d\udcdd Template-Based Analysis](#2--template-based-analysis)  \n    * [\u26a1How Templates Work](#-how-templates-work)  \n    * [\ud83d\udee0 How to Create Custom Templates](#-how-to-create-your-own-custom-template)  \n* [\ud83d\udeab How to Avoid Account Suspension](#-how-to-avoid-account-suspension)  \n* [\ud83d\udce6 Dependencies](#-dependencies)  \n\n## \ud83d\ude80 Getting Started\n### 1. Install OSINTGraph\n```bash\npipx install osintgraph\n```\nor\n```bash\npip install osintgraph\n```\n> [!NOTE]\n> When using pip, it\u2019s recommended to install inside a Python virtual environment to avoid dependency conflicts.\n\n### 2. Setup Configuration \nBefore running `osintgraph setup`, make sure you have the following ready:\n\n- **Instagram Account:** Preferably not your main account\n\n- **Neo4j Database:** For storing and visualizing data.\n  \n  (Sign up at [Neo4j](https://neo4j.com) \u2192 Create an instance for free \u2192 Download admin credentials) \u2014 you\u2019ll need these for connection.\n\n- **Gemini API Key:** Enables data pre-analyses and the AI agent.\n  \n  (Sign up at [Google AI Studio](https://aistudio.google.com) \u2192 Create or select a Google Cloud project \u2192 Get API Key for free)\n\n- **User Agent (Optional):** Helps reduce Instagram detection risk.\n  (Open your Firefox browser where you log in to Instagram, search \u201cmy user agent\u201d on Google, and copy it)\n\nThen run \n```bash\nosintgraph setup\n```\n\n### 3. Start collecting Instagram data\nStart gathering data on your target:\n```bash\nosintgraph discover TARGET_INSTAGRAM_USERNAME --limit follower=100 followee=100 post=2 \n```\n### 4. Analyze & Investigate\nLaunch the AI Agent to explore and analyze collected data:\n```bash\nosintgraph agent\n```\nOnce the agent starts, try asking it:\n```Show the target user\u2019s profile info```\n\n### 5. Visualize in Neo4j\nExplore your target\u2019s network graph interactively.\n- Go to the [Neo4j Console](https://console-preview.neo4j.io/tools/explore).\n- Click the **Explore tab**, then **Connect**.\n- In the search bar, type \"Show me a graph\".\n- You should now see the person you just collected, along with their relationships.\n\n\n## \u26a1 How OSINTGraph Works\n\n**OSINTGraph run in two main phases: [Reconnaissance](#phase-1-reconnaissance) and [Analysis & Investigation](#phase-2-analysis--investigation).**\n\n\n\n```bash\n   \u26a1PHASE 1: RECONNAISSANCE                                           \u26a1PHASE 2: ANALYSIS & INVESTIGATION\n   \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500                                           \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n   [ Data Collection ] (osintgraph discover <target>)                    [ Investigation ] \n     \u251c\u2500 Profile Metadata                                                   \u251c\u2500 [AI Agent] (osintgraph agent)\n     \u251c\u2500 Followers                                                          \u2502    \u2022 Retrieve Data    \n     \u251c\u2500 Followees                                                          \u2502    \u2022 Keyword Search\n     \u2514\u2500 Posts (with Comments)                                              \u2502    \u2022 Semantic Search\n           \u2193                                                               \u2502    \u2022 Graph Relationship Search\n   Posts Pre-Analysis                                                      \u2502    \u2022 Run Template Analyses\n     \u251c\u2500 Uses:                                                              \u2514\u2500 [Neo4j Visualization]\n     \u2502    \u2022 Post Metadata\n     \u2502    \u2022 Comments\n     \u2502    \u2022 Image Pre-Analyses\n     \u2502         \u251c\u2500 Uses:\n     \u2502         \u2502    \u2022 Post media (thumbnails & images)\n     \u2502         \u2514\u2500 Generates:\n     \u2502              \u2022 Structured Image Analysis Report\n     \u2514\u2500 Generates:\n          \u2022 Structured Post Analysis Report\n            \u2193\n    Account Pre-Analysis\n      \u251c\u2500 Uses:\n      \u2502    \u2022 All Post Analyses\n      \u2502    \u2022 Profile Metadata\n      \u2514\u2500 Generates:\n           \u2022 Structured Account Analysis Report\n\n```\n\n### Phase 1: Reconnaissance\nIn this phase, you **collect all public Instagram data** for a target and their network.\nYou\u2019re building the raw intelligence database that you\u2019ll investigate later.\n\n**What you do:**\n\nRun one of these commands to collect all public Instagram data for a target and their network:\n\n- `osintgraph discover <target>` \u2014 Collect and (optionally) pre-analyze the target account\u2019s data.\n\n- `osintgraph explore <target>` \u2014 Recursively run `discover` on each followee of the target, prioritizing followees with the largest follower base in the Neo4j database.\n\n**What OSINTGraph does in the background:**\n1. Scrapes the target\u2019s profile, followers, followees, posts, and comments.\n2. If Gemini API is enabled, pre-analyzes:\n   - Image Analysis: Each post\u2019s media is examined for visual clues and details.\n   - Post Analysis: Combines image findings, post metadata, and comments into a structured OSINT report.\n   - Account Analysis: Summarizes patterns and behaviors across all posts for the account.\n   > Pre-analysis quickly examines posts and account data to give you early insights. It\u2019s also useful for template-based investigations, because templates can use the pre-analyzed data immediately for deeper analysis.\n3. Maps all relationships (likes, follows, replies, etc.) into Neo4j. [See how Instagram data is stored in Neo4j \u2197](#-data-model-neo4j-schema)\n\n\n### Phase 2: Analysis & Investigation\n\nIn this phase, you **search**, **analyze**, and **visualize** the intelligence gathered in Phase 1.\nNow you\u2019re making sense of the network, activities, and patterns.\n\n**What you do:**\n- **Query** data using natural language, keyword/semantic search, and graph-relationship queries.\n- **Run** analyses using predefined or custom templates.\n- **Explore and Visualize** social networks interactively.\n\n\n**You have two main ways to do this:**\n\n#### 1. AI Agent  `osintgraph agent`\n\n- Ask questions for data retrieval, keyword and semantic searches, graph-relationship based queries, and analyses using predefined or custom templates.\n[Learn more about Agent \u2197](#-osintgraph-ai-agent--getting-started-guide)\n\n#### 2. Neo4j Visualization ([Neo4j Console Browser](https://console-preview.neo4j.io/tools/explore))\n\n- Explore visualize the social network map interactively.\n- See how people, posts, and interactions are connected.\n\n## \u2699 Commands Reference\nBelow is a breakdown of each command, what it does, and when to use it.\n\n### \ud83d\udd27 `setup [option]`\n\n<details>\n<summary>See Usage & options</summary>\n   \n**Purpose:**\n\nConfigures services and credentials so OSINTGraph can access Instagram, Neo4j, Gemini.\n\n**Options:**\n\n- `all` (default) \u2014 configure everything.\n\n- `instagram` \u2014 configure Instagram scraping credentials (cookies/session).\n\n- `neo4j` \u2014 set up your Neo4j database connection.\n\n- `gemini` \u2014 set your Gemini API key for AI analysis.\n\n- `user-agent` \u2014 customize the User-Agent string for scraping.\n\n**When to use:**\nRun this the first time you install OSINTGraph or to set credentials.\n\n\nExamples:\n\n```bash\nosintgraph setup\nosintgraph setup instagram\n```\n\n</details>\n\n### \ud83d\udd27 `reset [option]`\n\n<details>\n<summary>See Usage & options</summary>\n   \n**Purpose:**\nClears stored credentials for the chosen option and immediately re-runs setup for that option.\n\n**Options:**\n\n- `all` (default) \u2014 reset everything and reconfigure.\n\n- `instagram` \u2014 reset Instagram credentials.\n\n- `neo4j` \u2014 reset Neo4j database connection settings.\n\n- `gemini` \u2014 reset your Gemini API key.\n\n- `user-agent` \u2014 reset the User-Agent string for scraping.\n  \n**When to use:**\nUse this when you need to change or update your credentials (e.g., expired Instagram session, new API key, changed Neo4j password).\n\n**Examples:**\n\n```bash\nosintgraph reset\nosintgraph reset instagram\n```\n\n</details>\n\n\n### \ud83d\udd0d `discover <username>`\n\n<details>\n<summary>See Usage & options</summary>\n   \n**Purpose:**\nCollects all public data for a single Instagram account.\n\n**What it does:**\n\n- Scrapes followers, followees, and posts (with comments).\n\n- Runs **AI-powered post analysis** (`post_analysis`) ). (if Gemini is configured)\n\n- Runs **AI-powered account analysis** (`account_analysis`) after all posts are analyzed. (if Gemini is configured)\n\n- Saves everything in Neo4j.\n\n\n> **Resumable runs**  \n> - If `discover` cannot finish scraping or analysis in one run (for example, a target has thousands of followers or many posts), the progress is saved automatically.  \n> - Running the same command again with the same target will continue from where it left off until all data and analysis are complete.  \n> - Finished sections are skipped on later runs, so no duplicate work is done.  \n> - Use `--force` if you want to re-fetch or re-analyze any part (e.g., `--force follower`, `--force account-analysis`).\n>\n> **Limitation**\n> - When scraping followers and followees, only username and basic profile info are collected. To get full profiles, posts, and comments, you need to run `discover` on each account separately.\n> - When scraping posts, likes and comments are collected, but only a partial amount may be available due to Instagram\u2019s limitations.\n\n\n**Options:**\n\n- `--skip [parts]` \u2014 skip certain steps.\n\n   *(Options: all, follower, followee, post, post-analysis, account-analysis)*  \n   Example: `--skip post-analysis` will skip analyzing posts with AI.\n- `--limit TYPE=NUMBER` \u2014 limit how many items to fetch per type (default: follower=1000, followee=1000, post=10).\n\n   *(Options: follower, followee, post)*  \n   Example: `--limit post=5` \u2014 fetches only 5 posts.\n- `--rate-limit NUMBER` \u2014  pause for 8\u201310 minutes after every N request to avoid detection.  \n   Example: `--rate-limit 500` will wait 8~10 minutes after every 500 Instagram requests.\n- `--force [parts]` \u2014 re-fetch or re-analyze even if already done.   \n\n  *(Options: all, follower, followee, post, post-analysis, account-analysis)*  \n   Example: `--force account-analysis` \u2014 **resets the progress** and reruns the AI analysis on the account data\n\n**When to use:**\nFirst step of any investigation \u2014 gets all data for your primary target.\n\nExample:\n\n```bash\nosintgraph discover \"target_user\"\nosintgraph discover \"target_user\" --skip post-analysis account-analysis --limit follower=200 post=15 --force follower followee\n```\n\n</details>\n\n\n### \ud83c\udf10 `explore <username>`\n\n<details>\n<summary>See Usage & options</summary>\n\n**Purpose:**\n\nRecursive discovery \u2014 goes beyond your target to their network.\n\n**What it does:**\n\n- Runs `discover` on each followee of the target, prioritizing those with the largest follower counts in your Neo4j database.\n\n> Focuses on followees because they often reveal the target\u2019s real interests, communities, and affiliations\u2014such as local groups, news sources, favorite influencers, or close friends. Within these, accounts with larger follower bases in your Neo4j DB are explored first, increasing the chances of uncovering valuable insights.\n\n- Stops after a set number of accounts.\n\n**Options:**\n\n- `--max NUMBER` \u2014 how many accounts to discover in total.\n   Example: `--max 10` \u2014 the agent will `discover` up to 10 followees of the target, then stop.\n  \n*(The following options work the same way as in `discover`)*\n- `--skip [parts]` \u2014 skip certain steps (e.g., post-analysis).\n\n   *(Options: all, follower, followee, post, post-analysis, account-analysis)*\n- `--limit TYPE=NUMBER` \u2014 limit how many items to fetch per type (default: follower=1000, followee=1000, post=10).\n\n   *(Options: follower, followee, post)*\n- `--rate-limit NUMBER` \u2014  pause for 8\u201310 minutes after every N request to avoid detection.\n\n- `--force [parts]` \u2014 re-fetch or re-analyze even if already done.\n\n  *(Options: all, follower, followee, post, post-analysis, account-analysis)*\n\n\n**When to use:**\nTo expand your investigation into the wider social network.\n\nExample:\n\n```bash\nosintgraph explore \"target_user\"\nosintgraph explore \"target_user\" --max 10 --limit follower=1000 followee=500\n```\n\n</details>\n\n\n### \ud83e\udd16 `agent`\n\n<details>\n<summary>See Usage & options</summary>\n\n**Purpose:**\n\nLaunches the OSINTGraph AI Agent for natural language investigations.\n\n**What it can do:**\n\n- Keyword search across your Neo4j database.\n\n- Semantic search using AI embeddings.\n\n- Auto-generate and run Cypher queries.\n\n- Execute prebuilt or custom YAML investigation templates.\n\n**Key options:**\n\n- `--debug` \u2014 store detailed debug output for template.\n\n**When to use:**\n\nAfter you\u2019ve collected data use the agent to ask questions, run analysis, or execute templates.\n\nExample:\n\n```bash\nosintgraph agent --debug\n```\n\n</details>\n\n\n## \ud83e\udde9 Data Model (Neo4j Schema)\nAfter scraping, OSINTGraph stores Instagram data in Neo4j as interconnected nodes and relationships.\n<img width=\"710\" height=\"447.8\" alt=\"OsintgraphNeo4j\" src=\"https://github.com/user-attachments/assets/dc34d94b-fa2b-43c4-8435-a898c8a4dcb1\" />\n\n*OSINTGraph Data Model (All Entities & Relationships)*\n\n### \ud83d\udc64 Person - Represents an Instagram account.\n\n<details>\n<summary>See all properties</summary>\n   \n| Property                          | Type    | Description                                          |\n| --------------------------------- | ------- | ---------------------------------------------------- |\n| **id**                            | INTEGER | Unique identifier for the person node.               |\n| **username**                      | STRING  | Instagram username.                                  |\n| **fullname**                      | STRING  | Full display name from profile.                      |\n| **bio**                           | STRING  | Profile biography text.                              |\n| **followers**                     | INTEGER | Number of followers.                                 |\n| **followees**                     | INTEGER | Number of accounts followed.                         |\n| **mediacount**                    | INTEGER | Number of posts uploaded.                            |\n| **external\\_url**                 | STRING  | External link in profile bio.                        |\n| **business\\_category\\_name**      | STRING  | Business category if a business account.             |\n| **is\\_verified**                  | BOOLEAN | True if the account has a verification badge.        |\n| **is\\_business\\_account**         | BOOLEAN | True if the account is marked as a business account. |\n| **has\\_highlight\\_reels**         | BOOLEAN | True if the user has highlight stories.              |\n| **has\\_public\\_story**            | BOOLEAN | True if the account has public stories.              |\n| **is\\_private**                   | BOOLEAN | True if the account is private.                      |\n| **profile\\_pic\\_url**             | STRING  | Profile picture URL.                                 |\n| **profile\\_pic\\_url\\_no\\_iphone** | STRING  | Alternate profile picture URL.                       |\n| **biography\\_hashtags**           | LIST    | Hashtags used in the bio.                            |\n| **biography\\_mentions**           | LIST    | Usernames mentioned in the bio.                      |\n\n#### Analysis Fields\n| Property              | Type   | Description                           |\n| --------------------- | ------ | ------------------------------------- |\n| **account\\_analysis** | STRING | AI-generated analysis of the account. (stringified JSON)|\n\n<details>\n  <summary>Show account_analysis structure</summary>\n  <pre><code class=\"json\">\n  {\n  \"account_summary\": {\n    \"who_runs_this_account\": {\n      \"summary\": \"\",\n      \"confidence\": \"\"\n    },\n    \"what_type_of_account\": {\n      \"label\": \"\",\n      \"reasoning\": \"\",\n      \"confidence\": \"\"\n    },\n    \"why_this_account_exists\": {\n      \"main_purpose\": \"\",\n      \"supporting_signals\": []\n    },\n    \"who_is_the_target_audience\": {\n      \"summary\": \"\",\n      \"reasoning\": \"\"\n    },\n    \"what_it_posts_about\": {\n      \"topic_distribution\": [\n        {\n          \"topic\": \"\",\n          \"percentage\": \"\"\n        }\n      ]\n    },\n    \"how_often_it_posts\": {\n      \"avg_posts_per_month\": \"\",\n      \"most_active_days\": [],\n      \"seasonal_patterns\": \"\"\n    },\n    \"who_comments_on_it\": {\n      \"audience_profile\": {\n        \"likely_age_range\": \"\",\n        \"languages_used\": [],\n        \"comment_style\": \"\",\n        \"emotional_tone\": \"\"\n      },\n      \"relationship_to_owner\": \"\"\n    },\n    \"how_comments_look\": {\n      \"comment_quality\": \"\",\n      \"reply_behavior\": \"\",\n      \"engagement_style\": \"\",\n      \"detected_bots_or_fake_activity\": false\n    },\n    \"notable_flags_or_anomalies\": {\n      \"inconsistencies\": [],\n      \"suspicious_behavior\": [],\n      \"possible_account_switch_history\": false\n    },\n    \"language_and_text_patterns\": {\n      \"caption_language\": [],\n      \"common_caption_themes\": [],\n      \"hashtags_usage\": \"\",\n      \"emoji_usage\": \"\",\n      \"comment_language_distribution\": [],\n      \"comment_length\": \"\"\n    },\n    \"summary_notes\": \"\"\n  }\n}\n  </code></pre>\n</details>\n\n\n#### Semantic Search Fields\n| Property                      | Type | Description                                               |\n| ----------------------------- | ---- | --------------------------------------------------------- |\n| **username\\_vector**          | LIST | Vector embedding of username for semantic search.         |\n| **bio\\_vector**               | LIST | Vector embedding of biography for semantic search.        |\n| **fullname\\_vector**          | LIST | Vector embedding of full name for semantic search.        |\n| **account\\_analysis\\_vector** | LIST | Vector embedding of account analysis for semantic search. |\n\n#### Internal Fields\n| Property                          | Type    | Description                                    |\n| --------------------------------- | ------- | ---------------------------------------------- |\n| **\\_profile\\_complete**           | BOOLEAN | Internal flag: profile scrape completed.       |\n| **\\_followers\\_complete**         | BOOLEAN | Internal flag: follower list scrape completed. |\n| **\\_followees\\_complete**         | BOOLEAN | Internal flag: followee list scrape completed. |\n| **\\_posts\\_complete**             | BOOLEAN | Internal flag: posts scrape completed.         |\n| **\\_posts\\_analysis\\_complete**   | BOOLEAN | Internal flag: post analysis completed.        |\n| **\\_account\\_analysis\\_complete** | BOOLEAN | Internal flag: account analysis completed.     |\n| **\\_followers\\_resume\\_hash**     | STRING  | Internal resume state for follower scraping.   |\n| **\\_followees\\_resume\\_hash**     | STRING  | Internal resume state for followee scraping.   |\n| **\\_posts\\_resume\\_hash**         | STRING  | Internal resume state for posts scraping.      |\n\n</details>\n\n\n### \ud83d\udcf7 Post - Represents an Instagram post.\n\n<details>\n<summary>See all properties</summary>\n\n| Property                   | Type       | Description                                      |\n| -------------------------- | ---------- | ------------------------------------------------ |\n| **id**                     | INTEGER    | Unique identifier for the post node.             |\n| **shortcode**              | STRING     | Instagram post shortcode (URL-friendly ID).      |\n| **caption**                | STRING     | Post caption text.                               |\n| **pcaption**               | STRING     | Preprocessed caption text (cleaned).             |\n| **title**                  | STRING     | Post title (if available).                       |\n| **likes**                  | INTEGER    | Number of likes on the post.                     |\n| **comments**               | INTEGER    | Number of comments on the post.                  |\n| **is\\_video**              | BOOLEAN    | True if the post is a video.                     |\n| **video\\_duration**        | INTEGER    | Video length in seconds.                         |\n| **video\\_view\\_count**     | INTEGER    | Number of video views.                           |\n| **is\\_pinned**             | BOOLEAN    | True if the post is pinned on profile.           |\n| **is\\_sponsored**          | BOOLEAN    | True if the post is marked as sponsored content. |\n| **typename**               | STRING     | Instagram media type name.                       |\n| **mediacount**             | INTEGER    | Number of media items (for carousel posts).      |\n| **accessibility\\_caption** | STRING     | Alt-text or accessibility caption.               |\n| **tagged\\_users**          | LIST       | Usernames tagged in the post.                    |\n| **caption\\_hashtags**      | LIST       | Hashtags used in the post caption.               |\n| **caption\\_mentions**      | LIST       | Mentions in the post caption.                    |\n| **date\\_utc**              | DATE\\_TIME | UTC timestamp of post creation.                  |\n| **date\\_local**            | DATE\\_TIME | Local timestamp of post creation.                |\n\n#### Analysis Fields\n| Property            | Type   | Description                               |\n| ------------------- | ------ | ----------------------------------------- |\n| **post\\_analysis**  | STRING | AI-generated analysis of the post. (stringified JSON)|\n| **image\\_analysis** | STRING | AI-generated image analysis for the post. (stringified JSON array)|\n<details>\n  <summary>Show post_analysis structure</summary>\n  <pre><code class=\"json\">\n    {\n  \"post_metadata_summary\": {\n    \"post_type\": \"\",\n    \"post_tone\": \"\",\n    \"post_intent\": \"\",\n    \"poster_role_or_affiliation\": \"\",\n    \"target_audience\": \"\",\n    \"posting_motivation\": \"\",\n    \"date_context\": \"\",\n    \"sponsored_or_promotional\": false\n  },\n  \"visual_analysis_summary\": {\n    \"key_findings\": \"\",\n    \"notable_objects_or_symbols\": \"\",\n    \"people_or_groups_shown\": \"\",\n    \"locations_or_geo_clues\": \"\",\n    \"emotion_or_energy_level\": \"\",\n    \"forensic_red_flags\": []\n  },\n  \"comment_section_analysis\": {\n    \"overall_sentiment\": \"\",\n    \"common_comment_behaviors\": \"\",\n    \"dominant_tones_or_emotions\": \"\",\n    \"top_words_or_emojis\": [],\n    \"interaction_patterns\": \"\",\n    \"bot_or_coordinated_activity\": false,\n    \"cultural_or_linguistic_signals\": \"\"\n  },\n  \"behavioral_and_social_insight\": {\n    \"likely_poster_motivation\": \"\",\n    \"social_group_affiliations\": \"\",\n    \"influence_or_recruitment_signs\": \"\",\n    \"propaganda_or_polarization_signals\": \"\",\n    \"deception_or_misinfo_signs\": \"\"\n  },\n  \"osint_value\": {\n    \"intelligence_usefulness\": \"\",\n    \"recommended_followup\": \"\",\n    \"confidence_level\": \"\",\n    \"summary_takeaways\": \"\"\n  }\n}\n  </code></pre>\n</details>\n<details>\n  <summary>Show image_analysis structure</summary>\n  <pre><code class=\"json\">\n{\n  \"image_type\": \"\",\n  \"image_tone\": \"\",\n  \"image_scenario\": \"\",\n  \"image_intent\": \"\",\n  \"people_count_visible\": \"\",\n  \"people_visibility_level\": \"\",\n  \"people_gender\": \"\",\n  \"people_age_range\": \"\",\n  \"people_ethnicity\": \"\",\n  \"people_clothing\": \"\",\n  \"people_accessories\": \"\",\n  \"people_hair_description\": \"\",\n  \"people_facial_hair\": \"\",\n  \"people_face_features\": \"\",\n  \"people_body_type\": \"\",\n  \"people_skin_tone\": \"\",\n  \"people_posture\": \"\",\n  \"people_actions\": \"\",\n  \"people_dominant_hand\": \"\",\n  \"people_walking_style\": \"\",\n  \"people_emotions\": \"\",\n  \"people_interaction\": \"\",\n  \"people_possible_role\": \"\",\n  \"people_items_carried\": \"\",\n  \"people_visible_tech\": \"\",\n  \"people_tattoos_piercings\": \"\",\n  \"people_symbols_or_badges\": \"\",\n  \"people_identity_clues\": \"\",\n  \"people_eye_color\": \"\",\n  \"people_glasses_or_contacts\": \"\",\n  \"people_mouth_expression\": \"\",\n  \"people_visible_injuries\": \"\",\n  \"people_makeup_or_face_paint\": \"\",\n  \"people_body_language\": \"\",\n  \"people_proximity\": \"\",\n  \"people_group_behavior\": \"\",\n  \"people_footwear\": \"\",\n  \"people_carry_method\": \"\",\n  \"people_visible_tattoos\": \"\",\n  \"people_eye_contact\": \"\",\n  \"people_accessory_details\": \"\",\n  \"people_disabilities_or_devices\": \"\",\n  \"people_behavior_notes\": \"\",\n  \"text_present\": false,\n  \"text_transcribed\": \"\",\n  \"text_language\": \"\",\n  \"text_font_style\": \"\",\n  \"text_meaning\": \"\",\n  \"clothing_style\": \"\",\n  \"clothing_colors\": \"\",\n  \"clothing_symbols_or_logos\": \"\",\n  \"facial_expressions\": \"\",\n  \"group_mood\": \"\",\n  \"scene_location_type\": \"\",\n  \"scene_background\": \"\",\n  \"scene_time_weather\": \"\",\n  \"notable_objects\": \"\",\n  \"tech_or_tools\": \"\",\n  \"vehicles_or_props\": \"\",\n  \"visible_text_on_objects\": \"\",\n  \"uniforms_or_insignia\": \"\",\n  \"environment_signs\": \"\",\n  \"editing_or_staging_signs\": \"\",\n  \"license_plate_number\": \"\",\n  \"license_plate_region\": \"\",\n  \"brands_or_product_names\": \"\",\n  \"unique_identifiers\": \"\",\n  \"safety_gear\": \"\",\n  \"weapon_type\": \"\",\n  \"vehicle_type_or_model\": \"\",\n  \"unusual_objects\": \"\",\n  \"animals_seen\": \"\",\n  \"activity_signs\": \"\",\n  \"time_displayed\": \"\",\n  \"image_quality\": \"\",\n  \"visual_style\": \"\",\n  \"filters_or_watermarks\": \"\",\n  \"geo_clues\": \"\",\n  \"primary_language_seen\": \"\",\n  \"regional_indicators\": \"\",\n  \"slang_or_dialect_detected\": \"\",\n  \"cultural_or_religious_signs\": \"\",\n  \"group_affiliations\": \"\",\n  \"flags_uniforms_gestures\": \"\",\n  \"deception_signs\": \"\",\n  \"hashtags_or_keywords\": \"\",\n  \"geo_political_relevance\": \"\",\n  \"game_detected\": false,\n  \"game_name\": \"\",\n  \"exif_device\": \"\",\n  \"watermark_found\": false,\n  \"original_image_source\": \"\",\n  \"poster_intent\": \"\",\n  \"target_audience\": \"\",\n  \"engagement_tricks\": \"\",\n  \"psychological_triggers\": \"\",\n  \"radical_language_or_symbols\": \"\",\n  \"call_to_action\": \"\",\n  \"recruiting_or_polarizing_content\": \"\",\n  \"misinfo_or_agenda_signals\": \"\",\n  \"summary_type\": \"\",\n  \"key_takeaways\": \"\",\n  \"cultural_or_geo_significance\": \"\",\n  \"poster_purpose\": \"\",\n  \"osint_value\": \"\",\n  \"confidence_in_analysis\": \"\"\n}\n  </code></pre>\n</details>\n\n\n#### Semantic Search Fields\n| Property                    | Type | Description                         |\n| --------------------------- | ---- | ----------------------------------- |\n| **caption\\_vector**         | LIST | Vector embedding of caption text for semantic search..   |\n| **title\\_vector**           | LIST | Vector embedding of title text for semantic search..     |\n| **post\\_analysis\\_vector**  | LIST | Vector embedding of post analysis for semantic search..  |\n| **image\\_analysis\\_vector** | LIST | Vector embedding of image analysis for semantic search.. |\n\n</details>\n\n### \ud83d\udcac Comment - Represents a comment on a post.\n\n<details>\n<summary>See all properties</summary>\n   \n| Property             | Type       | Description                             |\n| -------------------- | ---------- | --------------------------------------- |\n| **id**               | INTEGER    | Unique identifier for the comment node. |\n| **text**             | STRING     | Comment text.                           |\n| **likes\\_count**     | INTEGER    | Number of likes on the comment.         |\n| **created\\_at\\_utc** | DATE\\_TIME | UTC timestamp of comment creation.      |\n\n#### Semantic Search Fields\n| Property         | Type | Description                                           |\n| ---------------- | ---- | ----------------------------------------------------- |\n| **text\\_vector** | LIST | Vector embedding of comment text for semantic search. |\n\n</details>\n\n\n### \ud83d\udd78 Relationships\n\n| Relationship                            | Description                                  |\n| --------------------------------------- | -------------------------------------------- |\n| \ud83d\udc64 Person - **Follows** -> \ud83d\udc64 Person    | A person **follows** another person.         |\n| \ud83d\udc64 Person - **Posted** -> \ud83d\udcf7 Post       | A person **created** the post.               |\n| \ud83d\udc64 Person - **Liked** -> \ud83d\udcf7 Post        | A person **liked** a specific post.          |\n| \ud83d\udc64 Person - **Commented** -> \ud83d\udcac Comment | A person **authored** the comment.           |\n| \ud83d\udcac Comment - **On** -> \ud83d\udcf7 Post          | The comment is **made on** a specific post.  |\n| \ud83d\udcac Comment - **Reply To** -> \ud83d\udcac Comment | A comment is a **reply to** another comment. |\n| \ud83d\udc64 Person - **Liked** -> \ud83d\udcac Comment     | A person **liked** a comment.                |\n\n\n## \ud83d\udd75 OSINTGraph AI Agent \u2013 Getting Started Guide\nThe OSINTGraph Agent helps you **explore, retrieve, and analyze your OSINT data in Neo4j.**\nIt works in two main ways:\n\n- **Data Retrieval & Simple Analysis** \u2013 Fetch accounts, posts, comments, and relationships using filters, graph queries, and searches. You can also ask for quick insights (summaries, counts, highlights) on the retrieved data.\n\n- **Template-Based Analysis** \u2013 For deeper investigations, use pre-built or custom templates. Templates guide the agent to retrieve the right data and apply structured analysis for more controlled , focused, and repeatable investigations.\n\nThis guide shows the two main ways to interact with the OSINTGraph AI Agent - **Data Retrieval** for quick questions, and **Template-Based Analysis** for deeper investigations. It also explains how to ask clear questions so you get the most accurate results.\n\n> [!NOTE]\n> These example questions are just a guide \u2014 you can ask the agent in your own words, and it will understand.\n\n\n### 1. \ud83d\udd27 Data Retrieval\nData Retrieval is best for **direct queries** and **simple analyses questions**\nYou can use it to fetch data based on **filters**, **relationships**, or **searches**.\n\n#### Approach 1: Basic Data Retrieval  \nGet data by filtering on straightforward criteria (e.g., usernames or dates).\n\n**Example:**  \n- \u201cGet John\u2019s comments from 2025\u201d  \n  *(Returns all comments made by John during 2025)*\n\n- \u201cHow many comments has John made in 2025\u201d  \n  *(Returns the total number of comments John made during 2025)*\n\n---\n\n#### Approach 2: Relationship Traversal  \nInclude social connections in your query \u2014 followers, likers, commenters, etc.\n\n**Example:**  \n- \u201cFind followers of John who commented on his posts in 2025\u201d  \n  *(Returns users who follow John and commented on his posts during 2025)*\n\n---\n\n#### Approach 3: Content Search\n\nYou can search data using two methods:\n\n- **Keyword Search (literal word match):**\n  Finds exact matches of words or phrases.  \n  *Example:* \u201cFind John\u2019s comments from 2025 with the word \u2018conference\u2019\u201d  \n  *(Returns John\u2019s 2025 comments containing the exact word \u201cconference\u201d)*\n\n- **Semantic Search (meaning-based):**\n  Finds content based on related meanings, including synonyms or related terms.\n  \n   Supported fields include:\n\n   - Person: `username`, `fullname`, `bio`, `account_analysis`\n   \n   - Post: `caption`, `title`, `post_analysis`, `image_analysis`\n   \n   - Comment: `text`\n     \n  *Example:* \u201cShow John\u2019s comments from 2025 about startups\u201d  \n  *(Returns John\u2019s 2025 comments'text related to \u201cstartups,\u201d such as \u201cnew companies\u201d or \u201cventures\u201d)*\n\n---\n\n#### Combining Approaches\nYou can mix filters, relationships, and content search for precise results:\n\n- \u201cFind followers of John who liked his posts about startups in 2025\u201d\n\n> - Filters posts by date (2025)  \n> - Traverses relationships to get John\u2019s followers who liked those posts  \n> - Apply semantic search on post content to find those about startups  \n\n- \u201cFind followers of John who liked his posts with the word \u2018conference\u2019 in 2025\u201d  \n> - Filters posts by date (2025)  \n> - Traverses relationships to get John\u2019s followers who liked those posts  \n> - Apply keyword search on post content for the exact word \u201cconference\u201d\n\n--- \n\n### \ud83c\udfaf Best Practices \u2013 How to Ask Questions for Best Results\n\nBeing precise makes your results more accurate and useful. Here are key ways to improve your queries:\n\n**Examples of precision:**\n\n#### Precision in Searching Method  \n- **Vague:** \"Find posts about aura farming\"  \n- **Precise:** \"**Use semantic search**, find posts about aura farming.\"\n\n#### Precision in Targeting Data Fields  \n- **Vague:** \"Search for aura farming\"  \n- **Precise:** \"Use semantic search on **post captions** about aura farming.\"\n\n#### Precision in Context and Entities  \n- **Vague:** \"Where is John?\"  \n- **Precise:** \"Which location might John be at **based on post captions, post analysis, and person bio**?\"\n\n#### Precision in Getting Results  \n- **Vague:** \"Tell me about John\"  \n- **Precise:** \"Give John\u2019s account analysis and follower count.\"\n\n\ud83d\udca1 **Tip:** When asking, think about:  \n- What searching method should be applied if needed? (semantic search, keyword search)  \n- Which data fields should be checked? (person bio, post analysis, post captions, etc.)  \n- What exactly do you want back? (summary, detailed context, related entities, relationships, etc.)\n\nThis will speed up your investigation and ensure the Agent looks in the right places.\n\n---\n\n### 2. \ud83e\udde9 Template-Based Analysis\n\nTemplates are **blueprints that tell the AI how to analyze your data**. Instead of manually going through posts, comments, likes, and social connections\u2014which can take days\u2014a template lets the OSINTGraph agent **gather all the needed data, feed it into a fresh AI, and get clear answers**.\n\n**Example scenario:** You want to figure out where a person might be located. Doing it manually would take hours or days\u2014looking through every post, comment, and followee. With a template, the AI can **analyze all this data** and **summarize likely locations**, saving you time and effort.\n\nEach template run:\n\n- Spawns a **new AI instance** with no memory of previous runs.\n\n- Uses a **system prompt** (the AI\u2019s \u201cbrain\u201d) to guide reasoning.\n\n- Injects the gathered data into a **user prompt** for analysis.\n\nTemplates are great because they let you:\n\n1. **Control** how the AI thinks and reasons.\n\n2. **Get consistent, repeatable results.**\n\n3. **Analyze large datasets quickly** without doing manual work.\n\n4. **Reuse the same template** across different targets or investigations.\n\n\n### \ud83d\udcdd Template Structure\n\nTemplates are written as `.yaml` files with the following structure:\n\n```yaml\nname: <unique_template_name>\n# Example: liked_post_analysis\n# A unique identifier for the template. Used to select and run this template.\n\ndescription: |\n  <Brief explanation of what the template does, what kind of data it processes, and the type of output it produces.>\n  # Example:\n  #    Analyze liked posts to infer user interests and personality traits.\n\ninput_fields:\n  # List of placeholders that will be replaced by actual data when running the template.\n  # Each field defines a unique placeholder name and what data should be injected by OSINTGraph agent into that placeholder.\n  - name: placeholder1\n    description: |\n      <Explain clearly what data this field should contain, and the exact format required.>\n      # The agent will read these descriptions to automatically choose the correct Cypher queries, run them, and inject the results in the requested format.\n      # Example:\n      #    Provide User profile info including Person.username and Person.bio.\n      #    Give results in this format:\n      #       Username: ...\n      #       Bio: ...\n\n  - name: placeholder2\n    description: |\n      <Explain what this second input field should contain and its format.>\n      # Describe what kind of data should be injected into this second placeholder when the template runs.\n      # Example:\n      #    A list of posts liked by the user, each with Post.caption and Post.post_analysis.\n      #    Format in this way:\n      #    Post:\n      #       Catpion: ...\n      #       Post analysis: ...     \n\nsystem_prompt: |\n  <Instructions defining the AI\u2019s role, behavior, reasoning style, and output format>\n  # Defines the LLM style, tone, rules, how to reason, what to infer, and how to format results\n  # Example:\n  #   You are a social media analyst. Review the user's liked posts and infer behavioral patterns or thematic interests based on post content.\n\nuser_prompt: |\n  <Task description with placeholders for injected data>\n  # The task request, with special placeholders `{placeholders}` for injected data\n  # Example:\n  #    Analyze the following profile and liked posts:\n  #    Profile Info:\n  #    {placeholder1}\n  #\n  #    Posts liked by the user:\n  #    {placeholder2}\n```\n\nSee an example template here: [location_analysis.yaml](https://github.com/XD-MHLOO/osintgraph-templates/blob/main/templates/location_analysis.yaml)\n\n### \ud83d\udce6 Predefined Templates\n\nOSINTGraph comes with several ready-to-use templates that cover common OSINT investigations. You can run them immediately without creating your own.\n\nExamples include:\n\n- **location_analysis** \u2013 Determine possible locations of the target user by analyzing posts, comments, likes, and their social graph.\n\n- **contact_info_extraction** \u2013 Scan bios, captions, comments, and images for potential leaks of emails, phone numbers, or addresses.\n\n- **interests_hobbies_lifestyle_analysis** \u2013 Uncover the target user\u2019s interests, hobbies, and lifestyle preferences with supporting evidence from posts, likes, and network connections.\n\nAll predefined templates are maintained in this repository: https://github.com/XD-MHLOO/osintgraph-templates\n\n**\ud83d\udc49 To see the full list of predefined templates**:\n\n  Ask the agent to list all templates in the folder. \n\n  > \"list all templates\"\n\n**\ud83d\udc49 To view details of a specific one:**\n\n  Ask the agent to show a template by name, or you can view the YAML file directly in your templates folder (`osintgraph -h` to see the folder path).\n\n  > \"show template location_analysis\"\n\n**\ud83d\udc49 To run a predefined template:**  \nAsk the agent to execute the template.\n\n> \"Run location_analysis on target_username\"\n\n\n### \u26a1 How Templates Work\n\n1. **You request a template to run**  \n    Example template with required additional context (e.g., username):  \n\n   > \"Run location_analysis template on JohnDoe\"\n\n   Choose the template you want to run and provide the agent with any required context.\n  \n   If you're not sure what to provide, simply ask the agent(e.g. \"How to use \\<the template\\>\") \u2014 it will guide you.\n\n2. **Agent collects required data automatically**\n\n   Based on the template\u2019s input field descriptions, the agent automatically runs Cypher queries on your Neo4j database. It retrieves all required fields, formats the results, and fills the `{placeholders}` in the template's user prompt.\n\n3. **Run Template and Get Output**\n\n   A new LLM instance is created internally, using the template\u2019s system and user prompts to analyze the data, then returns the output (e.g., analysis, summaries, or explanations) depending on the template's system prompt.\n\n> [!NOTE]  \n> OSINTGraph is primarily built using free services (e.g. Gemini API), therefore template runs are **rate-limited internally** to ensure stability.\n\n### \ud83d\udee0 How to Create Your Own Custom Template\n\n**You can create a custom template by defining a `.yaml` file that controls how the AI analyzes your data.**\n\n#### \ud83e\udde0 Example Use Case\n\nLet\u2019s say you want to analyze a user's **bio**, **post captions**, and **comment texts** to extract any possible of contact details (such as emails, phone numbers, addresses, etc.) You can build a custom template like this:\n\n```yaml\nname: contact_info_extraction\n\ndescription: |\n  Analyze a user's profile bio, post captions, comment texts and image analysis\n  (OCR and visual text) to detect any possible leaks of contact details such as emails, phone numbers, or addresses, and return them in a structured Markdown list with supporting context.\n\ninput_fields:\n  - name: bio\n    description: |\n      The user\u2019s Person.bio.\n\n      Format:\n      Bio:\n        Text: ...\n\n  - name: posts\n    description: |\n      List of all posts made by the user. Each post must include:\n        - Post.shortcode\n        - Post.caption\n        - Post.image_analysis\n\n      Format (One post per entry):\n        User Post:\n          Post Url: https://www.instagram.com/p/<Post.shortcode>/\n          Caption: ...\n          Image Analysis:\n\n          Image 1:\n          - People: [...]\n          - Text/OCR: [...]\n          - Summary: [...]\n\n          Image 2:\n          ...\n\n  - name: comments\n    description: |\n      A list of Comment.text authored by the user.\n\n      Format:\n      Comment:\n        Text: ...\n\nsystem_prompt: |\n  You are a digital privacy analyst. Your task is to carefully analyze the provided data to identify any possible leaks of contact details, including but not limited to:\n  - Email addresses\n  - Phone numbers\n  - Addresses\n  - Social media handles, usernames, or IDs\n  - Any other identifiers that may reveal contact information\n  - Use pattern recognition and contextual reasoning to flag potential contact details.\n  - If detected, report each type of possible contact detail (email, phone, address, ..) in a structured Markdown format.\n  - For each match, include:\n    - The type of contact detail (Email, Phone, Address, etc.)\n    - The exact string detected\n    - The source field (bio, caption, comment, image OCR) (cite Post Url for Post image OCR )\n    - Context / Possible Use \u2014 Based on surrounding information, what the contact might be\n    - A brief reasoning (if the match is inferred and not explicit)\n    - A confidence level (High / Medium / Low), with justification for the confidence\n  - If nothing is found, return: \"No possible contact details detected.\"\n\nuser_prompt: |\n  Review the following content and extract any possible contact-related information:\n  \n  User Bio:\n  {bio}\n  \n  List of User Posts:\n  {posts}\n  \n  List of User Comments:\n  {comments}\n\n\n```\n**Steps to Create Your Template**\n\n1. `name`\nChoose a unique name to identify your template. This will be used to select and run the template.\n\n2. `description`\nBriefly describe what your template does and the kind of output it produces.\n   > This helps the OSINTGraph agent better understand the intent and use of the template.\n\n3. `input_fields`\nDefine what data the agent should inject at runtime. Each input field includes:\n\n- `name`: Used as `{placeholder}` in the user prompt.\n\n- `description`: Explain exactly what data should be injected here and how it should be formatted.\n\n> [!NOTE]\n> - For direct schema attributes (e.g., `Person.bio`, `Post.caption`), mention them explicitly so the agent knows to fetch them directly from the database.\n\n4. `System Prompt`\nWrite clear instructions defining the AI\u2019s role, behavior, how to reason, and how to format its output. This controls how the AI thinks and processes the data.\n\n5. `User Prompt`\nWrite the actual task description, with `{placeholder_name}` tags for runtime data injection.\n\n\n#### \ud83d\udcc2 Add Your Custom Template\n\n1. Place your custom `.yaml` template file into your templates folder.\n(Run osintgraph -h to see where the folder is located.)\n\n2. Validate Your Template:\n   \n   > \"list all templates including invalid ones\"\n   \n   The agent will display all templates in the folder. If your custom template has errors, it will show where; if no errors appear, your template is valid and ready to use. (No need to restart `osintgraph agent` if it\u2019s already running \u2014 simply ask to \"refresh and list all templates\" again.)\n\n## \ud83d\udeab **How to Avoid Account Suspension**\n\n1. <mark>**Use Your Browser Session**</mark>  \n   When running `osintgraph setup instagram`, choose login via Firefox session to make the login look natural. \ud83c\udf10\n\n2. <mark>**Use Your Real User-Agent**</mark>  \n   When running `osingraph setup user-agent`, provide the exact user-agent from the browser you use to log in to your Instagram account. \ud83d\udda5\ufe0f\n\n3. <mark>**Enable 2FA**</mark>  \n   Turn on 2FA for your Instagram account. It\u2019s simple: just use an authenticator app, and it helps Instagram recognize that your account is legitimate. \ud83d\udd12\n\n4. <mark>**Build Account Reputation**</mark>  \n   Use your Instagram account normally (like posts, comment, watch stories) for a few days or weeks before scraping. \ud83d\udcc8\n\n5. <mark>**Warm Up Your Session**</mark>  \n   Spend time using Instagram before scraping, like a normal user, to avoid looking suspicious. \u23f3\n\n6. **Avoid VPNs**  \n   Don\u2019t use VPNs. Instagram may flag accounts with mismatched or suspicious locations. \ud83d\udeab\ud83c\udf0d\n\n7. **Don\u2019t Use the Account for Other Activities While Scraping**  \n   When using this tool to collect data, avoid using the same Instagram account for any other activities. \ud83d\uded1\n\n8. **Limit Scraping Time**  \n   Don\u2019t scrape for more than 6 hours straight. \u23f0\n### Credit:  \n- Thanks to [@ahmdrz](https://github.com/ahmdrz) for these valuable insights on avoiding account suspension. \ud83d\ude4f\n- Also see [this useful comment](https://github.com/instaloader/instaloader/issues/2391#issuecomment-2400987481) on Instaloader's GitHub for more tips.\n\n---\n\n## \ud83d\udce6 Dependencies:\n- **[Instaloader](https://github.com/instaloader/instaloader)** \u2013 Used to collect Instagram profile data, followers, and followees.\n- **[Neo4j](https://neo4j.com/)** \u2013 Graph database used to store and visualize the Instagram social network.\n- **[LangGraph](https://github.com/langgraph/langgraph)** \u2013 Handles structured multi-step LLM reasoning and ReAct-style agent execution.\n- **[Gemini / Google Generative AI](https://developers.google.com/)** \u2013 Provides the LLM model used for AI-powered analysis and powers the AI agent.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Osintgraph is a tool that maps Instagram targets, revealing social connections, posts, and interactions for OSINT investigations.",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/XD-MHLOO/Osintgraph",
        "Repository": "https://github.com/XD-MHLOO/Osintgraph"
    },
    "split_keywords": [
        "instagram",
        " langchain",
        " langgraph",
        " neo4j",
        " osint",
        " social network analysis"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ab8b184ac9e964c2daef893537b127ac14fde75cb99ac5b6ff702ac8c5f100ca",
                "md5": "407cd4ddc9089d4ab3b5ea90b657ab60",
                "sha256": "889107cc43a3ad7d66dc7d5c5f1f4dc281479c06b1dc9f1f025daaf25cde1b00"
            },
            "downloads": -1,
            "filename": "osintgraph-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "407cd4ddc9089d4ab3b5ea90b657ab60",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 90202,
            "upload_time": "2025-09-06T17:13:19",
            "upload_time_iso_8601": "2025-09-06T17:13:19.280051Z",
            "url": "https://files.pythonhosted.org/packages/ab/8b/184ac9e964c2daef893537b127ac14fde75cb99ac5b6ff702ac8c5f100ca/osintgraph-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "499959a03d9217131a1a7e5f3b49f797a6d3dfabece80f628e9b4ec0eb93c314",
                "md5": "fd0ca8bbfdddc189639c8ca3d919451b",
                "sha256": "31f1ff489648037d26d616427b79b2a5c4dfe9a6aefb90f23f958dcc41893473"
            },
            "downloads": -1,
            "filename": "osintgraph-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "fd0ca8bbfdddc189639c8ca3d919451b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 94636,
            "upload_time": "2025-09-06T17:13:22",
            "upload_time_iso_8601": "2025-09-06T17:13:22.073673Z",
            "url": "https://files.pythonhosted.org/packages/49/99/59a03d9217131a1a7e5f3b49f797a6d3dfabece80f628e9b4ec0eb93c314/osintgraph-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-06 17:13:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "XD-MHLOO",
    "github_project": "Osintgraph",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "osintgraph"
}
        
Elapsed time: 3.24306s