atomic-search


Nameatomic-search JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/aflinxh/atomic_search
SummaryA Python package for extracting and detecting malicious JavaScript syntax through atomic and molecule search.
upload_time2024-10-29 01:49:20
maintainerNone
docs_urlNone
authorAlfin Gusti Alamsyah
requires_python>=3.7
licenseMIT
keywords malicious code detection javascript analysis obfuscation feature extraction
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Atomic Search

**Atomic Search** is a Python package for detecting malicious JavaScript syntax through an atomic and molecule search approach. This package is designed to handle obfuscated JavaScript code using techniques like concatenation and syntax splitting, making it effective for detecting target syntax even when the code is heavily obfuscated.

## Features

- **Atomic Extraction**: Extracts relevant syntax fragments (atoms) from obfuscated JavaScript.
- **Molecule Search**: Combines these atoms to form specific target syntax using a brute-force approach, enabling the detection of malicious JavaScript syntax.
- **Logging and Debugging**: Logs the extraction and molecule formation process for debugging purposes.
- **Automated Task Management**: Simplify development tasks with `invoke` commands.

## Installation

Ensure you are using Python 3.7 or newer.

1. Clone the repository:
   ```bash
   git clone https://github.com/aflinxh/atomic_search.git
   cd atomic_search
   ```

2. Install the package using `pip`:
   ```bash
   pip install .
   ```

3. For development, install additional dependencies:
   ```bash
   pip install .[dev]
   ```

## Usage

Here’s an example of using **Atomic Search** to detect JavaScript syntax:

```python
from atomic_search import atomic_search

# List of target words to detect
target_words = ["getElementById", "addEventListener"]

# Example search space, which is obfuscated JavaScript code
search_space = "some obfuscated JavaScript code"

# Define minimum atom size and molecule similarity
min_atom_size = 2  # minimum atom size
molecule_similarity = {"getElementById": "90%", "addEventListener": "-2"}  # tolerance or similarity level

# Run the atomic search
results = atomic_search(target_words, search_space, min_atom_size, molecule_similarity, logs=True)

# Display the results
print("Search Results:", results)
```

### `atomic_search` Function Parameters

- **`target_words`**: List of strings representing the target syntax to detect.
- **`search_space`**: The JavaScript string to analyze.
- **`min_atom_size`**: Minimum atom size required for validity.
- **`molecule_similarity`**: Dictionary setting the similarity or tolerance for each target.
- **`logs`**: Set to `True` to display logs.

## Directory Structure

The project has the following structure:

```
atomic_search/
├── atomic_search.py        # Main function for atom and molecule search
├── extract_atoms.py        # Module for atom extraction
├── form_molecule.py        # Module to form molecules from atoms
└── __init__.py             # Package initializer
tasks.py                    # Task automation with Invoke
utils/                      # Utility scripts for managing logs and datasets
tests/                      # Test directory
README.md                   # This documentation
pyproject.toml              # Project metadata
setup.py                    # Installation configuration
```

## Utility Commands

This project uses `invoke` to manage development tasks, which are defined in `tasks.py`. Here are some commonly used commands:

- **Clear Logs**: Removes all log files from the logs directory.
  ```bash
  invoke clear-logs
  ```

- **Clear Datasets**: Removes all datasets from the dataset directory.
  ```bash
  invoke clear-datasets
  ```

- **Generate Datasets**: Generates datasets with an optional `num_samples` argument.
  ```bash
  invoke generate-datasets --num-samples=100
  ```

## Testing

This project uses `pytest` for running tests and `invoke` to manage and simplify test execution. Here are the available test commands using `invoke`:

- **Run Atom Tests**: Runs tests for `extract_atoms.py` located in `tests/test_extract_atoms.py`. You can optionally specify a particular file to test and enable logs.

  ```bash
  invoke test-atoms --file-name="sample.js" --show-logs
  ```
  - `--file-name`: Specifies the JavaScript file to use for testing.
  - `--show-logs`: Enables detailed logging during the test.

- **Run Molecule Tests**: Runs tests for `form_molecule.py` located in `tests/test_form_molecule.py`. You can optionally specify a file name and enable logs.

  ```bash
  invoke test-molecule --file-name="sample.js" --show-logs
  ```
  - `--file-name`: Specifies the JavaScript file to use for testing.
  - `--show-logs`: Enables detailed logging during the test.

- **Run Atomic Search Tests**: Runs tests for the `atomic_search` function located in `tests/test_atomic_search.py`. You can specify a file name and enable logs, similar to the other test commands.

  ```bash
  invoke test-atomic --file-name="sample.js" --show-logs
  ```
  - `--file-name`: Specifies the JavaScript file to use for testing.
  - `--show-logs`: Enables detailed logging during the test.

### Running All Tests

To run all tests in the `tests/` directory, you can use `pytest` directly:

```bash
pytest tests/
```

These `invoke` commands allow you to run targeted tests with specific options for more control during development and debugging.


## Contribution

Contributions are welcome! Follow these steps to contribute:

1. Fork this repository.
2. Create a branch for your feature or fix (`git checkout -b new-feature`).
3. Commit your changes (`git commit -m 'Add new feature'`).
4. Push to the branch (`git push origin new-feature`).
5. Create a Pull Request.

## License

This project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/aflinxh/atomic_search",
    "name": "atomic-search",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "malicious code detection, JavaScript analysis, obfuscation, feature extraction",
    "author": "Alfin Gusti Alamsyah",
    "author_email": "Alfin Gusti Alamsyah <alfinalamsyahhh@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/23/53/8e703c8a6c91f8f51e1ea05bd7a34401e1b7009efe446780430f1efbccfc/atomic_search-0.1.0.tar.gz",
    "platform": null,
    "description": "# Atomic Search\n\n**Atomic Search** is a Python package for detecting malicious JavaScript syntax through an atomic and molecule search approach. This package is designed to handle obfuscated JavaScript code using techniques like concatenation and syntax splitting, making it effective for detecting target syntax even when the code is heavily obfuscated.\n\n## Features\n\n- **Atomic Extraction**: Extracts relevant syntax fragments (atoms) from obfuscated JavaScript.\n- **Molecule Search**: Combines these atoms to form specific target syntax using a brute-force approach, enabling the detection of malicious JavaScript syntax.\n- **Logging and Debugging**: Logs the extraction and molecule formation process for debugging purposes.\n- **Automated Task Management**: Simplify development tasks with `invoke` commands.\n\n## Installation\n\nEnsure you are using Python 3.7 or newer.\n\n1. Clone the repository:\n   ```bash\n   git clone https://github.com/aflinxh/atomic_search.git\n   cd atomic_search\n   ```\n\n2. Install the package using `pip`:\n   ```bash\n   pip install .\n   ```\n\n3. For development, install additional dependencies:\n   ```bash\n   pip install .[dev]\n   ```\n\n## Usage\n\nHere\u2019s an example of using **Atomic Search** to detect JavaScript syntax:\n\n```python\nfrom atomic_search import atomic_search\n\n# List of target words to detect\ntarget_words = [\"getElementById\", \"addEventListener\"]\n\n# Example search space, which is obfuscated JavaScript code\nsearch_space = \"some obfuscated JavaScript code\"\n\n# Define minimum atom size and molecule similarity\nmin_atom_size = 2  # minimum atom size\nmolecule_similarity = {\"getElementById\": \"90%\", \"addEventListener\": \"-2\"}  # tolerance or similarity level\n\n# Run the atomic search\nresults = atomic_search(target_words, search_space, min_atom_size, molecule_similarity, logs=True)\n\n# Display the results\nprint(\"Search Results:\", results)\n```\n\n### `atomic_search` Function Parameters\n\n- **`target_words`**: List of strings representing the target syntax to detect.\n- **`search_space`**: The JavaScript string to analyze.\n- **`min_atom_size`**: Minimum atom size required for validity.\n- **`molecule_similarity`**: Dictionary setting the similarity or tolerance for each target.\n- **`logs`**: Set to `True` to display logs.\n\n## Directory Structure\n\nThe project has the following structure:\n\n```\natomic_search/\n\u251c\u2500\u2500 atomic_search.py        # Main function for atom and molecule search\n\u251c\u2500\u2500 extract_atoms.py        # Module for atom extraction\n\u251c\u2500\u2500 form_molecule.py        # Module to form molecules from atoms\n\u2514\u2500\u2500 __init__.py             # Package initializer\ntasks.py                    # Task automation with Invoke\nutils/                      # Utility scripts for managing logs and datasets\ntests/                      # Test directory\nREADME.md                   # This documentation\npyproject.toml              # Project metadata\nsetup.py                    # Installation configuration\n```\n\n## Utility Commands\n\nThis project uses `invoke` to manage development tasks, which are defined in `tasks.py`. Here are some commonly used commands:\n\n- **Clear Logs**: Removes all log files from the logs directory.\n  ```bash\n  invoke clear-logs\n  ```\n\n- **Clear Datasets**: Removes all datasets from the dataset directory.\n  ```bash\n  invoke clear-datasets\n  ```\n\n- **Generate Datasets**: Generates datasets with an optional `num_samples` argument.\n  ```bash\n  invoke generate-datasets --num-samples=100\n  ```\n\n## Testing\n\nThis project uses `pytest` for running tests and `invoke` to manage and simplify test execution. Here are the available test commands using `invoke`:\n\n- **Run Atom Tests**: Runs tests for `extract_atoms.py` located in `tests/test_extract_atoms.py`. You can optionally specify a particular file to test and enable logs.\n\n  ```bash\n  invoke test-atoms --file-name=\"sample.js\" --show-logs\n  ```\n  - `--file-name`: Specifies the JavaScript file to use for testing.\n  - `--show-logs`: Enables detailed logging during the test.\n\n- **Run Molecule Tests**: Runs tests for `form_molecule.py` located in `tests/test_form_molecule.py`. You can optionally specify a file name and enable logs.\n\n  ```bash\n  invoke test-molecule --file-name=\"sample.js\" --show-logs\n  ```\n  - `--file-name`: Specifies the JavaScript file to use for testing.\n  - `--show-logs`: Enables detailed logging during the test.\n\n- **Run Atomic Search Tests**: Runs tests for the `atomic_search` function located in `tests/test_atomic_search.py`. You can specify a file name and enable logs, similar to the other test commands.\n\n  ```bash\n  invoke test-atomic --file-name=\"sample.js\" --show-logs\n  ```\n  - `--file-name`: Specifies the JavaScript file to use for testing.\n  - `--show-logs`: Enables detailed logging during the test.\n\n### Running All Tests\n\nTo run all tests in the `tests/` directory, you can use `pytest` directly:\n\n```bash\npytest tests/\n```\n\nThese `invoke` commands allow you to run targeted tests with specific options for more control during development and debugging.\n\n\n## Contribution\n\nContributions are welcome! Follow these steps to contribute:\n\n1. Fork this repository.\n2. Create a branch for your feature or fix (`git checkout -b new-feature`).\n3. Commit your changes (`git commit -m 'Add new feature'`).\n4. Push to the branch (`git push origin new-feature`).\n5. Create a Pull Request.\n\n## License\n\nThis project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python package for extracting and detecting malicious JavaScript syntax through atomic and molecule search.",
    "version": "0.1.0",
    "project_urls": {
        "Documentation": "https://github.com/aflinxh/atomic_search",
        "Homepage": "https://github.com/aflinxh/atomic_search",
        "Issues": "https://github.com/Aflinxh/atomic_search/issues"
    },
    "split_keywords": [
        "malicious code detection",
        " javascript analysis",
        " obfuscation",
        " feature extraction"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f99fd6cf87efbb4042f973848484e3b8b610f4cc16a8643c8be2b0f3308778ad",
                "md5": "b449c6dd068747c093d79f8d6290b9de",
                "sha256": "d7b5ef835a83943ef302e1097239cfd445e571b5fac84b5ee1d0ebd32abe538c"
            },
            "downloads": -1,
            "filename": "atomic_search-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b449c6dd068747c093d79f8d6290b9de",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 6824,
            "upload_time": "2024-10-29T01:49:19",
            "upload_time_iso_8601": "2024-10-29T01:49:19.044768Z",
            "url": "https://files.pythonhosted.org/packages/f9/9f/d6cf87efbb4042f973848484e3b8b610f4cc16a8643c8be2b0f3308778ad/atomic_search-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "23538e703c8a6c91f8f51e1ea05bd7a34401e1b7009efe446780430f1efbccfc",
                "md5": "8d5bae3b1c821ec38954001a0931c0c1",
                "sha256": "a8bc7c3ab6a4a9b29dcc8d494dade732369cc3b3fd24bc66628c605b31058b45"
            },
            "downloads": -1,
            "filename": "atomic_search-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "8d5bae3b1c821ec38954001a0931c0c1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 10442,
            "upload_time": "2024-10-29T01:49:20",
            "upload_time_iso_8601": "2024-10-29T01:49:20.924094Z",
            "url": "https://files.pythonhosted.org/packages/23/53/8e703c8a6c91f8f51e1ea05bd7a34401e1b7009efe446780430f1efbccfc/atomic_search-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-29 01:49:20",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "aflinxh",
    "github_project": "atomic_search",
    "github_not_found": true,
    "lcname": "atomic-search"
}
        
Elapsed time: 2.56764s