Name | logdelta JSON |
Version |
1.0.0.post1
JSON |
| download |
home_page | None |
Summary | LogDelta - Go Beyond Grepping with NLP-based Log File Analysis |
upload_time | 2024-12-13 16:59:03 |
maintainer | None |
docs_url | None |
author | None |
requires_python | <3.13,>=3.9 |
license | None |
keywords |
logs
anomaly detection
log parsing
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# LogDelta
LogDelta - Go Beyond Grepping with NLP-based Log Analysis!
See [YouTube](https://www.youtube.com/playlist?list=PLTUjKYPvVhe6JhHBlkJN_yPhVDR5w2ej2) demonstrating the tool in action.
## Installation and Example
We recommend using a virtual environment to ensure smooth operations.
```bash
conda create -n logdelta python=3.11
conda activate logdelta
```
Install logdelta.
```bash
pip install logdelta
```
Download source code, and navigate to demo folder
```bash
git clone https://github.com/EvoTestOps/LogDelta.git
cd LogDelta/demo
```
Get data
```bash
wget -O Hadoop.zip https://zenodo.org/records/8196385/files/Hadoop.zip?download=1
unzip Hadoop.zip -d Hadoop
```
Run analysis
```bash
python -m logdelta.config_runner -c config.yml`
```
Observer results in `LogDelta/demo/Output`. For more examples see `LogDelta/demo/label_investigation` and `LogDelta/demo/full`
LogDelta assumes your folders represent a collection of software logs of interest. LogDelta performs a comparison between two or more folders using matching file names. A **target run** represents a software run we are interested in analyzing. LogDelta uses **comparison runs** as a baseline. For example, the "My_passing_logs1", "My_passing_logs2", "My_passing_logs3" folders can be comparison runs, while "My_failing_logs" would be your target run that you want to analyze with respect to comparison runs.
## Types of Analysis
In LogDelta, three types of analysis are available:
1. **Visualize**
- Multiple logs files or runs with UMAP based on two dimensional scaling of the log contents.
- Individual log files with log anomaly scoring (see step 3 for details anomaly detection supported)
2. **Measure the distance between two logs or sets of logs** using:
- Jaccard distance
- Cosine distance
- Containment distance
- Compression distance
3. **Build an anomaly detection model** from a set of logs and use it to score anomalies (higher scores more anomalous) in a log file using :
- KMeans (kmeans)
- IsolationForest (IF)
- RarityModel (RM)
- Out-of-Vocabulary Detector (OOVD)
## Levels of Analysis
Analysis can be done at four different levels:
1. **Run (folder) level**, investigating the names of files without looking at their contents.
2. **Run (folder) level**, investigating run contents (this is slower than what is done in 1).
3. **File level**, investigating file contents (matched with the same names between runs).
4. **Line level**, investigating line contents (matched with the same names between runs).
LogDelta is build on top of LogLead[^1]. https://pypi.org/project/LogLead/
Log line level anomaly detection visualized. Which one is anomaly?

[^1]: Mäntylä MV, Wang Y, Nyyssölä J. Loglead-fast and integrated log loader, enhancer, and anomaly detector. In2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) 2024 Mar 12 (pp. 395-399). IEEE.
Raw data
{
"_id": null,
"home_page": null,
"name": "logdelta",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.9",
"maintainer_email": null,
"keywords": "logs, anomaly detection, log parsing",
"author": null,
"author_email": "Mika M\u00e4ntyl\u00e4 <mika.mantyla@helsinki.fi>",
"download_url": "https://files.pythonhosted.org/packages/e3/23/b818313fdb43baa86fe165b845b69c281d6f82ed34de67a57a9ffb96b0da/logdelta-1.0.0.post1.tar.gz",
"platform": null,
"description": "# LogDelta\nLogDelta - Go Beyond Grepping with NLP-based Log Analysis! \n\nSee [YouTube](https://www.youtube.com/playlist?list=PLTUjKYPvVhe6JhHBlkJN_yPhVDR5w2ej2) demonstrating the tool in action.\n\n## Installation and Example\nWe recommend using a virtual environment to ensure smooth operations.\n```bash\nconda create -n logdelta python=3.11\nconda activate logdelta\n```\nInstall logdelta. \n```bash\npip install logdelta\n```\nDownload source code, and navigate to demo folder\n```bash\ngit clone https://github.com/EvoTestOps/LogDelta.git\ncd LogDelta/demo\n```\nGet data\n```bash\nwget -O Hadoop.zip https://zenodo.org/records/8196385/files/Hadoop.zip?download=1\nunzip Hadoop.zip -d Hadoop\n```\nRun analysis\n```bash\npython -m logdelta.config_runner -c config.yml`\n```\nObserver results in `LogDelta/demo/Output`. For more examples see `LogDelta/demo/label_investigation` and `LogDelta/demo/full`\n\n\nLogDelta assumes your folders represent a collection of software logs of interest. LogDelta performs a comparison between two or more folders using matching file names. A **target run** represents a software run we are interested in analyzing. LogDelta uses **comparison runs** as a baseline. For example, the \"My_passing_logs1\", \"My_passing_logs2\", \"My_passing_logs3\" folders can be comparison runs, while \"My_failing_logs\" would be your target run that you want to analyze with respect to comparison runs.\n\n\n## Types of Analysis\nIn LogDelta, three types of analysis are available:\n\n1. **Visualize** \n - Multiple logs files or runs with UMAP based on two dimensional scaling of the log contents. \n - Individual log files with log anomaly scoring (see step 3 for details anomaly detection supported)\n\n2. **Measure the distance between two logs or sets of logs** using:\n - Jaccard distance\n - Cosine distance\n - Containment distance\n - Compression distance\n\n3. **Build an anomaly detection model** from a set of logs and use it to score anomalies (higher scores more anomalous) in a log file using :\n - KMeans (kmeans)\n - IsolationForest (IF)\n - RarityModel (RM)\n - Out-of-Vocabulary Detector (OOVD)\n\n\n\n## Levels of Analysis\nAnalysis can be done at four different levels:\n\n1. **Run (folder) level**, investigating the names of files without looking at their contents.\n2. **Run (folder) level**, investigating run contents (this is slower than what is done in 1).\n3. **File level**, investigating file contents (matched with the same names between runs).\n4. **Line level**, investigating line contents (matched with the same names between runs).\n\n\nLogDelta is build on top of LogLead[^1]. https://pypi.org/project/LogLead/\n\nLog line level anomaly detection visualized. Which one is anomaly? \n\n\n\n[^1]: M\u00e4ntyl\u00e4 MV, Wang Y, Nyyss\u00f6l\u00e4 J. Loglead-fast and integrated log loader, enhancer, and anomaly detector. In2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) 2024 Mar 12 (pp. 395-399). IEEE.\n",
"bugtrack_url": null,
"license": null,
"summary": "LogDelta - Go Beyond Grepping with NLP-based Log File Analysis",
"version": "1.0.0.post1",
"project_urls": null,
"split_keywords": [
"logs",
" anomaly detection",
" log parsing"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "3495d381a77993dbeea76d40e9671a63ba63254e81af6342e8785686d3e670de",
"md5": "a071084f42b4239b890015259d9bf153",
"sha256": "cf31bd46bb420e8e3720ee72e4e94efac7227842c26824a72d2b47fc9230f1d4"
},
"downloads": -1,
"filename": "logdelta-1.0.0.post1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "a071084f42b4239b890015259d9bf153",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.9",
"size": 23048,
"upload_time": "2024-12-13T16:59:01",
"upload_time_iso_8601": "2024-12-13T16:59:01.409024Z",
"url": "https://files.pythonhosted.org/packages/34/95/d381a77993dbeea76d40e9671a63ba63254e81af6342e8785686d3e670de/logdelta-1.0.0.post1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e323b818313fdb43baa86fe165b845b69c281d6f82ed34de67a57a9ffb96b0da",
"md5": "63166326e2b13da51af81fe2f2f8fb4e",
"sha256": "91bac023461fde48acd1652cdd68f6e06dfb53df7649fdd97bbb139c19c2667e"
},
"downloads": -1,
"filename": "logdelta-1.0.0.post1.tar.gz",
"has_sig": false,
"md5_digest": "63166326e2b13da51af81fe2f2f8fb4e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.9",
"size": 23300,
"upload_time": "2024-12-13T16:59:03",
"upload_time_iso_8601": "2024-12-13T16:59:03.957868Z",
"url": "https://files.pythonhosted.org/packages/e3/23/b818313fdb43baa86fe165b845b69c281d6f82ed34de67a57a9ffb96b0da/logdelta-1.0.0.post1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-13 16:59:03",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "logdelta"
}