tcbench


Nametcbench JSON
Version 0.0.22 PyPI version JSON
download
home_page
SummaryA ML/DL framework for Traffic Classification
upload_time2023-10-21 19:22:23
maintainer
docs_urlNone
author
requires_python>=3.9
licenseMIT License Copyright (c) 2023 tcbenchstack Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords machine learning deep learning traffic classification time series
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">
  <img src="https://tcbenchstack.github.io/tcbench/tcbench_logo.svg" width="400px"/>
  <h3>An ML/DL framework for Traffic Classification TC</h3>
  <a href="https://tcbenchstack.github.io/tcbench">
  <img width="24" height="24" src="https://img.icons8.com/fluency/48/domain.png" alt="domain"/>
  <b>Documentation</b>
  </a>
</div>

<br>

tcbench design is cored in the following objectives:

* Easing ML/DL models training/testing results replicability.
* Tight integration with public TC datasets with ease data installation and curation,
* Model tracking via [AIM](https://github.com/aimhubio/aim). 
* Rich command line for executing modeling campaings and collecting performance reports.


## ...wait, what is Traffic Classification?
    
A computer network is formed by hosts that exchange
information, namely *packets*, according
to standardized protocols (e.g., [HTTP](https://en.wikipedia.org/wiki/HTTP) is the protocol used for the web). 
So to properly operate/manage networks one is required to monitor
this flow of information and react accordingly. For instance, 
in an office/enterprise environment, one might want to prioritize video meeting traffic
while limit social media traffic.

[__Traffic classification__](https://en.wikipedia.org/wiki/Traffic_classification) 
is the the act of labeling an exchange of packets 
between network hosts based on the application that generated it.
For instance, you want to identify traffic related to zoom/webx/skype/etc. calls 
or traffic related to twitter/instagram/facebook/mastodon
out of all traffic flowing throught the network.


## Motivations

The academic literature is ripe with methods and proposals for TC.
Yet, it is scarce of code artifacts and public datasets 
do not offer common conventions of use.

We designed tcbench with the following goals in mind:

| Goal | State of the art | tcbench |
|:-----|:-----------------|:--------|
| __Data curation__ | There are a few public datasets for TC, yet no common format/schema, cleaning process, or standard train/val/test folds. | An (opinionated) curation of datasets to create easy to use parquet files with associated train/val/test fold.|
|__Code__ | TC literature has no reference code base for ML/DL modeling | tcbench is [open source](https://github.com/tcbenchstack/tcbench) with an easy to use CLI based on [click](https://click.palletsprojects.com/en/8.1.x/)|
|__Model tracking__ | Most of ML framework requires integration with cloud environments and subscription services | tcbench uses [aimstack](https://aimstack.io/) to save on local servers metrics during training which can be later explored via its web UI or aggregated in report summaries using tcbench |

## Install

Create a conda environment

```
conda create -n tcbench python=3.10 pip
conda activate tcbench
python -m pip install tcbench
```

For the developer version
```
python -m pip install tcbench[dev]
```

## Features and roadmap

tcbench is still under development, but (as suggested by its name) ultimately aims
to be a reference framework for benchmarking multiple ML/DL solutions 
related to TC.

At the current stage, tcbench offers

* Integration with 4 datasets, namely `ucdavis-icdm19`, `mirage19`, `mirage22` and `utmobilenet21`.
You can use these datasets and their curated version independently from tcbench.
Check out the [dataset install](https://tcbenchstack.github.io/tcbench/datasets/install) process and [dataset loading tutorial](https://tcbenchstack.github.io/tcbench/datasets/guides/tutorial_load_datasets/).

* Good support for flowpic input representation.

* Initial support for for 1d packet time series (based on network packets properties) input representation.

* Data augmentation functionality for flowpic input representation.

* Modeling via XGBoost, vanilla DL supervision and contrastive learning (via SimCLR or SupCon).

More exiting features including more datasets and algorithms will come in the next months. 

Stay tuned ;)!

## Papers

* ["Replication: Contrastive Learning and Data Augmentation in Traffic Classification Using a Flowpic Input Representation"](https://arxiv.org/abs/2309.09733) __preprint__<br>
A. Finamore, C. Wang, J. Krolikowski, J. M. Navarro, F. Chen, D. Rossi<br>
ACM Internet Measurements Conference (IMC), 2023


            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "tcbench",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "",
    "keywords": "machine learning,deep learning,traffic classification,time series",
    "author": "",
    "author_email": "Alessandro Finamore <alessandro.finamore@huawei.com>",
    "download_url": "https://files.pythonhosted.org/packages/f2/ab/09cc49069893c1241e7fd6cc1516e99e4f469294dab89ac2989e43a22873/tcbench-0.0.22.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n  <img src=\"https://tcbenchstack.github.io/tcbench/tcbench_logo.svg\" width=\"400px\"/>\n  <h3>An ML/DL framework for Traffic Classification TC</h3>\n  <a href=\"https://tcbenchstack.github.io/tcbench\">\n  <img width=\"24\" height=\"24\" src=\"https://img.icons8.com/fluency/48/domain.png\" alt=\"domain\"/>\n  <b>Documentation</b>\n  </a>\n</div>\n\n<br>\n\ntcbench design is cored in the following objectives:\n\n* Easing ML/DL models training/testing results replicability.\n* Tight integration with public TC datasets with ease data installation and curation,\n* Model tracking via [AIM](https://github.com/aimhubio/aim). \n* Rich command line for executing modeling campaings and collecting performance reports.\n\n\n## ...wait, what is Traffic Classification?\n    \nA computer network is formed by hosts that exchange\ninformation, namely *packets*, according\nto standardized protocols (e.g., [HTTP](https://en.wikipedia.org/wiki/HTTP) is the protocol used for the web). \nSo to properly operate/manage networks one is required to monitor\nthis flow of information and react accordingly. For instance, \nin an office/enterprise environment, one might want to prioritize video meeting traffic\nwhile limit social media traffic.\n\n[__Traffic classification__](https://en.wikipedia.org/wiki/Traffic_classification) \nis the the act of labeling an exchange of packets \nbetween network hosts based on the application that generated it.\nFor instance, you want to identify traffic related to zoom/webx/skype/etc. calls \nor traffic related to twitter/instagram/facebook/mastodon\nout of all traffic flowing throught the network.\n\n\n## Motivations\n\nThe academic literature is ripe with methods and proposals for TC.\nYet, it is scarce of code artifacts and public datasets \ndo not offer common conventions of use.\n\nWe designed tcbench with the following goals in mind:\n\n| Goal | State of the art | tcbench |\n|:-----|:-----------------|:--------|\n| __Data curation__ | There are a few public datasets for TC, yet no common format/schema, cleaning process, or standard train/val/test folds. | An (opinionated) curation of datasets to create easy to use parquet files with associated train/val/test fold.|\n|__Code__ | TC literature has no reference code base for ML/DL modeling | tcbench is [open source](https://github.com/tcbenchstack/tcbench) with an easy to use CLI based on [click](https://click.palletsprojects.com/en/8.1.x/)|\n|__Model tracking__ | Most of ML framework requires integration with cloud environments and subscription services | tcbench uses [aimstack](https://aimstack.io/) to save on local servers metrics during training which can be later explored via its web UI or aggregated in report summaries using tcbench |\n\n## Install\n\nCreate a conda environment\n\n```\nconda create -n tcbench python=3.10 pip\nconda activate tcbench\npython -m pip install tcbench\n```\n\nFor the developer version\n```\npython -m pip install tcbench[dev]\n```\n\n## Features and roadmap\n\ntcbench is still under development, but (as suggested by its name) ultimately aims\nto be a reference framework for benchmarking multiple ML/DL solutions \nrelated to TC.\n\nAt the current stage, tcbench offers\n\n* Integration with 4 datasets, namely `ucdavis-icdm19`, `mirage19`, `mirage22` and `utmobilenet21`.\nYou can use these datasets and their curated version independently from tcbench.\nCheck out the [dataset install](https://tcbenchstack.github.io/tcbench/datasets/install) process and [dataset loading tutorial](https://tcbenchstack.github.io/tcbench/datasets/guides/tutorial_load_datasets/).\n\n* Good support for flowpic input representation.\n\n* Initial support for for 1d packet time series (based on network packets properties) input representation.\n\n* Data augmentation functionality for flowpic input representation.\n\n* Modeling via XGBoost, vanilla DL supervision and contrastive learning (via SimCLR or SupCon).\n\nMore exiting features including more datasets and algorithms will come in the next months. \n\nStay tuned ;)!\n\n## Papers\n\n* [\"Replication: Contrastive Learning and Data Augmentation in Traffic Classification Using a Flowpic Input Representation\"](https://arxiv.org/abs/2309.09733) __preprint__<br>\nA. Finamore, C. Wang, J. Krolikowski, J. M. Navarro, F. Chen, D. Rossi<br>\nACM Internet Measurements Conference (IMC), 2023\n\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2023 tcbenchstack  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "A ML/DL framework for Traffic Classification",
    "version": "0.0.22",
    "project_urls": {
        "Homepage": "https://tcbenchstack.github.io/tcbench/"
    },
    "split_keywords": [
        "machine learning",
        "deep learning",
        "traffic classification",
        "time series"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ad691efc51e4044b22a48e046a28a20c673dfca957059d64215ee98c26b375f6",
                "md5": "b27e29a632b44482d8acc8130106711a",
                "sha256": "72b3b1b8d421eca6c8c1c6dee826ec7564a924a7c2e633fdb884089c7986fa9a"
            },
            "downloads": -1,
            "filename": "tcbench-0.0.22-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b27e29a632b44482d8acc8130106711a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 115145,
            "upload_time": "2023-10-21T19:22:21",
            "upload_time_iso_8601": "2023-10-21T19:22:21.190924Z",
            "url": "https://files.pythonhosted.org/packages/ad/69/1efc51e4044b22a48e046a28a20c673dfca957059d64215ee98c26b375f6/tcbench-0.0.22-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f2ab09cc49069893c1241e7fd6cc1516e99e4f469294dab89ac2989e43a22873",
                "md5": "2ff0beaa776ac63828de953c269267a4",
                "sha256": "3d4cbf9e4403d3abf42534517002a3344d4a18ee564e5a95c6dcdbbee4cfb8ed"
            },
            "downloads": -1,
            "filename": "tcbench-0.0.22.tar.gz",
            "has_sig": false,
            "md5_digest": "2ff0beaa776ac63828de953c269267a4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 96550,
            "upload_time": "2023-10-21T19:22:23",
            "upload_time_iso_8601": "2023-10-21T19:22:23.032458Z",
            "url": "https://files.pythonhosted.org/packages/f2/ab/09cc49069893c1241e7fd6cc1516e99e4f469294dab89ac2989e43a22873/tcbench-0.0.22.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-21 19:22:23",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "tcbench"
}
        
Elapsed time: 0.91280s