pyscreeningfs


Namepyscreeningfs JSON
Version 0.1.1 PyPI version JSON
download
home_pagehttps://github.com/tbonewmy/Online-Feature-Screening-for-Datastream-with-Sparsity-Concept-Drifting
SummaryThis is a Python implementation by the authors of the paper 'Online Feature Screening for Data Streams With Concept Drift' from Dr. Mingyuan Wang and Dr. Adrian Barbu. Contain various feature selection methods.
upload_time2025-08-04 00:15:07
maintainerNone
docs_urlNone
authorMingyuan Wang
requires_python>=3.10
licenseApache-2.0
keywords feature selection feature screening variable screening online learning online feature selection concept drift data drift machine learning artificial intelligence statistics
VCS
bugtrack_url
requirements numpy
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Online-Feature-Screening-for-Datastream-with-Sparsity-Concept-Drifting

This is a Python implementation by the authors of the paper **"Online Feature Screening for Data Streams With Concept Drift"** from Dr. Mingyuan Wang and Dr. Adrian Barbu.

Please cite this paper if you use or build on our method. [doi.org/10.1109/TKDE.2022.3232752](https://doi.org/10.1109/TKDE.2022.3232752)

This project enabled well-known feature screening methods, including gini index, chi-square score, mutual information, fisher-score, T-score to handle streaming data, batch data, data with drifting, and sparse data. It currently only works on binary classification data.

## Installation

### Prerequisites

* `Python` 3.10 or newer
* `pip`
* `numpy` 2.2.4 or newer

### Note
Although the package is designed OS independent, it was only tested on Windows. You might need to use methods listed below other than `pip install pyscreeningfs`.
   \
   \
**For users installing from source (e.g., if no pre-built wheels are available for your system):**
You will need a C++ compiler compatible with your Python installation:
* **Windows:** Microsoft Visual C++ Build Tools (part of Visual Studio, or standalone).
* **Linux:** `gcc` and `g++` (usually included or easily installed via your package manager, e.g., `sudo apt-get install build-essential`).
* **macOS:** Xcode Command Line Tools (install with `xcode-select --install`).

### Install via git clone
1. Clone repository
``` bash
git clone https://github.com/yourusername/repo_name.git
```
2. Navigate into the cloned repository directory
```
cd repo_name 
```
3. Install
```
pip install .
```

### Install via download
1. Download the repository
2. Unpack to your own folder your_folder/repo_name
3. Navigate into the unpacked repository directory
``` bash
cd repo_name  
```
4. Install
``` bash
pip install .
```
### Install via pip (Currently unavailable)

If pre-built wheels are available for your system on PyPI (coming soon!), you can install directly:
```
pip install pyscreeningfs
```

## Data
For .svm sparse data, visit [https://www.sysnet.ucsd.edu/projects/url/](https://www.sysnet.ucsd.edu/projects/url/) \
Download and put into `data/url_svmlight/`

For any input data/data files, the Y/label/class vector can only contain numeric value and one of the label must be 1.

## Demo
For a demo, see testing.py in the root directory.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/tbonewmy/Online-Feature-Screening-for-Datastream-with-Sparsity-Concept-Drifting",
    "name": "pyscreeningfs",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "feature selection, feature screening, variable screening, online learning, online feature selection, concept drift, data drift, machine learning, artificial intelligence, statistics",
    "author": "Mingyuan Wang",
    "author_email": "bruce.wmy.research@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/f2/63/fbc002bc35086667e1985d7636510651c4c620f482bd20e47a2380832659/pyscreeningfs-0.1.1.tar.gz",
    "platform": null,
    "description": "# Online-Feature-Screening-for-Datastream-with-Sparsity-Concept-Drifting\r\n\r\nThis is a Python implementation by the authors of the paper **\"Online Feature Screening for Data Streams With Concept Drift\"** from Dr. Mingyuan Wang and Dr. Adrian Barbu.\r\n\r\nPlease cite this paper if you use or build on our method. [doi.org/10.1109/TKDE.2022.3232752](https://doi.org/10.1109/TKDE.2022.3232752)\r\n\r\nThis project enabled well-known feature screening methods, including gini index, chi-square score, mutual information, fisher-score, T-score to handle streaming data, batch data, data with drifting, and sparse data. It currently only works on binary classification data.\r\n\r\n## Installation\r\n\r\n### Prerequisites\r\n\r\n* `Python` 3.10 or newer\r\n* `pip`\r\n* `numpy` 2.2.4 or newer\r\n\r\n### Note\r\nAlthough the package is designed OS independent, it was only tested on Windows. You might need to use methods listed below other than `pip install pyscreeningfs`.\r\n   \\\r\n   \\\r\n**For users installing from source (e.g., if no pre-built wheels are available for your system):**\r\nYou will need a C++ compiler compatible with your Python installation:\r\n* **Windows:** Microsoft Visual C++ Build Tools (part of Visual Studio, or standalone).\r\n* **Linux:** `gcc` and `g++` (usually included or easily installed via your package manager, e.g., `sudo apt-get install build-essential`).\r\n* **macOS:** Xcode Command Line Tools (install with `xcode-select --install`).\r\n\r\n### Install via git clone\r\n1. Clone repository\r\n``` bash\r\ngit clone https://github.com/yourusername/repo_name.git\r\n```\r\n2. Navigate into the cloned repository directory\r\n```\r\ncd repo_name \r\n```\r\n3. Install\r\n```\r\npip install .\r\n```\r\n\r\n### Install via download\r\n1. Download the repository\r\n2. Unpack to your own folder your_folder/repo_name\r\n3. Navigate into the unpacked repository directory\r\n``` bash\r\ncd repo_name  \r\n```\r\n4. Install\r\n``` bash\r\npip install .\r\n```\r\n### Install via pip (Currently unavailable)\r\n\r\nIf pre-built wheels are available for your system on PyPI (coming soon!), you can install directly:\r\n```\r\npip install pyscreeningfs\r\n```\r\n\r\n## Data\r\nFor .svm sparse data, visit [https://www.sysnet.ucsd.edu/projects/url/](https://www.sysnet.ucsd.edu/projects/url/) \\\r\nDownload and put into `data/url_svmlight/`\r\n\r\nFor any input data/data files, the Y/label/class vector can only contain numeric value and one of the label must be 1.\r\n\r\n## Demo\r\nFor a demo, see testing.py in the root directory.\r\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "This is a Python implementation by the authors of the paper 'Online Feature Screening for Data Streams With Concept Drift' from Dr. Mingyuan Wang and Dr. Adrian Barbu. Contain various feature selection methods.",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/tbonewmy/Online-Feature-Screening-for-Datastream-with-Sparsity-Concept-Drifting"
    },
    "split_keywords": [
        "feature selection",
        " feature screening",
        " variable screening",
        " online learning",
        " online feature selection",
        " concept drift",
        " data drift",
        " machine learning",
        " artificial intelligence",
        " statistics"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a747d825b03b60402d7f47ec95ef2d99ba930e1ded3bf6a186a80a4a95e8eeb6",
                "md5": "75448372721ecfeb4114d3c7d75131d6",
                "sha256": "ccd14a877d1d1fe463b42e07882bada17efb6d5610f84dc51bf4239535b042d7"
            },
            "downloads": -1,
            "filename": "pyscreeningfs-0.1.1-cp310-cp310-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "75448372721ecfeb4114d3c7d75131d6",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.10",
            "size": 89294,
            "upload_time": "2025-08-04T00:15:06",
            "upload_time_iso_8601": "2025-08-04T00:15:06.700340Z",
            "url": "https://files.pythonhosted.org/packages/a7/47/d825b03b60402d7f47ec95ef2d99ba930e1ded3bf6a186a80a4a95e8eeb6/pyscreeningfs-0.1.1-cp310-cp310-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f263fbc002bc35086667e1985d7636510651c4c620f482bd20e47a2380832659",
                "md5": "ba952f513388a98d1e9c9abbbb207a70",
                "sha256": "7fba58c9f71f599abf01c4cb7779a034618491d86d1c62ea91a03dd3d679c075"
            },
            "downloads": -1,
            "filename": "pyscreeningfs-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "ba952f513388a98d1e9c9abbbb207a70",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 64736,
            "upload_time": "2025-08-04T00:15:07",
            "upload_time_iso_8601": "2025-08-04T00:15:07.585655Z",
            "url": "https://files.pythonhosted.org/packages/f2/63/fbc002bc35086667e1985d7636510651c4c620f482bd20e47a2380832659/pyscreeningfs-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-04 00:15:07",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tbonewmy",
    "github_project": "Online-Feature-Screening-for-Datastream-with-Sparsity-Concept-Drifting",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "2.2.4"
                ]
            ]
        }
    ],
    "lcname": "pyscreeningfs"
}
        
Elapsed time: 1.20155s