# fsam
`fsam` is a Python module to perform feature selection in additive models. It is assumed
that the smooth components to be estimated are defined through a reduced-rank basis
(B−splines) and fitted via a penalized splines approach (P−splines). Our variable
selection approach is based on selecting the best subset of features of a given size,
taking into account that each of the features can enter in the model as linear,
non-linear, or both. This cardinality-constrained problem is stated as a mixed-integer
quadratic programming (MIQP) model. We develop a framework to compute tight bounds for
the regression coefficients to the case of additive models. A heuristic approach based
on the large neighborhood search metaheuristic and that exploits the exact formulation
of the problem is developed, thus yielding a _matheuristic_. Moreover, a method to build
a warm-start solution is also developed by combining additive models and group lasso.
Solving the optimization problems is done using [GUROBI](https://www.gurobi.com/)
optimization software.
## Project structure
The current version of the project is structured as follows:
* **fsam**: the main directory of the project, which consist of:
* **fsam_fit**: contains the feature selection algorithm.
* **penalized_group_lasso**: contains our warm-start approach.
* **sop**: contains the methodology for estimating the smoothing parameters.
* **data**: a folder containing CSV files used in the real data numerical
experiments.
* **examples**: a directory containing multiple numerical experiments.
* **img**: contains some images.
* **tests**: a folder including tests for the main methods of the project.
## Package dependencies
`fsam` mainly depends on the following packages:
* [cpsplines](https://github.com/ManuelNavarroGarcia/cpsplines).
* [gurobipy](https://www.gurobi.com). **License Required**
* [matplotlib](https://matplotlib.org/).
* [numpy](https://numpy.org/).
* [pandas](https://pandas.pydata.org/).
* [scikit-learn](https://scikit-learn.org/).
* [scipy](https://www.scipy.org/).
* [statsmodels](https://www.statsmodels.org/).
* [tqdm](https://tqdm.github.io/).
* [typer](https://typer.tiangolo.com/).
GUROBI requires a license to be used. For research or educational purposes, a free
yearly and renewable [academic license](https://www.gurobi.com/academia/academic-program-and-licenses/) is offered by the
company.
## Installation
1. To clone the repository on your own device, use
```{bash}
git clone https://github.com/ManuelNavarroGarcia/fsam.git
cd fsam
```
2. To install the dependencies, there are two options according to your
installation preferences:
* Create and activate a virtual environment with `conda` (recommended)
```{bash}
conda env create -f env.yml
conda activate fsam
```
* Install the setuptools dependencies via `pip`
```{bash}
pip install -r requirements.txt
pip install -e .[dev]
```
3. If neccessary, add version requirements to existing dependencies or add new
ones on `setup.py`. Then, update `requirements.txt` file using
```{bash}
pip-compile --extra dev > requirements.txt
```
and update the environment with `pip-sync`. Afterwards, the command
```{bash}
pip install -e .[dev]
```
needs to be executed.
## Testing
The repository contains a folder with unit tests to guarantee the main methods
meets their design and behave as intended. To launch the test suite, it is
enough to enter `pytest`. If only one test file wants to be run, the syntax is
given by
```{bash}
pytest tests/test_<file_name>.py
```
## Contributing
Contributions to the repository are welcomed! Regardless of whether it is a
small fix on the documentation or a notable feature to be included, I encourage
you to develop your ideas and make this project greater. Even suggestions about
the code structure are highly appreciated. Furthermore, users participating on
these submissions will figure as contributors on this main page of the
repository.
There are many ways you can contribute on this repository:
* [Discussions](https://github.com/ManuelNavarroGarcia/fsam/discussions).
To ask questions you are wondering about or share ideas, you can enter an
existing discussion or open a new one.
* [Issues](https://github.com/ManuelNavarroGarcia/fsam/issues). If you
detect a bug or you want to propose an enhancement of the current version of
the code, a issue with reproducible code and/or a detailed description is
highly appreciated.
* [Pull Requests](https://github.com/ManuelNavarroGarcia/fsam/pulls). If
you feel I am missing an important feature, either in the code or in the
documentation, I encourage you to start a pull request developing this idea.
Nevertheless, before starting any major new feature work, I suggest you to
open an issue or start a discussion describing what you are planning to do.
Recall that, before starting a pull request, all unit test must pass on your
local repository.
## Contact Information and Citation
If you have encountered any problem or doubt while using `fsam`, please feel free to let
me know by sending me an email:
* Name: Manuel Navarro García (he/his)
* Email: <manuelnavarrogithub@gmail.com>
## Acknowledgements
Throughout the developing of this project I have received strong support from
various individuals. I would like to thank my PhD supervisors, Professor [Vanesa
Guerrero](https://github.com/vanesaguerrero) and Professor [María
Durbán](https://github.com/MariaDurban), whose insightful comments and
invaluable expertise has given way to many of the current functionalities of the
repository.
Raw data
{
"_id": null,
"home_page": "https://github.com/ManuelNavarroGarcia/fsam/",
"name": "fsam",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": null,
"author": "Manuel Navarro Garc\u00eda",
"author_email": "manuelnavarrogithub@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/01/7d/ab16e9b2823276b7a8e6f3f68026b912f36770cdd37c1e5d7bad5a0b44d5/fsam-0.1.1.tar.gz",
"platform": null,
"description": "# fsam\n\n`fsam` is a Python module to perform feature selection in additive models. It is assumed\nthat the smooth components to be estimated are defined through a reduced-rank basis\n(B\u2212splines) and fitted via a penalized splines approach (P\u2212splines). Our variable\nselection approach is based on selecting the best subset of features of a given size,\ntaking into account that each of the features can enter in the model as linear,\nnon-linear, or both. This cardinality-constrained problem is stated as a mixed-integer\nquadratic programming (MIQP) model. We develop a framework to compute tight bounds for\nthe regression coefficients to the case of additive models. A heuristic approach based\non the large neighborhood search metaheuristic and that exploits the exact formulation\nof the problem is developed, thus yielding a _matheuristic_. Moreover, a method to build\na warm-start solution is also developed by combining additive models and group lasso.\n\nSolving the optimization problems is done using [GUROBI](https://www.gurobi.com/)\noptimization software.\n\n## Project structure\n\nThe current version of the project is structured as follows:\n\n* **fsam**: the main directory of the project, which consist of:\n * **fsam_fit**: contains the feature selection algorithm.\n * **penalized_group_lasso**: contains our warm-start approach.\n * **sop**: contains the methodology for estimating the smoothing parameters.\n* **data**: a folder containing CSV files used in the real data numerical\n experiments.\n* **examples**: a directory containing multiple numerical experiments.\n* **img**: contains some images.\n* **tests**: a folder including tests for the main methods of the project.\n\n## Package dependencies\n\n`fsam` mainly depends on the following packages:\n\n* [cpsplines](https://github.com/ManuelNavarroGarcia/cpsplines).\n* [gurobipy](https://www.gurobi.com). **License Required**\n* [matplotlib](https://matplotlib.org/).\n* [numpy](https://numpy.org/).\n* [pandas](https://pandas.pydata.org/).\n* [scikit-learn](https://scikit-learn.org/).\n* [scipy](https://www.scipy.org/).\n* [statsmodels](https://www.statsmodels.org/).\n* [tqdm](https://tqdm.github.io/).\n* [typer](https://typer.tiangolo.com/).\n\nGUROBI requires a license to be used. For research or educational purposes, a free\nyearly and renewable [academic license](https://www.gurobi.com/academia/academic-program-and-licenses/) is offered by the\ncompany.\n\n## Installation\n\n1. To clone the repository on your own device, use\n\n```{bash}\ngit clone https://github.com/ManuelNavarroGarcia/fsam.git\ncd fsam\n```\n\n2. To install the dependencies, there are two options according to your\n installation preferences:\n\n* Create and activate a virtual environment with `conda` (recommended)\n\n```{bash}\nconda env create -f env.yml\nconda activate fsam\n```\n\n* Install the setuptools dependencies via `pip`\n\n```{bash}\npip install -r requirements.txt\npip install -e .[dev]\n```\n\n3. If neccessary, add version requirements to existing dependencies or add new\n ones on `setup.py`. Then, update `requirements.txt` file using\n\n```{bash}\npip-compile --extra dev > requirements.txt\n```\n\nand update the environment with `pip-sync`. Afterwards, the command\n\n```{bash}\npip install -e .[dev]\n```\n\nneeds to be executed.\n\n## Testing\n\nThe repository contains a folder with unit tests to guarantee the main methods\nmeets their design and behave as intended. To launch the test suite, it is\nenough to enter `pytest`. If only one test file wants to be run, the syntax is\ngiven by\n\n```{bash}\npytest tests/test_<file_name>.py\n```\n\n## Contributing\n\nContributions to the repository are welcomed! Regardless of whether it is a\nsmall fix on the documentation or a notable feature to be included, I encourage\nyou to develop your ideas and make this project greater. Even suggestions about\nthe code structure are highly appreciated. Furthermore, users participating on\nthese submissions will figure as contributors on this main page of the\nrepository.\n\nThere are many ways you can contribute on this repository:\n\n* [Discussions](https://github.com/ManuelNavarroGarcia/fsam/discussions).\n To ask questions you are wondering about or share ideas, you can enter an\n existing discussion or open a new one.\n\n* [Issues](https://github.com/ManuelNavarroGarcia/fsam/issues). If you\n detect a bug or you want to propose an enhancement of the current version of\n the code, a issue with reproducible code and/or a detailed description is\n highly appreciated.\n\n* [Pull Requests](https://github.com/ManuelNavarroGarcia/fsam/pulls). If\n you feel I am missing an important feature, either in the code or in the\n documentation, I encourage you to start a pull request developing this idea.\n Nevertheless, before starting any major new feature work, I suggest you to\n open an issue or start a discussion describing what you are planning to do.\n Recall that, before starting a pull request, all unit test must pass on your\n local repository.\n\n## Contact Information and Citation\n\nIf you have encountered any problem or doubt while using `fsam`, please feel free to let\nme know by sending me an email:\n\n* Name: Manuel Navarro Garc\u00eda (he/his)\n* Email: <manuelnavarrogithub@gmail.com>\n\n## Acknowledgements\n\nThroughout the developing of this project I have received strong support from\nvarious individuals. I would like to thank my PhD supervisors, Professor [Vanesa\nGuerrero](https://github.com/vanesaguerrero) and Professor [Mar\u00eda\nDurb\u00e1n](https://github.com/MariaDurban), whose insightful comments and\ninvaluable expertise has given way to many of the current functionalities of the\nrepository.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": null,
"version": "0.1.1",
"project_urls": {
"Download": "https://github.com/ManuelNavarroGarcia/fsam/archive/refs/tags/0.1.1.tar.gz",
"Homepage": "https://github.com/ManuelNavarroGarcia/fsam/"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6726201cf338bd75aab2c0538fe20ed93d6c0888071391727b345b0bdf0a3e30",
"md5": "e51c27d66d09063cc56d4bc27968a287",
"sha256": "76b660404a612087734b7684de73e9569cef24e0dc6e85653785a832de4e6913"
},
"downloads": -1,
"filename": "fsam-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e51c27d66d09063cc56d4bc27968a287",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 40321,
"upload_time": "2024-06-26T13:16:55",
"upload_time_iso_8601": "2024-06-26T13:16:55.212401Z",
"url": "https://files.pythonhosted.org/packages/67/26/201cf338bd75aab2c0538fe20ed93d6c0888071391727b345b0bdf0a3e30/fsam-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "017dab16e9b2823276b7a8e6f3f68026b912f36770cdd37c1e5d7bad5a0b44d5",
"md5": "ffc2ba03ee2a0bd80309a108b48696b2",
"sha256": "a795e6bd480c25905654af6f713285fdef6f18d699920337806029c92536805c"
},
"downloads": -1,
"filename": "fsam-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "ffc2ba03ee2a0bd80309a108b48696b2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 266786,
"upload_time": "2024-06-26T13:16:56",
"upload_time_iso_8601": "2024-06-26T13:16:56.991867Z",
"url": "https://files.pythonhosted.org/packages/01/7d/ab16e9b2823276b7a8e6f3f68026b912f36770cdd37c1e5d7bad5a0b44d5/fsam-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-26 13:16:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ManuelNavarroGarcia",
"github_project": "fsam",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "fsam"
}