# feature-space
> A module framework for constructing a network of features supporting each other, yet each feature is calculated only once.
## Installation
```
pip install feature-space
```
## examples
basic creation of features and datasets with multiple dependencies.
```python
import pandas as pd
import numpy as np
df = pd.DataFrame(
{
key: np.random.random(100)
for key in ('Open', 'High', 'Low', 'Close', 'Volume')
}
)
print(df)
```
output
```
Open High Low Close Volume
0 0.306962 0.090669 0.957007 0.382841 0.331181
1 0.668492 0.233647 0.601794 0.533531 0.761473
2 0.582980 0.765049 0.453987 0.989116 0.439396
3 0.053769 0.512395 0.763573 0.589263 0.886496
4 0.690432 0.372401 0.960555 0.202977 0.133927
.. ... ... ... ... ...
95 0.469604 0.591768 0.590435 0.138835 0.217345
96 0.304976 0.521499 0.006687 0.545035 0.974107
97 0.816594 0.639280 0.702651 0.942868 0.681855
98 0.387333 0.232820 0.563151 0.123126 0.051621
99 0.930279 0.657109 0.620474 0.794123 0.134324
```
creating the indicators with their relationships.
```python
from feature_space import Column, RSI, Change, Momentum
high = Column('High')
low = Column('Low')
close = Column('Close')
close_change = Change(close)
close_rsi_14 = RSI(close_change, 14)
close_momentum = Momentum(close_change, 35)
```
creating a dataset to contain and control the features.
using the dataset object is simple, but everything it does can be done
with individual interactions with each feature.
```python
from feature_space import Dataset
change_indicators = Dataset(
name='Change_Features',
features=[close_change, close_rsi_14, close_momentum]
)
change_indicators.calculate(df)
df.dropna(inplace=True)
print(df)
```
output - all features are present in the dataframe,
and each one was calculated only once,
even though some features are required by more than one feature.
```
output
Open High Low Close Volume Close_Change Close_RSI_14 Close_Momentum_35
13 0.762020 0.053808 0.079920 0.061354 0.169120 -0.514332 45.932592 -0.321487
14 0.683689 0.948868 0.291903 0.461534 0.557272 0.400181 50.904070 0.078693
15 0.729113 0.352819 0.267228 0.923362 0.331447 0.461828 54.179768 0.540522
16 0.633024 0.931491 0.092854 0.910211 0.164508 -0.013152 49.065301 0.527370
17 0.321494 0.662967 0.253199 0.643929 0.810552 -0.266282 50.668724 0.261088
.. ... ... ... ... ... ... ... ...
95 0.469604 0.591768 0.590435 0.138835 0.217345 -0.635993 43.876190 -0.447753
96 0.304976 0.521499 0.006687 0.545035 0.974107 0.406200 51.201515 -0.293790
97 0.816594 0.639280 0.702651 0.942868 0.681855 0.397833 57.625151 0.194520
98 0.387333 0.232820 0.563151 0.123126 0.051621 -0.819742 45.336996 0.061992
99 0.930279 0.657109 0.620474 0.794123 0.134324 0.670997 48.895303 -0.160722
```
to recalculate each feature one again, maby after a change in the original source data,
simply call the .clear() method on a Feature or a Dataset object.
This will not remove any data from the dataframe, just clear the cached referenses to the
series in the features.
if you wish to override existing data, you can specify.
```python
change_indicators.clear()
change_indicators.calculate(df, override=True)
```
Save and load a whole dataset with its inter-dependency of features:
```python
change_indicators.save('dataset.pkl')
change_indicators = Dataset.load('dataset.pkl')
```
Raw data
{
"_id": null,
"home_page": "https://github.com/Shahaf-F-S/feature-space",
"name": "feature-space",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Shahaf Frank-Shapir",
"author_email": "shahaffrs@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/cb/f6/24697c8c3b0c56ca9b2bc86bf5afc483faf433e970dfa24a87ad209794c8/feature_space-0.0.5.tar.gz",
"platform": null,
"description": "# feature-space\r\n\r\n> A module framework for constructing a network of features supporting each other, yet each feature is calculated only once.\r\n\r\n## Installation\r\n\r\n```\r\npip install feature-space\r\n```\r\n\r\n## examples\r\n\r\nbasic creation of features and datasets with multiple dependencies.\r\n\r\n```python\r\nimport pandas as pd\r\nimport numpy as np\r\n\r\ndf = pd.DataFrame(\r\n {\r\n key: np.random.random(100)\r\n for key in ('Open', 'High', 'Low', 'Close', 'Volume')\r\n }\r\n)\r\n\r\nprint(df)\r\n```\r\n\r\noutput\r\n```\r\n Open High Low Close Volume\r\n0 0.306962 0.090669 0.957007 0.382841 0.331181\r\n1 0.668492 0.233647 0.601794 0.533531 0.761473\r\n2 0.582980 0.765049 0.453987 0.989116 0.439396\r\n3 0.053769 0.512395 0.763573 0.589263 0.886496\r\n4 0.690432 0.372401 0.960555 0.202977 0.133927\r\n.. ... ... ... ... ...\r\n95 0.469604 0.591768 0.590435 0.138835 0.217345\r\n96 0.304976 0.521499 0.006687 0.545035 0.974107\r\n97 0.816594 0.639280 0.702651 0.942868 0.681855\r\n98 0.387333 0.232820 0.563151 0.123126 0.051621\r\n99 0.930279 0.657109 0.620474 0.794123 0.134324\r\n```\r\n\r\ncreating the indicators with their relationships.\r\n```python\r\nfrom feature_space import Column, RSI, Change, Momentum\r\n\r\nhigh = Column('High')\r\nlow = Column('Low')\r\nclose = Column('Close')\r\n\r\nclose_change = Change(close)\r\nclose_rsi_14 = RSI(close_change, 14)\r\nclose_momentum = Momentum(close_change, 35)\r\n```\r\n\r\ncreating a dataset to contain and control the features.\r\nusing the dataset object is simple, but everything it does can be done \r\nwith individual interactions with each feature.\r\n```python\r\nfrom feature_space import Dataset\r\n\r\nchange_indicators = Dataset(\r\n name='Change_Features',\r\n features=[close_change, close_rsi_14, close_momentum]\r\n)\r\n\r\nchange_indicators.calculate(df)\r\n\r\ndf.dropna(inplace=True)\r\n\r\nprint(df)\r\n```\r\n\r\noutput - all features are present in the dataframe, \r\nand each one was calculated only once, \r\neven though some features are required by more than one feature.\r\n```\r\noutput\r\nOpen High Low Close Volume Close_Change Close_RSI_14 Close_Momentum_35\r\n13 0.762020 0.053808 0.079920 0.061354 0.169120 -0.514332 45.932592 -0.321487\r\n14 0.683689 0.948868 0.291903 0.461534 0.557272 0.400181 50.904070 0.078693\r\n15 0.729113 0.352819 0.267228 0.923362 0.331447 0.461828 54.179768 0.540522\r\n16 0.633024 0.931491 0.092854 0.910211 0.164508 -0.013152 49.065301 0.527370\r\n17 0.321494 0.662967 0.253199 0.643929 0.810552 -0.266282 50.668724 0.261088\r\n.. ... ... ... ... ... ... ... ...\r\n95 0.469604 0.591768 0.590435 0.138835 0.217345 -0.635993 43.876190 -0.447753\r\n96 0.304976 0.521499 0.006687 0.545035 0.974107 0.406200 51.201515 -0.293790\r\n97 0.816594 0.639280 0.702651 0.942868 0.681855 0.397833 57.625151 0.194520\r\n98 0.387333 0.232820 0.563151 0.123126 0.051621 -0.819742 45.336996 0.061992\r\n99 0.930279 0.657109 0.620474 0.794123 0.134324 0.670997 48.895303 -0.160722\r\n```\r\n\r\nto recalculate each feature one again, maby after a change in the original source data,\r\nsimply call the .clear() method on a Feature or a Dataset object.\r\nThis will not remove any data from the dataframe, just clear the cached referenses to the \r\nseries in the features.\r\n\r\nif you wish to override existing data, you can specify.\r\n\r\n```python\r\nchange_indicators.clear()\r\nchange_indicators.calculate(df, override=True)\r\n```\r\n\r\nSave and load a whole dataset with its inter-dependency of features:\r\n```python\r\nchange_indicators.save('dataset.pkl')\r\nchange_indicators = Dataset.load('dataset.pkl')\r\n```\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A module framework for constructing a network of features supporting each other, yet each feature is calculated only once.",
"version": "0.0.5",
"project_urls": {
"Homepage": "https://github.com/Shahaf-F-S/feature-space"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "cbf624697c8c3b0c56ca9b2bc86bf5afc483faf433e970dfa24a87ad209794c8",
"md5": "eda242c8a9ccb05a01a2259ae92775ef",
"sha256": "814e449e092e771b965749146083df726c00ae4c0d3481c387c77a517cfc2ada"
},
"downloads": -1,
"filename": "feature_space-0.0.5.tar.gz",
"has_sig": false,
"md5_digest": "eda242c8a9ccb05a01a2259ae92775ef",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 10714,
"upload_time": "2024-04-21T14:23:30",
"upload_time_iso_8601": "2024-04-21T14:23:30.121701Z",
"url": "https://files.pythonhosted.org/packages/cb/f6/24697c8c3b0c56ca9b2bc86bf5afc483faf433e970dfa24a87ad209794c8/feature_space-0.0.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-21 14:23:30",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Shahaf-F-S",
"github_project": "feature-space",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "pandas",
"specs": []
},
{
"name": "numpy",
"specs": []
},
{
"name": "dill",
"specs": []
}
],
"lcname": "feature-space"
}