tf_idf
================
<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->
This file will become your README and also the index of your
documentation.
## Install
``` sh
pip install tf_idf
```
## How to use
Fill me in please! Don’t forget code examples:
``` python
import tf_idf.core as tf_idf
import pandas as pd
```
``` python
AI = 'For instance, in the design phase of a structural engineering project, Monte Carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety'
ME = 'For instance, Monte Carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness'
# word_tokenize(AI.lower().split())
# preprocess_text(AI)
```
``` python
compare = tf_idf.preprocess_text(AI)
```
``` python
compare = pd.concat([compare, preprocess_text(ME)], ignore_index=True)
compare
```
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>DOCUMENT</th>
<th>LOWERCASE</th>
<th>CLEANING</th>
<th>TOKENIZATION</th>
<th>STOP-WORDS</th>
<th>STEMMING</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>For instance, in the design phase of a structural engineering project, Monte Carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety</td>
<td>for instance, in the design phase of a structural engineering project, monte carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety</td>
<td>for instance in the design phase of a structural engineering project monte carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties providing valuable insights into its reliability and safety</td>
<td>[for, instance, in, the, design, phase, of, a, structural, engineering, project, monte, carlo, simulations, can, help, evaluate, the, performance, of, a, proposed, design, under, different, loading, conditions, and, material, properties, providing, valuable, insights, into, its, reliability, and, safety]</td>
<td>[instance, design, phase, structural, engineering, project, monte, carlo, simulations, evaluate, performance, proposed, design, different, loading, conditions, material, properties, providing, valuable, insights, reliability, safety]</td>
<td>[instanc, design, phase, structur, engin, project, mont, carlo, simul, evalu, perform, propos, design, differ, load, condit, materi, properti, provid, valuabl, insight, reliabl, safeti]</td>
</tr>
<tr>
<th>1</th>
<td>For instance, Monte Carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness</td>
<td>for instance, monte carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness</td>
<td>for instance monte carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness</td>
<td>[for, instance, monte, carlo, simulations, can, simulate, hundreds, or, thousands, of, different, combinations, of, loading, conditions, and, material, properties, to, create, statistical, predictions, of, structure, stiffness]</td>
<td>[instance, monte, carlo, simulations, simulate, hundreds, thousands, different, combinations, loading, conditions, material, properties, create, statistical, predictions, structure, stiffness]</td>
<td>[instanc, mont, carlo, simul, simul, hundr, thousand, differ, combin, load, condit, materi, properti, creat, statist, predict, structur, stiff]</td>
</tr>
</tbody>
</table>
</div>
``` python
compare_tfidf = calculate_tfidf(compare)
compare_tfidf
```
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>DOCUMENT</th>
<th>LOWERCASE</th>
<th>CLEANING</th>
<th>TOKENIZATION</th>
<th>STOP-WORDS</th>
<th>STEMMING</th>
<th>carlo</th>
<th>combin</th>
<th>condit</th>
<th>creat</th>
<th>...</th>
<th>propos</th>
<th>provid</th>
<th>reliabl</th>
<th>safeti</th>
<th>simul</th>
<th>statist</th>
<th>stiff</th>
<th>structur</th>
<th>thousand</th>
<th>valuabl</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>For instance, in the design phase of a structural engineering project, Monte Carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety</td>
<td>for instance, in the design phase of a structural engineering project, monte carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety</td>
<td>for instance in the design phase of a structural engineering project monte carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties providing valuable insights into its reliability and safety</td>
<td>[for, instance, in, the, design, phase, of, a, structural, engineering, project, monte, carlo, simulations, can, help, evaluate, the, performance, of, a, proposed, design, under, different, loading, conditions, and, material, properties, providing, valuable, insights, into, its, reliability, and, safety]</td>
<td>[instance, design, phase, structural, engineering, project, monte, carlo, simulations, evaluate, performance, proposed, design, different, loading, conditions, material, properties, providing, valuable, insights, reliability, safety]</td>
<td>[instanc, design, phase, structur, engin, project, mont, carlo, simul, evalu, perform, propos, design, differ, load, condit, materi, properti, provid, valuabl, insight, reliabl, safeti]</td>
<td>0.158850</td>
<td>0.000000</td>
<td>0.158850</td>
<td>0.000000</td>
<td>...</td>
<td>0.223259</td>
<td>0.223259</td>
<td>0.223259</td>
<td>0.223259</td>
<td>0.158850</td>
<td>0.000000</td>
<td>0.000000</td>
<td>0.158850</td>
<td>0.000000</td>
<td>0.223259</td>
</tr>
<tr>
<th>1</th>
<td>For instance, Monte Carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness</td>
<td>for instance, monte carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness</td>
<td>for instance monte carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness</td>
<td>[for, instance, monte, carlo, simulations, can, simulate, hundreds, or, thousands, of, different, combinations, of, loading, conditions, and, material, properties, to, create, statistical, predictions, of, structure, stiffness]</td>
<td>[instance, monte, carlo, simulations, simulate, hundreds, thousands, different, combinations, loading, conditions, material, properties, create, statistical, predictions, structure, stiffness]</td>
<td>[instanc, mont, carlo, simul, simul, hundr, thousand, differ, combin, load, condit, materi, properti, creat, statist, predict, structur, stiff]</td>
<td>0.193068</td>
<td>0.271351</td>
<td>0.193068</td>
<td>0.271351</td>
<td>...</td>
<td>0.000000</td>
<td>0.000000</td>
<td>0.000000</td>
<td>0.000000</td>
<td>0.386137</td>
<td>0.271351</td>
<td>0.271351</td>
<td>0.193068</td>
<td>0.271351</td>
<td>0.000000</td>
</tr>
</tbody>
</table>
<p>2 rows × 35 columns</p>
</div>
``` python
tf_idf.cosineSimilarity(compare)
```
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>DOCUMENT</th>
<th>STEMMING</th>
<th>COSIM</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>For instance, in the design phase of a structural engineering project, Monte Carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety</td>
<td>[instanc, design, phase, structur, engin, project, mont, carlo, simul, evalu, perform, propos, design, differ, load, condit, materi, properti, provid, valuabl, insight, reliabl, safeti]</td>
<td>1.000000</td>
</tr>
<tr>
<th>1</th>
<td>For instance, Monte Carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness</td>
<td>[instanc, mont, carlo, simul, simul, hundr, thousand, differ, combin, load, condit, materi, properti, creat, statist, predict, structur, stiff]</td>
<td>0.337359</td>
</tr>
</tbody>
</table>
</div>
Raw data
{
"_id": null,
"home_page": "https://github.com/cooperrc/tf_idf_cosimm",
"name": "tf-idf-cosimm",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "nbdev jupyter notebook python",
"author": "Ryan C. Cooper",
"author_email": "ryan.c.cooper@uconn.edu",
"download_url": "https://files.pythonhosted.org/packages/de/da/f1897e332602ef43985bac7c301285a94e371c7ccdfd84935835037cc94b/tf_idf_cosimm-0.0.2.tar.gz",
"platform": null,
"description": "tf_idf\n================\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\nThis file will become your README and also the index of your\ndocumentation.\n\n## Install\n\n``` sh\npip install tf_idf\n```\n\n## How to use\n\nFill me in please! Don\u2019t forget code examples:\n\n``` python\nimport tf_idf.core as tf_idf\nimport pandas as pd\n```\n\n``` python\nAI = 'For instance, in the design phase of a structural engineering project, Monte Carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety'\nME = 'For instance, Monte Carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness'\n# word_tokenize(AI.lower().split())\n# preprocess_text(AI)\n```\n\n``` python\ncompare = tf_idf.preprocess_text(AI)\n```\n\n``` python\ncompare = pd.concat([compare, preprocess_text(ME)], ignore_index=True)\ncompare\n```\n\n<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n .dataframe tbody tr th {\n vertical-align: top;\n }\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>DOCUMENT</th>\n <th>LOWERCASE</th>\n <th>CLEANING</th>\n <th>TOKENIZATION</th>\n <th>STOP-WORDS</th>\n <th>STEMMING</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>For instance, in the design phase of a structural engineering project, Monte Carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety</td>\n <td>for instance, in the design phase of a structural engineering project, monte carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety</td>\n <td>for instance in the design phase of a structural engineering project monte carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties providing valuable insights into its reliability and safety</td>\n <td>[for, instance, in, the, design, phase, of, a, structural, engineering, project, monte, carlo, simulations, can, help, evaluate, the, performance, of, a, proposed, design, under, different, loading, conditions, and, material, properties, providing, valuable, insights, into, its, reliability, and, safety]</td>\n <td>[instance, design, phase, structural, engineering, project, monte, carlo, simulations, evaluate, performance, proposed, design, different, loading, conditions, material, properties, providing, valuable, insights, reliability, safety]</td>\n <td>[instanc, design, phase, structur, engin, project, mont, carlo, simul, evalu, perform, propos, design, differ, load, condit, materi, properti, provid, valuabl, insight, reliabl, safeti]</td>\n </tr>\n <tr>\n <th>1</th>\n <td>For instance, Monte Carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness</td>\n <td>for instance, monte carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness</td>\n <td>for instance monte carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness</td>\n <td>[for, instance, monte, carlo, simulations, can, simulate, hundreds, or, thousands, of, different, combinations, of, loading, conditions, and, material, properties, to, create, statistical, predictions, of, structure, stiffness]</td>\n <td>[instance, monte, carlo, simulations, simulate, hundreds, thousands, different, combinations, loading, conditions, material, properties, create, statistical, predictions, structure, stiffness]</td>\n <td>[instanc, mont, carlo, simul, simul, hundr, thousand, differ, combin, load, condit, materi, properti, creat, statist, predict, structur, stiff]</td>\n </tr>\n </tbody>\n</table>\n</div>\n\n``` python\ncompare_tfidf = calculate_tfidf(compare)\ncompare_tfidf\n```\n\n<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n .dataframe tbody tr th {\n vertical-align: top;\n }\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>DOCUMENT</th>\n <th>LOWERCASE</th>\n <th>CLEANING</th>\n <th>TOKENIZATION</th>\n <th>STOP-WORDS</th>\n <th>STEMMING</th>\n <th>carlo</th>\n <th>combin</th>\n <th>condit</th>\n <th>creat</th>\n <th>...</th>\n <th>propos</th>\n <th>provid</th>\n <th>reliabl</th>\n <th>safeti</th>\n <th>simul</th>\n <th>statist</th>\n <th>stiff</th>\n <th>structur</th>\n <th>thousand</th>\n <th>valuabl</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>For instance, in the design phase of a structural engineering project, Monte Carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety</td>\n <td>for instance, in the design phase of a structural engineering project, monte carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety</td>\n <td>for instance in the design phase of a structural engineering project monte carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties providing valuable insights into its reliability and safety</td>\n <td>[for, instance, in, the, design, phase, of, a, structural, engineering, project, monte, carlo, simulations, can, help, evaluate, the, performance, of, a, proposed, design, under, different, loading, conditions, and, material, properties, providing, valuable, insights, into, its, reliability, and, safety]</td>\n <td>[instance, design, phase, structural, engineering, project, monte, carlo, simulations, evaluate, performance, proposed, design, different, loading, conditions, material, properties, providing, valuable, insights, reliability, safety]</td>\n <td>[instanc, design, phase, structur, engin, project, mont, carlo, simul, evalu, perform, propos, design, differ, load, condit, materi, properti, provid, valuabl, insight, reliabl, safeti]</td>\n <td>0.158850</td>\n <td>0.000000</td>\n <td>0.158850</td>\n <td>0.000000</td>\n <td>...</td>\n <td>0.223259</td>\n <td>0.223259</td>\n <td>0.223259</td>\n <td>0.223259</td>\n <td>0.158850</td>\n <td>0.000000</td>\n <td>0.000000</td>\n <td>0.158850</td>\n <td>0.000000</td>\n <td>0.223259</td>\n </tr>\n <tr>\n <th>1</th>\n <td>For instance, Monte Carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness</td>\n <td>for instance, monte carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness</td>\n <td>for instance monte carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness</td>\n <td>[for, instance, monte, carlo, simulations, can, simulate, hundreds, or, thousands, of, different, combinations, of, loading, conditions, and, material, properties, to, create, statistical, predictions, of, structure, stiffness]</td>\n <td>[instance, monte, carlo, simulations, simulate, hundreds, thousands, different, combinations, loading, conditions, material, properties, create, statistical, predictions, structure, stiffness]</td>\n <td>[instanc, mont, carlo, simul, simul, hundr, thousand, differ, combin, load, condit, materi, properti, creat, statist, predict, structur, stiff]</td>\n <td>0.193068</td>\n <td>0.271351</td>\n <td>0.193068</td>\n <td>0.271351</td>\n <td>...</td>\n <td>0.000000</td>\n <td>0.000000</td>\n <td>0.000000</td>\n <td>0.000000</td>\n <td>0.386137</td>\n <td>0.271351</td>\n <td>0.271351</td>\n <td>0.193068</td>\n <td>0.271351</td>\n <td>0.000000</td>\n </tr>\n </tbody>\n</table>\n<p>2 rows \u00d7 35 columns</p>\n</div>\n\n``` python\ntf_idf.cosineSimilarity(compare)\n```\n\n<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n .dataframe tbody tr th {\n vertical-align: top;\n }\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>DOCUMENT</th>\n <th>STEMMING</th>\n <th>COSIM</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>For instance, in the design phase of a structural engineering project, Monte Carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety</td>\n <td>[instanc, design, phase, structur, engin, project, mont, carlo, simul, evalu, perform, propos, design, differ, load, condit, materi, properti, provid, valuabl, insight, reliabl, safeti]</td>\n <td>1.000000</td>\n </tr>\n <tr>\n <th>1</th>\n <td>For instance, Monte Carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness</td>\n <td>[instanc, mont, carlo, simul, simul, hundr, thousand, differ, combin, load, condit, materi, properti, creat, statist, predict, structur, stiff]</td>\n <td>0.337359</td>\n </tr>\n </tbody>\n</table>\n</div>\n",
"bugtrack_url": null,
"license": "Apache Software License 2.0",
"summary": "This is a short set of functions meant to help analyze cosine similarity between texts",
"version": "0.0.2",
"project_urls": {
"Homepage": "https://github.com/cooperrc/tf_idf_cosimm"
},
"split_keywords": [
"nbdev",
"jupyter",
"notebook",
"python"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c6a0b83d5cd1985bc46ccab65f313708019a370deed2deee39de7e017f06cefd",
"md5": "055f30b238d0e168dc78460686e06c91",
"sha256": "daac75b3065830310aa19fb844109e622f138a809e2d7d958545ce4a2e8cd667"
},
"downloads": -1,
"filename": "tf_idf_cosimm-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "055f30b238d0e168dc78460686e06c91",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 9200,
"upload_time": "2024-05-28T18:03:18",
"upload_time_iso_8601": "2024-05-28T18:03:18.368588Z",
"url": "https://files.pythonhosted.org/packages/c6/a0/b83d5cd1985bc46ccab65f313708019a370deed2deee39de7e017f06cefd/tf_idf_cosimm-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "dedaf1897e332602ef43985bac7c301285a94e371c7ccdfd84935835037cc94b",
"md5": "a01ba3d2cea0953d7717ed50a3c0a26e",
"sha256": "a3e9a38c4cd53e5720bca687215abdc273a71d5a39f7e59ae659f9abc4e69c96"
},
"downloads": -1,
"filename": "tf_idf_cosimm-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "a01ba3d2cea0953d7717ed50a3c0a26e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 10424,
"upload_time": "2024-05-28T18:03:19",
"upload_time_iso_8601": "2024-05-28T18:03:19.363731Z",
"url": "https://files.pythonhosted.org/packages/de/da/f1897e332602ef43985bac7c301285a94e371c7ccdfd84935835037cc94b/tf_idf_cosimm-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-28 18:03:19",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "cooperrc",
"github_project": "tf_idf_cosimm",
"github_not_found": true,
"lcname": "tf-idf-cosimm"
}