m2metricforge


Namem2metricforge JSON
Version 0.0.4 PyPI version JSON
download
home_pagehttps://github.com/HDD-Team/metricforge
SummaryLibrary for generating validation dataset and evaluation metrics
upload_time2024-06-11 20:51:23
maintainerNone
docs_urlNone
authorm2syndicate
requires_python>=3.7
licenseNone
keywords metric validation generate
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # **MetricForge**
## Library for generating validation dataset and evaluation metrics

### Use  `import metricforge.main as mf`  
### List of functions:

##### script_valid



Function for creating validation dataset or creating a new column, using your own model. 
To do this, enter in function your function, which generating answer. 
Otherwise select model which you need and prompt, then new CSV file would be generated

###### variables
**file_base** - your csv file with train data. `"your_csv.csv"` <br />
**df_name** - name of your new dataframe <br />
**column_name** - name of column which will be used for valid generation <br />
**file_new** - name of new csv file `"new_csv.csv"` <br />
**model_name** - name of model which you want to use <br />
**prompt** - prompt for LLM for generating valid dataset <br />

**Example usage of function:**

`mf.script_valid(file_base = "dataset.csv", df_name = "generated_answer",column_name = "data/dictionary",prompt = "You are a validation generator dataset bot. "
                                                                                     "You are creating a validation dataset based on a training dataset. "
                                                                                     "Based on the given query, generate a similar query.",file_new = "file_new.csv",model_name="mistral:instruct")`

##### calculate_metrics

Function for calculating Accuracy and F1 Score metrics. We use the Schlern library for calculations. Column one and Column two can be both the name of the column and its number

###### variables
**csv_file** - your csv file with data. `"your_csv.csv"` <br /> 
**column_one** - number or name of first column, where it is your original data <br /> 
**column_two** - number or name of second column, where it is generated or predicted data <br />

**Example usage of function:**

`csv_file = r"validated_dataset.csv"
accuracy, f1 = mf.calculate_metrics(csv_file, 3, 4)
print("Accuracy:", accuracy)
print("F1 Score:", f1)`

##### script_generate

Function for applying RAG function "model_query" to the provdied dataset, to generate answers for "column_name" in your dataset. By default model_query generate answer in str format.

###### variables

**csv_file** - your csv file with provided data. `"your_csv.csv"` <br />
**column_name** - name of column which will be used for answer generation <br />
**dfnew_name** - name of new df <br />
**model_query** - your function with RAG chain, where result is worg of RAG chain <br />

**Example usage of function:**

`mf.script_generate_json(csv_file=r"datavalid.csv", column_name="data/dictionary",
                   model_query=model_query,dfnew_name="testing.csv")`

##### script_generate_json

Function for applying RAG function "model_query" to the provdied dataset, to generate answers for "column_name" in your dataset. This function assumes your LLM generate answer in JSON view, so you need to select which name from JSON you want extract in desired_data variable

###### variables

**csv_file** - your csv file with provided data. `"your_csv.csv"` <br />
**column_name** - name of column which will be used for answer generation <br />
**dfnew_name** - name of new df <br />
**model_query** - your function with RAG chain, where result is worg of RAG chain <br />
**desired_data** - name of json data, which will be taken from json and inputted in result CSV column <br />

**Example usage of function:**

`mf.script_generate_json(csv_file=r"datavalid.csv", column_name="data/dictionary",
                   model_query=model_query,dfnew_name="testing.csv", desired_data="data/url")`

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/HDD-Team/metricforge",
    "name": "m2metricforge",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "metric, validation, generate",
    "author": "m2syndicate",
    "author_email": "mtwosyndicate@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/65/19/c2c6e4ebeb7b94741acd060b58df7a246b4eea1ddb87203e5960007259c2/m2metricforge-0.0.4.tar.gz",
    "platform": null,
    "description": "# **MetricForge**\r\n## Library for generating validation dataset and evaluation metrics\r\n\r\n### Use  `import metricforge.main as mf`  \r\n### List of functions:\r\n\r\n##### script_valid\r\n\r\n\r\n\r\nFunction for creating validation dataset or creating a new column, using your own model. \r\nTo do this, enter in function your function, which generating answer. \r\nOtherwise select model which you need and prompt, then new CSV file would be generated\r\n\r\n###### variables\r\n**file_base** - your csv file with train data. `\"your_csv.csv\"` <br />\r\n**df_name** - name of your new dataframe <br />\r\n**column_name** - name of column which will be used for valid generation <br />\r\n**file_new** - name of new csv file `\"new_csv.csv\"` <br />\r\n**model_name** - name of model which you want to use <br />\r\n**prompt** - prompt for LLM for generating valid dataset <br />\r\n\r\n**Example usage of function:**\r\n\r\n`mf.script_valid(file_base = \"dataset.csv\", df_name = \"generated_answer\",column_name = \"data/dictionary\",prompt = \"You are a validation generator dataset bot. \"\r\n                                                                                     \"You are creating a validation dataset based on a training dataset. \"\r\n                                                                                     \"Based on the given query, generate a similar query.\",file_new = \"file_new.csv\",model_name=\"mistral:instruct\")`\r\n\r\n##### calculate_metrics\r\n\r\nFunction for calculating Accuracy and F1 Score metrics. We use the Schlern library for calculations. Column one and Column two can be both the name of the column and its number\r\n\r\n###### variables\r\n**csv_file** - your csv file with data. `\"your_csv.csv\"` <br /> \r\n**column_one** - number or name of first column, where it is your original data <br /> \r\n**column_two** - number or name of second column, where it is generated or predicted data <br />\r\n\r\n**Example usage of function:**\r\n\r\n`csv_file = r\"validated_dataset.csv\"\r\naccuracy, f1 = mf.calculate_metrics(csv_file, 3, 4)\r\nprint(\"Accuracy:\", accuracy)\r\nprint(\"F1 Score:\", f1)`\r\n\r\n##### script_generate\r\n\r\nFunction for applying RAG function \"model_query\" to the provdied dataset, to generate answers for \"column_name\" in your dataset. By default model_query generate answer in str format.\r\n\r\n###### variables\r\n\r\n**csv_file** - your csv file with provided data. `\"your_csv.csv\"` <br />\r\n**column_name** - name of column which will be used for answer generation <br />\r\n**dfnew_name** - name of new df <br />\r\n**model_query** - your function with RAG chain, where result is worg of RAG chain <br />\r\n\r\n**Example usage of function:**\r\n\r\n`mf.script_generate_json(csv_file=r\"datavalid.csv\", column_name=\"data/dictionary\",\r\n                   model_query=model_query,dfnew_name=\"testing.csv\")`\r\n\r\n##### script_generate_json\r\n\r\nFunction for applying RAG function \"model_query\" to the provdied dataset, to generate answers for \"column_name\" in your dataset. This function assumes your LLM generate answer in JSON view, so you need to select which name from JSON you want extract in desired_data variable\r\n\r\n###### variables\r\n\r\n**csv_file** - your csv file with provided data. `\"your_csv.csv\"` <br />\r\n**column_name** - name of column which will be used for answer generation <br />\r\n**dfnew_name** - name of new df <br />\r\n**model_query** - your function with RAG chain, where result is worg of RAG chain <br />\r\n**desired_data** - name of json data, which will be taken from json and inputted in result CSV column <br />\r\n\r\n**Example usage of function:**\r\n\r\n`mf.script_generate_json(csv_file=r\"datavalid.csv\", column_name=\"data/dictionary\",\r\n                   model_query=model_query,dfnew_name=\"testing.csv\", desired_data=\"data/url\")`\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Library for generating validation dataset and evaluation metrics",
    "version": "0.0.4",
    "project_urls": {
        "Documentation": "https://github.com/HDD-Team/metricforge/blob/main/README.MD",
        "Homepage": "https://github.com/HDD-Team/metricforge"
    },
    "split_keywords": [
        "metric",
        " validation",
        " generate"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9788e97366b9da0235fdabb3de24bb7f832aa118ffa02294de485a22101a04e6",
                "md5": "728b46f67c68d8f398236b6ebe6a7f4e",
                "sha256": "6d0a860c35814983f2724093879275b000ae1f105380a3ccba4a500e392735e1"
            },
            "downloads": -1,
            "filename": "m2metricforge-0.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "728b46f67c68d8f398236b6ebe6a7f4e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 8350,
            "upload_time": "2024-06-11T20:51:21",
            "upload_time_iso_8601": "2024-06-11T20:51:21.716372Z",
            "url": "https://files.pythonhosted.org/packages/97/88/e97366b9da0235fdabb3de24bb7f832aa118ffa02294de485a22101a04e6/m2metricforge-0.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6519c2c6e4ebeb7b94741acd060b58df7a246b4eea1ddb87203e5960007259c2",
                "md5": "5e833af359a06c193cd11f6afb836e08",
                "sha256": "8ab2ca94d2c72622a2c36fda6788746830ac468485f161776008ce3b0b44cb59"
            },
            "downloads": -1,
            "filename": "m2metricforge-0.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "5e833af359a06c193cd11f6afb836e08",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 7502,
            "upload_time": "2024-06-11T20:51:23",
            "upload_time_iso_8601": "2024-06-11T20:51:23.057315Z",
            "url": "https://files.pythonhosted.org/packages/65/19/c2c6e4ebeb7b94741acd060b58df7a246b4eea1ddb87203e5960007259c2/m2metricforge-0.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-11 20:51:23",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "HDD-Team",
    "github_project": "metricforge",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "m2metricforge"
}
        
Elapsed time: 4.48599s