[![License: Apache 2](https://img.shields.io/badge/License-apache2-green.svg)](LICENSE)
[![TraceML](https://github.com/polyaxon/traceml/actions/workflows/traceml.yml/badge.svg)](https://github.com/polyaxon/traceml/actions/workflows/traceml.yml)
[![Slack](https://img.shields.io/badge/chat-on%20slack-aadada.svg?logo=slack&longCache=true)](https://polyaxon.com/slack/)
[![Docs](https://img.shields.io/badge/docs-stable-brightgreen.svg?style=flat)](https://polyaxon.com/docs/)
[![GitHub](https://img.shields.io/badge/issue_tracker-github-blue?logo=github)](https://github.com/polyaxon/polyaxon/issues)
[![GitHub](https://img.shields.io/badge/roadmap-github-blue?logo=github)](https://github.com/polyaxon/polyaxon/milestones)
<a href="https://polyaxon.com"><img src="https://raw.githubusercontent.com/polyaxon/polyaxon/master/artifacts/packages/traceml.svg" width="125" height="125" align="right" /></a>
# TraceML
Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.
## Install
```bash
pip install traceml
```
If you would like to use the tracking features, you need to install `polyaxon` as well:
```bash
pip install polyaxon traceml
```
## [WIP] Local sandbox
> Coming soon
## Offline usage
You can enable the offline mode to track runs without an API:
```bash
export POLYAXON_OFFLINE="true"
```
Or passing the offline flag
```python
from traceml import tracking
tracking.init(..., is_offline=True, ...)
```
## Simple usage in a Python script
```python
import random
import traceml as tracking
tracking.init(
is_offline=True,
project='quick-start',
name="my-new-run",
description="trying TraceML",
tags=["examples"],
artifacts_path="path/to/artifacts/repo"
)
# Tracking some data refs
tracking.log_data_ref(content=X_train, name='x_train')
tracking.log_data_ref(content=y_train, name='y_train')
# Tracking inputs
tracking.log_inputs(
batch_size=64,
dropout=0.2,
learning_rate=0.001,
optimizer="Adam"
)
def get_loss(step):
result = 10 / (step + 1)
noise = (random.random() - 0.5) * 0.5 * result
return result + noise
# Track metrics
for step in range(100):
loss = get_loss(step)
tracking.log_metrics(
loss=loss,
accuracy=(100 - loss) / 100.0,
)
# Track some one time results
tracking.log_outputs(validation_score=0.66)
# Optionally manually stop the tracking process
tracking.stop()
```
## Integration with deep learning and machine learning libraries and frameworks
### Keras
You can use TraceML's callback to automatically save all metrics and collect outputs and models, you can also track additional information using the logging methods:
```python
from traceml import tracking
from traceml.integrations.keras import Callback
tracking.init(
is_offline=True,
project='tracking-project',
name="keras-run",
description="trying TraceML & Keras",
tags=["examples"],
artifacts_path="path/to/artifacts/repo"
)
tracking.log_inputs(
batch_size=64,
dropout=0.2,
learning_rate=0.001,
optimizer="Adam"
)
tracking.log_data_ref(content=x_train, name='x_train')
tracking.log_data_ref(content=y_train, name='y_train')
tracking.log_data_ref(content=x_test, name='x_test')
tracking.log_data_ref(content=y_test, name='y_test')
# ...
model.fit(
x_train,
y_train,
validation_data=(X_test, y_test),
epochs=epochs,
batch_size=100,
callbacks=[Callback()],
)
```
### PyTorch
You can log metrics, inputs, and outputs of Pytorch experiments using the tracking module:
```python
from traceml import tracking
tracking.init(
is_offline=True,
project='tracking-project',
name="pytorch-run",
description="trying TraceML & PyTorch",
tags=["examples"],
artifacts_path="path/to/artifacts/repo"
)
tracking.log_inputs(
batch_size=64,
dropout=0.2,
learning_rate=0.001,
optimizer="Adam"
)
# Metrics
for batch_idx, (data, target) in enumerate(train_loader):
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
optimizer.step()
tracking.log_metrics(loss=loss)
asset_path = tracking.get_outputs_path('model.ckpt')
torch.save(model.state_dict(), asset_path)
# log model
tracking.log_artifact_ref(asset_path, framework="pytorch", ...)
```
### Tensorflow
You can log metrics, outputs, and models of Tensorflow experiments and distributed Tensorflow experiments using the tracking module:
```python
from traceml import tracking
from traceml.integrations.tensorflow import Callback
tracking.init(
is_offline=True,
project='tracking-project',
name="tf-run",
description="trying TraceML & Tensorflow",
tags=["examples"],
artifacts_path="path/to/artifacts/repo"
)
tracking.log_inputs(
batch_size=64,
dropout=0.2,
learning_rate=0.001,
optimizer="Adam"
)
# log model
estimator.train(hooks=[Callback(log_image=True, log_histo=True, log_tensor=True)])
```
### Fastai
You can log metrics, outputs, and models of Fastai experiments using the tracking module:
```python
from traceml import tracking
from traceml.integrations.fastai import Callback
tracking.init(
is_offline=True,
project='tracking-project',
name="fastai-run",
description="trying TraceML & Fastai",
tags=["examples"],
artifacts_path="path/to/artifacts/repo"
)
# Log model metrics
learn.fit(..., cbs=[Callback()])
```
### Pytorch Lightning
You can log metrics, outputs, and models of Pytorch Lightning experiments using the tracking module:
```python
from traceml import tracking
from traceml.integrations.pytorch_lightning import Callback
tracking.init(
is_offline=True,
project='tracking-project',
name="pytorch-lightning-run",
description="trying TraceML & Lightning",
tags=["examples"],
artifacts_path="path/to/artifacts/repo"
)
...
trainer = pl.Trainer(
gpus=0,
progress_bar_refresh_rate=20,
max_epochs=2,
logger=Callback(),
)
```
### HuggingFace
You can log metrics, outputs, and models of HuggingFace experiments using the tracking module:
```python
from traceml import tracking
from traceml.integrations.hugging_face import Callback
tracking.init(
is_offline=True,
project='tracking-project',
name="hg-run",
description="trying TraceML & HuggingFace",
tags=["examples"],
artifacts_path="path/to/artifacts/repo"
)
...
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset if training_args.do_train else None,
eval_dataset=eval_dataset if training_args.do_eval else None,
callbacks=[Callback],
# ...
)
```
## Tracking artifacts
```python
import altair as alt
import matplotlib.pyplot as plt
import numpy as np
import plotly.express as px
from bokeh.plotting import figure
from vega_datasets import data
from traceml import tracking
def plot_mpl_figure(step):
np.random.seed(19680801)
data = np.random.randn(2, 100)
figure, axs = plt.subplots(2, 2, figsize=(5, 5))
axs[0, 0].hist(data[0])
axs[1, 0].scatter(data[0], data[1])
axs[0, 1].plot(data[0], data[1])
axs[1, 1].hist2d(data[0], data[1])
tracking.log_mpl_image(figure, 'mpl_image', step=step)
def log_bokeh(step):
factors = ["a", "b", "c", "d", "e", "f", "g", "h"]
x = [50, 40, 65, 10, 25, 37, 80, 60]
dot = figure(title="Categorical Dot Plot", tools="", toolbar_location=None,
y_range=factors, x_range=[0, 100])
dot.segment(0, factors, x, factors, line_width=2, line_color="green", )
dot.circle(x, factors, size=15, fill_color="orange", line_color="green", line_width=3, )
factors = ["foo 123", "bar:0.2", "baz-10"]
x = ["foo 123", "foo 123", "foo 123", "bar:0.2", "bar:0.2", "bar:0.2", "baz-10", "baz-10",
"baz-10"]
y = ["foo 123", "bar:0.2", "baz-10", "foo 123", "bar:0.2", "baz-10", "foo 123", "bar:0.2",
"baz-10"]
colors = [
"#0B486B", "#79BD9A", "#CFF09E",
"#79BD9A", "#0B486B", "#79BD9A",
"#CFF09E", "#79BD9A", "#0B486B"
]
hm = figure(title="Categorical Heatmap", tools="hover", toolbar_location=None,
x_range=factors, y_range=factors)
hm.rect(x, y, color=colors, width=1, height=1)
tracking.log_bokeh_chart(name='confusion-bokeh', figure=hm, step=step)
def log_altair(step):
source = data.cars()
brush = alt.selection(type='interval')
points = alt.Chart(source).mark_point().encode(
x='Horsepower:Q',
y='Miles_per_Gallon:Q',
color=alt.condition(brush, 'Origin:N', alt.value('lightgray'))
).add_selection(
brush
)
bars = alt.Chart(source).mark_bar().encode(
y='Origin:N',
color='Origin:N',
x='count(Origin):Q'
).transform_filter(
brush
)
chart = points & bars
tracking.log_altair_chart(name='altair_chart', figure=chart, step=step)
def log_plotly(step):
df = px.data.tips()
fig = px.density_heatmap(df, x="total_bill", y="tip", facet_row="sex", facet_col="smoker")
tracking.log_plotly_chart(name="2d-hist", figure=fig, step=step)
plot_mpl_figure(100)
log_bokeh(100)
log_altair(100)
log_plotly(100)
```
## Tracking DataFrames
### Summary
An extension to [pandas](http://pandas.pydata.org/) dataframes describe function.
The module contains `DataFrameSummary` object that extend `describe()` with:
- **properties**
- dfs.columns_stats: counts, uniques, missing, missing_perc, and type per column
- dsf.columns_types: a count of the types of columns
- dfs[column]: more in depth summary of the column
- **function**
- summary(): extends the `describe()` function with the values with `columns_stats`
The `DataFrameSummary` expect a pandas `DataFrame` to summarise.
```python
from traceml.summary.df import DataFrameSummary
dfs = DataFrameSummary(df)
```
getting the columns types
```python
dfs.columns_types
numeric 9
bool 3
categorical 2
unique 1
date 1
constant 1
dtype: int64
```
getting the columns stats
```python
dfs.columns_stats
A B C D E
counts 5802 5794 5781 5781 4617
uniques 5802 3 5771 128 121
missing 0 8 21 21 1185
missing_perc 0% 0.14% 0.36% 0.36% 20.42%
types unique categorical numeric numeric numeric
```
getting a single column summary, e.g. numerical column
```python
# we can also access the column using numbers A[1]
dfs['A']
std 0.2827146
max 1.072792
min 0
variance 0.07992753
mean 0.5548516
5% 0.1603367
25% 0.3199776
50% 0.4968588
75% 0.8274732
95% 1.011255
iqr 0.5074956
kurtosis -1.208469
skewness 0.2679559
sum 3207.597
mad 0.2459508
cv 0.5095319
zeros_num 11
zeros_perc 0,1%
deviating_of_mean 21
deviating_of_mean_perc 0.36%
deviating_of_median 21
deviating_of_median_perc 0.36%
top_correlations {u'D': 0.702240243124, u'E': -0.663}
counts 5781
uniques 5771
missing 21
missing_perc 0.36%
types numeric
Name: A, dtype: object
```
### [WIP] Summaries
* [ ] Add summary analysis between columns, i.e. `dfs[[1, 2]]`
### [WIP] Visualizations
* [ ] Add summary visualization with matplotlib.
* [ ] Add summary visualization with plotly.
* [ ] Add summary visualization with altair.
* [ ] Add predefined profiling.
### [WIP] Catalog and Versions
* [ ] Add possibility to persist summary and link to a specific version.
* [ ] Integrate with quality libraries.
Raw data
{
"_id": null,
"home_page": "https://github.com/polyaxon/traceml",
"name": "traceml",
"maintainer": "Polyaxon, Inc.",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "contact@polyaxon.com",
"keywords": "polyaxon, aws, s3, microsoft, azure, google cloud storage, gcs, deep-learning, machine-learning, data-science, neural-networks, artificial-intelligence, ai, reinforcement-learning, kubernetes, aws, microsoft, azure, google cloud, tensorFlow, pytorch, matplotlib, plotly, visualization, analytics",
"author": "Polyaxon, Inc.",
"author_email": "contact@polyaxon.com",
"download_url": "https://files.pythonhosted.org/packages/fd/c3/c65ea5a8b2410ffec30b3d25d18f3998f0394c7ed5e9cc6c6d10ad751233/traceml-1.1.5.tar.gz",
"platform": "any",
"description": "[![License: Apache 2](https://img.shields.io/badge/License-apache2-green.svg)](LICENSE)\n[![TraceML](https://github.com/polyaxon/traceml/actions/workflows/traceml.yml/badge.svg)](https://github.com/polyaxon/traceml/actions/workflows/traceml.yml)\n[![Slack](https://img.shields.io/badge/chat-on%20slack-aadada.svg?logo=slack&longCache=true)](https://polyaxon.com/slack/)\n[![Docs](https://img.shields.io/badge/docs-stable-brightgreen.svg?style=flat)](https://polyaxon.com/docs/)\n[![GitHub](https://img.shields.io/badge/issue_tracker-github-blue?logo=github)](https://github.com/polyaxon/polyaxon/issues)\n[![GitHub](https://img.shields.io/badge/roadmap-github-blue?logo=github)](https://github.com/polyaxon/polyaxon/milestones)\n\n<a href=\"https://polyaxon.com\"><img src=\"https://raw.githubusercontent.com/polyaxon/polyaxon/master/artifacts/packages/traceml.svg\" width=\"125\" height=\"125\" align=\"right\" /></a>\n\n# TraceML\n\nEngine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.\n\n## Install\n\n```bash\npip install traceml\n```\n\nIf you would like to use the tracking features, you need to install `polyaxon` as well:\n\n```bash\npip install polyaxon traceml\n```\n\n## [WIP] Local sandbox\n\n> Coming soon\n\n## Offline usage\n\nYou can enable the offline mode to track runs without an API:\n\n```bash\nexport POLYAXON_OFFLINE=\"true\"\n```\n\nOr passing the offline flag\n\n```python\nfrom traceml import tracking\n\ntracking.init(..., is_offline=True, ...)\n```\n\n## Simple usage in a Python script\n\n```python\nimport random\n\nimport traceml as tracking\n\ntracking.init(\n is_offline=True,\n project='quick-start',\n name=\"my-new-run\",\n description=\"trying TraceML\",\n tags=[\"examples\"],\n artifacts_path=\"path/to/artifacts/repo\"\n)\n\n# Tracking some data refs\ntracking.log_data_ref(content=X_train, name='x_train')\ntracking.log_data_ref(content=y_train, name='y_train')\n\n# Tracking inputs\ntracking.log_inputs(\n batch_size=64,\n dropout=0.2,\n learning_rate=0.001,\n optimizer=\"Adam\"\n)\n\ndef get_loss(step):\n result = 10 / (step + 1)\n noise = (random.random() - 0.5) * 0.5 * result\n return result + noise\n\n# Track metrics\nfor step in range(100):\n loss = get_loss(step)\n tracking.log_metrics(\n loss=loss,\n accuracy=(100 - loss) / 100.0,\n)\n\n# Track some one time results\ntracking.log_outputs(validation_score=0.66)\n\n# Optionally manually stop the tracking process\ntracking.stop()\n```\n\n## Integration with deep learning and machine learning libraries and frameworks\n\n### Keras\n\nYou can use TraceML's callback to automatically save all metrics and collect outputs and models, you can also track additional information using the logging methods:\n\n```python\nfrom traceml import tracking\nfrom traceml.integrations.keras import Callback\n\ntracking.init(\n is_offline=True,\n project='tracking-project',\n name=\"keras-run\",\n description=\"trying TraceML & Keras\",\n tags=[\"examples\"],\n artifacts_path=\"path/to/artifacts/repo\"\n)\n\ntracking.log_inputs(\n batch_size=64,\n dropout=0.2,\n learning_rate=0.001,\n optimizer=\"Adam\"\n)\ntracking.log_data_ref(content=x_train, name='x_train')\ntracking.log_data_ref(content=y_train, name='y_train')\ntracking.log_data_ref(content=x_test, name='x_test')\ntracking.log_data_ref(content=y_test, name='y_test')\n\n# ...\n\nmodel.fit(\n x_train,\n y_train,\n validation_data=(X_test, y_test),\n epochs=epochs,\n batch_size=100,\n callbacks=[Callback()],\n)\n```\n\n### PyTorch\n\nYou can log metrics, inputs, and outputs of Pytorch experiments using the tracking module:\n\n```python\nfrom traceml import tracking\n\ntracking.init(\n is_offline=True,\n project='tracking-project',\n name=\"pytorch-run\",\n description=\"trying TraceML & PyTorch\",\n tags=[\"examples\"],\n artifacts_path=\"path/to/artifacts/repo\"\n)\n\ntracking.log_inputs(\n batch_size=64,\n dropout=0.2,\n learning_rate=0.001,\n optimizer=\"Adam\"\n)\n\n# Metrics\nfor batch_idx, (data, target) in enumerate(train_loader):\n output = model(data)\n loss = F.nll_loss(output, target)\n loss.backward()\n optimizer.step()\n tracking.log_metrics(loss=loss)\n\nasset_path = tracking.get_outputs_path('model.ckpt')\ntorch.save(model.state_dict(), asset_path)\n\n# log model\ntracking.log_artifact_ref(asset_path, framework=\"pytorch\", ...)\n```\n\n### Tensorflow\n\nYou can log metrics, outputs, and models of Tensorflow experiments and distributed Tensorflow experiments using the tracking module:\n\n```python\nfrom traceml import tracking\nfrom traceml.integrations.tensorflow import Callback\n\ntracking.init(\n is_offline=True,\n project='tracking-project',\n name=\"tf-run\",\n description=\"trying TraceML & Tensorflow\",\n tags=[\"examples\"],\n artifacts_path=\"path/to/artifacts/repo\"\n)\n\ntracking.log_inputs(\n batch_size=64,\n dropout=0.2,\n learning_rate=0.001,\n optimizer=\"Adam\"\n)\n\n# log model\nestimator.train(hooks=[Callback(log_image=True, log_histo=True, log_tensor=True)])\n```\n\n### Fastai\n\nYou can log metrics, outputs, and models of Fastai experiments using the tracking module:\n\n```python\nfrom traceml import tracking\nfrom traceml.integrations.fastai import Callback\n\ntracking.init(\n is_offline=True,\n project='tracking-project',\n name=\"fastai-run\",\n description=\"trying TraceML & Fastai\",\n tags=[\"examples\"],\n artifacts_path=\"path/to/artifacts/repo\"\n)\n\n# Log model metrics\nlearn.fit(..., cbs=[Callback()])\n```\n\n### Pytorch Lightning\n\nYou can log metrics, outputs, and models of Pytorch Lightning experiments using the tracking module:\n\n```python\nfrom traceml import tracking\nfrom traceml.integrations.pytorch_lightning import Callback\n\ntracking.init(\n is_offline=True,\n project='tracking-project',\n name=\"pytorch-lightning-run\",\n description=\"trying TraceML & Lightning\",\n tags=[\"examples\"],\n artifacts_path=\"path/to/artifacts/repo\"\n)\n\n...\ntrainer = pl.Trainer(\n gpus=0,\n progress_bar_refresh_rate=20,\n max_epochs=2,\n logger=Callback(),\n)\n```\n\n### HuggingFace\n\nYou can log metrics, outputs, and models of HuggingFace experiments using the tracking module:\n\n```python\nfrom traceml import tracking\nfrom traceml.integrations.hugging_face import Callback\n\ntracking.init(\n is_offline=True,\n project='tracking-project',\n name=\"hg-run\",\n description=\"trying TraceML & HuggingFace\",\n tags=[\"examples\"],\n artifacts_path=\"path/to/artifacts/repo\"\n)\n\n...\ntrainer = Trainer(\n model=model,\n args=training_args,\n train_dataset=train_dataset if training_args.do_train else None,\n eval_dataset=eval_dataset if training_args.do_eval else None,\n callbacks=[Callback],\n # ...\n)\n```\n\n## Tracking artifacts\n\n```python\nimport altair as alt\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport plotly.express as px\nfrom bokeh.plotting import figure\nfrom vega_datasets import data\n\nfrom traceml import tracking\n\n\ndef plot_mpl_figure(step):\n np.random.seed(19680801)\n data = np.random.randn(2, 100)\n\n figure, axs = plt.subplots(2, 2, figsize=(5, 5))\n axs[0, 0].hist(data[0])\n axs[1, 0].scatter(data[0], data[1])\n axs[0, 1].plot(data[0], data[1])\n axs[1, 1].hist2d(data[0], data[1])\n\n tracking.log_mpl_image(figure, 'mpl_image', step=step)\n\n\ndef log_bokeh(step):\n factors = [\"a\", \"b\", \"c\", \"d\", \"e\", \"f\", \"g\", \"h\"]\n x = [50, 40, 65, 10, 25, 37, 80, 60]\n\n dot = figure(title=\"Categorical Dot Plot\", tools=\"\", toolbar_location=None,\n y_range=factors, x_range=[0, 100])\n\n dot.segment(0, factors, x, factors, line_width=2, line_color=\"green\", )\n dot.circle(x, factors, size=15, fill_color=\"orange\", line_color=\"green\", line_width=3, )\n\n factors = [\"foo 123\", \"bar:0.2\", \"baz-10\"]\n x = [\"foo 123\", \"foo 123\", \"foo 123\", \"bar:0.2\", \"bar:0.2\", \"bar:0.2\", \"baz-10\", \"baz-10\",\n \"baz-10\"]\n y = [\"foo 123\", \"bar:0.2\", \"baz-10\", \"foo 123\", \"bar:0.2\", \"baz-10\", \"foo 123\", \"bar:0.2\",\n \"baz-10\"]\n colors = [\n \"#0B486B\", \"#79BD9A\", \"#CFF09E\",\n \"#79BD9A\", \"#0B486B\", \"#79BD9A\",\n \"#CFF09E\", \"#79BD9A\", \"#0B486B\"\n ]\n\n hm = figure(title=\"Categorical Heatmap\", tools=\"hover\", toolbar_location=None,\n x_range=factors, y_range=factors)\n\n hm.rect(x, y, color=colors, width=1, height=1)\n\n tracking.log_bokeh_chart(name='confusion-bokeh', figure=hm, step=step)\n\n\ndef log_altair(step):\n source = data.cars()\n\n brush = alt.selection(type='interval')\n\n points = alt.Chart(source).mark_point().encode(\n x='Horsepower:Q',\n y='Miles_per_Gallon:Q',\n color=alt.condition(brush, 'Origin:N', alt.value('lightgray'))\n ).add_selection(\n brush\n )\n\n bars = alt.Chart(source).mark_bar().encode(\n y='Origin:N',\n color='Origin:N',\n x='count(Origin):Q'\n ).transform_filter(\n brush\n )\n\n chart = points & bars\n\n tracking.log_altair_chart(name='altair_chart', figure=chart, step=step)\n\n\ndef log_plotly(step):\n df = px.data.tips()\n\n fig = px.density_heatmap(df, x=\"total_bill\", y=\"tip\", facet_row=\"sex\", facet_col=\"smoker\")\n tracking.log_plotly_chart(name=\"2d-hist\", figure=fig, step=step)\n\n\nplot_mpl_figure(100)\nlog_bokeh(100)\nlog_altair(100)\nlog_plotly(100)\n```\n\n## Tracking DataFrames\n\n### Summary\n\nAn extension to [pandas](http://pandas.pydata.org/) dataframes describe function.\n\nThe module contains `DataFrameSummary` object that extend `describe()` with:\n\n- **properties**\n - dfs.columns_stats: counts, uniques, missing, missing_perc, and type per column\n - dsf.columns_types: a count of the types of columns\n - dfs[column]: more in depth summary of the column\n- **function**\n - summary(): extends the `describe()` function with the values with `columns_stats`\n\nThe `DataFrameSummary` expect a pandas `DataFrame` to summarise.\n\n```python\nfrom traceml.summary.df import DataFrameSummary\n\ndfs = DataFrameSummary(df)\n```\n\ngetting the columns types\n\n```python\ndfs.columns_types\n\n\nnumeric 9\nbool 3\ncategorical 2\nunique 1\ndate 1\nconstant 1\ndtype: int64\n```\n\ngetting the columns stats\n\n```python\ndfs.columns_stats\n\n\n A B C D E\ncounts 5802 5794 5781 5781 4617\nuniques 5802 3 5771 128 121\nmissing 0 8 21 21 1185\nmissing_perc 0% 0.14% 0.36% 0.36% 20.42%\ntypes unique categorical numeric numeric numeric\n```\n\ngetting a single column summary, e.g. numerical column\n\n```python\n# we can also access the column using numbers A[1]\ndfs['A']\n\nstd 0.2827146\nmax 1.072792\nmin 0\nvariance 0.07992753\nmean 0.5548516\n5% 0.1603367\n25% 0.3199776\n50% 0.4968588\n75% 0.8274732\n95% 1.011255\niqr 0.5074956\nkurtosis -1.208469\nskewness 0.2679559\nsum 3207.597\nmad 0.2459508\ncv 0.5095319\nzeros_num 11\nzeros_perc 0,1%\ndeviating_of_mean 21\ndeviating_of_mean_perc 0.36%\ndeviating_of_median 21\ndeviating_of_median_perc 0.36%\ntop_correlations {u'D': 0.702240243124, u'E': -0.663}\ncounts 5781\nuniques 5771\nmissing 21\nmissing_perc 0.36%\ntypes numeric\nName: A, dtype: object\n```\n\n### [WIP] Summaries\n\n * [ ] Add summary analysis between columns, i.e. `dfs[[1, 2]]`\n\n### [WIP] Visualizations\n\n * [ ] Add summary visualization with matplotlib.\n * [ ] Add summary visualization with plotly.\n * [ ] Add summary visualization with altair.\n * [ ] Add predefined profiling.\n\n\n### [WIP] Catalog and Versions\n\n * [ ] Add possibility to persist summary and link to a specific version.\n * [ ] Integrate with quality libraries.\n\n\n",
"bugtrack_url": null,
"license": "Apache 2.0",
"summary": "Engine for ML/Data tracking, visualization, dashboards, and model UI for Polyaxon.",
"version": "1.1.5",
"project_urls": {
"Homepage": "https://github.com/polyaxon/traceml"
},
"split_keywords": [
"polyaxon",
" aws",
" s3",
" microsoft",
" azure",
" google cloud storage",
" gcs",
" deep-learning",
" machine-learning",
" data-science",
" neural-networks",
" artificial-intelligence",
" ai",
" reinforcement-learning",
" kubernetes",
" aws",
" microsoft",
" azure",
" google cloud",
" tensorflow",
" pytorch",
" matplotlib",
" plotly",
" visualization",
" analytics"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "1e567458d713e5bf1c5c9b2064203c4573f7935b6d38f6514962def0a934a6c3",
"md5": "375372d037fa6af375eea4618ce715ff",
"sha256": "42a4b052ba5ec6e29e6a66056d1000b9deb73504b9a8dc208b5e0b96df1644a8"
},
"downloads": -1,
"filename": "traceml-1.1.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "375372d037fa6af375eea4618ce715ff",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 134158,
"upload_time": "2024-09-02T20:54:57",
"upload_time_iso_8601": "2024-09-02T20:54:57.497453Z",
"url": "https://files.pythonhosted.org/packages/1e/56/7458d713e5bf1c5c9b2064203c4573f7935b6d38f6514962def0a934a6c3/traceml-1.1.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "fdc3c65ea5a8b2410ffec30b3d25d18f3998f0394c7ed5e9cc6c6d10ad751233",
"md5": "28ecbdc2cc7db69d1ae6e558526546cc",
"sha256": "b094df0cbabda5b839c290939ab54848712b3094eb714453a9bf6c0b34269795"
},
"downloads": -1,
"filename": "traceml-1.1.5.tar.gz",
"has_sig": false,
"md5_digest": "28ecbdc2cc7db69d1ae6e558526546cc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 106104,
"upload_time": "2024-09-02T20:55:00",
"upload_time_iso_8601": "2024-09-02T20:55:00.261193Z",
"url": "https://files.pythonhosted.org/packages/fd/c3/c65ea5a8b2410ffec30b3d25d18f3998f0394c7ed5e9cc6c6d10ad751233/traceml-1.1.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-02 20:55:00",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "polyaxon",
"github_project": "traceml",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "traceml"
}