xetrack

Name	xetrack JSON
Version	0.3.1 JSON
	download
home_page
Summary	A simple tool for benchamrking and tracking machine learning models and experiments.
upload_time	2024-03-05 16:19:12
maintainer
docs_url	None
author	xdssio
requires_python	>=3.9,<4.0
license	BSD-3-Clause
keywords	machine-learning duckdb pandas sqlitedict xxhash loguru
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Xetrack

xetrack is a lightweight package to track experiments benchmarks, and monitor stractured data using [duckdb](https://duckdb.org) and [sqlite](https://sqlite.org/index.html).   
It is focuesed on simplicity and flexability.

You create a "Tracker", and let it track data. You can retrive it later as pandas or connect to it as a database.   
Each instance of the tracker has a "track_id" which is a unique identifier for a single run.

## Features

* Simple
* Embedded
* Fast
* Pandas-like
* SQL-like
* Object store with deduplication
* CLI for basic functions
* Multiprocessing reads and writes
* Loguru logs integration
* Experiment tracking
* Model monitoring

## Installation

```bash
pip install xetrack
```

## Quickstart

```python
from xetrack import Tracker

tracker = Tracker('database.db', 
                  params={'model': 'resnet18'}
                  )
tracker.log({"accuracy":0.9, "loss":0.1, "epoch":1}) # All you really need

tracker.latest
{'accuracy': 0.9, 'loss': 0.1, 'epoch': 1, 'model': 'resnet18', 'timestamp': '18-08-2023 11:02:35.162360',
 'track_id': 'cd8afc54-5992-4828-893d-a4cada28dba5'}


tracker.to_df(all=True)  # Retrive all the runs as dataframe
                    timestamp                              track_id     model  loss  epoch  accuracy
0  26-09-2023 12:17:00.342814  398c985a-dc15-42da-88aa-6ac6cbf55794  resnet18   0.1      1       0.9
```

**Params** are values which are added to every future row:

```python
$ tracker.set_params({'model': 'resnet18', 'dataset': 'cifar10'})
$ tracker.log({"accuracy":0.9, "loss":0.1, "epoch":2})

{'accuracy': 0.9, 'loss': 0.1, 'epoch': 2, 'model': 'resnet18', 'dataset': 'cifar10', 
 'timestamp': '26-09-2023 12:18:40.151756', 'track_id': '398c985a-dc15-42da-88aa-6ac6cbf55794'}

```

You can also set a value to an entire run with *set_value* ("back in time"):

```python
tracker.set_value('test_accuracy', 0.9) # Only known at the end of the experiment
tracker.to_df()

                    timestamp                              track_id     model  loss  epoch  accuracy  dataset  test_accuracy
0  26-09-2023 12:17:00.342814  398c985a-dc15-42da-88aa-6ac6cbf55794  resnet18   0.1      1       0.9      NaN            0.9
2  26-09-2023 12:18:40.151756  398c985a-dc15-42da-88aa-6ac6cbf55794  resnet18   0.1      2       0.9  cifar10            0.9

```

## Track functions

You can track any function.

* The return value is logged before returned

```python
tracker = Tracker('database.db', 
    log_system_params=True, 
    log_network_params=True, 
    measurement_interval=0.1)
image = tracker.track(read_image, *args, **kwargs)
tracker.latest
{'result': 571084, 'name': 'read_image', 'time': 0.30797290802001953, 'error': '', 'disk_percent': 0.6,
 'p_memory_percent': 0.496507, 'cpu': 0.0, 'memory_percent': 32.874608, 'bytes_sent': 0.0078125,
 'bytes_recv': 0.583984375}
```

Or with a wrapper:

```python

@tracker.wrap(params={'name':'foofoo'})
def foo(a: int, b: str):
    return a + len(b)

result = foo(1, 'hello')
tracker.latest
{'function_name': 'foo', 'args': "[1, 'hello']", 'kwargs': '{}', 'error': '', 'function_time': 4.0531158447265625e-06, 
 'function_result': 6, 'name': 'foofoo', 'timestamp': '26-09-2023 12:21:02.200245', 'track_id': '398c985a-dc15-42da-88aa-6ac6cbf55794'}
```

## Track assets (Oriented for ML models)

When you attempt to track a non primitive value which is not a list or a dict - xetrack saves it as assets with deduplication and log the object hash:

* Tips: If you plan to log the same object many times over, after the first time you log it, just insert the hash instead for future values to save time on encoding and hashing.

```python
$ tracker = Tracker('database.db', params={'model': 'logistic regression'})
$ lr = Logisticregression().fit(X_train, y_train)
$ tracker.log({'accuracy': float(lr.score(X_test, y_test)), 'lr': lr})
{'accuracy': 0.9777777777777777, 'lr': '53425a65a40a49f4',  # <-- this is the model hash
    'dataset': 'iris', 'model': 'logistic regression', 'timestamp': '2023-12-27 12:21:00.727834', 'track_id': 'wisteria-turkey-4392'}

$ model = tracker.get('53425a65a40a49f4') # retrive an object
$ model.score(X_test, y_test)
0.9777777777777777
```

You can retrive the model in CLI if you need only the model in production and mind carring the rest of the file

```bash
# bash
xt assets export database.db 53425a65a40a49f4 model.cloudpickle
```

```python
# python
import cloudpickle
with open("model.cloudpickle", 'rb') as f:
    model = cloudpickle.loads(f.read())
# LogisticRegression()
```


### Tips and tricks

* ```Tracker(Tracker.IN_MEMORY) ``` Let you run only in memory - great for debuging or working with logs only

### Pandas-like

```python
print(tracker)
                                    _id                              track_id                 date    b    a  accuracy
0  48154ec7-1fe4-4896-ac66-89db54ddd12a  fd0bfe4f-7257-4ec3-8c6f-91fe8ae67d20  16-08-2023 00:21:46  2.0  1.0       NaN
1  8a43000a-03a4-4822-98f8-4df671c2d410  fd0bfe4f-7257-4ec3-8c6f-91fe8ae67d20  16-08-2023 00:24:21  NaN  NaN       1.0

tracker['accuracy'] # get accuracy column
tracker.to_df() # get pandas dataframe of current run

```

### SQL-like
You can filter the data using SQL-like syntax using [duckdb](https://duckdb.org/docs):

* The sqlite database is attached as **db** and the table is **events**. Assts are in the **assets** table.

#### Python
```python
tracker.conn.execute(f"SELECT * FROM db.events WHERE accuracy > 0.8").fetchall()
```

#### Duckdb CLI
```bash
duckdb
D ATTACH 'database.db' AS db (TYPE sqlite);
D SELECT * FROM db.events;
┌────────────────────────────┬──────────────────┬──────────┬───────┬──────────┬────────┐
│         timestamp          │     track_id     │  model   │ epoch │ accuracy │  loss  │
│          varchar           │     varchar      │ varchar  │ int64 │  double  │ double │
├────────────────────────────┼──────────────────┼──────────┼───────┼──────────┼────────┤
│ 2023-12-27 11:25:59.244003 │ fierce-pudu-1649 │ resnet18 │     1 │      0.9 │    0.1 │
└────────────────────────────┴──────────────────┴──────────┴───────┴──────────┴────────┘
```

### Logger integration
This is very useful in an environment where you can use normal logs, and don't want to manage a separate logger or file.   
On great use-case is **model monitoring**.

`logs_stdout=true` print to stdout every tracked event
`logs_path='logs'` writes logs to a file

```python
$ Tracker(db=Tracker.IN_MEMORY, logs_path='logs',logs_stdout=True).log({"accuracy":0.9})
2023-12-14 21:46:55.290 | TRACKING | xetrack.logging:log:69!📁!{"a": 1, "b": 2, "timestamp": "2023-12-14 21:46:55.290098", "track_id": "marvellous-stork-4885"}

$ Reader.read_logs(path='logs')
   accuracy                   timestamp                track_id
0       0.9  2023-12-14 21:47:48.375258  unnatural-polecat-1380
```

## Analysis
To get the data of all runs in the database for analysis:   
Use this for further analysis and plotting.

* This works even while a another process is writing to the database.

```python
from xetrack import Reader
df = Reader('database.db').to_df() 
```

### Model Monitoring

Here is how we can save logs on any server and monitor them with xetrack:    
We want to print logs to a file or *stdout* to be captured normally.   
We save memory by not inserting the data to the database (even though it's fine).
Later we can read the logs and do fancy visualisation, online/offline analysis, build dashboards etc.

```python
tracker = Tracker(db=Tracker.SKIP_INSERT, logs_path='logs', logs_stdout=True)
tracker.logger.monitor("<dict or pandas DataFrame>") # -> write to logs in a structured way, consistent by schema, no database file needed


df = Reader.read_logs(path='logs')
"""
Run drift analysis and outlier detection on your logs: 
"""
```

### ML tracking

```python
tracker.logger.experiemnt(<model evaluation and params>) # -> prettily write to logs

df = Reader.read_logs(path='logs')
"""
Run fancy visualisation, online/offline analysis, build dashboards etc.
"""
```

## CLI

For basic and repetative needs.

```bash
$ xt head database.db --n=2
|    | timestamp                  | track_id                 | model    |   accuracy | data   | params           |
|---:|:---------------------------|:-------------------------|:---------|-----------:|:-------|:-----------------|
|  0 | 2023-12-27 11:36:45.859668 | crouching-groundhog-5046 | xgboost  |        0.9 | mnist  | 1b5b2294fc521d12 |
|  1 | 2023-12-27 11:36:45.863888 | crouching-groundhog-5046 | xgboost  |        0.9 | mnist  | 1b5b2294fc521d12 |
...


$ xt tail database.db --n=1
|    | timestamp                  | track_id        | model    |   accuracy | data   | params           |
|---:|:---------------------------|:----------------|:---------|-----------:|:-------|:-----------------|
|  0 | 2023-12-27 11:37:30.627189 | ebony-loon-6720 | lightgbm |        0.9 | mnist  | 1b5b2294fc521d12 |

$ xet set accuracy 0.8 --where-key params --where-value 1b5b2294fc521d12 --track-id ebony-loon-6720

$ xt delete database.db ebony-loon-6720 # delete experiments wiht a given track_id

# run any other SQL in a oneliner
$ xt sql database.db "SELECT * FROM db.events;"

# retrive a model (any object) which was saved into a file using cloudpickle
$ xt assets export database.db hash output 

# remove an object from the assets
$ xt assets delete database.db hash 

# If you have two databases, and you want to merge one to the other
$ xt copy source.db target.db --assets/--no-assets

# Stats
$ xt describe database.db --columns=x,y,z

$ xt stats top/bottom database.db x # print the entry with the top/bottom result of a value

# bashplotlib (`pip install bashplotlib` is required)
$ xt plot hist database.db x
    ----------------------
    |    x histogram     |
    ----------------------

 225|      o
 200|     ooo
 175|     ooo
 150|     ooo
 125|     ooo
 100|    ooooo
  75|    ooooo
  50|    ooooo
  25|   ooooooo
   1| oooooooooo
     ----------

-----------------------------------
|             Summary             |
-----------------------------------
|        observations: 1000       |
|      min value: -56.605967      |
|         mean : 2.492545         |
|       max value: 75.185944      |
-----------------------------------
$ xt plot scatter database.db x y

```

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "xetrack",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9,<4.0",
    "maintainer_email": "",
    "keywords": "machine-learning,duckdb,pandas,sqlitedict,xxhash,loguru",
    "author": "xdssio",
    "author_email": "jonathan@xdss.io",
    "download_url": "https://files.pythonhosted.org/packages/e1/f0/6d9a956db02fab553ea6a82732913f291baa805a829e75ffef9e7d2e07fb/xetrack-0.3.1.tar.gz",
    "platform": null,
    "description": "# Xetrack\n\nxetrack is a lightweight package to track experiments benchmarks, and monitor stractured data using [duckdb](https://duckdb.org) and [sqlite](https://sqlite.org/index.html).   \nIt is focuesed on simplicity and flexability.\n\nYou create a \"Tracker\", and let it track data. You can retrive it later as pandas or connect to it as a database.   \nEach instance of the tracker has a \"track_id\" which is a unique identifier for a single run.\n\n## Features\n\n* Simple\n* Embedded\n* Fast\n* Pandas-like\n* SQL-like\n* Object store with deduplication\n* CLI for basic functions\n* Multiprocessing reads and writes\n* Loguru logs integration\n* Experiment tracking\n* Model monitoring\n\n## Installation\n\n```bash\npip install xetrack\n```\n\n## Quickstart\n\n```python\nfrom xetrack import Tracker\n\ntracker = Tracker('database.db', \n                  params={'model': 'resnet18'}\n                  )\ntracker.log({\"accuracy\":0.9, \"loss\":0.1, \"epoch\":1}) # All you really need\n\ntracker.latest\n{'accuracy': 0.9, 'loss': 0.1, 'epoch': 1, 'model': 'resnet18', 'timestamp': '18-08-2023 11:02:35.162360',\n 'track_id': 'cd8afc54-5992-4828-893d-a4cada28dba5'}\n\n\ntracker.to_df(all=True)  # Retrive all the runs as dataframe\n                    timestamp                              track_id     model  loss  epoch  accuracy\n0  26-09-2023 12:17:00.342814  398c985a-dc15-42da-88aa-6ac6cbf55794  resnet18   0.1      1       0.9\n```\n\n**Params** are values which are added to every future row:\n\n```python\n$ tracker.set_params({'model': 'resnet18', 'dataset': 'cifar10'})\n$ tracker.log({\"accuracy\":0.9, \"loss\":0.1, \"epoch\":2})\n\n{'accuracy': 0.9, 'loss': 0.1, 'epoch': 2, 'model': 'resnet18', 'dataset': 'cifar10', \n 'timestamp': '26-09-2023 12:18:40.151756', 'track_id': '398c985a-dc15-42da-88aa-6ac6cbf55794'}\n\n```\n\nYou can also set a value to an entire run with *set_value* (\"back in time\"):\n\n```python\ntracker.set_value('test_accuracy', 0.9) # Only known at the end of the experiment\ntracker.to_df()\n\n                    timestamp                              track_id     model  loss  epoch  accuracy  dataset  test_accuracy\n0  26-09-2023 12:17:00.342814  398c985a-dc15-42da-88aa-6ac6cbf55794  resnet18   0.1      1       0.9      NaN            0.9\n2  26-09-2023 12:18:40.151756  398c985a-dc15-42da-88aa-6ac6cbf55794  resnet18   0.1      2       0.9  cifar10            0.9\n\n```\n\n## Track functions\n\nYou can track any function.\n\n* The return value is logged before returned\n\n```python\ntracker = Tracker('database.db', \n    log_system_params=True, \n    log_network_params=True, \n    measurement_interval=0.1)\nimage = tracker.track(read_image, *args, **kwargs)\ntracker.latest\n{'result': 571084, 'name': 'read_image', 'time': 0.30797290802001953, 'error': '', 'disk_percent': 0.6,\n 'p_memory_percent': 0.496507, 'cpu': 0.0, 'memory_percent': 32.874608, 'bytes_sent': 0.0078125,\n 'bytes_recv': 0.583984375}\n```\n\nOr with a wrapper:\n\n```python\n\n@tracker.wrap(params={'name':'foofoo'})\ndef foo(a: int, b: str):\n    return a + len(b)\n\nresult = foo(1, 'hello')\ntracker.latest\n{'function_name': 'foo', 'args': \"[1, 'hello']\", 'kwargs': '{}', 'error': '', 'function_time': 4.0531158447265625e-06, \n 'function_result': 6, 'name': 'foofoo', 'timestamp': '26-09-2023 12:21:02.200245', 'track_id': '398c985a-dc15-42da-88aa-6ac6cbf55794'}\n```\n\n## Track assets (Oriented for ML models)\n\nWhen you attempt to track a non primitive value which is not a list or a dict - xetrack saves it as assets with deduplication and log the object hash:\n\n* Tips: If you plan to log the same object many times over, after the first time you log it, just insert the hash instead for future values to save time on encoding and hashing.\n\n```python\n$ tracker = Tracker('database.db', params={'model': 'logistic regression'})\n$ lr = Logisticregression().fit(X_train, y_train)\n$ tracker.log({'accuracy': float(lr.score(X_test, y_test)), 'lr': lr})\n{'accuracy': 0.9777777777777777, 'lr': '53425a65a40a49f4',  # <-- this is the model hash\n    'dataset': 'iris', 'model': 'logistic regression', 'timestamp': '2023-12-27 12:21:00.727834', 'track_id': 'wisteria-turkey-4392'}\n\n$ model = tracker.get('53425a65a40a49f4') # retrive an object\n$ model.score(X_test, y_test)\n0.9777777777777777\n```\n\nYou can retrive the model in CLI if you need only the model in production and mind carring the rest of the file\n\n```bash\n# bash\nxt assets export database.db 53425a65a40a49f4 model.cloudpickle\n```\n\n```python\n# python\nimport cloudpickle\nwith open(\"model.cloudpickle\", 'rb') as f:\n    model = cloudpickle.loads(f.read())\n# LogisticRegression()\n```\n\n\n### Tips and tricks\n\n* ```Tracker(Tracker.IN_MEMORY) ``` Let you run only in memory - great for debuging or working with logs only\n\n### Pandas-like\n\n```python\nprint(tracker)\n                                    _id                              track_id                 date    b    a  accuracy\n0  48154ec7-1fe4-4896-ac66-89db54ddd12a  fd0bfe4f-7257-4ec3-8c6f-91fe8ae67d20  16-08-2023 00:21:46  2.0  1.0       NaN\n1  8a43000a-03a4-4822-98f8-4df671c2d410  fd0bfe4f-7257-4ec3-8c6f-91fe8ae67d20  16-08-2023 00:24:21  NaN  NaN       1.0\n\ntracker['accuracy'] # get accuracy column\ntracker.to_df() # get pandas dataframe of current run\n\n```\n\n### SQL-like\nYou can filter the data using SQL-like syntax using [duckdb](https://duckdb.org/docs):\n\n* The sqlite database is attached as **db** and the table is **events**. Assts are in the **assets** table.\n\n#### Python\n```python\ntracker.conn.execute(f\"SELECT * FROM db.events WHERE accuracy > 0.8\").fetchall()\n```\n\n#### Duckdb CLI\n```bash\nduckdb\nD ATTACH 'database.db' AS db (TYPE sqlite);\nD SELECT * FROM db.events;\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502         timestamp          \u2502     track_id     \u2502  model   \u2502 epoch \u2502 accuracy \u2502  loss  \u2502\n\u2502          varchar           \u2502     varchar      \u2502 varchar  \u2502 int64 \u2502  double  \u2502 double \u2502\n\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u253c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u253c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u253c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u253c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u253c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524\n\u2502 2023-12-27 11:25:59.244003 \u2502 fierce-pudu-1649 \u2502 resnet18 \u2502     1 \u2502      0.9 \u2502    0.1 \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n### Logger integration\nThis is very useful in an environment where you can use normal logs, and don't want to manage a separate logger or file.   \nOn great use-case is **model monitoring**.\n\n`logs_stdout=true` print to stdout every tracked event\n`logs_path='logs'` writes logs to a file\n\n```python\n$ Tracker(db=Tracker.IN_MEMORY, logs_path='logs',logs_stdout=True).log({\"accuracy\":0.9})\n2023-12-14 21:46:55.290 | TRACKING | xetrack.logging:log:69!\ud83d\udcc1!{\"a\": 1, \"b\": 2, \"timestamp\": \"2023-12-14 21:46:55.290098\", \"track_id\": \"marvellous-stork-4885\"}\n\n$ Reader.read_logs(path='logs')\n   accuracy                   timestamp                track_id\n0       0.9  2023-12-14 21:47:48.375258  unnatural-polecat-1380\n```\n\n## Analysis\nTo get the data of all runs in the database for analysis:   \nUse this for further analysis and plotting.\n\n* This works even while a another process is writing to the database.\n\n```python\nfrom xetrack import Reader\ndf = Reader('database.db').to_df() \n```\n\n### Model Monitoring\n\nHere is how we can save logs on any server and monitor them with xetrack:    \nWe want to print logs to a file or *stdout* to be captured normally.   \nWe save memory by not inserting the data to the database (even though it's fine).\nLater we can read the logs and do fancy visualisation, online/offline analysis, build dashboards etc.\n\n```python\ntracker = Tracker(db=Tracker.SKIP_INSERT, logs_path='logs', logs_stdout=True)\ntracker.logger.monitor(\"<dict or pandas DataFrame>\") # -> write to logs in a structured way, consistent by schema, no database file needed\n\n\ndf = Reader.read_logs(path='logs')\n\"\"\"\nRun drift analysis and outlier detection on your logs: \n\"\"\"\n```\n\n### ML tracking\n\n```python\ntracker.logger.experiemnt(<model evaluation and params>) # -> prettily write to logs\n\ndf = Reader.read_logs(path='logs')\n\"\"\"\nRun fancy visualisation, online/offline analysis, build dashboards etc.\n\"\"\"\n```\n\n## CLI\n\nFor basic and repetative needs.\n\n```bash\n$ xt head database.db --n=2\n|    | timestamp                  | track_id                 | model    |   accuracy | data   | params           |\n|---:|:---------------------------|:-------------------------|:---------|-----------:|:-------|:-----------------|\n|  0 | 2023-12-27 11:36:45.859668 | crouching-groundhog-5046 | xgboost  |        0.9 | mnist  | 1b5b2294fc521d12 |\n|  1 | 2023-12-27 11:36:45.863888 | crouching-groundhog-5046 | xgboost  |        0.9 | mnist  | 1b5b2294fc521d12 |\n...\n\n\n$ xt tail database.db --n=1\n|    | timestamp                  | track_id        | model    |   accuracy | data   | params           |\n|---:|:---------------------------|:----------------|:---------|-----------:|:-------|:-----------------|\n|  0 | 2023-12-27 11:37:30.627189 | ebony-loon-6720 | lightgbm |        0.9 | mnist  | 1b5b2294fc521d12 |\n\n$ xet set accuracy 0.8 --where-key params --where-value 1b5b2294fc521d12 --track-id ebony-loon-6720\n\n$ xt delete database.db ebony-loon-6720 # delete experiments wiht a given track_id\n\n# run any other SQL in a oneliner\n$ xt sql database.db \"SELECT * FROM db.events;\"\n\n# retrive a model (any object) which was saved into a file using cloudpickle\n$ xt assets export database.db hash output \n\n# remove an object from the assets\n$ xt assets delete database.db hash \n\n# If you have two databases, and you want to merge one to the other\n$ xt copy source.db target.db --assets/--no-assets\n\n# Stats\n$ xt describe database.db --columns=x,y,z\n\n$ xt stats top/bottom database.db x # print the entry with the top/bottom result of a value\n\n# bashplotlib (`pip install bashplotlib` is required)\n$ xt plot hist database.db x\n    ----------------------\n    |    x histogram     |\n    ----------------------\n\n 225|      o\n 200|     ooo\n 175|     ooo\n 150|     ooo\n 125|     ooo\n 100|    ooooo\n  75|    ooooo\n  50|    ooooo\n  25|   ooooooo\n   1| oooooooooo\n     ----------\n\n-----------------------------------\n|             Summary             |\n-----------------------------------\n|        observations: 1000       |\n|      min value: -56.605967      |\n|         mean : 2.492545         |\n|       max value: 75.185944      |\n-----------------------------------\n$ xt plot scatter database.db x y\n\n```\n",
    "bugtrack_url": null,
    "license": "BSD-3-Clause",
    "summary": "A simple tool for benchamrking and tracking machine learning models and experiments.",
    "version": "0.3.1",
    "project_urls": null,
    "split_keywords": [
        "machine-learning",
        "duckdb",
        "pandas",
        "sqlitedict",
        "xxhash",
        "loguru"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2ddb6b6b378fb9c00a67e172bc113c5e3e0ce5d3ff7e5d613f204a19f7724081",
                "md5": "4076fda01fb4aece869d9759e4735156",
                "sha256": "e5c6fa43834e8948c51ddbea6bdf73a73195671c5150c302409d9b1d277e748e"
            },
            "downloads": -1,
            "filename": "xetrack-0.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4076fda01fb4aece869d9759e4735156",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9,<4.0",
            "size": 23886,
            "upload_time": "2024-03-05T16:19:10",
            "upload_time_iso_8601": "2024-03-05T16:19:10.988055Z",
            "url": "https://files.pythonhosted.org/packages/2d/db/6b6b378fb9c00a67e172bc113c5e3e0ce5d3ff7e5d613f204a19f7724081/xetrack-0.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e1f06d9a956db02fab553ea6a82732913f291baa805a829e75ffef9e7d2e07fb",
                "md5": "f52b687279b58949905fa5a14c4af369",
                "sha256": "244091fa80d4fbdc8842adfa00dcd97986f25e74448adc3d89bd0b93593441c0"
            },
            "downloads": -1,
            "filename": "xetrack-0.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "f52b687279b58949905fa5a14c4af369",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9,<4.0",
            "size": 23772,
            "upload_time": "2024-03-05T16:19:12",
            "upload_time_iso_8601": "2024-03-05T16:19:12.981571Z",
            "url": "https://files.pythonhosted.org/packages/e1/f0/6d9a956db02fab553ea6a82732913f291baa805a829e75ffef9e7d2e07fb/xetrack-0.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-05 16:19:12",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "xetrack"
}

xdssio