time-split-app

Name	time-split-app JSON
Version	0.7.2 JSON
	download
home_page	None
Summary	Companion app for the `time-split` library.
upload_time	2025-07-17 10:17:37
maintainer	None
docs_url	None
author	Richard Sundqvist
requires_python	>=3.11
license	MIT
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Time Split  <!-- omit in toc -->
Time-based k-fold validation splits for heterogeneous data.

-----------------
[![PyPI - Version](https://img.shields.io/pypi/v/time-split.svg)](https://pypi.python.org/pypi/time-split)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/time-split.svg)](https://pypi.python.org/pypi/time-split)
[![Tests](https://github.com/rsundqvist/time-split/workflows/tests/badge.svg)](https://github.com/rsundqvist/time-split/actions?workflow=tests)
[![Codecov](https://codecov.io/gh/rsundqvist/time-split/branch/master/graph/badge.svg)](https://codecov.io/gh/rsundqvist/time-split)
[![Read the Docs](https://readthedocs.org/projects/time-split/badge/)](https://time-split.readthedocs.io/)
[![PyPI - License](https://img.shields.io/pypi/l/time-split.svg)](https://pypi.python.org/pypi/time-split)
[![Docker Image Size (tag)](https://img.shields.io/docker/image-size/rsundqvist/time-split/latest?logo=docker&label=time-split)](https://hub.docker.com/r/rsundqvist/time-split/)

<div align="center">
  <img alt="Plotted folds on a two-by-two grid." 
       title="Examples" height="300" width="1200" 
  src="https://raw.githubusercontent.com/rsundqvist/time-split/master/docs/2x2-examples.jpg"><br>
</div>

Folds plotted on a two-by-two grid. See the
[examples](https://time-split.readthedocs.io/en/stable/auto_examples/index.html) page for more.

# About this image

The **Time Split** application
(available [here](https://time-split.streamlit.app/?data=1554942900-1557610200&schedule=0+0+%2A+%2A+MON%2CFRI&n_splits=2&step=2&show_removed=True))
is designed to help evaluate the effects of different
[parameters](https://time-split.readthedocs.io/en/stable/#parameter-overview).
To start it locally, run
```sh
docker run -p 8501:8501 rsundqvist/time-split
```
or 
```bash
pip install time-split[app]
python -m time_split app start
```
in the terminal. You may use
[`create_explorer_link()`](https://time-split.readthedocs.io/en/stable/api/time_split.app.html#time_split.app.create_explorer_link)
to build application URLs with preselected splitting parameters.

# Documentation
Click [here](https://time-split.readthedocs.io/en/stable/api/time_split.app.reexport.html) for documentation of the most
important types, functions and classes used by the application.

# Custom dataset loaders
Dataset loaders are a flexible way to load or create datasets that requires user input. The existing images (`>=0.7.0`)
can be extended to use custom loaders:

```Dockerfile
FROM python:3.13

RUN pip install --no-cache --compile time-split[app]
RUN pip install --no-cache --compile your-dependencies

ENV DATASET_LOADER=custom_dataset_loader:CustomDatasetLoader
COPY custom_dataset_loader.py .

# Entrypoint etc.
```

Loaders must implement the `DataLoaderWidget` interface.

# Custom datasets
To bundle datasets, mount a configuration file (determined by 
[`DATASETS_CONFIG_PATH='/home/streamlit/datasets.toml'`](https://time-split.readthedocs.io/en/stable/generated/time_split.streamlit.config.html#time_split.streamlit.config.DATASETS_CONFIG_PATH)
). The `DatasetConfig` struct has the following keys:

| Key                    | Type             | Required | Description                                                                   |
|------------------------|------------------|----------|-------------------------------------------------------------------------------|
| `label`                | `string`         |          | Name shown in the UI. Defaults to section header (i.e. *"my-dataset"* below). |
| `path`                 | `string`         | Required | First argument to the `pandas` read function.                                 |
| `index`                | `string`         | Required | Datetime-like column. Will be converted using [pandas.to_datetime()].         |
| `aggregations`         | `dict[str, str]` |          | Determines function to use in the `📈 Aggregations per fold` tab.             |
| `description`          | `string`         |          | Markdown. The first line will be used as the summary in the UI.               |
| `read_function_kwargs` | `dict[str, Any]` |          | Keyword arguments for the `pandas` read function used.                        |

[pandas.to_datetime()]: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html

The read function is chosen automatically based on the path.

> ℹ️ Additional dependencies are required for remote filesystems.
> You may use `EXTRA_PIP_PACKAGES=s3fs` to install dependencies for the S3 paths used below.

```toml
[my-dataset]
label = "Dataset name"
path = "s3://my-bucket/data/title_basics.csv"
index = "from"
aggregations = { runtimeMinutes = "min", isAdult = "mean" }
description = """This is the summary.

Simplified version of the
[Title basics](https://developer.imdb.com/non-commercial-datasets/#titlebasicstsvgz) IMDB
dataset. The description supports Markdown syntax.

Last updated: `2019-05-11T20:30:00+00:00'
"""
[my-dataset.read_function_kwargs]
# Valid options depend on the read function used (pandas.read_csv, in this case).
```

Multiple datasets may be configured in their own top-level sections. Labels must be unique.

## Mounted datasets
A convenient way to keep datasets up-to-date without relying on network storage is to mount a dataset folder on a local
machine, using e.g. a CRON job to update the data. To start the image with datasets mounted, run:
```bash
docker run \
  -p 8501:8501 \
  -v ./data:/home/streamlit/data:ro \
  -v ./datasets.toml:/home/streamlit/datasets.toml:ro \
  -e REQUIRE_DATASETS=true \
  rsundqvist/time-split
```
in the terminal. The [tomli-w](https://pypi.org/project/tomli-w/) package may be used to emit TOML files if using Python.

* The dataframes returned by the dataset loader are cached for `config.DATASET_CACHE_TTL` seconds (default = 12 hours).
* The dataset configuration file is read every `config.DATASET_CONFIG_CACHE_TTL` seconds (default = 30 seconds).

All datasets are reloaded immediately if the configuration changes, ignoring comments and formatting.

# Environment variables
See [config.py](src/time_split_app/config.py) for configurable values. Use `true|false` for boolean variables. 
Documentation for the underlying framework (Streamlit) is available 
[here](https://docs.streamlit.io/develop/concepts/configuration/options/).

## User choice
Users may *lower* some configured values by using the Performance tweaker widget in the `❔ About tab` of application. To 
set a lower default, add a `DEFAULT_`-prefix to the regular name.
```bash
PLOT_AGGREGATIONS_PER_FOLD=true
DEFAULT_PLOT_AGGREGATIONS_PER_FOLD=false
```
This will disable the (expensive) per-column fold aggregation figures, but users who need them can turn them back on.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "time-split-app",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": null,
    "author": "Richard Sundqvist",
    "author_email": "richard.sundqvist@live.se",
    "download_url": "https://files.pythonhosted.org/packages/90/c4/301bde68b9c38b7d971736449c6439ecd1f9e036b6a8c1edaa1c1213c9a1/time_split_app-0.7.2.tar.gz",
    "platform": null,
    "description": "# Time Split  <!-- omit in toc -->\nTime-based k-fold validation splits for heterogeneous data.\n\n-----------------\n[![PyPI - Version](https://img.shields.io/pypi/v/time-split.svg)](https://pypi.python.org/pypi/time-split)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/time-split.svg)](https://pypi.python.org/pypi/time-split)\n[![Tests](https://github.com/rsundqvist/time-split/workflows/tests/badge.svg)](https://github.com/rsundqvist/time-split/actions?workflow=tests)\n[![Codecov](https://codecov.io/gh/rsundqvist/time-split/branch/master/graph/badge.svg)](https://codecov.io/gh/rsundqvist/time-split)\n[![Read the Docs](https://readthedocs.org/projects/time-split/badge/)](https://time-split.readthedocs.io/)\n[![PyPI - License](https://img.shields.io/pypi/l/time-split.svg)](https://pypi.python.org/pypi/time-split)\n[![Docker Image Size (tag)](https://img.shields.io/docker/image-size/rsundqvist/time-split/latest?logo=docker&label=time-split)](https://hub.docker.com/r/rsundqvist/time-split/)\n\n<div align=\"center\">\n  <img alt=\"Plotted folds on a two-by-two grid.\" \n       title=\"Examples\" height=\"300\" width=\"1200\" \n  src=\"https://raw.githubusercontent.com/rsundqvist/time-split/master/docs/2x2-examples.jpg\"><br>\n</div>\n\nFolds plotted on a two-by-two grid. See the\n[examples](https://time-split.readthedocs.io/en/stable/auto_examples/index.html) page for more.\n\n# About this image\n\nThe **Time Split** application\n(available [here](https://time-split.streamlit.app/?data=1554942900-1557610200&schedule=0+0+%2A+%2A+MON%2CFRI&n_splits=2&step=2&show_removed=True))\nis designed to help evaluate the effects of different\n[parameters](https://time-split.readthedocs.io/en/stable/#parameter-overview).\nTo start it locally, run\n```sh\ndocker run -p 8501:8501 rsundqvist/time-split\n```\nor \n```bash\npip install time-split[app]\npython -m time_split app start\n```\nin the terminal. You may use\n[`create_explorer_link()`](https://time-split.readthedocs.io/en/stable/api/time_split.app.html#time_split.app.create_explorer_link)\nto build application URLs with preselected splitting parameters.\n\n# Documentation\nClick [here](https://time-split.readthedocs.io/en/stable/api/time_split.app.reexport.html) for documentation of the most\nimportant types, functions and classes used by the application.\n\n# Custom dataset loaders\nDataset loaders are a flexible way to load or create datasets that requires user input. The existing images (`>=0.7.0`)\ncan be extended to use custom loaders:\n\n```Dockerfile\nFROM python:3.13\n\nRUN pip install --no-cache --compile time-split[app]\nRUN pip install --no-cache --compile your-dependencies\n\nENV DATASET_LOADER=custom_dataset_loader:CustomDatasetLoader\nCOPY custom_dataset_loader.py .\n\n# Entrypoint etc.\n```\n\nLoaders must implement the `DataLoaderWidget` interface.\n\n# Custom datasets\nTo bundle datasets, mount a configuration file (determined by \n[`DATASETS_CONFIG_PATH='/home/streamlit/datasets.toml'`](https://time-split.readthedocs.io/en/stable/generated/time_split.streamlit.config.html#time_split.streamlit.config.DATASETS_CONFIG_PATH)\n). The `DatasetConfig` struct has the following keys:\n\n| Key                    | Type             | Required | Description                                                                   |\n|------------------------|------------------|----------|-------------------------------------------------------------------------------|\n| `label`                | `string`         |          | Name shown in the UI. Defaults to section header (i.e. *\"my-dataset\"* below). |\n| `path`                 | `string`         | Required | First argument to the `pandas` read function.                                 |\n| `index`                | `string`         | Required | Datetime-like column. Will be converted using [pandas.to_datetime()].         |\n| `aggregations`         | `dict[str, str]` |          | Determines function to use in the `\ud83d\udcc8 Aggregations per fold` tab.             |\n| `description`          | `string`         |          | Markdown. The first line will be used as the summary in the UI.               |\n| `read_function_kwargs` | `dict[str, Any]` |          | Keyword arguments for the `pandas` read function used.                        |\n\n[pandas.to_datetime()]: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html\n\nThe read function is chosen automatically based on the path.\n\n> \u2139\ufe0f Additional dependencies are required for remote filesystems.\n> You may use `EXTRA_PIP_PACKAGES=s3fs` to install dependencies for the S3 paths used below.\n\n```toml\n[my-dataset]\nlabel = \"Dataset name\"\npath = \"s3://my-bucket/data/title_basics.csv\"\nindex = \"from\"\naggregations = { runtimeMinutes = \"min\", isAdult = \"mean\" }\ndescription = \"\"\"This is the summary.\n\nSimplified version of the\n[Title basics](https://developer.imdb.com/non-commercial-datasets/#titlebasicstsvgz) IMDB\ndataset. The description supports Markdown syntax.\n\nLast updated: `2019-05-11T20:30:00+00:00'\n\"\"\"\n[my-dataset.read_function_kwargs]\n# Valid options depend on the read function used (pandas.read_csv, in this case).\n```\n\nMultiple datasets may be configured in their own top-level sections. Labels must be unique.\n\n## Mounted datasets\nA convenient way to keep datasets up-to-date without relying on network storage is to mount a dataset folder on a local\nmachine, using e.g. a CRON job to update the data. To start the image with datasets mounted, run:\n```bash\ndocker run \\\n  -p 8501:8501 \\\n  -v ./data:/home/streamlit/data:ro \\\n  -v ./datasets.toml:/home/streamlit/datasets.toml:ro \\\n  -e REQUIRE_DATASETS=true \\\n  rsundqvist/time-split\n```\nin the terminal. The [tomli-w](https://pypi.org/project/tomli-w/) package may be used to emit TOML files if using Python.\n\n* The dataframes returned by the dataset loader are cached for `config.DATASET_CACHE_TTL` seconds (default = 12 hours).\n* The dataset configuration file is read every `config.DATASET_CONFIG_CACHE_TTL` seconds (default = 30 seconds).\n\nAll datasets are reloaded immediately if the configuration changes, ignoring comments and formatting.\n\n# Environment variables\nSee [config.py](src/time_split_app/config.py) for configurable values. Use `true|false` for boolean variables. \nDocumentation for the underlying framework (Streamlit) is available \n[here](https://docs.streamlit.io/develop/concepts/configuration/options/).\n\n## User choice\nUsers may *lower* some configured values by using the Performance tweaker widget in the `\u2754 About tab` of application. To \nset a lower default, add a `DEFAULT_`-prefix to the regular name.\n```bash\nPLOT_AGGREGATIONS_PER_FOLD=true\nDEFAULT_PLOT_AGGREGATIONS_PER_FOLD=false\n```\nThis will disable the (expensive) per-column fold aggregation figures, but users who need them can turn them back on.\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Companion app for the `time-split` library.",
    "version": "0.7.2",
    "project_urls": {
        "Bug Tracker": "https://github.com/rsundqvist/time-split-app/issues",
        "Changelog": "https://github.com/rsundqvist/time-split-app/blob/master/CHANGELOG.md",
        "Documentation": "https://time-fold.readthedocs.io",
        "Homepage": "https://github.com/rsundqvist/time-split-app",
        "Repository": "https://github.com/rsundqvist/time-split-app"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6a19c8878c3ae57b330e91dfaa3ddf9ae08ff689d1ceb8aa7610a7f1628c0336",
                "md5": "26d3b12695489f184325e3fabe68c4cb",
                "sha256": "54ec16232e085fc001d9b66e959dcbcfee178177f776b4fc5bf2e58fac743b61"
            },
            "downloads": -1,
            "filename": "time_split_app-0.7.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "26d3b12695489f184325e3fabe68c4cb",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 63261,
            "upload_time": "2025-07-17T10:17:36",
            "upload_time_iso_8601": "2025-07-17T10:17:36.113179Z",
            "url": "https://files.pythonhosted.org/packages/6a/19/c8878c3ae57b330e91dfaa3ddf9ae08ff689d1ceb8aa7610a7f1628c0336/time_split_app-0.7.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "90c4301bde68b9c38b7d971736449c6439ecd1f9e036b6a8c1edaa1c1213c9a1",
                "md5": "95aade43ccc03b71115e0ebbb643ae99",
                "sha256": "8f85c4e484f18c29336496332cbc8df651982b5e15d3a06197f12a3c4a73059c"
            },
            "downloads": -1,
            "filename": "time_split_app-0.7.2.tar.gz",
            "has_sig": false,
            "md5_digest": "95aade43ccc03b71115e0ebbb643ae99",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 46627,
            "upload_time": "2025-07-17T10:17:37",
            "upload_time_iso_8601": "2025-07-17T10:17:37.865939Z",
            "url": "https://files.pythonhosted.org/packages/90/c4/301bde68b9c38b7d971736449c6439ecd1f9e036b6a8c1edaa1c1213c9a1/time_split_app-0.7.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-17 10:17:37",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "rsundqvist",
    "github_project": "time-split-app",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "time-split-app"
}

Richard Sundqvist