neptune-fetcher


Nameneptune-fetcher JSON
Version 0.12.0 PyPI version JSON
download
home_pageNone
SummaryNeptune Fetcher
upload_time2025-01-21 08:22:02
maintainerNone
docs_urlNone
authorneptune.ai
requires_python<4.0,>=3.8
licenseApache-2.0
keywords mlops ml experiment tracking ml model registry ml model store ml metadata store
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Neptune Fetcher

> [!NOTE]
> This package is experimental and only works with Neptune Scale, which is in beta.
>
> You can't use this package with `neptune<2.0` or the currently available Neptune app version. For the corresponding
> Python API, see [neptune-client](https://github.com/neptune-ai/neptune-client).

Neptune Fetcher is designed to separate data retrieval capabilities from the regular `neptune` package. This separation
makes data fetching more efficient and improves performance.

## Installation

```bash
pip install neptune-fetcher
```

## Usage

1. Set your Neptune API token and project name as environment variables:

    ```bash
    export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM.4kl0jvYh3Kb8...ifQ=="
    ```

    ```bash
    export NEPTUNE_PROJECT="workspace-name/project-name"
    ```

    For help, see https://docs-beta.neptune.ai/setup.

1. In your Python code, create a [`ReadOnlyProject`](#readonlyproject) instance:

    ```python
    from neptune_fetcher import ReadOnlyProject

    my_project = ReadOnlyProject()
    ```

Now you have a Neptune project to operate on.

> If you don't set the Neptune environment variables, you can pass your credentials through arguments when creating a project or run object.

To fetch experiments in bulk, call a fetching method on the project:

```python
experiments_df = my_project.fetch_experiments_df(
    names_regex="tree/.*",
    columns=["sys/custom_run_id", "sys/modification_time"],
    query='(last(`accuracy`:floatSeries) > 0.88) AND (`learning_rate`:float < 0.01)',
)
```

To fetch metadata from an individual experiment or run, create and use a [`ReadOnlyRun`](#readonlyrun) object:

```python
from neptune_fetcher import ReadOnlyRun

run = ReadOnlyRun(
    read_only_project=my_project,
    experiment_name="seagull-flying-skills",
)

# Fetch value
print(run["parameters/optimizer"].fetch())

# Fecth last value of metric
print(run["metrics/loss"].fetch_last())

# Fetch all metric values, with optional pre-fetching to speed up subsequent access to the field
run.prefetch_series_values(["metrics/accuracy"])
print(run["metrics/accuracy"].fetch_values())
```

For details, see the Neptune documentation:

- [Fetch runs or experiments](https://docs-beta.neptune.ai/fetch_runs)
- [Fetch metadata from a run or experiment](https://docs-beta.neptune.ai/fetch_run_data)
- [Neptune Query Language (NQL)](https://docs-beta.neptune.ai/nql)

## Examples

### Listing runs of a project

```python
from neptune_fetcher import ReadOnlyProject

project = ReadOnlyProject("workspace/project")

for run in project.list_runs():
    print(run)  # dicts with identifiers
```

### Listing experiments of a project

```python
from neptune_fetcher import ReadOnlyProject

project = ReadOnlyProject("workspace/project")

for experiment in project.list_experiments():
    print(experiment)  # dicts with identifiers
```

### Fetching runs data frame with specific columns

```python
from neptune_fetcher import ReadOnlyProject

project = ReadOnlyProject("workspace/project")

runs_df = project.fetch_runs_df(
    columns=["sys/custom_run_id", "sys/modification_time"],
    columns_regex="tree/.*",  # added to columns specified with the "columns" parameter
)
```

### Fetching data from specified runs

```python
from neptune_fetcher import ReadOnlyProject

project = ReadOnlyProject("workspace/project")

for run in project.fetch_read_only_runs(with_ids=["RUN-1", "RUN-2"]):
    run.prefetch(["parameters/optimizer", "parameters/init_lr"])

    print(run["parameters/optimizer"].fetch())
    print(run["parameters/init_lr"].fetch())
```

### Fetching data from a single run

```python
from neptune_fetcher import ReadOnlyProject, ReadOnlyRun

project = ReadOnlyProject("workspace/project")
run = ReadOnlyRun(project, with_id="TES-1")

run.prefetch(["parameters/optimizer", "parameters/init_lr"])
run.prefetch_series_values(["metrics/loss", "metrics/accuracy"], use_threads=True)

print(run["parameters/optimizer"].fetch())
print(run["parameters/init_lr"].fetch())
print(run["metrics/loss"].fetch_values())
print(run["metrics/accuracy"].fetch_values())
```

## API reference

### Supported regular expressions

Neptune uses the [RE2](https://github.com/google/re2) regular expression library. For supported regex features and limitations, see the official [syntax guide](https://github.com/google/re2/wiki/syntax).


### `ReadOnlyProject`

Representation of a Neptune project in a limited read-only mode.

#### Initialization

Initialize with the ReadOnlyProject class constructor:

```python
project = ReadOnlyProject("workspace/project", api_token="...")
```

> [!TIP]
> Find your API token in your user menu, in the bottom-left corner of the Neptune app.

__Parameters:__

| Name        | Type             | Default | Description                                                                                                                                                                                                                                       |
|-------------|------------------|---------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `project`   | `str`, optional  | `None`  | Name of a project in the form `workspace-name/project-name`. If `None`, the value of the `NEPTUNE_PROJECT` environment variable is used.                                                                                                          |
| `api_token` | `str`, optional  | `None`  | Your Neptune API token (or a service account's API token). If `None`, the value of the `NEPTUNE_API_TOKEN` environment variable is used. To keep your token secure, avoid placing it in source code. Instead, save it as an environment variable. |
| `proxies`   | `dict`, optional | `None`  | Dictionary of proxy settings, if needed. This argument is passed to HTTP calls made via the Requests library. For details on proxies, see the [Requests documentation](https://requests.readthedocs.io/).                                         |

---

#### `list_runs()`

Lists all runs of a project.

Each run is identified by Neptune ID (`sys/id`), custom ID (`sys/custom_run_id`) and, if set, name (`sys/name`).

__Returns:__ `Iterator` of dictionaries with Neptune run identifiers, custom identifiers and names.

__Example:__

```python
project = ReadOnlyProject()

for run in project.list_runs():
    print(run)
```

---

#### `list_experiments()`

Lists all experiments of a project.

Each experiment is identified by:

- Neptune ID: `sys/id`
- (If set) Custom ID: `sys/custom_run_id`
- Name: `sys/name`

__Example:__

```python
for experiment in project.list_experiments():
    print(experiment)
```

__Returns:__ `Iterator` of dictionaries with Neptune experiment identifiers, custom identifiers and names.

---

#### `fetch_runs()`

Fetches a table containing Neptune IDs, custom run IDs and names of runs in the project.

__Returns:__ `pandas.DataFrame` `pandas.DataFrame` with three columns (`sys/id`, `sys/name` and `sys/custom_run_id`)
and one row for each run.

__Example:__

```python
project = ReadOnlyProject()
df = project.fetch_runs()
```

---

#### `fetch_experiments()`

Fetches a table containing Neptune IDs, custom IDs and names of experiments in the project.

__Example__:

```python
df = project.fetch_experiments()
```

__Returns__:
`pandas.DataFrame` with three columns (`sys/id`, `sys/custom_run_id`, `sys/name`) and one row for each experiment.

---

#### `fetch_runs_df()`

Fetches the runs' metadata and returns them as a pandas DataFrame.

__Parameters:__

| Name              | Type                  | Default             | Description                                                                                                                                                                                                                                                                                                          |
|-------------------|-----------------------|---------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `columns`         | `List[str]`, optional | `None`              | Names of columns to include in the table, as a list of field names. The sorting column, custom run identifier (`sys/custom_run_id`), and experiment name (`sys/name`) are always included. `None` results in returning only the default columns.
| `columns_regex`   | `str`, optional       | `None`              | A regex pattern to filter columns by name. Use this parameter to include columns in addition to the ones specified by the `columns` parameter.
| `names_regex`     | `str`, optional       | `None`              | A regex pattern to filter the runs by name.                                                                                                                                                                                                                                                                          |
| `custom_id_regex` | `str`, optional       | `None`              | A regex pattern to filter the runs by custom ID.                                                                                                                                                                                                                                                                     |
| `with_ids`        | `List[str]`, optional | `None`              | List of multiple Neptune IDs. Example: `["NLU-1", "NLU-2"]`. Matching any element of the list is sufficient to pass the criterion.                                                                                                                                                                                   |
| `custom_ids`      | `List[str]`, optional | `None`              | List of multiple custom IDs. Example: `["nostalgic_shockley", "high_albattani"]`. Matching any element of the list is sufficient to pass the criterion.                                                                                                                                                              |
| `states`          | `List[str]`, optional | `None`              | List of states. Possible values: `"inactive"`, `"active"`. "Active" means that at least one process is connected to the run. Matching any element of the list is sufficient to pass the criterion.                                                                                                                   |
| `owners`          | `List[str]`, optional | `None`              | List of multiple owners. Example:  `["frederic", "josh"]`. The owner is the user who created the run. Matching any element of the list is sufficient to pass the criterion.                                                                                                                                          |
| `tags`            | `List[str]`, optional | `None`              | A list of tags. Example: `"lightGBM"` or `["pytorch", "cycleLR"]`. **Note:** Only runs that have all specified tags will pass this criterion.                                                                                                                                                                        |
| `trashed`         | `bool`, optional      | `False`             | Whether to retrieve trashed runs. If `True`, only trashed runs are retrieved. If `False`, only non-trashed runs are retrieved. If `None` or left empty, all run objects are retrieved, including trashed ones.                                                                                                       |
| `limit`           | `int`, optional       | `None`              | Maximum number of runs to fetch. If `None`, all runs are fetched.                                                                                                                                                                                                                                                    |
| `sort_by`         | `str`, optional       | `sys/creation_time` | Name of the field to sort the results by. The field must represent a simple type (string, float, integer).                                                                                                                                                                                                           |
| `ascending`       | `bool`, optional      | `False`             | Whether to sort the entries in ascending order of the sorting column values.                                                                                                                                                                                                                                         |
| `progress_bar`    | `bool`                | `True`              | Set to `False `to disable the download progress bar.                                                                                                                                                                                                                                                                 |
| `query`           | `str`, optional       | `None`              | NQL query string. Example: `"(accuracy: float > 0.88) AND (loss: float < 0.2)"`. The query is applied on top of other criteria like, `custom_ids`, `tags` etc, using the logical AND operator. See examples below. For syntax, see [Neptune Query Language](https://docs.neptune.ai/usage/nql/) in the Neptune docs. |

__Returns:__ `pandas.DataFrame`: A pandas DataFrame containing metadata of the fetched runs.

> [!IMPORTANT]
> The following fields are always included:
>
> - `sys/custom_run_id`: the custom run identifier.
> - The field to sort by. That is, the field name passed to the `sort_by` argument.
>
> The maximum number of runs that can be returned is 5000.


__Examples:__

Fetch all runs, with specific columns:

```python
project = ReadOnlyProject()

runs_df = project.fetch_runs_df(
    columns=["sys/modification_time", "training/lr"]
)
```

Fetch all runs, with specific columns and extra columns that match a regex pattern:

```python
runs_df = project.fetch_runs_df(
    columns=["sys/modification_time"],
    columns_regex="tree/.*",
)
```

Fetch runs by specific ID:

```python
specific_runs_df = my_project.fetch_runs_df(custom_ids=["nostalgic_shockley", "high_albattani"])
```

Fetch runs by names that match a regex pattern:

```python
specific_runs_df = my_project.fetch_runs_df(
    names_regex="tree_3[2-4]+"
)
```

Fetch runs with a complex query using NQL.

```python
runs_df = my_project.fetch_runs_df(
    query='(last(`accuracy`:floatSeries) > 0.88) AND (`learning_rate`:float < 0.01)'
)
```

---

#### `fetch_experiments_df()`

Fetches the experiments' metadata and returns them as a pandas DataFrame.

__Parameters__:

| Name                  | Type                  | Default             | Description                                                                                                                                                                                                                                                                                                          |
|-----------------------|-----------------------|---------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `columns`             | `List[str]`, optional | `None`              | Names of columns to include in the table, as a list of field names. The sorting column, custom run identifier (`sys/custom_run_id`), and experiment name (`sys/name`) are always included. `None` results in returning only the default columns.
| `columns_regex`       | `str`, optional       | `None`              | A regex pattern to filter columns by name. Use this parameter to include columns in addition to the ones specified by the `columns` parameter.
| `names_regex`         | `str`, optional       | `None`              | A regex pattern to filter the experiments by name.                                                                                                                                                                                                                                                                   |
| `names_regex`         | `str`, optional       | `None`              | A regex pattern or a list of regex patterns to filter the experiments by name. Multiple patterns will be connected by AND logic.                                                                                                                                                                                     |
| `names_exclude_regex` | `str`, optional       | `None`              | A regex pattern or a list of regex patterns to exclude experiments by name. Multiple patterns will be connected by AND logic.                                                                                                                                                                                        |
| `custom_id_regex`     | `str`, optional       | `None`              | A regex pattern to filter the experiments by custom ID.                                                                                                                                                                                                                                                              |
| `with_ids`            | `List[str]`, optional | `None`              | List of multiple Neptune IDs. Example: `["NLU-1", "NLU-2"]`. Matching any element of the list is sufficient to pass the criterion.                                                                                                                                                                                   |
| `custom_ids`          | `List[str]`, optional | `None`              | List of multiple custom IDs. Example: `["nostalgic_shockley", "high_albattani"]`. Matching any element of the list is sufficient to pass the criterion.                                                                                                                                                              |
| `states`              | `List[str]`, optional | `None`              | List of states. Possible values: `"inactive"`, `"active"`. "Active" means that at least one process is connected to the experiment. Matching any element of the list is sufficient to pass the criterion.                                                                                                            |
| `owners`              | `List[str]`, optional | `None`              | List of multiple owners. Example:  `["frederic", "josh"]`. The owner is the user who created the experiement. Matching any element of the list is sufficient to pass the criterion.                                                                                                                                  |
| `tags`                | `List[str]`, optional | `None`              | A list of tags. Example: `"lightGBM"` or `["pytorch", "cycleLR"]`. **Note:** Only experiments that have all specified tags will pass this criterion.                                                                                                                                                                 |
| `trashed`             | `bool`, optional      | `False`             | Whether to retrieve trashed experiments. If `True`, only trashed experiments are retrieved. If `False`, only non-trashed experiments are retrieved. If `None` or left empty, all experiment objects are retrieved, including trashed ones.                                                                           |
| `limit`               | `int`, optional       | `None`              | Maximum number of experiments to fetch. If `None`, all experiments are fetched.                                                                                                                                                                                                                                      |
| `sort_by`             | `str`, optional       | `sys/creation_time` | Name of the field to sort the results by. The field must represent a simple type (string, float, integer).                                                                                                                                                                                                           |
| `ascending`           | `bool`, optional      | `False`             | Whether to sort the entries in ascending order of the sorting column values.                                                                                                                                                                                                                                         |
| `progress_bar`        | `bool`                | `True`              | Set to `False `to disable the download progress bar.                                                                                                                                                                                                                                                                 |
| `query`               | `str`, optional       | `None`              | NQL query string. Example: `"(accuracy: float > 0.88) AND (loss: float < 0.2)"`. The query is applied on top of other criteria like, `custom_ids`, `tags` etc, using the logical AND operator. See examples below. For syntax, see [Neptune Query Language](https://docs.neptune.ai/usage/nql/) in the Neptune docs. |

__Returns:__ `pandas.DataFrame`: A pandas DataFrame containing metadata of the fetched experiments.

> [!IMPORTANT]
> The following fields are always included:
>
> - `sys/custom_run_id`: the custom run identifier.
> - `sys/name`: the experiment name.
> - The field to sort by. That is, the field name passed to the `sort_by` argument.
>
> The maximum number of runs that can be returned is 5000.

__Examples:__

Fetch all experiments with specific columns:

```python
experiments_df = project.fetch_experiments_df(
    columns=["sys/custom_run_id", "sys/modification_time", "training/lr"]
)
```

Fetch all experiments with specific columns and extra columns that match a regex pattern:

```python
experiments_df = project.fetch_experiments_df(
    columns=["sys/custom_run_id", "sys/modification_time"],
    columns_regex="tree/.*",
)
```

Fetch experiments by specific IDs:

```python
specific_experiments_df = my_project.fetch_experiments_df(
    custom_ids=["nostalgic_shockley", "high_albattani"]
)
```

Use the Neptune Query Language to fetch experiments with a complex query. Note that for regular strings, the `\` character needs to be escaped:

```python
experiments_df = my_project.fetch_experiments_df(
    query='(`learning_rate`:float < 0.01) AND (`sys/name`:string MATCHES "experiment-\\\\d+")'
)
```

As a less cluttered alternative, pass a raw Python string to the `query` argument:

```python
experiments_df = my_project.fetch_experiments_df(
    query=r'(`learning_rate`:float < 0.01) AND (`sys/name`:string MATCHES "experiment-\\d+")'
)
```

---

#### `fetch_read_only_runs()`

List runs of the project in the form of ReadOnlyRun.

__Parameters:__

| Name         | Type                  | Default | Description                       |
|--------------|-----------------------|---------|-----------------------------------|
| `with_ids`   | `Optional[List[str]]` | `None`  | List of Neptune run IDs to fetch. |
| `custom_ids` | `Optional[List[str]]` | `None`  | List of custom run IDs to fetch.  |

__Returns:__ Iterator of ReadOnlyRun objects.

__Example:__

```python
project = ReadOnlyProject()

for run in project.fetch_read_only_runs(custom_ids=["nostalgic_shockley", "high_albattani"]):
    ...
```

---

#### `fetch_read_only_experiments()`

Lists experiments of the project in the form of ReadOnlyRun.

__Parameters:__

| Name    | Type                  | Default | Description                        |
|---------|-----------------------|---------|------------------------------------|
| `names` | `Optional[List[str]]` | `None`  | List of experiment names to fetch. |

__Returns:__ Iterator of ReadOnlyRun objects.

__Example:__

```python
project = ReadOnlyProject()

for run in project.fetch_read_only_experiments(names=["yolo-v2", "yolo-v3"]):
    ...
```

---

### `ReadOnlyRun`

Representation of a Neptune run in a limited read-only mode.

#### Initialization

Can be created

- with the class constructor:

    ```python
    project = ReadOnlyProject()
    run = ReadOnlyRun(project, with_id="TES-1")
    ```

- or as a result of the [`fetch_read_only_runs()`](#fetch_read_only_runs) method of the `ReadOnlyProject` class:

    ```python
    for run in project.fetch_read_only_runs(
        custom_ids=["nostalgic_shockley", "high_albattani"]):
        ...
    ```

__Parameters:__

| Name                               | Type              | Default | Description                                                                                                                        |
|------------------------------------|-------------------|---------|------------------------------------------------------------------------------------------------------------------------------------|
| `read_only_project`                | `ReadOnlyProject` | -       | Project from which the run is fetched.                                                                                             |
| `with_id`                          | `Optional[str]`   | `None`  | ID of the Neptune run to fetch. Example: `RUN-1`. Exclusive with the `custom_id` and `experiment_name` parameters.                 |
| `custom_id`                        | `Optional[str]`   | `None`  | Custom ID of the Neptune run to fetch. Example: `high_albattani`. Exclusive with the `with_id` and `experiment_name` parameters.   |
| `experiment_name`                  | `Optional[str]`   | `None`  | Name of the Neptune experiment to fetch. Example: `high_albattani`. Exclusive with the `with_id` and `custom_id` parameters.       |
| `eager_load_fields` | `Optional[bool]`  | `True`  | Whether to eagerly load the run fields definitions. If `False`, individual fields are loaded only when accessed. Default is `True`. |

__Example:__

```python
from neptune_fetcher import ReadOnlyProject, ReadOnlyRun

project = ReadOnlyProject("workspace-name/project-name", api_token="...")
run = ReadOnlyRun(project, custom_id="high_albattani")
```

---

#### `.field_names`

List of run field names.

A _field_ is the location where a piece of metadata is stored in the run.

__Returns:__ Iterator of run fields as strings.

__Example:__

```python
for run in project.fetch_read_only_runs(custom_ids=["nostalgic_shockley", ...]):
    print(list(run.field_names))
```

---

#### Field lookup: `run[field_name]`

Used to access a specific field of a run. See [Available types](#available-types).

__Returns:__ An internal object used to operate on a specific field.

__Example:__

```python
run = ReadOnlyRun(...)
custom_id = run["sys/custom_run_id"].fetch()
```

---

#### `prefetch()`

Pre-fetches a batch of fields to the internal cache.

Improves the performance of access to consecutive field values.

Supported Neptune field types:

- [`Boolean`](#boolean)
- [`Datetime`](#datetime)
- [`Float`](#float)
- [`FloatSeries`](#floatseries)
- [`Integer`](#integer)
- [`ObjectState`](#objectstate)
- [`String`](#string)
- [`StringSet`](#stringset)

__Parameters:__

| Name    | Type        | Default | Description                                |
|---------|-------------|---------|--------------------------------------------|
| `paths` | `List[str]` | -       | List of field paths to fetch to the cache. |

__Example:__

```python
run = ReadOnlyRun(...)
run.prefetch(["parameters/optimizer", "parameter/init_lr"])
# No more calls to the API
print(run["parameters/optimizer"].fetch())
print(run["parameter/init_lr"].fetch())
```

### `prefetch_series_values()`

Prefetches a batch of series to the internal cache. This method skips the non-existing attributes.

Improves the performance of access to consecutive field values. Works only for series ([`FloatSeries`](#floatseries)).

To speed up the fetching process, this method can use multithreading.
To enable it, set the `use_threads` parameter to `True`.

By default, the maximum number of workers is 10. You can change this number by setting the `NEPTUNE_FETCHER_MAX_WORKERS`
environment variable.

__Parameters__:

| Name                | Type                  | Default      | Description                                                                                                                                                                                                                                                                                      |
|---------------------|-----------------------|--------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `paths`             | `List[str]`, required | `None`       | List of paths to prefetch to the internal cache.                                                                                                                                                                                                                                                 |
| `use_threads`       | `bool`, optional      | `False`      | Whether to use threads to fetch the data.                                                                                                                                                                                                                                                        |
| `progress_bar`      | `bool`                | `True`       | Set to False to disable the download progress bar.                                                                                                                                                                                                                                               |
| `include_inherited` | `bool`, optional      | `True`       | If True (default), values inherited from ancestor runs are included. To only fetch values from the current run, set to False.                                                                                                                                                                    |
| `step_range`        | `tuple[float, float]` | (None, None) | Limits the range of steps to fetch. This must be a 2-tuple: <br> - `left`: The left boundary of the range (inclusive). If `None`, the range extends indefinitely on the left.<br>- `right`: The right boundary of the range (inclusive). If `None`, the range extends indefinitely on the right. |

__Example__:

```python
run.prefetch_series_values(["metrics/loss", "metrics/accuracy"])
# No more calls to the API
print(run["metrics/loss"].fetch_values())
print(run["metrics/accuracy"].fetch_values())
```

## Available types

This section lists the available field types and data retrieval operations.

---

### `Boolean`

#### `fetch()`

Retrieves a `bool` value either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.

__Example:__

```python
status = run["sys/failed"].fetch()
```

---

### `Datetime`

#### `fetch()`

Retrieves a `datetime.datetime` value either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.

__Example:__

```python
created_at = run["sys/creation_time"].fetch()
```

---

### `Float`

#### `fetch()`

Retrieves a `float` value either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.

__Example:__

```python
f1 = run["scores/f1"].fetch()
```

---

### `FloatSeries`

#### `fetch()` or `fetch_last()`

Retrieves the last value of a series, either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.

__Returns:__ `Optional[float]`

__Example:__

```python
loss = run["loss"].fetch_last()
```

#### `fetch_values()`

Retrieves all series values either from the internal cache (see [`prefetch_series_values()`](#prefetch_series_values))
or from the API.

__Parameters:__

| Name                | Type                  | Default      | Description                                                                                                                                                               |
|---------------------|-----------------------|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `include_timestamp` | `bool`                | `True`       | Whether the fetched data should include the timestamp field.                                                                                                              |
| `include_inherited` | `bool`                | `True`       | If True (default), values inherited from ancestor runs are included. To only fetch values from the current run, set to False.                                             |
| `progress_bar`      | `bool`                | `True`       | Set to False to disable the download progress bar.                                                                                                                        |
| `step_range`        | `tuple[float, float]` | (None, None) | - left: left boundary of the range (inclusive). If None, it\'s open on the left. <br> - right: right boundary of the range (inclusive). If None, it\'s open on the right. |

__Returns:__ `pandas.DataFrame`

__Example:__

```python
values = run["loss"].fetch_values()
```

---

### `Integer`

#### `fetch()`

Retrieves an `int` value either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.

__Example:__

```python
batch_size = run["batch_size"].fetch()
```

---

### `ObjectState`

#### `fetch()`

Retrieves the state of a run either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.

__Returns:__ `str`

> [!NOTE]
> The state can be **active** or **inactive**. It refers to whether new data was recently logged to the run.
> To learn more about this field, see [System namespace: State](https://docs.neptune.ai/api/sys/#state) in the Neptune
> docs.

__Example:__

```python
state = run["sys/state"].fetch()
```

---

### `String`

#### `fetch()`

Retrieves a `str` value either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.

__Example:__

```python
token = run["token"].fetch()
```

---

### `StringSet`

#### `fetch()`

Retrieves a `dict` of `str` values either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.

__Example:__

```python
groups = run["sys/group_tags"].fetch()
```

## License

This project is licensed under the Apache License Version 2.0. For more details,
see [Apache License Version 2.0](http://www.apache.org/licenses/LICENSE-2.0).


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "neptune-fetcher",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8",
    "maintainer_email": null,
    "keywords": "MLOps, ML Experiment Tracking, ML Model Registry, ML Model Store, ML Metadata Store",
    "author": "neptune.ai",
    "author_email": "contact@neptune.ai",
    "download_url": "https://files.pythonhosted.org/packages/59/44/5ff1b11f5b55cd973e9bcdeec8821e083c0dab700749fb79dda703c98f20/neptune_fetcher-0.12.0.tar.gz",
    "platform": null,
    "description": "# Neptune Fetcher\n\n> [!NOTE]\n> This package is experimental and only works with Neptune Scale, which is in beta.\n>\n> You can't use this package with `neptune<2.0` or the currently available Neptune app version. For the corresponding\n> Python API, see [neptune-client](https://github.com/neptune-ai/neptune-client).\n\nNeptune Fetcher is designed to separate data retrieval capabilities from the regular `neptune` package. This separation\nmakes data fetching more efficient and improves performance.\n\n## Installation\n\n```bash\npip install neptune-fetcher\n```\n\n## Usage\n\n1. Set your Neptune API token and project name as environment variables:\n\n    ```bash\n    export NEPTUNE_API_TOKEN=\"h0dHBzOi8aHR0cHM.4kl0jvYh3Kb8...ifQ==\"\n    ```\n\n    ```bash\n    export NEPTUNE_PROJECT=\"workspace-name/project-name\"\n    ```\n\n    For help, see https://docs-beta.neptune.ai/setup.\n\n1. In your Python code, create a [`ReadOnlyProject`](#readonlyproject) instance:\n\n    ```python\n    from neptune_fetcher import ReadOnlyProject\n\n    my_project = ReadOnlyProject()\n    ```\n\nNow you have a Neptune project to operate on.\n\n> If you don't set the Neptune environment variables, you can pass your credentials through arguments when creating a project or run object.\n\nTo fetch experiments in bulk, call a fetching method on the project:\n\n```python\nexperiments_df = my_project.fetch_experiments_df(\n    names_regex=\"tree/.*\",\n    columns=[\"sys/custom_run_id\", \"sys/modification_time\"],\n    query='(last(`accuracy`:floatSeries) > 0.88) AND (`learning_rate`:float < 0.01)',\n)\n```\n\nTo fetch metadata from an individual experiment or run, create and use a [`ReadOnlyRun`](#readonlyrun) object:\n\n```python\nfrom neptune_fetcher import ReadOnlyRun\n\nrun = ReadOnlyRun(\n    read_only_project=my_project,\n    experiment_name=\"seagull-flying-skills\",\n)\n\n# Fetch value\nprint(run[\"parameters/optimizer\"].fetch())\n\n# Fecth last value of metric\nprint(run[\"metrics/loss\"].fetch_last())\n\n# Fetch all metric values, with optional pre-fetching to speed up subsequent access to the field\nrun.prefetch_series_values([\"metrics/accuracy\"])\nprint(run[\"metrics/accuracy\"].fetch_values())\n```\n\nFor details, see the Neptune documentation:\n\n- [Fetch runs or experiments](https://docs-beta.neptune.ai/fetch_runs)\n- [Fetch metadata from a run or experiment](https://docs-beta.neptune.ai/fetch_run_data)\n- [Neptune Query Language (NQL)](https://docs-beta.neptune.ai/nql)\n\n## Examples\n\n### Listing runs of a project\n\n```python\nfrom neptune_fetcher import ReadOnlyProject\n\nproject = ReadOnlyProject(\"workspace/project\")\n\nfor run in project.list_runs():\n    print(run)  # dicts with identifiers\n```\n\n### Listing experiments of a project\n\n```python\nfrom neptune_fetcher import ReadOnlyProject\n\nproject = ReadOnlyProject(\"workspace/project\")\n\nfor experiment in project.list_experiments():\n    print(experiment)  # dicts with identifiers\n```\n\n### Fetching runs data frame with specific columns\n\n```python\nfrom neptune_fetcher import ReadOnlyProject\n\nproject = ReadOnlyProject(\"workspace/project\")\n\nruns_df = project.fetch_runs_df(\n    columns=[\"sys/custom_run_id\", \"sys/modification_time\"],\n    columns_regex=\"tree/.*\",  # added to columns specified with the \"columns\" parameter\n)\n```\n\n### Fetching data from specified runs\n\n```python\nfrom neptune_fetcher import ReadOnlyProject\n\nproject = ReadOnlyProject(\"workspace/project\")\n\nfor run in project.fetch_read_only_runs(with_ids=[\"RUN-1\", \"RUN-2\"]):\n    run.prefetch([\"parameters/optimizer\", \"parameters/init_lr\"])\n\n    print(run[\"parameters/optimizer\"].fetch())\n    print(run[\"parameters/init_lr\"].fetch())\n```\n\n### Fetching data from a single run\n\n```python\nfrom neptune_fetcher import ReadOnlyProject, ReadOnlyRun\n\nproject = ReadOnlyProject(\"workspace/project\")\nrun = ReadOnlyRun(project, with_id=\"TES-1\")\n\nrun.prefetch([\"parameters/optimizer\", \"parameters/init_lr\"])\nrun.prefetch_series_values([\"metrics/loss\", \"metrics/accuracy\"], use_threads=True)\n\nprint(run[\"parameters/optimizer\"].fetch())\nprint(run[\"parameters/init_lr\"].fetch())\nprint(run[\"metrics/loss\"].fetch_values())\nprint(run[\"metrics/accuracy\"].fetch_values())\n```\n\n## API reference\n\n### Supported regular expressions\n\nNeptune uses the [RE2](https://github.com/google/re2) regular expression library. For supported regex features and limitations, see the official [syntax guide](https://github.com/google/re2/wiki/syntax).\n\n\n### `ReadOnlyProject`\n\nRepresentation of a Neptune project in a limited read-only mode.\n\n#### Initialization\n\nInitialize with the ReadOnlyProject class constructor:\n\n```python\nproject = ReadOnlyProject(\"workspace/project\", api_token=\"...\")\n```\n\n> [!TIP]\n> Find your API token in your user menu, in the bottom-left corner of the Neptune app.\n\n__Parameters:__\n\n| Name        | Type             | Default | Description                                                                                                                                                                                                                                       |\n|-------------|------------------|---------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `project`   | `str`, optional  | `None`  | Name of a project in the form `workspace-name/project-name`. If `None`, the value of the `NEPTUNE_PROJECT` environment variable is used.                                                                                                          |\n| `api_token` | `str`, optional  | `None`  | Your Neptune API token (or a service account's API token). If `None`, the value of the `NEPTUNE_API_TOKEN` environment variable is used. To keep your token secure, avoid placing it in source code. Instead, save it as an environment variable. |\n| `proxies`   | `dict`, optional | `None`  | Dictionary of proxy settings, if needed. This argument is passed to HTTP calls made via the Requests library. For details on proxies, see the [Requests documentation](https://requests.readthedocs.io/).                                         |\n\n---\n\n#### `list_runs()`\n\nLists all runs of a project.\n\nEach run is identified by Neptune ID (`sys/id`), custom ID (`sys/custom_run_id`) and, if set, name (`sys/name`).\n\n__Returns:__ `Iterator` of dictionaries with Neptune run identifiers, custom identifiers and names.\n\n__Example:__\n\n```python\nproject = ReadOnlyProject()\n\nfor run in project.list_runs():\n    print(run)\n```\n\n---\n\n#### `list_experiments()`\n\nLists all experiments of a project.\n\nEach experiment is identified by:\n\n- Neptune ID: `sys/id`\n- (If set) Custom ID: `sys/custom_run_id`\n- Name: `sys/name`\n\n__Example:__\n\n```python\nfor experiment in project.list_experiments():\n    print(experiment)\n```\n\n__Returns:__ `Iterator` of dictionaries with Neptune experiment identifiers, custom identifiers and names.\n\n---\n\n#### `fetch_runs()`\n\nFetches a table containing Neptune IDs, custom run IDs and names of runs in the project.\n\n__Returns:__ `pandas.DataFrame` `pandas.DataFrame` with three columns (`sys/id`, `sys/name` and `sys/custom_run_id`)\nand one row for each run.\n\n__Example:__\n\n```python\nproject = ReadOnlyProject()\ndf = project.fetch_runs()\n```\n\n---\n\n#### `fetch_experiments()`\n\nFetches a table containing Neptune IDs, custom IDs and names of experiments in the project.\n\n__Example__:\n\n```python\ndf = project.fetch_experiments()\n```\n\n__Returns__:\n`pandas.DataFrame` with three columns (`sys/id`, `sys/custom_run_id`, `sys/name`) and one row for each experiment.\n\n---\n\n#### `fetch_runs_df()`\n\nFetches the runs' metadata and returns them as a pandas DataFrame.\n\n__Parameters:__\n\n| Name              | Type                  | Default             | Description                                                                                                                                                                                                                                                                                                          |\n|-------------------|-----------------------|---------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `columns`         | `List[str]`, optional | `None`              | Names of columns to include in the table, as a list of field names. The sorting column, custom run identifier (`sys/custom_run_id`), and experiment name (`sys/name`) are always included. `None` results in returning only the default columns.\n| `columns_regex`   | `str`, optional       | `None`              | A regex pattern to filter columns by name. Use this parameter to include columns in addition to the ones specified by the `columns` parameter.\n| `names_regex`     | `str`, optional       | `None`              | A regex pattern to filter the runs by name.                                                                                                                                                                                                                                                                          |\n| `custom_id_regex` | `str`, optional       | `None`              | A regex pattern to filter the runs by custom ID.                                                                                                                                                                                                                                                                     |\n| `with_ids`        | `List[str]`, optional | `None`              | List of multiple Neptune IDs. Example: `[\"NLU-1\", \"NLU-2\"]`. Matching any element of the list is sufficient to pass the criterion.                                                                                                                                                                                   |\n| `custom_ids`      | `List[str]`, optional | `None`              | List of multiple custom IDs. Example: `[\"nostalgic_shockley\", \"high_albattani\"]`. Matching any element of the list is sufficient to pass the criterion.                                                                                                                                                              |\n| `states`          | `List[str]`, optional | `None`              | List of states. Possible values: `\"inactive\"`, `\"active\"`. \"Active\" means that at least one process is connected to the run. Matching any element of the list is sufficient to pass the criterion.                                                                                                                   |\n| `owners`          | `List[str]`, optional | `None`              | List of multiple owners. Example:  `[\"frederic\", \"josh\"]`. The owner is the user who created the run. Matching any element of the list is sufficient to pass the criterion.                                                                                                                                          |\n| `tags`            | `List[str]`, optional | `None`              | A list of tags. Example: `\"lightGBM\"` or `[\"pytorch\", \"cycleLR\"]`. **Note:** Only runs that have all specified tags will pass this criterion.                                                                                                                                                                        |\n| `trashed`         | `bool`, optional      | `False`             | Whether to retrieve trashed runs. If `True`, only trashed runs are retrieved. If `False`, only non-trashed runs are retrieved. If `None` or left empty, all run objects are retrieved, including trashed ones.                                                                                                       |\n| `limit`           | `int`, optional       | `None`              | Maximum number of runs to fetch. If `None`, all runs are fetched.                                                                                                                                                                                                                                                    |\n| `sort_by`         | `str`, optional       | `sys/creation_time` | Name of the field to sort the results by. The field must represent a simple type (string, float, integer).                                                                                                                                                                                                           |\n| `ascending`       | `bool`, optional      | `False`             | Whether to sort the entries in ascending order of the sorting column values.                                                                                                                                                                                                                                         |\n| `progress_bar`    | `bool`                | `True`              | Set to `False `to disable the download progress bar.                                                                                                                                                                                                                                                                 |\n| `query`           | `str`, optional       | `None`              | NQL query string. Example: `\"(accuracy: float > 0.88) AND (loss: float < 0.2)\"`. The query is applied on top of other criteria like, `custom_ids`, `tags` etc, using the logical AND operator. See examples below. For syntax, see [Neptune Query Language](https://docs.neptune.ai/usage/nql/) in the Neptune docs. |\n\n__Returns:__ `pandas.DataFrame`: A pandas DataFrame containing metadata of the fetched runs.\n\n> [!IMPORTANT]\n> The following fields are always included:\n>\n> - `sys/custom_run_id`: the custom run identifier.\n> - The field to sort by. That is, the field name passed to the `sort_by` argument.\n>\n> The maximum number of runs that can be returned is 5000.\n\n\n__Examples:__\n\nFetch all runs, with specific columns:\n\n```python\nproject = ReadOnlyProject()\n\nruns_df = project.fetch_runs_df(\n    columns=[\"sys/modification_time\", \"training/lr\"]\n)\n```\n\nFetch all runs, with specific columns and extra columns that match a regex pattern:\n\n```python\nruns_df = project.fetch_runs_df(\n    columns=[\"sys/modification_time\"],\n    columns_regex=\"tree/.*\",\n)\n```\n\nFetch runs by specific ID:\n\n```python\nspecific_runs_df = my_project.fetch_runs_df(custom_ids=[\"nostalgic_shockley\", \"high_albattani\"])\n```\n\nFetch runs by names that match a regex pattern:\n\n```python\nspecific_runs_df = my_project.fetch_runs_df(\n    names_regex=\"tree_3[2-4]+\"\n)\n```\n\nFetch runs with a complex query using NQL.\n\n```python\nruns_df = my_project.fetch_runs_df(\n    query='(last(`accuracy`:floatSeries) > 0.88) AND (`learning_rate`:float < 0.01)'\n)\n```\n\n---\n\n#### `fetch_experiments_df()`\n\nFetches the experiments' metadata and returns them as a pandas DataFrame.\n\n__Parameters__:\n\n| Name                  | Type                  | Default             | Description                                                                                                                                                                                                                                                                                                          |\n|-----------------------|-----------------------|---------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `columns`             | `List[str]`, optional | `None`              | Names of columns to include in the table, as a list of field names. The sorting column, custom run identifier (`sys/custom_run_id`), and experiment name (`sys/name`) are always included. `None` results in returning only the default columns.\n| `columns_regex`       | `str`, optional       | `None`              | A regex pattern to filter columns by name. Use this parameter to include columns in addition to the ones specified by the `columns` parameter.\n| `names_regex`         | `str`, optional       | `None`              | A regex pattern to filter the experiments by name.                                                                                                                                                                                                                                                                   |\n| `names_regex`         | `str`, optional       | `None`              | A regex pattern or a list of regex patterns to filter the experiments by name. Multiple patterns will be connected by AND logic.                                                                                                                                                                                     |\n| `names_exclude_regex` | `str`, optional       | `None`              | A regex pattern or a list of regex patterns to exclude experiments by name. Multiple patterns will be connected by AND logic.                                                                                                                                                                                        |\n| `custom_id_regex`     | `str`, optional       | `None`              | A regex pattern to filter the experiments by custom ID.                                                                                                                                                                                                                                                              |\n| `with_ids`            | `List[str]`, optional | `None`              | List of multiple Neptune IDs. Example: `[\"NLU-1\", \"NLU-2\"]`. Matching any element of the list is sufficient to pass the criterion.                                                                                                                                                                                   |\n| `custom_ids`          | `List[str]`, optional | `None`              | List of multiple custom IDs. Example: `[\"nostalgic_shockley\", \"high_albattani\"]`. Matching any element of the list is sufficient to pass the criterion.                                                                                                                                                              |\n| `states`              | `List[str]`, optional | `None`              | List of states. Possible values: `\"inactive\"`, `\"active\"`. \"Active\" means that at least one process is connected to the experiment. Matching any element of the list is sufficient to pass the criterion.                                                                                                            |\n| `owners`              | `List[str]`, optional | `None`              | List of multiple owners. Example:  `[\"frederic\", \"josh\"]`. The owner is the user who created the experiement. Matching any element of the list is sufficient to pass the criterion.                                                                                                                                  |\n| `tags`                | `List[str]`, optional | `None`              | A list of tags. Example: `\"lightGBM\"` or `[\"pytorch\", \"cycleLR\"]`. **Note:** Only experiments that have all specified tags will pass this criterion.                                                                                                                                                                 |\n| `trashed`             | `bool`, optional      | `False`             | Whether to retrieve trashed experiments. If `True`, only trashed experiments are retrieved. If `False`, only non-trashed experiments are retrieved. If `None` or left empty, all experiment objects are retrieved, including trashed ones.                                                                           |\n| `limit`               | `int`, optional       | `None`              | Maximum number of experiments to fetch. If `None`, all experiments are fetched.                                                                                                                                                                                                                                      |\n| `sort_by`             | `str`, optional       | `sys/creation_time` | Name of the field to sort the results by. The field must represent a simple type (string, float, integer).                                                                                                                                                                                                           |\n| `ascending`           | `bool`, optional      | `False`             | Whether to sort the entries in ascending order of the sorting column values.                                                                                                                                                                                                                                         |\n| `progress_bar`        | `bool`                | `True`              | Set to `False `to disable the download progress bar.                                                                                                                                                                                                                                                                 |\n| `query`               | `str`, optional       | `None`              | NQL query string. Example: `\"(accuracy: float > 0.88) AND (loss: float < 0.2)\"`. The query is applied on top of other criteria like, `custom_ids`, `tags` etc, using the logical AND operator. See examples below. For syntax, see [Neptune Query Language](https://docs.neptune.ai/usage/nql/) in the Neptune docs. |\n\n__Returns:__ `pandas.DataFrame`: A pandas DataFrame containing metadata of the fetched experiments.\n\n> [!IMPORTANT]\n> The following fields are always included:\n>\n> - `sys/custom_run_id`: the custom run identifier.\n> - `sys/name`: the experiment name.\n> - The field to sort by. That is, the field name passed to the `sort_by` argument.\n>\n> The maximum number of runs that can be returned is 5000.\n\n__Examples:__\n\nFetch all experiments with specific columns:\n\n```python\nexperiments_df = project.fetch_experiments_df(\n    columns=[\"sys/custom_run_id\", \"sys/modification_time\", \"training/lr\"]\n)\n```\n\nFetch all experiments with specific columns and extra columns that match a regex pattern:\n\n```python\nexperiments_df = project.fetch_experiments_df(\n    columns=[\"sys/custom_run_id\", \"sys/modification_time\"],\n    columns_regex=\"tree/.*\",\n)\n```\n\nFetch experiments by specific IDs:\n\n```python\nspecific_experiments_df = my_project.fetch_experiments_df(\n    custom_ids=[\"nostalgic_shockley\", \"high_albattani\"]\n)\n```\n\nUse the Neptune Query Language to fetch experiments with a complex query. Note that for regular strings, the `\\` character needs to be escaped:\n\n```python\nexperiments_df = my_project.fetch_experiments_df(\n    query='(`learning_rate`:float < 0.01) AND (`sys/name`:string MATCHES \"experiment-\\\\\\\\d+\")'\n)\n```\n\nAs a less cluttered alternative, pass a raw Python string to the `query` argument:\n\n```python\nexperiments_df = my_project.fetch_experiments_df(\n    query=r'(`learning_rate`:float < 0.01) AND (`sys/name`:string MATCHES \"experiment-\\\\d+\")'\n)\n```\n\n---\n\n#### `fetch_read_only_runs()`\n\nList runs of the project in the form of ReadOnlyRun.\n\n__Parameters:__\n\n| Name         | Type                  | Default | Description                       |\n|--------------|-----------------------|---------|-----------------------------------|\n| `with_ids`   | `Optional[List[str]]` | `None`  | List of Neptune run IDs to fetch. |\n| `custom_ids` | `Optional[List[str]]` | `None`  | List of custom run IDs to fetch.  |\n\n__Returns:__ Iterator of ReadOnlyRun objects.\n\n__Example:__\n\n```python\nproject = ReadOnlyProject()\n\nfor run in project.fetch_read_only_runs(custom_ids=[\"nostalgic_shockley\", \"high_albattani\"]):\n    ...\n```\n\n---\n\n#### `fetch_read_only_experiments()`\n\nLists experiments of the project in the form of ReadOnlyRun.\n\n__Parameters:__\n\n| Name    | Type                  | Default | Description                        |\n|---------|-----------------------|---------|------------------------------------|\n| `names` | `Optional[List[str]]` | `None`  | List of experiment names to fetch. |\n\n__Returns:__ Iterator of ReadOnlyRun objects.\n\n__Example:__\n\n```python\nproject = ReadOnlyProject()\n\nfor run in project.fetch_read_only_experiments(names=[\"yolo-v2\", \"yolo-v3\"]):\n    ...\n```\n\n---\n\n### `ReadOnlyRun`\n\nRepresentation of a Neptune run in a limited read-only mode.\n\n#### Initialization\n\nCan be created\n\n- with the class constructor:\n\n    ```python\n    project = ReadOnlyProject()\n    run = ReadOnlyRun(project, with_id=\"TES-1\")\n    ```\n\n- or as a result of the [`fetch_read_only_runs()`](#fetch_read_only_runs) method of the `ReadOnlyProject` class:\n\n    ```python\n    for run in project.fetch_read_only_runs(\n        custom_ids=[\"nostalgic_shockley\", \"high_albattani\"]):\n        ...\n    ```\n\n__Parameters:__\n\n| Name                               | Type              | Default | Description                                                                                                                        |\n|------------------------------------|-------------------|---------|------------------------------------------------------------------------------------------------------------------------------------|\n| `read_only_project`                | `ReadOnlyProject` | -       | Project from which the run is fetched.                                                                                             |\n| `with_id`                          | `Optional[str]`   | `None`  | ID of the Neptune run to fetch. Example: `RUN-1`. Exclusive with the `custom_id` and `experiment_name` parameters.                 |\n| `custom_id`                        | `Optional[str]`   | `None`  | Custom ID of the Neptune run to fetch. Example: `high_albattani`. Exclusive with the `with_id` and `experiment_name` parameters.   |\n| `experiment_name`                  | `Optional[str]`   | `None`  | Name of the Neptune experiment to fetch. Example: `high_albattani`. Exclusive with the `with_id` and `custom_id` parameters.       |\n| `eager_load_fields` | `Optional[bool]`  | `True`  | Whether to eagerly load the run fields definitions. If `False`, individual fields are loaded only when accessed. Default is `True`. |\n\n__Example:__\n\n```python\nfrom neptune_fetcher import ReadOnlyProject, ReadOnlyRun\n\nproject = ReadOnlyProject(\"workspace-name/project-name\", api_token=\"...\")\nrun = ReadOnlyRun(project, custom_id=\"high_albattani\")\n```\n\n---\n\n#### `.field_names`\n\nList of run field names.\n\nA _field_ is the location where a piece of metadata is stored in the run.\n\n__Returns:__ Iterator of run fields as strings.\n\n__Example:__\n\n```python\nfor run in project.fetch_read_only_runs(custom_ids=[\"nostalgic_shockley\", ...]):\n    print(list(run.field_names))\n```\n\n---\n\n#### Field lookup: `run[field_name]`\n\nUsed to access a specific field of a run. See [Available types](#available-types).\n\n__Returns:__ An internal object used to operate on a specific field.\n\n__Example:__\n\n```python\nrun = ReadOnlyRun(...)\ncustom_id = run[\"sys/custom_run_id\"].fetch()\n```\n\n---\n\n#### `prefetch()`\n\nPre-fetches a batch of fields to the internal cache.\n\nImproves the performance of access to consecutive field values.\n\nSupported Neptune field types:\n\n- [`Boolean`](#boolean)\n- [`Datetime`](#datetime)\n- [`Float`](#float)\n- [`FloatSeries`](#floatseries)\n- [`Integer`](#integer)\n- [`ObjectState`](#objectstate)\n- [`String`](#string)\n- [`StringSet`](#stringset)\n\n__Parameters:__\n\n| Name    | Type        | Default | Description                                |\n|---------|-------------|---------|--------------------------------------------|\n| `paths` | `List[str]` | -       | List of field paths to fetch to the cache. |\n\n__Example:__\n\n```python\nrun = ReadOnlyRun(...)\nrun.prefetch([\"parameters/optimizer\", \"parameter/init_lr\"])\n# No more calls to the API\nprint(run[\"parameters/optimizer\"].fetch())\nprint(run[\"parameter/init_lr\"].fetch())\n```\n\n### `prefetch_series_values()`\n\nPrefetches a batch of series to the internal cache. This method skips the non-existing attributes.\n\nImproves the performance of access to consecutive field values. Works only for series ([`FloatSeries`](#floatseries)).\n\nTo speed up the fetching process, this method can use multithreading.\nTo enable it, set the `use_threads` parameter to `True`.\n\nBy default, the maximum number of workers is 10. You can change this number by setting the `NEPTUNE_FETCHER_MAX_WORKERS`\nenvironment variable.\n\n__Parameters__:\n\n| Name                | Type                  | Default      | Description                                                                                                                                                                                                                                                                                      |\n|---------------------|-----------------------|--------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `paths`             | `List[str]`, required | `None`       | List of paths to prefetch to the internal cache.                                                                                                                                                                                                                                                 |\n| `use_threads`       | `bool`, optional      | `False`      | Whether to use threads to fetch the data.                                                                                                                                                                                                                                                        |\n| `progress_bar`      | `bool`                | `True`       | Set to False to disable the download progress bar.                                                                                                                                                                                                                                               |\n| `include_inherited` | `bool`, optional      | `True`       | If True (default), values inherited from ancestor runs are included. To only fetch values from the current run, set to False.                                                                                                                                                                    |\n| `step_range`        | `tuple[float, float]` | (None, None) | Limits the range of steps to fetch. This must be a 2-tuple: <br> - `left`: The left boundary of the range (inclusive). If `None`, the range extends indefinitely on the left.<br>- `right`: The right boundary of the range (inclusive). If `None`, the range extends indefinitely on the right. |\n\n__Example__:\n\n```python\nrun.prefetch_series_values([\"metrics/loss\", \"metrics/accuracy\"])\n# No more calls to the API\nprint(run[\"metrics/loss\"].fetch_values())\nprint(run[\"metrics/accuracy\"].fetch_values())\n```\n\n## Available types\n\nThis section lists the available field types and data retrieval operations.\n\n---\n\n### `Boolean`\n\n#### `fetch()`\n\nRetrieves a `bool` value either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.\n\n__Example:__\n\n```python\nstatus = run[\"sys/failed\"].fetch()\n```\n\n---\n\n### `Datetime`\n\n#### `fetch()`\n\nRetrieves a `datetime.datetime` value either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.\n\n__Example:__\n\n```python\ncreated_at = run[\"sys/creation_time\"].fetch()\n```\n\n---\n\n### `Float`\n\n#### `fetch()`\n\nRetrieves a `float` value either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.\n\n__Example:__\n\n```python\nf1 = run[\"scores/f1\"].fetch()\n```\n\n---\n\n### `FloatSeries`\n\n#### `fetch()` or `fetch_last()`\n\nRetrieves the last value of a series, either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.\n\n__Returns:__ `Optional[float]`\n\n__Example:__\n\n```python\nloss = run[\"loss\"].fetch_last()\n```\n\n#### `fetch_values()`\n\nRetrieves all series values either from the internal cache (see [`prefetch_series_values()`](#prefetch_series_values))\nor from the API.\n\n__Parameters:__\n\n| Name                | Type                  | Default      | Description                                                                                                                                                               |\n|---------------------|-----------------------|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `include_timestamp` | `bool`                | `True`       | Whether the fetched data should include the timestamp field.                                                                                                              |\n| `include_inherited` | `bool`                | `True`       | If True (default), values inherited from ancestor runs are included. To only fetch values from the current run, set to False.                                             |\n| `progress_bar`      | `bool`                | `True`       | Set to False to disable the download progress bar.                                                                                                                        |\n| `step_range`        | `tuple[float, float]` | (None, None) | - left: left boundary of the range (inclusive). If None, it\\'s open on the left. <br> - right: right boundary of the range (inclusive). If None, it\\'s open on the right. |\n\n__Returns:__ `pandas.DataFrame`\n\n__Example:__\n\n```python\nvalues = run[\"loss\"].fetch_values()\n```\n\n---\n\n### `Integer`\n\n#### `fetch()`\n\nRetrieves an `int` value either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.\n\n__Example:__\n\n```python\nbatch_size = run[\"batch_size\"].fetch()\n```\n\n---\n\n### `ObjectState`\n\n#### `fetch()`\n\nRetrieves the state of a run either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.\n\n__Returns:__ `str`\n\n> [!NOTE]\n> The state can be **active** or **inactive**. It refers to whether new data was recently logged to the run.\n> To learn more about this field, see [System namespace: State](https://docs.neptune.ai/api/sys/#state) in the Neptune\n> docs.\n\n__Example:__\n\n```python\nstate = run[\"sys/state\"].fetch()\n```\n\n---\n\n### `String`\n\n#### `fetch()`\n\nRetrieves a `str` value either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.\n\n__Example:__\n\n```python\ntoken = run[\"token\"].fetch()\n```\n\n---\n\n### `StringSet`\n\n#### `fetch()`\n\nRetrieves a `dict` of `str` values either from the internal cache (see [`prefetch()`](#prefetch)) or from the API.\n\n__Example:__\n\n```python\ngroups = run[\"sys/group_tags\"].fetch()\n```\n\n## License\n\nThis project is licensed under the Apache License Version 2.0. For more details,\nsee [Apache License Version 2.0](http://www.apache.org/licenses/LICENSE-2.0).\n\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Neptune Fetcher",
    "version": "0.12.0",
    "project_urls": {
        "Documentation": "https://docs.neptune.ai/",
        "Homepage": "https://neptune.ai/",
        "Repository": "https://github.com/neptune-ai/neptune-fetcher",
        "Tracker": "https://github.com/neptune-ai/neptune-fetcher/issues"
    },
    "split_keywords": [
        "mlops",
        " ml experiment tracking",
        " ml model registry",
        " ml model store",
        " ml metadata store"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "18dfda88d6d9a4752c4f719e2e5dc4832f74ae14aaefab951449370f266cc33e",
                "md5": "f210b29bc755447a48a96018030ab0ef",
                "sha256": "24a752bffc7cef09470864114bbccf7eb85882f8a2d9bf93318ab54dc455f94b"
            },
            "downloads": -1,
            "filename": "neptune_fetcher-0.12.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f210b29bc755447a48a96018030ab0ef",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8",
            "size": 36343,
            "upload_time": "2025-01-21T08:22:01",
            "upload_time_iso_8601": "2025-01-21T08:22:01.107581Z",
            "url": "https://files.pythonhosted.org/packages/18/df/da88d6d9a4752c4f719e2e5dc4832f74ae14aaefab951449370f266cc33e/neptune_fetcher-0.12.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "59445ff1b11f5b55cd973e9bcdeec8821e083c0dab700749fb79dda703c98f20",
                "md5": "0f45feec8f9f567fb1cdc9402f21357f",
                "sha256": "71b450aeabe7a874a876bfc41eb3f51c64ac6fa67f66edc1626a2093b13637cf"
            },
            "downloads": -1,
            "filename": "neptune_fetcher-0.12.0.tar.gz",
            "has_sig": false,
            "md5_digest": "0f45feec8f9f567fb1cdc9402f21357f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8",
            "size": 35637,
            "upload_time": "2025-01-21T08:22:02",
            "upload_time_iso_8601": "2025-01-21T08:22:02.662069Z",
            "url": "https://files.pythonhosted.org/packages/59/44/5ff1b11f5b55cd973e9bcdeec8821e083c0dab700749fb79dda703c98f20/neptune_fetcher-0.12.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-21 08:22:02",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "neptune-ai",
    "github_project": "neptune-fetcher",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "neptune-fetcher"
}
        
Elapsed time: 0.83953s