fasttext-numpy2-wheel


Namefasttext-numpy2-wheel JSON
Version 0.9.2 PyPI version JSON
download
home_pagehttps://github.com/facebookresearch/fastText
Summaryfasttext Python bindings
upload_time2024-11-05 13:54:19
maintainerNone
docs_urlNone
authorOnur Celebi
requires_pythonNone
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            fastText |CircleCI|
===================

`fastText <https://fasttext.cc/>`__ is a library for efficient learning
of word representations and sentence classification.

In this document we present how to use fastText in python.

Table of contents
-----------------

-  `Requirements <#requirements>`__
-  `Installation <#installation>`__
-  `Usage overview <#usage-overview>`__
-  `Word representation model <#word-representation-model>`__
-  `Text classification model <#text-classification-model>`__
-  `IMPORTANT: Preprocessing data / encoding
   conventions <#important-preprocessing-data-encoding-conventions>`__
-  `More examples <#more-examples>`__
-  `API <#api>`__
-  `train_unsupervised parameters <#train_unsupervised-parameters>`__
-  `train_supervised parameters <#train_supervised-parameters>`__
-  `model object <#model-object>`__

Requirements
============

`fastText <https://fasttext.cc/>`__ builds on modern Mac OS and Linux
distributions. Since it uses C++11 features, it requires a compiler with
good C++11 support. You will need `Python <https://www.python.org/>`__
(version 2.7 or ≥ 3.4), `NumPy <http://www.numpy.org/>`__ &
`SciPy <https://www.scipy.org/>`__ and
`pybind11 <https://github.com/pybind/pybind11>`__.

Installation
============

To install the latest release, you can do :

.. code:: bash

    $ pip install fasttext

or, to get the latest development version of fasttext, you can install
from our github repository :

.. code:: bash

    $ git clone https://github.com/facebookresearch/fastText.git
    $ cd fastText
    $ sudo pip install .
    $ # or :
    $ sudo python setup.py install

Usage overview
==============

Word representation model
-------------------------

In order to learn word vectors, as `described
here <https://fasttext.cc/docs/en/references.html#enriching-word-vectors-with-subword-information>`__,
we can use ``fasttext.train_unsupervised`` function like this:

.. code:: py

    import fasttext

    # Skipgram model :
    model = fasttext.train_unsupervised('data.txt', model='skipgram')

    # or, cbow model :
    model = fasttext.train_unsupervised('data.txt', model='cbow')

where ``data.txt`` is a training file containing utf-8 encoded text.

The returned ``model`` object represents your learned model, and you can
use it to retrieve information.

.. code:: py

    print(model.words)   # list of words in dictionary
    print(model['king']) # get the vector of the word 'king'

Saving and loading a model object
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can save your trained model object by calling the function
``save_model``.

.. code:: py

    model.save_model("model_filename.bin")

and retrieve it later thanks to the function ``load_model`` :

.. code:: py

    model = fasttext.load_model("model_filename.bin")

For more information about word representation usage of fasttext, you
can refer to our `word representations
tutorial <https://fasttext.cc/docs/en/unsupervised-tutorial.html>`__.

Text classification model
-------------------------

In order to train a text classifier using the method `described
here <https://fasttext.cc/docs/en/references.html#bag-of-tricks-for-efficient-text-classification>`__,
we can use ``fasttext.train_supervised`` function like this:

.. code:: py

    import fasttext

    model = fasttext.train_supervised('data.train.txt')

where ``data.train.txt`` is a text file containing a training sentence
per line along with the labels. By default, we assume that labels are
words that are prefixed by the string ``__label__``

Once the model is trained, we can retrieve the list of words and labels:

.. code:: py

    print(model.words)
    print(model.labels)

To evaluate our model by computing the precision at 1 (P@1) and the
recall on a test set, we use the ``test`` function:

.. code:: py

    def print_results(N, p, r):
        print("N\t" + str(N))
        print("P@{}\t{:.3f}".format(1, p))
        print("R@{}\t{:.3f}".format(1, r))

    print_results(*model.test('test.txt'))

We can also predict labels for a specific text :

.. code:: py

    model.predict("Which baking dish is best to bake a banana bread ?")

By default, ``predict`` returns only one label : the one with the
highest probability. You can also predict more than one label by
specifying the parameter ``k``:

.. code:: py

    model.predict("Which baking dish is best to bake a banana bread ?", k=3)

If you want to predict more than one sentence you can pass an array of
strings :

.. code:: py

    model.predict(["Which baking dish is best to bake a banana bread ?", "Why not put knives in the dishwasher?"], k=3)

Of course, you can also save and load a model to/from a file as `in the
word representation usage <#saving-and-loading-a-model-object>`__.

For more information about text classification usage of fasttext, you
can refer to our `text classification
tutorial <https://fasttext.cc/docs/en/supervised-tutorial.html>`__.

Compress model files with quantization
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When you want to save a supervised model file, fastText can compress it
in order to have a much smaller model file by sacrificing only a little
bit performance.

.. code:: py

    # with the previously trained `model` object, call :
    model.quantize(input='data.train.txt', retrain=True)

    # then display results and save the new model :
    print_results(*model.test(valid_data))
    model.save_model("model_filename.ftz")

``model_filename.ftz`` will have a much smaller size than
``model_filename.bin``.

For further reading on quantization, you can refer to `this paragraph
from our blog
post <https://fasttext.cc/blog/2017/10/02/blog-post.html#model-compression>`__.

IMPORTANT: Preprocessing data / encoding conventions
----------------------------------------------------

In general it is important to properly preprocess your data. In
particular our example scripts in the `root
folder <https://github.com/facebookresearch/fastText>`__ do this.

fastText assumes UTF-8 encoded text. All text must be `unicode for
Python2 <https://docs.python.org/2/library/functions.html#unicode>`__
and `str for
Python3 <https://docs.python.org/3.5/library/stdtypes.html#textseq>`__.
The passed text will be `encoded as UTF-8 by
pybind11 <https://pybind11.readthedocs.io/en/master/advanced/cast/strings.html?highlight=utf-8#strings-bytes-and-unicode-conversions>`__
before passed to the fastText C++ library. This means it is important to
use UTF-8 encoded text when building a model. On Unix-like systems you
can convert text using `iconv <https://en.wikipedia.org/wiki/Iconv>`__.

fastText will tokenize (split text into pieces) based on the following
ASCII characters (bytes). In particular, it is not aware of UTF-8
whitespace. We advice the user to convert UTF-8 whitespace / word
boundaries into one of the following symbols as appropiate.

-  space
-  tab
-  vertical tab
-  carriage return
-  formfeed
-  the null character

The newline character is used to delimit lines of text. In particular,
the EOS token is appended to a line of text if a newline character is
encountered. The only exception is if the number of tokens exceeds the
MAX\_LINE\_SIZE constant as defined in the `Dictionary
header <https://github.com/facebookresearch/fastText/blob/master/src/dictionary.h>`__.
This means if you have text that is not separate by newlines, such as
the `fil9 dataset <http://mattmahoney.net/dc/textdata>`__, it will be
broken into chunks with MAX\_LINE\_SIZE of tokens and the EOS token is
not appended.

The length of a token is the number of UTF-8 characters by considering
the `leading two bits of a
byte <https://en.wikipedia.org/wiki/UTF-8#Description>`__ to identify
`subsequent bytes of a multi-byte
sequence <https://github.com/facebookresearch/fastText/blob/master/src/dictionary.cc>`__.
Knowing this is especially important when choosing the minimum and
maximum length of subwords. Further, the EOS token (as specified in the
`Dictionary
header <https://github.com/facebookresearch/fastText/blob/master/src/dictionary.h>`__)
is considered a character and will not be broken into subwords.

More examples
-------------

In order to have a better knowledge of fastText models, please consider
the main
`README <https://github.com/facebookresearch/fastText/blob/master/README.md>`__
and in particular `the tutorials on our
website <https://fasttext.cc/docs/en/supervised-tutorial.html>`__.

You can find further python examples in `the doc
folder <https://github.com/facebookresearch/fastText/tree/master/python/doc/examples>`__.

As with any package you can get help on any Python function using the
help function.

For example

::

    +>>> import fasttext
    +>>> help(fasttext.FastText)

    Help on module fasttext.FastText in fasttext:

    NAME
        fasttext.FastText

    DESCRIPTION
        # Copyright (c) 2017-present, Facebook, Inc.
        # All rights reserved.
        #
        # This source code is licensed under the MIT license found in the
        # LICENSE file in the root directory of this source tree.

    FUNCTIONS
        load_model(path)
            Load a model given a filepath and return a model object.

        tokenize(text)
            Given a string of text, tokenize it and return a list of tokens
    [...]

API
===

``train_unsupervised`` parameters
---------------------------------

.. code:: python

        input             # training file path (required)
        model             # unsupervised fasttext model {cbow, skipgram} [skipgram]
        lr                # learning rate [0.05]
        dim               # size of word vectors [100]
        ws                # size of the context window [5]
        epoch             # number of epochs [5]
        minCount          # minimal number of word occurences [5]
        minn              # min length of char ngram [3]
        maxn              # max length of char ngram [6]
        neg               # number of negatives sampled [5]
        wordNgrams        # max length of word ngram [1]
        loss              # loss function {ns, hs, softmax, ova} [ns]
        bucket            # number of buckets [2000000]
        thread            # number of threads [number of cpus]
        lrUpdateRate      # change the rate of updates for the learning rate [100]
        t                 # sampling threshold [0.0001]
        verbose           # verbose [2]

``train_supervised`` parameters
-------------------------------

.. code:: python

        input             # training file path (required)
        lr                # learning rate [0.1]
        dim               # size of word vectors [100]
        ws                # size of the context window [5]
        epoch             # number of epochs [5]
        minCount          # minimal number of word occurences [1]
        minCountLabel     # minimal number of label occurences [1]
        minn              # min length of char ngram [0]
        maxn              # max length of char ngram [0]
        neg               # number of negatives sampled [5]
        wordNgrams        # max length of word ngram [1]
        loss              # loss function {ns, hs, softmax, ova} [softmax]
        bucket            # number of buckets [2000000]
        thread            # number of threads [number of cpus]
        lrUpdateRate      # change the rate of updates for the learning rate [100]
        t                 # sampling threshold [0.0001]
        label             # label prefix ['__label__']
        verbose           # verbose [2]
        pretrainedVectors # pretrained word vectors (.vec file) for supervised learning []

``model`` object
----------------

``train_supervised``, ``train_unsupervised`` and ``load_model``
functions return an instance of ``_FastText`` class, that we generaly
name ``model`` object.

This object exposes those training arguments as properties : ``lr``,
``dim``, ``ws``, ``epoch``, ``minCount``, ``minCountLabel``, ``minn``,
``maxn``, ``neg``, ``wordNgrams``, ``loss``, ``bucket``, ``thread``,
``lrUpdateRate``, ``t``, ``label``, ``verbose``, ``pretrainedVectors``.
So ``model.wordNgrams`` will give you the max length of word ngram used
for training this model.

In addition, the object exposes several functions :

.. code:: python

        get_dimension           # Get the dimension (size) of a lookup vector (hidden layer).
                                # This is equivalent to `dim` property.
        get_input_vector        # Given an index, get the corresponding vector of the Input Matrix.
        get_input_matrix        # Get a copy of the full input matrix of a Model.
        get_labels              # Get the entire list of labels of the dictionary
                                # This is equivalent to `labels` property.
        get_line                # Split a line of text into words and labels.
        get_output_matrix       # Get a copy of the full output matrix of a Model.
        get_sentence_vector     # Given a string, get a single vector represenation. This function
                                # assumes to be given a single line of text. We split words on
                                # whitespace (space, newline, tab, vertical tab) and the control
                                # characters carriage return, formfeed and the null character.
        get_subword_id          # Given a subword, return the index (within input matrix) it hashes to.
        get_subwords            # Given a word, get the subwords and their indicies.
        get_word_id             # Given a word, get the word id within the dictionary.
        get_word_vector         # Get the vector representation of word.
        get_words               # Get the entire list of words of the dictionary
                                # This is equivalent to `words` property.
        is_quantized            # whether the model has been quantized
        predict                 # Given a string, get a list of labels and a list of corresponding probabilities.
        quantize                # Quantize the model reducing the size of the model and it's memory footprint.
        save_model              # Save the model to the given path
        test                    # Evaluate supervised model using file given by path
        test_label              # Return the precision and recall score for each label.

The properties ``words``, ``labels`` return the words and labels from
the dictionary :

.. code:: py

    model.words         # equivalent to model.get_words()
    model.labels        # equivalent to model.get_labels()

The object overrides ``__getitem__`` and ``__contains__`` functions in
order to return the representation of a word and to check if a word is
in the vocabulary.

.. code:: py

    model['king']       # equivalent to model.get_word_vector('king')
    'king' in model     # equivalent to `'king' in model.get_words()`

Join the fastText community
---------------------------

-  `Facebook page <https://www.facebook.com/groups/1174547215919768>`__
-  `Stack
   overflow <https://stackoverflow.com/questions/tagged/fasttext>`__
-  `Google
   group <https://groups.google.com/forum/#!forum/fasttext-library>`__
-  `GitHub <https://github.com/facebookresearch/fastText>`__

.. |CircleCI| image:: https://circleci.com/gh/facebookresearch/fastText/tree/master.svg?style=svg
   :target: https://circleci.com/gh/facebookresearch/fastText/tree/master

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/facebookresearch/fastText",
    "name": "fasttext-numpy2-wheel",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Onur Celebi",
    "author_email": "celebio@fb.com",
    "download_url": "https://files.pythonhosted.org/packages/c0/8a/8c72c92cd91146ce9948421f3967bc2775deac5215eb8fa3b0487020881f/fasttext_numpy2_wheel-0.9.2.tar.gz",
    "platform": null,
    "description": "fastText |CircleCI|\n===================\n\n`fastText <https://fasttext.cc/>`__ is a library for efficient learning\nof word representations and sentence classification.\n\nIn this document we present how to use fastText in python.\n\nTable of contents\n-----------------\n\n-  `Requirements <#requirements>`__\n-  `Installation <#installation>`__\n-  `Usage overview <#usage-overview>`__\n-  `Word representation model <#word-representation-model>`__\n-  `Text classification model <#text-classification-model>`__\n-  `IMPORTANT: Preprocessing data / encoding\n   conventions <#important-preprocessing-data-encoding-conventions>`__\n-  `More examples <#more-examples>`__\n-  `API <#api>`__\n-  `train_unsupervised parameters <#train_unsupervised-parameters>`__\n-  `train_supervised parameters <#train_supervised-parameters>`__\n-  `model object <#model-object>`__\n\nRequirements\n============\n\n`fastText <https://fasttext.cc/>`__ builds on modern Mac OS and Linux\ndistributions. Since it uses C++11 features, it requires a compiler with\ngood C++11 support. You will need `Python <https://www.python.org/>`__\n(version 2.7 or \u2265 3.4), `NumPy <http://www.numpy.org/>`__ &\n`SciPy <https://www.scipy.org/>`__ and\n`pybind11 <https://github.com/pybind/pybind11>`__.\n\nInstallation\n============\n\nTo install the latest release, you can do :\n\n.. code:: bash\n\n    $ pip install fasttext\n\nor, to get the latest development version of fasttext, you can install\nfrom our github repository :\n\n.. code:: bash\n\n    $ git clone https://github.com/facebookresearch/fastText.git\n    $ cd fastText\n    $ sudo pip install .\n    $ # or :\n    $ sudo python setup.py install\n\nUsage overview\n==============\n\nWord representation model\n-------------------------\n\nIn order to learn word vectors, as `described\nhere <https://fasttext.cc/docs/en/references.html#enriching-word-vectors-with-subword-information>`__,\nwe can use ``fasttext.train_unsupervised`` function like this:\n\n.. code:: py\n\n    import fasttext\n\n    # Skipgram model :\n    model = fasttext.train_unsupervised('data.txt', model='skipgram')\n\n    # or, cbow model :\n    model = fasttext.train_unsupervised('data.txt', model='cbow')\n\nwhere ``data.txt`` is a training file containing utf-8 encoded text.\n\nThe returned ``model`` object represents your learned model, and you can\nuse it to retrieve information.\n\n.. code:: py\n\n    print(model.words)   # list of words in dictionary\n    print(model['king']) # get the vector of the word 'king'\n\nSaving and loading a model object\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nYou can save your trained model object by calling the function\n``save_model``.\n\n.. code:: py\n\n    model.save_model(\"model_filename.bin\")\n\nand retrieve it later thanks to the function ``load_model`` :\n\n.. code:: py\n\n    model = fasttext.load_model(\"model_filename.bin\")\n\nFor more information about word representation usage of fasttext, you\ncan refer to our `word representations\ntutorial <https://fasttext.cc/docs/en/unsupervised-tutorial.html>`__.\n\nText classification model\n-------------------------\n\nIn order to train a text classifier using the method `described\nhere <https://fasttext.cc/docs/en/references.html#bag-of-tricks-for-efficient-text-classification>`__,\nwe can use ``fasttext.train_supervised`` function like this:\n\n.. code:: py\n\n    import fasttext\n\n    model = fasttext.train_supervised('data.train.txt')\n\nwhere ``data.train.txt`` is a text file containing a training sentence\nper line along with the labels. By default, we assume that labels are\nwords that are prefixed by the string ``__label__``\n\nOnce the model is trained, we can retrieve the list of words and labels:\n\n.. code:: py\n\n    print(model.words)\n    print(model.labels)\n\nTo evaluate our model by computing the precision at 1 (P@1) and the\nrecall on a test set, we use the ``test`` function:\n\n.. code:: py\n\n    def print_results(N, p, r):\n        print(\"N\\t\" + str(N))\n        print(\"P@{}\\t{:.3f}\".format(1, p))\n        print(\"R@{}\\t{:.3f}\".format(1, r))\n\n    print_results(*model.test('test.txt'))\n\nWe can also predict labels for a specific text :\n\n.. code:: py\n\n    model.predict(\"Which baking dish is best to bake a banana bread ?\")\n\nBy default, ``predict`` returns only one label : the one with the\nhighest probability. You can also predict more than one label by\nspecifying the parameter ``k``:\n\n.. code:: py\n\n    model.predict(\"Which baking dish is best to bake a banana bread ?\", k=3)\n\nIf you want to predict more than one sentence you can pass an array of\nstrings :\n\n.. code:: py\n\n    model.predict([\"Which baking dish is best to bake a banana bread ?\", \"Why not put knives in the dishwasher?\"], k=3)\n\nOf course, you can also save and load a model to/from a file as `in the\nword representation usage <#saving-and-loading-a-model-object>`__.\n\nFor more information about text classification usage of fasttext, you\ncan refer to our `text classification\ntutorial <https://fasttext.cc/docs/en/supervised-tutorial.html>`__.\n\nCompress model files with quantization\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nWhen you want to save a supervised model file, fastText can compress it\nin order to have a much smaller model file by sacrificing only a little\nbit performance.\n\n.. code:: py\n\n    # with the previously trained `model` object, call :\n    model.quantize(input='data.train.txt', retrain=True)\n\n    # then display results and save the new model :\n    print_results(*model.test(valid_data))\n    model.save_model(\"model_filename.ftz\")\n\n``model_filename.ftz`` will have a much smaller size than\n``model_filename.bin``.\n\nFor further reading on quantization, you can refer to `this paragraph\nfrom our blog\npost <https://fasttext.cc/blog/2017/10/02/blog-post.html#model-compression>`__.\n\nIMPORTANT: Preprocessing data / encoding conventions\n----------------------------------------------------\n\nIn general it is important to properly preprocess your data. In\nparticular our example scripts in the `root\nfolder <https://github.com/facebookresearch/fastText>`__ do this.\n\nfastText assumes UTF-8 encoded text. All text must be `unicode for\nPython2 <https://docs.python.org/2/library/functions.html#unicode>`__\nand `str for\nPython3 <https://docs.python.org/3.5/library/stdtypes.html#textseq>`__.\nThe passed text will be `encoded as UTF-8 by\npybind11 <https://pybind11.readthedocs.io/en/master/advanced/cast/strings.html?highlight=utf-8#strings-bytes-and-unicode-conversions>`__\nbefore passed to the fastText C++ library. This means it is important to\nuse UTF-8 encoded text when building a model. On Unix-like systems you\ncan convert text using `iconv <https://en.wikipedia.org/wiki/Iconv>`__.\n\nfastText will tokenize (split text into pieces) based on the following\nASCII characters (bytes). In particular, it is not aware of UTF-8\nwhitespace. We advice the user to convert UTF-8 whitespace / word\nboundaries into one of the following symbols as appropiate.\n\n-  space\n-  tab\n-  vertical tab\n-  carriage return\n-  formfeed\n-  the null character\n\nThe newline character is used to delimit lines of text. In particular,\nthe EOS token is appended to a line of text if a newline character is\nencountered. The only exception is if the number of tokens exceeds the\nMAX\\_LINE\\_SIZE constant as defined in the `Dictionary\nheader <https://github.com/facebookresearch/fastText/blob/master/src/dictionary.h>`__.\nThis means if you have text that is not separate by newlines, such as\nthe `fil9 dataset <http://mattmahoney.net/dc/textdata>`__, it will be\nbroken into chunks with MAX\\_LINE\\_SIZE of tokens and the EOS token is\nnot appended.\n\nThe length of a token is the number of UTF-8 characters by considering\nthe `leading two bits of a\nbyte <https://en.wikipedia.org/wiki/UTF-8#Description>`__ to identify\n`subsequent bytes of a multi-byte\nsequence <https://github.com/facebookresearch/fastText/blob/master/src/dictionary.cc>`__.\nKnowing this is especially important when choosing the minimum and\nmaximum length of subwords. Further, the EOS token (as specified in the\n`Dictionary\nheader <https://github.com/facebookresearch/fastText/blob/master/src/dictionary.h>`__)\nis considered a character and will not be broken into subwords.\n\nMore examples\n-------------\n\nIn order to have a better knowledge of fastText models, please consider\nthe main\n`README <https://github.com/facebookresearch/fastText/blob/master/README.md>`__\nand in particular `the tutorials on our\nwebsite <https://fasttext.cc/docs/en/supervised-tutorial.html>`__.\n\nYou can find further python examples in `the doc\nfolder <https://github.com/facebookresearch/fastText/tree/master/python/doc/examples>`__.\n\nAs with any package you can get help on any Python function using the\nhelp function.\n\nFor example\n\n::\n\n    +>>> import fasttext\n    +>>> help(fasttext.FastText)\n\n    Help on module fasttext.FastText in fasttext:\n\n    NAME\n        fasttext.FastText\n\n    DESCRIPTION\n        # Copyright (c) 2017-present, Facebook, Inc.\n        # All rights reserved.\n        #\n        # This source code is licensed under the MIT license found in the\n        # LICENSE file in the root directory of this source tree.\n\n    FUNCTIONS\n        load_model(path)\n            Load a model given a filepath and return a model object.\n\n        tokenize(text)\n            Given a string of text, tokenize it and return a list of tokens\n    [...]\n\nAPI\n===\n\n``train_unsupervised`` parameters\n---------------------------------\n\n.. code:: python\n\n        input             # training file path (required)\n        model             # unsupervised fasttext model {cbow, skipgram} [skipgram]\n        lr                # learning rate [0.05]\n        dim               # size of word vectors [100]\n        ws                # size of the context window [5]\n        epoch             # number of epochs [5]\n        minCount          # minimal number of word occurences [5]\n        minn              # min length of char ngram [3]\n        maxn              # max length of char ngram [6]\n        neg               # number of negatives sampled [5]\n        wordNgrams        # max length of word ngram [1]\n        loss              # loss function {ns, hs, softmax, ova} [ns]\n        bucket            # number of buckets [2000000]\n        thread            # number of threads [number of cpus]\n        lrUpdateRate      # change the rate of updates for the learning rate [100]\n        t                 # sampling threshold [0.0001]\n        verbose           # verbose [2]\n\n``train_supervised`` parameters\n-------------------------------\n\n.. code:: python\n\n        input             # training file path (required)\n        lr                # learning rate [0.1]\n        dim               # size of word vectors [100]\n        ws                # size of the context window [5]\n        epoch             # number of epochs [5]\n        minCount          # minimal number of word occurences [1]\n        minCountLabel     # minimal number of label occurences [1]\n        minn              # min length of char ngram [0]\n        maxn              # max length of char ngram [0]\n        neg               # number of negatives sampled [5]\n        wordNgrams        # max length of word ngram [1]\n        loss              # loss function {ns, hs, softmax, ova} [softmax]\n        bucket            # number of buckets [2000000]\n        thread            # number of threads [number of cpus]\n        lrUpdateRate      # change the rate of updates for the learning rate [100]\n        t                 # sampling threshold [0.0001]\n        label             # label prefix ['__label__']\n        verbose           # verbose [2]\n        pretrainedVectors # pretrained word vectors (.vec file) for supervised learning []\n\n``model`` object\n----------------\n\n``train_supervised``, ``train_unsupervised`` and ``load_model``\nfunctions return an instance of ``_FastText`` class, that we generaly\nname ``model`` object.\n\nThis object exposes those training arguments as properties : ``lr``,\n``dim``, ``ws``, ``epoch``, ``minCount``, ``minCountLabel``, ``minn``,\n``maxn``, ``neg``, ``wordNgrams``, ``loss``, ``bucket``, ``thread``,\n``lrUpdateRate``, ``t``, ``label``, ``verbose``, ``pretrainedVectors``.\nSo ``model.wordNgrams`` will give you the max length of word ngram used\nfor training this model.\n\nIn addition, the object exposes several functions :\n\n.. code:: python\n\n        get_dimension           # Get the dimension (size) of a lookup vector (hidden layer).\n                                # This is equivalent to `dim` property.\n        get_input_vector        # Given an index, get the corresponding vector of the Input Matrix.\n        get_input_matrix        # Get a copy of the full input matrix of a Model.\n        get_labels              # Get the entire list of labels of the dictionary\n                                # This is equivalent to `labels` property.\n        get_line                # Split a line of text into words and labels.\n        get_output_matrix       # Get a copy of the full output matrix of a Model.\n        get_sentence_vector     # Given a string, get a single vector represenation. This function\n                                # assumes to be given a single line of text. We split words on\n                                # whitespace (space, newline, tab, vertical tab) and the control\n                                # characters carriage return, formfeed and the null character.\n        get_subword_id          # Given a subword, return the index (within input matrix) it hashes to.\n        get_subwords            # Given a word, get the subwords and their indicies.\n        get_word_id             # Given a word, get the word id within the dictionary.\n        get_word_vector         # Get the vector representation of word.\n        get_words               # Get the entire list of words of the dictionary\n                                # This is equivalent to `words` property.\n        is_quantized            # whether the model has been quantized\n        predict                 # Given a string, get a list of labels and a list of corresponding probabilities.\n        quantize                # Quantize the model reducing the size of the model and it's memory footprint.\n        save_model              # Save the model to the given path\n        test                    # Evaluate supervised model using file given by path\n        test_label              # Return the precision and recall score for each label.\n\nThe properties ``words``, ``labels`` return the words and labels from\nthe dictionary :\n\n.. code:: py\n\n    model.words         # equivalent to model.get_words()\n    model.labels        # equivalent to model.get_labels()\n\nThe object overrides ``__getitem__`` and ``__contains__`` functions in\norder to return the representation of a word and to check if a word is\nin the vocabulary.\n\n.. code:: py\n\n    model['king']       # equivalent to model.get_word_vector('king')\n    'king' in model     # equivalent to `'king' in model.get_words()`\n\nJoin the fastText community\n---------------------------\n\n-  `Facebook page <https://www.facebook.com/groups/1174547215919768>`__\n-  `Stack\n   overflow <https://stackoverflow.com/questions/tagged/fasttext>`__\n-  `Google\n   group <https://groups.google.com/forum/#!forum/fasttext-library>`__\n-  `GitHub <https://github.com/facebookresearch/fastText>`__\n\n.. |CircleCI| image:: https://circleci.com/gh/facebookresearch/fastText/tree/master.svg?style=svg\n   :target: https://circleci.com/gh/facebookresearch/fastText/tree/master\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "fasttext Python bindings",
    "version": "0.9.2",
    "project_urls": {
        "Homepage": "https://github.com/facebookresearch/fastText"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "38e4e9b6407538100ced67ddde5941a071ef01482badd023ab8ebcc0d5e6797b",
                "md5": "b4ee9c28dc03cd0d91f37698873f49ed",
                "sha256": "5432ebdbac8cb90d91305f9d985bfd64fbd0d30a8f27de9a508c926e47d33051"
            },
            "downloads": -1,
            "filename": "fasttext_numpy2_wheel-0.9.2-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl",
            "has_sig": false,
            "md5_digest": "b4ee9c28dc03cd0d91f37698873f49ed",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": null,
            "size": 4557717,
            "upload_time": "2024-11-05T13:54:10",
            "upload_time_iso_8601": "2024-11-05T13:54:10.179665Z",
            "url": "https://files.pythonhosted.org/packages/38/e4/e9b6407538100ced67ddde5941a071ef01482badd023ab8ebcc0d5e6797b/fasttext_numpy2_wheel-0.9.2-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5c45bd33da5fbcbff711dcbd1d20f28329ece7a95b707ac32709df22a8a02349",
                "md5": "51d8e412dcb42aa00faf3976c3b3ba88",
                "sha256": "d47fc7b0f7a4410eb45437397803f9be217240baba660c1cdc8b493365dd1421"
            },
            "downloads": -1,
            "filename": "fasttext_numpy2_wheel-0.9.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "51d8e412dcb42aa00faf3976c3b3ba88",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": null,
            "size": 4646860,
            "upload_time": "2024-11-05T13:54:07",
            "upload_time_iso_8601": "2024-11-05T13:54:07.444153Z",
            "url": "https://files.pythonhosted.org/packages/5c/45/bd33da5fbcbff711dcbd1d20f28329ece7a95b707ac32709df22a8a02349/fasttext_numpy2_wheel-0.9.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "92a44dce97deb92057c2a71fc5f0d26a28507ecf8c6c64cd93f9edbd410a4caa",
                "md5": "6c57fdf76b91e2cfa78459bc4027685d",
                "sha256": "ef2f5caf30f7186eb08d3fc3ef598895aa534fc3a1cc31150904aaacee8a3c80"
            },
            "downloads": -1,
            "filename": "fasttext_numpy2_wheel-0.9.2-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl",
            "has_sig": false,
            "md5_digest": "6c57fdf76b91e2cfa78459bc4027685d",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": null,
            "size": 4578625,
            "upload_time": "2024-11-05T13:54:15",
            "upload_time_iso_8601": "2024-11-05T13:54:15.645928Z",
            "url": "https://files.pythonhosted.org/packages/92/a4/4dce97deb92057c2a71fc5f0d26a28507ecf8c6c64cd93f9edbd410a4caa/fasttext_numpy2_wheel-0.9.2-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "34e7c664619e42a38f0064f8d9100b76b3bb57bf2679360dece5cf245a5b980e",
                "md5": "747d8f0d157d0f182219729b9872dbfe",
                "sha256": "21ca228aad6c75349e1ccb231fa454b4ec5bdc2f8a510718d2b74b5742c65f20"
            },
            "downloads": -1,
            "filename": "fasttext_numpy2_wheel-0.9.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "747d8f0d157d0f182219729b9872dbfe",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": null,
            "size": 4669758,
            "upload_time": "2024-11-05T13:54:09",
            "upload_time_iso_8601": "2024-11-05T13:54:09.381068Z",
            "url": "https://files.pythonhosted.org/packages/34/e7/c664619e42a38f0064f8d9100b76b3bb57bf2679360dece5cf245a5b980e/fasttext_numpy2_wheel-0.9.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "df13f8a21e5aa6c0a3e382c2346322b6d70fe1435c85cd57c8913b56fc72441d",
                "md5": "f59ccb782bd7d0cb4c6093906cf2553f",
                "sha256": "e1d50a174457a11d9317c5eaef007d0971a1baed4b15b8f5c404d69ba7ed2346"
            },
            "downloads": -1,
            "filename": "fasttext_numpy2_wheel-0.9.2-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl",
            "has_sig": false,
            "md5_digest": "f59ccb782bd7d0cb4c6093906cf2553f",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": null,
            "size": 4583847,
            "upload_time": "2024-11-05T13:54:19",
            "upload_time_iso_8601": "2024-11-05T13:54:19.266121Z",
            "url": "https://files.pythonhosted.org/packages/df/13/f8a21e5aa6c0a3e382c2346322b6d70fe1435c85cd57c8913b56fc72441d/fasttext_numpy2_wheel-0.9.2-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d49da78dfd3e996f5d3c91bcd14a4e249a90da038b99287fda43211281724fdb",
                "md5": "8812b53c138906c38aa192507c4c21df",
                "sha256": "3e70c79c9ca670f36fcf639eea87bf44adff4082a966c62053747a7dee0f2898"
            },
            "downloads": -1,
            "filename": "fasttext_numpy2_wheel-0.9.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "8812b53c138906c38aa192507c4c21df",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": null,
            "size": 4686609,
            "upload_time": "2024-11-05T13:54:11",
            "upload_time_iso_8601": "2024-11-05T13:54:11.194367Z",
            "url": "https://files.pythonhosted.org/packages/d4/9d/a78dfd3e996f5d3c91bcd14a4e249a90da038b99287fda43211281724fdb/fasttext_numpy2_wheel-0.9.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7b0656b18b5f7e5d75497d2a5db654078384b8ca25caffbb22ab827f7904e540",
                "md5": "8dceb8ab54ad09f421fafad722033a4a",
                "sha256": "74e6f4d6c5453a7cd71ac61766c70797165e42de99458c7ce5b3310ce661531a"
            },
            "downloads": -1,
            "filename": "fasttext_numpy2_wheel-0.9.2-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl",
            "has_sig": false,
            "md5_digest": "8dceb8ab54ad09f421fafad722033a4a",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": null,
            "size": 4558804,
            "upload_time": "2024-11-05T13:54:23",
            "upload_time_iso_8601": "2024-11-05T13:54:23.524985Z",
            "url": "https://files.pythonhosted.org/packages/7b/06/56b18b5f7e5d75497d2a5db654078384b8ca25caffbb22ab827f7904e540/fasttext_numpy2_wheel-0.9.2-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "12eab9d18381f9c27fa2a3e052b8737dde1d53960dfbb650e489be0f77d19993",
                "md5": "86db755bfca18aa023520665f47232a6",
                "sha256": "5c7baab6f9419c99c4a69b01c06886314d147ea547c7be87e1ba68d37316bb1c"
            },
            "downloads": -1,
            "filename": "fasttext_numpy2_wheel-0.9.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "86db755bfca18aa023520665f47232a6",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": null,
            "size": 4650883,
            "upload_time": "2024-11-05T13:54:15",
            "upload_time_iso_8601": "2024-11-05T13:54:15.584987Z",
            "url": "https://files.pythonhosted.org/packages/12/ea/b9d18381f9c27fa2a3e052b8737dde1d53960dfbb650e489be0f77d19993/fasttext_numpy2_wheel-0.9.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "826d10b91f802316ae538a4cc781e6f4896139728a9975ca7c8bbd02c335ba71",
                "md5": "e73bef591aadbe79ba84928c20af2ff0",
                "sha256": "7f612905192f23be4ff883ece3c1ee44fca52f0c308c2bc4f2e4b6c0e86d99a2"
            },
            "downloads": -1,
            "filename": "fasttext_numpy2_wheel-0.9.2-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl",
            "has_sig": false,
            "md5_digest": "e73bef591aadbe79ba84928c20af2ff0",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": null,
            "size": 4552775,
            "upload_time": "2024-11-05T13:54:26",
            "upload_time_iso_8601": "2024-11-05T13:54:26.259077Z",
            "url": "https://files.pythonhosted.org/packages/82/6d/10b91f802316ae538a4cc781e6f4896139728a9975ca7c8bbd02c335ba71/fasttext_numpy2_wheel-0.9.2-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "75fa917d1437a70f6a4ef69c44025d6ab1f13c40903956ed203993fe1709a48e",
                "md5": "4a06420c8969acf413c9ff7f96b966fe",
                "sha256": "e50ab4f2ddd050a42fe7d634fb6237e855739e6d186aa9e6f879f0606bff8256"
            },
            "downloads": -1,
            "filename": "fasttext_numpy2_wheel-0.9.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "4a06420c8969acf413c9ff7f96b966fe",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": null,
            "size": 4645625,
            "upload_time": "2024-11-05T13:54:16",
            "upload_time_iso_8601": "2024-11-05T13:54:16.943668Z",
            "url": "https://files.pythonhosted.org/packages/75/fa/917d1437a70f6a4ef69c44025d6ab1f13c40903956ed203993fe1709a48e/fasttext_numpy2_wheel-0.9.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c08a8c72c92cd91146ce9948421f3967bc2775deac5215eb8fa3b0487020881f",
                "md5": "2257b327976fe58e958ba087648f1762",
                "sha256": "484bb7efb0d07c5b6235c8ab44d5e7ddcde5727a25dbc04d18175f777c1e799c"
            },
            "downloads": -1,
            "filename": "fasttext_numpy2_wheel-0.9.2.tar.gz",
            "has_sig": false,
            "md5_digest": "2257b327976fe58e958ba087648f1762",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 73494,
            "upload_time": "2024-11-05T13:54:19",
            "upload_time_iso_8601": "2024-11-05T13:54:19.159373Z",
            "url": "https://files.pythonhosted.org/packages/c0/8a/8c72c92cd91146ce9948421f3967bc2775deac5215eb8fa3b0487020881f/fasttext_numpy2_wheel-0.9.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-05 13:54:19",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "facebookresearch",
    "github_project": "fastText",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "fasttext-numpy2-wheel"
}
        
Elapsed time: 0.37187s