bert-for-tf2-e


Namebert-for-tf2-e JSON
Version 0.14.11 PyPI version JSON
download
home_page
SummaryA TensorFlow 2.11.0 Keras implementation of BERT.
upload_time2023-01-22 11:39:23
maintainer
docs_urlNone
authorEsa Krissa
requires_python>=3.6
licenseMIT
keywords tensorflow keras bert
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            BERT for TensorFlow 2.11.0
======================

|Build Status| |Coverage Status| |Version Status| |Python Versions| |Downloads|

This repo contains a `TensorFlow 2.11.0`_ `Keras`_ implementation of `google-research/bert`_
with support for loading of the original `pre-trained weights`_,
and producing activations **numerically identical** to the one calculated by the original model.

`ALBERT`_ and `adapter-BERT`_ are also supported by setting the corresponding
configuration parameters (``shared_layer=True``, ``embedding_size`` for `ALBERT`_
and ``adapter_size`` for `adapter-BERT`_). Setting both will result in an adapter-ALBERT
by sharing the BERT parameters across all layers while adapting every layer with layer specific adapter.

The implementation is build from scratch using only basic tensorflow operations,
following the code in `google-research/bert/modeling.py`_
(but skipping dead code and applying some simplifications). It also utilizes `kpe/params-flow`_ to reduce
common Keras boilerplate code (related to passing model and layer configuration arguments).

`bert-for-tf2-e`_ should work with both `TensorFlow 2.11.0`_ and `TensorFlow 1.14`_ or newer.

Install
-------

``bert-for-tf2-e`` is on the Python Package Index (PyPI):

::

    pip install bert-for-tf2-e


Usage
-----

BERT in `bert-for-tf2-e` is implemented as a Keras layer. You could instantiate it like this:

.. code:: python

  from bert import BertModelLayer

  l_bert = BertModelLayer(**BertModelLayer.Params(
    vocab_size               = 16000,        # embedding params
    use_token_type           = True,
    use_position_embeddings  = True,
    token_type_vocab_size    = 2,

    num_layers               = 12,           # transformer encoder params
    hidden_size              = 768,
    hidden_dropout           = 0.1,
    intermediate_size        = 4*768,
    intermediate_activation  = "gelu",

    adapter_size             = None,         # see arXiv:1902.00751 (adapter-BERT)

    shared_layer             = False,        # True for ALBERT (arXiv:1909.11942)
    embedding_size           = None,         # None for BERT, wordpiece embedding size for ALBERT

    name                     = "bert"        # any other Keras layer params
  ))

or by using the ``bert_config.json`` from a `pre-trained google model`_:

.. code:: python

  import bert

  model_dir = ".models/uncased_L-12_H-768_A-12"

  bert_params = bert.params_from_pretrained_ckpt(model_dir)
  l_bert = bert.BertModelLayer.from_params(bert_params, name="bert")


now you can use the BERT layer in your Keras model like this:

.. code:: python

  from tensorflow import keras

  max_seq_len = 128
  l_input_ids      = keras.layers.Input(shape=(max_seq_len,), dtype='int32')
  l_token_type_ids = keras.layers.Input(shape=(max_seq_len,), dtype='int32')

  # using the default token_type/segment id 0
  output = l_bert(l_input_ids)                              # output: [batch_size, max_seq_len, hidden_size]
  model = keras.Model(inputs=l_input_ids, outputs=output)
  model.build(input_shape=(None, max_seq_len))

  # provide a custom token_type/segment id as a layer input
  output = l_bert([l_input_ids, l_token_type_ids])          # [batch_size, max_seq_len, hidden_size]
  model = keras.Model(inputs=[l_input_ids, l_token_type_ids], outputs=output)
  model.build(input_shape=[(None, max_seq_len), (None, max_seq_len)])

if you choose to use `adapter-BERT`_ by setting the `adapter_size` parameter,
you would also like to freeze all the original BERT layers by calling:

.. code:: python

  l_bert.apply_adapter_freeze()

and once the model has been build or compiled, the original pre-trained weights
can be loaded in the BERT layer:

.. code:: python

  import bert

  bert_ckpt_file   = os.path.join(model_dir, "bert_model.ckpt")
  bert.load_stock_weights(l_bert, bert_ckpt_file)

**N.B.** see `tests/test_bert_activations.py`_ for a complete example.

FAQ
---
0. In all the examlpes bellow, **please note** the line:

.. code:: python

  # use in a Keras Model here, and call model.build()

for a quick test, you can replace it with something like:

.. code:: python

  model = keras.models.Sequential([
    keras.layers.InputLayer(input_shape=(128,)),
    l_bert,
    keras.layers.Lambda(lambda x: x[:, 0, :]),
    keras.layers.Dense(2)
  ])
  model.build(input_shape=(None, 128))


1. How to use BERT with the `google-research/bert`_ pre-trained weights?

.. code:: python

  model_name = "uncased_L-12_H-768_A-12"
  model_dir = bert.fetch_google_bert_model(model_name, ".models")
  model_ckpt = os.path.join(model_dir, "bert_model.ckpt")

  bert_params = bert.params_from_pretrained_ckpt(model_dir)
  l_bert = bert.BertModelLayer.from_params(bert_params, name="bert")

  # use in a Keras Model here, and call model.build()

  bert.load_bert_weights(l_bert, model_ckpt)      # should be called after model.build()

2. How to use ALBERT with the `google-research/ALBERT`_ pre-trained weights (fetching from TFHub)?

see `tests/nonci/test_load_pretrained_weights.py <https://github.com/kpe/bert-for-tf2/blob/master/tests/nonci/test_load_pretrained_weights.py>`_:

.. code:: python

  model_name = "albert_base"
  model_dir    = bert.fetch_tfhub_albert_model(model_name, ".models")
  model_params = bert.albert_params(model_name)
  l_bert = bert.BertModelLayer.from_params(model_params, name="albert")

  # use in a Keras Model here, and call model.build()

  bert.load_albert_weights(l_bert, albert_dir)      # should be called after model.build()

3. How to use ALBERT with the `google-research/ALBERT`_ pre-trained weights (non TFHub)?

see `tests/nonci/test_load_pretrained_weights.py <https://github.com/kpe/bert-for-tf2/blob/master/tests/nonci/test_load_pretrained_weights.py>`_:

.. code:: python

  model_name = "albert_base_v2"
  model_dir    = bert.fetch_google_albert_model(model_name, ".models")
  model_ckpt   = os.path.join(albert_dir, "model.ckpt-best")

  model_params = bert.albert_params(model_dir)
  l_bert = bert.BertModelLayer.from_params(model_params, name="albert")

  # use in a Keras Model here, and call model.build()

  bert.load_albert_weights(l_bert, model_ckpt)      # should be called after model.build()

4. How to use ALBERT with the `brightmart/albert_zh`_ pre-trained weights?

see `tests/nonci/test_albert.py <https://github.com/kpe/bert-for-tf2/blob/master/tests/nonci/test_albert.py>`_:

.. code:: python

  model_name = "albert_base"
  model_dir = bert.fetch_brightmart_albert_model(model_name, ".models")
  model_ckpt = os.path.join(model_dir, "albert_model.ckpt")

  bert_params = bert.params_from_pretrained_ckpt(model_dir)
  l_bert = bert.BertModelLayer.from_params(bert_params, name="bert")

  # use in a Keras Model here, and call model.build()

  bert.load_albert_weights(l_bert, model_ckpt)      # should be called after model.build()

5. How to tokenize the input for the `google-research/bert`_ models?

.. code:: python

  do_lower_case = not (model_name.find("cased") == 0 or model_name.find("multi_cased") == 0)
  bert.bert_tokenization.validate_case_matches_checkpoint(do_lower_case, model_ckpt)
  vocab_file = os.path.join(model_dir, "vocab.txt")
  tokenizer = bert.bert_tokenization.FullTokenizer(vocab_file, do_lower_case)
  tokens = tokenizer.tokenize("Hello, BERT-World!")
  token_ids = tokenizer.convert_tokens_to_ids(tokens)

6. How to tokenize the input for `brightmart/albert_zh`?

.. code:: python

  import params_flow pf

  # fetch the vocab file
  albert_zh_vocab_url = "https://raw.githubusercontent.com/brightmart/albert_zh/master/albert_config/vocab.txt"
  vocab_file = pf.utils.fetch_url(albert_zh_vocab_url, model_dir)

  tokenizer = bert.albert_tokenization.FullTokenizer(vocab_file)
  tokens = tokenizer.tokenize("你好世界")
  token_ids = tokenizer.convert_tokens_to_ids(tokens)

7. How to tokenize the input for the `google-research/ALBERT`_ models?

.. code:: python

  import sentencepiece as spm

  spm_model = os.path.join(model_dir, "assets", "30k-clean.model")
  sp = spm.SentencePieceProcessor()
  sp.load(spm_model)
  do_lower_case = True

  processed_text = bert.albert_tokenization.preprocess_text("Hello, World!", lower=do_lower_case)
  token_ids = bert.albert_tokenization.encode_ids(sp, processed_text)

8. How to tokenize the input for the Chinese `google-research/ALBERT`_ models?

.. code:: python

  import bert

  vocab_file = os.path.join(model_dir, "vocab.txt")
  tokenizer = bert.albert_tokenization.FullTokenizer(vocab_file=vocab_file)
  tokens = tokenizer.tokenize(u"你好世界")
  token_ids = tokenizer.convert_tokens_to_ids(tokens)

Resources
---------
- `ORIGINAL`_ - https://github.com/kpe/bert-for-tf2

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "bert-for-tf2-e",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "tensorflow keras bert",
    "author": "Esa Krissa",
    "author_email": "esakrissa.wayan@gmail.com",
    "download_url": "",
    "platform": null,
    "description": "BERT for TensorFlow 2.11.0\n======================\n\n|Build Status| |Coverage Status| |Version Status| |Python Versions| |Downloads|\n\nThis repo contains a `TensorFlow 2.11.0`_ `Keras`_ implementation of `google-research/bert`_\nwith support for loading of the original `pre-trained weights`_,\nand producing activations **numerically identical** to the one calculated by the original model.\n\n`ALBERT`_ and `adapter-BERT`_ are also supported by setting the corresponding\nconfiguration parameters (``shared_layer=True``, ``embedding_size`` for `ALBERT`_\nand ``adapter_size`` for `adapter-BERT`_). Setting both will result in an adapter-ALBERT\nby sharing the BERT parameters across all layers while adapting every layer with layer specific adapter.\n\nThe implementation is build from scratch using only basic tensorflow operations,\nfollowing the code in `google-research/bert/modeling.py`_\n(but skipping dead code and applying some simplifications). It also utilizes `kpe/params-flow`_ to reduce\ncommon Keras boilerplate code (related to passing model and layer configuration arguments).\n\n`bert-for-tf2-e`_ should work with both `TensorFlow 2.11.0`_ and `TensorFlow 1.14`_ or newer.\n\nInstall\n-------\n\n``bert-for-tf2-e`` is on the Python Package Index (PyPI):\n\n::\n\n    pip install bert-for-tf2-e\n\n\nUsage\n-----\n\nBERT in `bert-for-tf2-e` is implemented as a Keras layer. You could instantiate it like this:\n\n.. code:: python\n\n  from bert import BertModelLayer\n\n  l_bert = BertModelLayer(**BertModelLayer.Params(\n    vocab_size               = 16000,        # embedding params\n    use_token_type           = True,\n    use_position_embeddings  = True,\n    token_type_vocab_size    = 2,\n\n    num_layers               = 12,           # transformer encoder params\n    hidden_size              = 768,\n    hidden_dropout           = 0.1,\n    intermediate_size        = 4*768,\n    intermediate_activation  = \"gelu\",\n\n    adapter_size             = None,         # see arXiv:1902.00751 (adapter-BERT)\n\n    shared_layer             = False,        # True for ALBERT (arXiv:1909.11942)\n    embedding_size           = None,         # None for BERT, wordpiece embedding size for ALBERT\n\n    name                     = \"bert\"        # any other Keras layer params\n  ))\n\nor by using the ``bert_config.json`` from a `pre-trained google model`_:\n\n.. code:: python\n\n  import bert\n\n  model_dir = \".models/uncased_L-12_H-768_A-12\"\n\n  bert_params = bert.params_from_pretrained_ckpt(model_dir)\n  l_bert = bert.BertModelLayer.from_params(bert_params, name=\"bert\")\n\n\nnow you can use the BERT layer in your Keras model like this:\n\n.. code:: python\n\n  from tensorflow import keras\n\n  max_seq_len = 128\n  l_input_ids      = keras.layers.Input(shape=(max_seq_len,), dtype='int32')\n  l_token_type_ids = keras.layers.Input(shape=(max_seq_len,), dtype='int32')\n\n  # using the default token_type/segment id 0\n  output = l_bert(l_input_ids)                              # output: [batch_size, max_seq_len, hidden_size]\n  model = keras.Model(inputs=l_input_ids, outputs=output)\n  model.build(input_shape=(None, max_seq_len))\n\n  # provide a custom token_type/segment id as a layer input\n  output = l_bert([l_input_ids, l_token_type_ids])          # [batch_size, max_seq_len, hidden_size]\n  model = keras.Model(inputs=[l_input_ids, l_token_type_ids], outputs=output)\n  model.build(input_shape=[(None, max_seq_len), (None, max_seq_len)])\n\nif you choose to use `adapter-BERT`_ by setting the `adapter_size` parameter,\nyou would also like to freeze all the original BERT layers by calling:\n\n.. code:: python\n\n  l_bert.apply_adapter_freeze()\n\nand once the model has been build or compiled, the original pre-trained weights\ncan be loaded in the BERT layer:\n\n.. code:: python\n\n  import bert\n\n  bert_ckpt_file   = os.path.join(model_dir, \"bert_model.ckpt\")\n  bert.load_stock_weights(l_bert, bert_ckpt_file)\n\n**N.B.** see `tests/test_bert_activations.py`_ for a complete example.\n\nFAQ\n---\n0. In all the examlpes bellow, **please note** the line:\n\n.. code:: python\n\n  # use in a Keras Model here, and call model.build()\n\nfor a quick test, you can replace it with something like:\n\n.. code:: python\n\n  model = keras.models.Sequential([\n    keras.layers.InputLayer(input_shape=(128,)),\n    l_bert,\n    keras.layers.Lambda(lambda x: x[:, 0, :]),\n    keras.layers.Dense(2)\n  ])\n  model.build(input_shape=(None, 128))\n\n\n1. How to use BERT with the `google-research/bert`_ pre-trained weights?\n\n.. code:: python\n\n  model_name = \"uncased_L-12_H-768_A-12\"\n  model_dir = bert.fetch_google_bert_model(model_name, \".models\")\n  model_ckpt = os.path.join(model_dir, \"bert_model.ckpt\")\n\n  bert_params = bert.params_from_pretrained_ckpt(model_dir)\n  l_bert = bert.BertModelLayer.from_params(bert_params, name=\"bert\")\n\n  # use in a Keras Model here, and call model.build()\n\n  bert.load_bert_weights(l_bert, model_ckpt)      # should be called after model.build()\n\n2. How to use ALBERT with the `google-research/ALBERT`_ pre-trained weights (fetching from TFHub)?\n\nsee `tests/nonci/test_load_pretrained_weights.py <https://github.com/kpe/bert-for-tf2/blob/master/tests/nonci/test_load_pretrained_weights.py>`_:\n\n.. code:: python\n\n  model_name = \"albert_base\"\n  model_dir    = bert.fetch_tfhub_albert_model(model_name, \".models\")\n  model_params = bert.albert_params(model_name)\n  l_bert = bert.BertModelLayer.from_params(model_params, name=\"albert\")\n\n  # use in a Keras Model here, and call model.build()\n\n  bert.load_albert_weights(l_bert, albert_dir)      # should be called after model.build()\n\n3. How to use ALBERT with the `google-research/ALBERT`_ pre-trained weights (non TFHub)?\n\nsee `tests/nonci/test_load_pretrained_weights.py <https://github.com/kpe/bert-for-tf2/blob/master/tests/nonci/test_load_pretrained_weights.py>`_:\n\n.. code:: python\n\n  model_name = \"albert_base_v2\"\n  model_dir    = bert.fetch_google_albert_model(model_name, \".models\")\n  model_ckpt   = os.path.join(albert_dir, \"model.ckpt-best\")\n\n  model_params = bert.albert_params(model_dir)\n  l_bert = bert.BertModelLayer.from_params(model_params, name=\"albert\")\n\n  # use in a Keras Model here, and call model.build()\n\n  bert.load_albert_weights(l_bert, model_ckpt)      # should be called after model.build()\n\n4. How to use ALBERT with the `brightmart/albert_zh`_ pre-trained weights?\n\nsee `tests/nonci/test_albert.py <https://github.com/kpe/bert-for-tf2/blob/master/tests/nonci/test_albert.py>`_:\n\n.. code:: python\n\n  model_name = \"albert_base\"\n  model_dir = bert.fetch_brightmart_albert_model(model_name, \".models\")\n  model_ckpt = os.path.join(model_dir, \"albert_model.ckpt\")\n\n  bert_params = bert.params_from_pretrained_ckpt(model_dir)\n  l_bert = bert.BertModelLayer.from_params(bert_params, name=\"bert\")\n\n  # use in a Keras Model here, and call model.build()\n\n  bert.load_albert_weights(l_bert, model_ckpt)      # should be called after model.build()\n\n5. How to tokenize the input for the `google-research/bert`_ models?\n\n.. code:: python\n\n  do_lower_case = not (model_name.find(\"cased\") == 0 or model_name.find(\"multi_cased\") == 0)\n  bert.bert_tokenization.validate_case_matches_checkpoint(do_lower_case, model_ckpt)\n  vocab_file = os.path.join(model_dir, \"vocab.txt\")\n  tokenizer = bert.bert_tokenization.FullTokenizer(vocab_file, do_lower_case)\n  tokens = tokenizer.tokenize(\"Hello, BERT-World!\")\n  token_ids = tokenizer.convert_tokens_to_ids(tokens)\n\n6. How to tokenize the input for `brightmart/albert_zh`?\n\n.. code:: python\n\n  import params_flow pf\n\n  # fetch the vocab file\n  albert_zh_vocab_url = \"https://raw.githubusercontent.com/brightmart/albert_zh/master/albert_config/vocab.txt\"\n  vocab_file = pf.utils.fetch_url(albert_zh_vocab_url, model_dir)\n\n  tokenizer = bert.albert_tokenization.FullTokenizer(vocab_file)\n  tokens = tokenizer.tokenize(\"\u4f60\u597d\u4e16\u754c\")\n  token_ids = tokenizer.convert_tokens_to_ids(tokens)\n\n7. How to tokenize the input for the `google-research/ALBERT`_ models?\n\n.. code:: python\n\n  import sentencepiece as spm\n\n  spm_model = os.path.join(model_dir, \"assets\", \"30k-clean.model\")\n  sp = spm.SentencePieceProcessor()\n  sp.load(spm_model)\n  do_lower_case = True\n\n  processed_text = bert.albert_tokenization.preprocess_text(\"Hello, World!\", lower=do_lower_case)\n  token_ids = bert.albert_tokenization.encode_ids(sp, processed_text)\n\n8. How to tokenize the input for the Chinese `google-research/ALBERT`_ models?\n\n.. code:: python\n\n  import bert\n\n  vocab_file = os.path.join(model_dir, \"vocab.txt\")\n  tokenizer = bert.albert_tokenization.FullTokenizer(vocab_file=vocab_file)\n  tokens = tokenizer.tokenize(u\"\u4f60\u597d\u4e16\u754c\")\n  token_ids = tokenizer.convert_tokens_to_ids(tokens)\n\nResources\n---------\n- `ORIGINAL`_ - https://github.com/kpe/bert-for-tf2\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A TensorFlow 2.11.0 Keras implementation of BERT.",
    "version": "0.14.11",
    "split_keywords": [
        "tensorflow",
        "keras",
        "bert"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "65dcffcdfbf004f6bf0d857257a5bb801657be1c970a8aebd4ab120426903cba",
                "md5": "79730a722df4866d4fe8e35468e47d2a",
                "sha256": "b699b5af69d6a67f6480388a43b8ffc9ad0bc919c4044b76e87969fbab0695ed"
            },
            "downloads": -1,
            "filename": "bert_for_tf2_e-0.14.11-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "79730a722df4866d4fe8e35468e47d2a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 49915,
            "upload_time": "2023-01-22T11:39:23",
            "upload_time_iso_8601": "2023-01-22T11:39:23.401762Z",
            "url": "https://files.pythonhosted.org/packages/65/dc/ffcdfbf004f6bf0d857257a5bb801657be1c970a8aebd4ab120426903cba/bert_for_tf2_e-0.14.11-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-01-22 11:39:23",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "bert-for-tf2-e"
}
        
Elapsed time: 0.08386s