Fine tune a BERT model with the use of Colab TPU.

First some info about the BERT model.

  • Token embeddings: A [CLS] token is added to the input word tokens at the beginning of the first sentence and a [SEP] token is inserted at the end of each sentence.
  • Segment embeddings: A marker indicating Sentence A or Sentence B is added to each token. This allows the encoder to distinguish between sentences.
  • Positional embeddings: A positional embedding is added to each token to indicate its position in the sentence.

Colab walkthrough

# Install transformers and download the greek specific file for model and tokenizer
# This is done only for testing some inputs
# For fine tuning the BERT model we have to load again the model when we build it inside strategy.scope()
!pip install transformers
from transformers import AutoTokenizer, TFAutoModel

tokenizer = AutoTokenizer.from_pretrained("nlpaueb/bert-base-greek-uncased-v1")
model = TFAutoModel.from_pretrained("nlpaueb/bert-base-greek-uncased-v1")
input_x = bert_encode(text_list, bert_preprocess_model)
# creating the model in the TPUStrategy scope places the model on the TPU
with
strategy.scope():
model = build_model()
model.compile(tf.keras.optimizers.Adam(lr=1e-5), loss='categorical_crossentropy', metrics=['accuracy'], steps_per_execution=32)

model.summary()
train_history = model.fit(
input_x, train_labels,
validation_split=0.2,
epochs=5,
batch_size=32,
verbose=1)

Model saving/loading on TPUs

save_locally = tf.saved_model.SaveOptions(experimental_io_device='/job:localhost')model.save('./model', options=save_locally) # saving in Tensorflow's "SavedModel" format
with strategy.scope():
load_locally = tf.saved_model.LoadOptions(experimental_io_device='/job:localhost')
model = tf.keras.models.load_model('./model', options=load_locally) # loading in Tensorflow's "SavedModel" format

--

--

--

I am a pharmacist turned android developer and machine learning engineer. Right now I’m a senior android developer at Invisalign and a ML GDE.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Announcing the 3rd annual Machine Learning in the Real World workshop

The Six NVIDIA Xavier Processors

Machine Learning Basics: Logistic Regression (Classification)

Paper Reading on Sequence to Sequence Learning with Neural Networks

Pipeline of Automatic Speech Recognition System

Solving the Schrödinger equation with deep learning

SP4C3 Tensor Processing Unit: a disruptive technology for the crypto market.

The Evolution in Corpus Analysis Tools

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
George Soloupis

George Soloupis

I am a pharmacist turned android developer and machine learning engineer. Right now I’m a senior android developer at Invisalign and a ML GDE.

More from Medium

Chatbot Optimization — A Necessity for Future Progress!

Building Interactive standalone Edge Impulse Models with MQTT Connectivity on the Nordic Thingy91

Critical Review of the “Deep Learning” Paper

Read Instead of Listen: How Speech Recognition Works on VKontakte