Classification of sounds using android mobile phone and the YAMNet ML model

  • Part 1: Architecture of ML model, conversion to TensorFlow Lite (TFLite), benchmarking of the model
  • Part 2: Android implementation
Computing the mel spectrogram
  • A stabilized log mel spectrogram is computed by applying log(mel-spectrum + 0.001) where the offset is used to avoid taking a logarithm of zero.
Computing stabilized log mel spectrogram
  • These features are then framed into 50%-overlapping examples of 0.96 seconds, where each example covers 64 mel bands and 96 frames of 10 ms each.
Generating the batch of input features
Source of image
Source of image
Source og image
  • Scores, a float32 Tensor of shape (N, 521) containing the per-frame predicted scores for each of the 521 classes in the AudioSet ontology that are supported by YAMNet.
  • Embeddings, a float32 Tensor of shape (N, 1024) containing per-frame embeddings, where the embedding vector is the average-pooled output that feeds into the final classifier layer.
  • log_mel_spectrogram, a float32 Tensor representing the log mel spectrogram of the entire waveform. These are the audio features passed into the model.
  • Initialization time
  • Inference time of warmup state
  • Inference time of steady state
  • Memory usage during initialization time
  • Overall memory usage
Error: Attempting to use a delegate that only supports static-sized tensors with a graph that has dynamic-sized tensors.

--

--

--

I am a pharmacist turned android developer and machine learning engineer. Right now I’m a senior android developer at Invisalign and a ML GDE.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

All about Categorical Variable Encoding

Feature Engineering for Numerical features -Remove skewness

Mushroom classification using KNN algorithm

Speech Recognition by using Deep Learning

The people, process, and technology impediment to making ML work in your business

Performance Metrics for Classification Models in Machine Learning: Part II

Easy Way to Understand Machine Learning for Everyone

A Glance to Natural Language Processing (NLP)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
George Soloupis

George Soloupis

I am a pharmacist turned android developer and machine learning engineer. Right now I’m a senior android developer at Invisalign and a ML GDE.

More from Medium

Use Restricted Boltzmann Machine to “fix/create/rebuild” image or photo

A simple take on effusion detection using chest X-rays

x-ray classes comparison

Hindi Character Recognition

Sign Language Detection using LSTM Model