AI. MLOps. Neural Network. Daycare (huh?)

(Written June 2021)

Ok I admit it...I threw together some buzzwords in this title as I wasn't sure how this project would actualize.

All I knew in the beginning was that I wanted to incorporate ideas I had in my head, ranging from the Korean drama series 'Startup' (eh?), Artificial Intelligence, baby monitors (huh??), MLOps, and daycare (what???) .

The end result was a computer vision model that identifies from pictures of my boys whether they are sleeping, eating, or performing some other activity.

The model itself is so-so accuracy but what was more important to me was that I build+deploy a pipeline that would lend itself to continuous delivery, aka MLOps. That, and being able to scratch the surface in implementing a neural network and live out my fantasy of "doing" AI.

Steps:

1. Create account with Microsoft Azure Machine Learning

2. Use Azure ML Data Labeling to label pictures to be used in training model.

3. Use Azure ML Designer to create a pipeline for training model, score and evaluate. Rinse and repeat as necessary for better accuracy.

4. Deploy inference pipeline and have the model predict/label new pictures.

Code not hosted on GitHub (this was done with minimal coding, using Azure ML)

Korean drama 'Startup' + baby monitor + daycare

This seems like a haphazard groups of words. Bear with me as I try to explain.

Random thought #1: Glamourizing AI

With evenings free whilst our little ones slept, my wife and I indulged in Korean dramas. This series in particular, 'Startup', immediately grabbed my attention as its a rom-com with the back story being friends forming a startup company using Artificial Intelligence / Computer Vision / Neural Networks. One of their initial products used Convolutional Neural Network as the backbone to identify and label objects or people from streaming video. This then led me on a rabbit hole to various ideas...

One of the scenes has the main protagonists being degraded as they are asked to do a computer vision labeling exercise. We've probably seen those reCAPTCHA widgets on websites to prove you're a human and not some bot. Identify the trains, cars, signs, etc. That gives you a glimpse of how mind numbing computer vision labeling can be.

Random thought #2: Baby monitors to track noise/movement

When our son was months old, we had a baby monitor setup which would detect noises or movements should we need to rush in from another room.

Random thought #3: Real-time updates at daycare

Now that our son is older and in daycare, we see they have monitoring available to parents in the form of a cam that's constantly on in the room; realtime updates provided by the educators on food intake, hours napped, diapers changed, activities performed. I see the educators on their tablets updating information throughout the day and I wonder whether there might be a different way.

"Yes dumb dumb, I am awake. Do you really need a camera to hear me cry?"

With these random thoughts, I surmised:

Wouldn't it be cool if we could free up the need for daycare educators to be taking notes? This technology already exists, doesn't it?

As I read on, I found many examples of cool use cases where AI (or more specifically Computer Vision), could be used to track calories by analyzing photos/video of the food, number of pushups a person was performing and tracking exercise...basically measure, derive, and/or label something based on what could be seen.

This led to my objective:

Can I use AI to help daycare educators track what and when children are eating, sleeping, or any other activity?

An example of using Computer Vision for counting pushups

Definitions

Ok, before I forge on, check out this article.

It helped me demystify AI vs ML vs Deep Learning vs Neural Network

Data labeling

Usng Microsoft Azure ML, I dove into a data labeling exercise. Using a sample of 100+ pictures which I registered as a dataset, I began to label each picture as eating, sleeping, or other activity.

This activity is best accompanied with a beer in hand.


To code or not to code

Prior to this project I had completed certification for Azure Data Scientist Associate. I had learned that everything required in a data science exercise (compute, dataset, pipeline, model, etc.) could be done via code and/or through the graphical designer.

I wanted to do as much of this through the designer to prove that a quick and dirty MLOps + Computer Vision model could be done with minimal to no coding.

The model

I setup a DenseNet model (Dense Convolutional Network for image classification) and went through many many many iterations, running it against my training set of labeled pictures in the hopes of creating a model that would accurately label/predict from new pictures whether someone was eating, sleeping, or doing other activity.

I worked on and off for a few weeks as I would lazily launch an experiment, grab a coffee, and come back half hour later later only to find my pipeline had errored out or that the model's accuracy scored terribly.

This is the part that can really s-u-c-k.

Why the hell am I doing this? What small mistake did I make? Why won't that little box turn green instead of red?

When it comes to coding errors, you most often find some other poor soul who has been stuck in the same exact place you've been, who's posted on Stack Overflow, with the community pulling together to provide a dizzying amount of expertise and help.

When I got stuck in Azure ML designer, I didn't feel that same experience...I found either that there just wasn't enough users yet, some functionality was recently out of preview (aka beta), or that the help was confusing and convoluted (LOL, bad jk).

and then...


...model with 90% accuracy

Finally I had some results I could work with.

My next steps were to create an inference pipeline. This came in the form of a REST endpoint that could be called upon to label/predict when presented with new pictures, whether the subject was eating, sleeping, or other activity.

I now had a pipeline whereby if I were to update a step such as revising the dataset used for modeling, re-training the model, or predicting against a new test set; I could easily redeploy without having to rebuild all the steps in my pipeline. Furthermore Azure ML would keep track of all the versioning and results across experiments, models, datasets, etc.

My model was sure of itself...92% certain that this was a person eating...

What did i learn?


  1. Turn off the lights.

Using Azure Compute, I often failed to stop the servers when not in use and was unpleasantly surprised with a $100 bill at the end of the month. I could have avoided this by incorporating programmatic ways of stopping idle servers or setting a combination of thresholds and overage notifications.


  1. MLOps.

Performing data science experiments with changing parameters on the data, models, computer power, deployments, etc. requires appropriate versioning and tracking of these assets. In the spirit of continuous build, MLOps should be part and parcel of the setup as without, it can quickly and easily become unwieldy.


  1. Neural network.

My model's accuracy was high but when it came to predicting from new pictures, it was so-so. One of the interesting things about deep learning is that like a fine wine, it will only get better with time (provided you feed it more data...the model that is, not the wine).


  1. In Azure ML, Data Labeling does not work with Designer

This pissed me off. How can you create a feature that doesn't play nice with another feature in the same product? C'mon...


  1. Transfer learning

My data labeling exercise was only against 100 pictures but I can imagine if you had to do this for thousands or millions of pictures that the cost and resources would be immense. Transfer learning, whereby a model is reused as the starting point for a model on another task, would be an efficient way to kick start your experiment by leveraging the work of others. Transfer learning to me is indicative of the trends around Open-source software, OpenML ... collaborative and public, to bring efficiencies and savings.


  1. The future

My little model (tries to) label and predict whether the subject is eating, sleeping, or other activity. This could be extrapolated and applied to streaming video. Imagine taking this further and applying features such as food item detection to determine the volume/weight of food and estimate nutrition. Adding more and more features could make for some interesting solutions in monitoring the health of children, adults, and senior citizens in settings such as daycares, schools, work, and retirement homes.