Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

Highly Confidential! GCP Architecture in Action! - AI Learning Applications

When it comes to the public perception of Google, it's often associated with cutting-edge research and innovation. Google is considered a pioneer in various open-source technologies, from Kubernetes to TensorFlow, with its presence felt in nearly all open-source initiatives. Especially lauded for its advancements in Artificial Intelligence (AI) technology, most of the products used in the market today are built using Google Cloud Platform (GCP)'s AI capabilities. But what exactly does GCP offer in terms of AI technology and products?

This article will explore what AI technology entails, discuss the categories and differences of GCP AI services, and delve into the implementation of GCP's "Vision API."

What is AI?
AI, or Artificial Intelligence, refers to the development of systems capable of performing tasks that typically require human intelligence. This includes tasks such as problem-solving, prediction, speech and image recognition, translation, and more. Common applications of AI include chatbots, Google Translate, image recognition, and the trending facial recognition technology amidst the pandemic. One of the most well-known examples is Google's AlphaGo defeating the world champion in the game of Go. But can AI truly become sentient beings with emotions, as depicted in movies?

American philosopher John Searle categorizes AI into "Strong AI" and "Weak AI." Strong AI would possess self-awareness and emotions akin to human traits, while Weak AI demonstrates specific behavioral capabilities such as image recognition and speech recognition. Currently, all developed AI falls under the category of Weak AI. True artificial minds akin to Strong AI, as portrayed in the movie "Free Guy," are yet to be achieved.

The Learning Process of AI

The core foundation of Artificial Intelligence (AI) is "Machine Learning" (ML), a term often heard in the field. Machine learning is the process of teaching machines to recognize features through training. In this process, training is conducted using algorithm-generated training models. There are two main learning methods: Supervised Learning and Unsupervised Learning.

In Supervised Learning, if we want machines to recognize images of dogs and cats, we first need to manually select images of dogs and cats and label them accordingly. Then, we train the machine using these labeled images. On the other hand, Unsupervised Learning involves feeding a large amount of data to the machine and allowing it to interpret and learn the differences between various images. Sometimes, during Unsupervised Learning, the AI might even learn to distinguish between fur colors, even if we only intended it to differentiate between cats and dogs.

The Three Major Categories of AI

The three major categories of AI can be understood as follows:

AI encompasses Machine Learning, which, in turn, includes Deep Learning. AI operates software through code to enable software to perceive and recognize human-like behaviors.
Supporting AI is Machine Learning, which primarily includes algorithms and data.
Deep Learning is a branch of Machine Learning. It simulates the neural networks of the brain, consisting of layers such as input, hidden, and output layers, also known as neural networks. The hidden layers in deep learning require more data compared to traditional machine learning models. Deep learning autonomously extracts and classifies features from big data and ultimately obtains the closest answer to the correct solution. For example, AlphaGo, as mentioned earlier, is an AI based on Deep Learning. Through inputting a large number of game records and self-training, it ultimately defeated the world champion.

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

GCP AI Services

Vertax AI:
Vertax AI is an integrated AI platform suitable for those who have some experience with AI and understand its benefits. With this product, users can quickly build their own training models without the need for technical expertise in algorithm development. It's worth noting that Vertax AI, formerly known as GCP's AI Platform, still supports the AI Platform interface. However, most functionalities have been integrated into Vertax AI, allowing users to perform related tasks directly.

Target Audience:
Data scientists, engineers interested in AI, data analysts

Natural Language API:
The Natural Language API is an AI application for natural language analysis. It can extract unstructured or semi-structured text data for exploration and provide in-depth text analysis results, including sentiment analysis, grammar structure, word combinations, and more. It is suitable for tasks requiring extensive text mining.

Target Audience:
Researchers conducting extensive literature reviews, healthcare professionals (for psychological therapy-related applications), data scientists

Speech-to-Text API:
The Speech-to-Text API is a familiar service that performs speech-to-text recognition. For example, in Google Maps, this service allows drivers to dictate addresses instead of manually typing them. It's also useful in the growing market of short video content creation, where it can quickly generate subtitles through speech-to-text conversion, making it a boon for content creators.

Target Audience:
Developers in navigation, speech-related service industries

Translation API:
As the name suggests, the Translation API utilizes AI for text translation. Over the years, Google Translate has become increasingly accurate due to the accumulation of large amounts of data. Google has released its extensively trained models as the Translation API, enabling users to perform AI translation tasks quickly.

Target Audience:
Language translation applications, cross-platform development, speech-related applications combined with Speech-to-Text

Video Intelligence:
With the rise of video platforms such as YouTube, Netflix, and TikTok, AI technologies dedicated to analyzing videos have become mature. Video Intelligence is an AI service used to analyze video content, extract different segments and scenes, and improve the viewer's experience by creating various tags.

Target Audience:
Professionals in the video industry, video data analysts

Vision API:
Vision API, the focus of this operation, primarily identifies objects and faces in images, including handwritten text detection. Common applications include facial recognition systems and detecting handwritten content in forms, which are popular AI applications in the fintech and manufacturing industries for product quality inspection.

Target Audience:
Fintech industry, facial recognition, manufacturing (product quality inspection), and more.

Vision API Implementation Objective:
Input images of cats and dogs, add tags, train the model, and test if the model can accurately identify other cat and dog images.

Source：Kaggle Cats & Dogs Dataset（https://www.microsoft.com/en-us/download/details.aspx?id=54765）

To access Vision API, follow the steps on the main page to enable it.

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

Select "Add Dataset."

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

Choose the model objective according to your needs.

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

Select "Upload from computer" and upload the zip file containing the cat and dog images prepared for this exercise.

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

Please wait while the dataset is being imported.

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

After the data import is complete, you can add labels. You can also click on "Label Statistics" to view label data.

Select "Train" and check if you have enough images. Each label requires a minimum of 10 images. Once you confirm everything is fine, choose to start the training. For this demonstration, there will be a total of 24 images for both cats and dogs available for machine learning.

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

Choose whether to train the model for deployment on the cloud or on edge devices.

Choose the node hours for computation. Higher settings result in faster training efficiency but also higher costs. Once confirmed, you can proceed to start the training.

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

After training is completed, click on "Evaluate" to view the training progress.

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

Since you didn't deploy the model earlier, choose "Test & Use," then proceed to deploy the model.

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

The deployment location determines how many computation requests can be supported simultaneously. If you anticipate a large number of end-users needing image recognition, you may want to increase the deployment location to handle the load.

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

After deployment is complete, you can use the model directly by clicking on "UPLOAD IMAGES".

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

After uploading the photo of the cute cat, there's no problem. The probability of it being a cat is 0.99.

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

After uploading the adorable dog photo, the probability of it being a cat decreases to 0.69. With more training images, the predictions will be more accurate.

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications

Finally, if necessary, the model can be deployed through REST API or Python code!

Top Secret Revealed! GCP Architecture in Action! - AI Learning Applications
Even conventional software engineers, who may not be proficient in algorithm development, can swiftly deploy an AI model using GCP's AI services. Furthermore, the predictions are optimized and adjusted, saving them from the time and monetary costs associated with developing AI algorithms.

Author

Solution Architecture
吳祐德 Ted Wu

share to