Generative AI: An Introduction to, and an exploration of its many facets.

Artificial Intelligence, commonly called AI, is a system that shows behaviour that could be identified or interpreted as human intelligence. The issue of whether AI capabilities should be classified as intelligence has long since been a debate amongst industry experts. An experiment was conducted that involved AI systems giving responses in Mandarin Chinese to requests also given in Mandarin Chinese. It gave responses after memorizing and matching phrasebooks containing glossaries of phrases and terms in Mandarin Chinese, which were fed into its database, with the requests it was given. This went on to prove one thing, that AI systems weren’t exactly intelligent, but just excellent at observing, recognizing, and matching patterns. Big players and sponsors of AI research include Open AI, NVIDIA, Google, Meta, UC Berkeley, and LMU Munich, amongst many others.

There are various types of AI systems, including broad classes, Discriminative AI (DAI) and Generative AI (GAI). Discriminative AI, or DAI, focuses on classifying or identifying content that is based on already pre-existing data while Generative AI, or GAI, is used in applications that feature image generation, video synthesis, language generation, music composition etc. It is primarily for generating content. Other subcategories of AI include:

  • Reactive Machines – used in self-driving cars

  • Limited Memory – used in weather forecasts

  • Theory of mind – used in virtual customer assistance

  • Narrow AI – this generates customized product suggestions for e-commerce sites

  • Supervised learning – identifies objects from things like images and video

  • Unsupervised learning – can detect fraudulent bank transactions

  • Reinforcement learning – can teach a machine how to play a game

Applications of GAI

It is important to note that GAI can either be private or public (open-source). Most AI companies and creators have made their AI system source code available to the public for use, either by tweaking to their personal preferences or just using it the way it is. AI source codes are held in tools called notebooks. Notebooks are used to write and run AI source codes. Examples include Google Colab and Jupyter Notebooks. Google Colab requires a paid subscription if you want to reduce the generation time for your AI, but you can always stick with the default generation. The beauty of working with notebooks is that you can tweak and personalize the outcomes. There are different levels of users when it comes to AI and its architecture, namely: Basic users, Intermediate users, and Professional users.

  • A Basic user just makes use of AI by interacting with its interface to generate or create content for regular everyday use like research and recreation.

  • An Intermediate user goes further to interact with the source codes of already existing AI systems for tweaking personal preferences as earlier mentioned.

  • A Professional user, on the other hand, goes even further to either write the source codes for an AI system or emulate an existing one, and then create their notebook for writing or running the code.

The following terms are used in AI regularly:

  • Model – this is a set of algorithms that have been trained on a specific dataset

  • Notebook – this is a tool for writing and running AI source code

  • Application – this is an example of how a model can be used

  • Outcome – this is what the end user produces using GAI or a notebook that houses a model

Natural Language Models (NLMs)

These are the fundamental components on which Natural Language Processing (will be discussed in detail later) features are built. It is employed in a variety of applications including spelling auto-correction, translation, speech recognition, summaries etc. Most text-based GAI use NLMs in their operation. A commonly use one is GPT-3. GPT is an acronym for Generative Pre-trained Transformer. GPT is notable for its large-scale, transformer architecture and ability to generate human-like text. GPT is used in GAIs like the popular Chat GPT-3, produced and released by Open AI, and the new Microsoft Bing search from Microsoft. When Chat GPT-3 was released, due to the adaptability of its core architecture, it reached over 1 million users in less than a week. GPT-3 however, has its limitations. These include:

  • Lack of common sense

  • Lack of creativity

  • Lack of understanding of generated text

  • Biased databases and,

  • Normalization of mediocrity

Text-to-Image applications

Text-to-image applications, as the name implies, are GAI systems that simply interpret and convert your text input into images or photos to the best of their understanding of your text input. Some of these include:

  • Midjourney, which has a closed API and an art-centric approach to its architecture;

  • DALL-E, which has an open API, and when released by its corporation, had the most superior machine learning algorithm. It favours technical superiority over design;

  • Stable Diffusion, which is open source, and therefore continues to receive improvements from its developer community.

The quality of images generated from Text-to-image models can depend on the quality of the algorithms and the datasets used to train them. Some industry applications include:

  • Streaming production of movie backgrounds, using Cuberic (Alpha version walkthrough).

  • The suggestion of garments using real clothes, along with clothes generated by DALL-E, in Stitchfix.

  • Ideation for concepts, films, and also storyboards.

Generative Adversarial Networks (GANs)

GAN models function with the presence of a generator and discriminator tool that work together to improve the generator’s ability to create realistic data. The generator creates a guess or prediction of the data, and the discriminator sort of criticizes the prediction for accuracy in comparison with the correct data, causing the generator to reproduce a more accurate prediction every time based on the criticism. This goes on over and over, almost like a toggle between generator and discriminator, until the data produced by the generator is almost indistinguishable from the correct data. In GANs, you input one type of data and output the same type of data or a replica of the input.

Some real-world applications of GANs include:

  • Audi using GANs for their wheel designs

  • Beko using GANs in their sustainability stand film

  • GANs being used to produce synthetic versions of fraudulent transactions

Variational Auto-Encoders (VAE) and Anomaly detection

Variational Auto-Encoders are used for anomaly detection by training the AI on data sets of normal data and then using it to identify situations that deviate from the normal data. Applications include:

  • Detection of financial fraud, manufacturing flaws, and network security breaches

  • Detection of anomalies in healthcare (medical imaging, CT scans, and MRI using electronic records, vital signs, and demographic info)

  • Detection of scratches, dents, and misalignments in industrial quality control

How AI works

As earlier mentioned, when matching patterns, computers are just using high-tech phrasebooks. AI works simply by using a process called Machine learning to observe and memorize data and then using that data to analyze and interpret real-world scenarios. There are two classifications of AI, Strong AI and Weak AI. Strong AI exhibits most or all person-like behaviors while Weak AI is usually confined to very narrow tasks, like Apple’s Siri and Microsoft’s Cortana. Machine learning uses data as its five senses, and so requires access to large amounts of it to be able to function or give as accurate results as possible. It has become a dominant form of AI over the years because of its training to look for patterns in large datasets.

Artificial Neural Networks (ANNs)

A type of machine learning system in wide use is Artificial Neural Networks (ANNs). An ANN is an AI system that mimics the structure of the human brain, which is one of the common approaches to machine learning. It uses hundreds (or millions) of numerical dials that enable it to make much more specific guesses. For example, when trying to interpret an image to text, it doesn’t see the image as you do, but rather as a recognizable pattern of dots. Since ANNs are still machine learning systems, they still need access to huge amounts of data.

Natural Language Processing (NLP)

Communication between machines can be presumed to be more accurate than communication between humans, because of the straightforward and unbiased way in which they interact with each other via the transfer and reception of data, from specifically designed source to destination. In the same vein, AI strives to efficiently interact using human language just as much as using NLP. In NLP, you interact with the machine using your natural language. Modern NLP also uses machine learning and ANN, and so requires large amounts of data as well to learn. It looks through millions of conversations to identify patterns and communication trends. In NLP, it isn’t just about the machine understanding the words, but also the context and meaning of their usage.

How AI systems learn from data

Machines learn from data using two ways:

Supervised learning*:* In supervised learning, a data scientist acts like a tutor for the machine. It is fed with small chunks of labelled data called the training data, which it then uses to make classifications on much larger chunks of real-world data called the test data into human-created categories. A data scientist creates these categories. The system uses machine learning algorithms that rely on statistics to find relationships within the data. An example of this is in Binary classification where there can only be two possible outcomes or guesses.

Unsupervised learning: Here, you just let the machine make all the observations on its own. It still needs to be fed with huge amounts of training data as well, for that’s the only way for it to see the patterns. Only, here, the data is unlabeled. It, therefore, creates data clusters in the process of creating its groups of data. One of the biggest advantages of data clustering is that there is a lot more unlabeled data for the machine to learn from.

Types of Machine learning algorithms

Many machine learning algorithms, as earlier stated, are borrowed from statistics. Some also work for both supervised and unsupervised learning techniques. Some types of machine learning algorithms include the following:

Reinforcement learning: This is a machine learning algorithm that uses rewards as a way to give the system incentives to find new patterns. You can look at it as an algorithm that encourages the system to enhance its pattern-matching capabilities by giving it a sort of digital gift or point every time it makes a successful match. A typical example is a scenario where you tell Amazon Alexa to find and play more songs for you in the same genre as the one currently playing. When Alexa successfully does so, a reward is given internally in her algorithm that helps her make even more accurate matches in the future. Reinforcement learning is also used in teaching systems how to play games. Systems using this algorithm can improve over time by setting a series of goals and rewards. Q-learning is a reinforcement learning technique that will find the best course of action, given the current state of the agent. In Q-learning, the machine is required to improve the quality of the outcome with each subsequent prediction or output.

K-Nearest Neighbour (KNN): KNN is an algorithm that simply plots new data and compares it to existing data. It works by the principle of comparing both the new data and existing data, finding those with the least differences and most similarities, and grouping them together. As opposed to Binary classification, KNN uses multi-class classification where there can be more than just two possible outcomes. In KNN, minimizing the distance in differences is a key component, in this case, Euclidean distance. Euclidean distance is a mathematical formula that can help in seeing the distance between data points. KNN uses classification predictors to make its grouping while noticing differences between the data.

K-Means Clustering (KMC): KMC is an unsupervised machine learning algorithm. It is used to create clusters based on what the machine sees in the data. The system analyses the data and makes its groups and classifications called clusters. KMC is a good way for the system to organize or group people into clusters by looking at hundreds of variables.

Regression: Regression, or Regression analysis, is a supervised machine learning algorithm that looks at the relationship between predictors and the outcome. Predictors can also be referred to as Regressors, Input variables or Independent variables. The principle of Regression works by taking the training data and labelling the correct output aforehand, then letting the system use the labelled data with the test data. Regression analysis uses trends to determine outcomes, and so, therefore, is a valuable tool for businesses and organizations. Most experts beg to differ in the idea that Regression is a machine learning algorithm, because in actuality, it’s not really learning anything, but just identifying trends and predicting outcomes. In any way, it is very powerful for predicting future behaviour.

Naive Bayes: The Naive Bayes algorithm assumes away from the regular that all the predictors are independent of each other. It classifies items based on “many” features in the data and not just those that are closely auto-correlated or similar. It does something called a Class Predictor Probability. This is when it looks at each one of the predictors and creates a probability that an item belongs in a certain class. When looking at each predictor, it does so independently or individually. Banks can use Naive Bayes to check for fraud by looking at each banking predictor independently and measuring the likelihood of it being fraudulent. After this, they then use a Class Predictor probability to classify the transaction as either fraudulent or not. Considering that Naïve Bayes makes very few assumptions, it can analyze large amounts of predictors. This single property makes it even more accurate when ultimately classifying data.

How to select the best algorithm

In selecting the best algorithm for your AI system, you can the following tips:

  • Ensemble modelling, which includes:

  • Bagging: using several versions of the same machine-learning algorithm

  • Stacking: using different machine learning algorithms and stacking them on top of each other

Mixing and matching your machine-learning algorithms will give you different insights with different results.

  • Follow the data: In machine learning, systems must always follow the data to get to the best or most accurate outcomes, but this is not so easy. One of the biggest challenges in machine learning algorithms is being able to successfully balance out the bias and variance in the data.

  • Bias is a value that indicates the gap or difference between the predicted value and the actual outcome, while,

  • Variance is a measure of the degree of scatter of the predicted values

One of the commonest challenges in machine learning in this aspect is called the Bias-Variance tradeoff. This means that if the system attempts to balance the impact of one, in this case, the bias, it also has to look at the impact of the other, in the case of the variance. This is why it must follow the data. It will continually fine-tune both until it can come to the best balance between both, which becomes the prediction. The best outcome is always a prediction with a low bias and low variance.

Factors to consider when building an AI system

Two very defining factors to be considered when building an AI system are:

  • Figuring out what you want from the data and,

  • Determining the type of machine learning model you need, standard machine learning algorithms or ANNs

Ethics in AI

Just as in every other technological implementation, there are steps to be taken regarding ethics as regards AI usage and interaction with humans. These include:

  • Accessing whether the generated result fits your quality and satisfaction parameters

  • Organizing a board as an ethical foundation for the integration of GAI in your company

  • Providing employees with guidance and education

  • Ensuring that humans remain the sole decision-makers

  • Focusing on highlighting the role that humans play in the creation and use of AI

GAI and AI in general have come to stay, considering the multiple aspects of life it has been integrated into. It makes day to day operations easier for us to get by, even while we maintain human autonomy. It is only projected to get better at these operations as we develop and improve its architecture in future versions.