Deep Learning 101: Lesson 7: Perceptron
This article is part of the “Deep Learning 101” series. Explore the full series for more insights and in-depth learning here.
A perceptron is a fundamental building block of artificial intelligence and machine learning. Think of it as a simplified model of a neuron in our brain. It takes inputs, multiplies them by their respective weights, adds them up, and applies a threshold to determine its output. This output is then compared to the desired output, and the perceptron adjusts its weights using a process called gradient descent to minimize the difference between the actual and desired outputs. In essence, the perceptron learns from data to make predictions or classify new examples, just as we learn from examples to make decisions in our daily lives. It’s an important concept in AI and machine learning, and it’s the foundation for more complex models. The threshold concept we talked about earlier is also known as the activation function. The activation function is a mathematical function that takes the weighted sum of the inputs and applies a transformation to produce the output of the perceptron. The threshold can be thought of as a specific value of the activation function that determines when the perceptron should “fire” or produce an output.
A simple activation function, such as a step function, can be used to explain Perceptron. The step function compares the weighted sum of the inputs to the threshold. If the sum is greater than or equal to the threshold, the activation function outputs 1. Otherwise, it outputs 0. There are other activation functions used in AI and machine learning, such as sigmoid, softmax, etc. These functions introduce non-linearity and allow neural networks to learn more complex patterns and relationships in the data.
Understanding the Perceptron Through Logic Gates
Logic gates are the simplest forms of perceptrons and are essential to understanding how neural networks work. These gates process inputs to produce a binary output based on specific rules — similar to decision making. By studying logic gates such as AND and OR with different activation functions, we can elucidate the workings of a perceptron.
As depicted in the the above diagram, the perceptron consists of several key components:
Inputs (x₁ and x₂): These represent the feature data that we feed into the perceptron. Each input is associated with a weight which signifies its relative importance.
Weights (w₁ and w₂): Weights are applied to the inputs and express the strength of the connection between the input and the neuron. The perceptron learns by adjusting these weights based on the error of its predictions.
Weighted Sum (Σ): The formula below calculates the weighted sum, which is the linear combination of the inputs and their respective weights, adjusted by a bias term.
Bias (b): The bias allows us to shift the activation function to the left or right, which helps with fine-tuning the output of the perceptron.
Activation Function: The step function used in the perceptron model is defined as:
This function activates the neuron (outputs a 1) if the weighted sum is greater than or equal to zero, and deactivates it (outputs a 0) otherwise.
Output: The final binary result of the perceptron’s processing, determined by the activation function.
Example Calculations:
In the below table, we see the perceptron in action with specific weights and bias:
- Weights are set as w₁ = 0.6 and w₂ = 0.5
- The bias is set as b = -0.8
The table shows how different combinations of input values x₁ and x₂ affect the weighted sum and the resulting output after applying the step function. For example, if both x₁ and x₂ are 1, the weighted sum is 1 times 0.6 + 1 times 0.5–0.8 = 0.3. Since the sum is greater than 0, the output is 1.
The simplicity of the perceptron belies its power. By adjusting weights and biases during the training process, a perceptron can make complex decisions by finding the right balance to correctly map inputs to the desired output. This section has illustrated the basic computations and operations within a single perceptron. Subsequent sections will build on this knowledge and demonstrate how multiple perceptrons can be combined into a network to tackle complex tasks.
OR Gate with Step Function
In contrast, an OR-gate perceptron would output a 1 if any of its inputs were true. This behavior can also be explored by adjusting the weights and bias and following the same process as with the AND gate.
Transition to Sigmoid Activation
In the previous section, we discussed the architecture of a basic neural network model, using the step function as our activation function. Now, we will delve into a more advanced and commonly used activation function known as the Sigmoid function.
The Sigmoid function, often represented by the symbol σ(x), is a mathematical function that has a characteristic “S”-shaped curve or sigmoid curve. This function is widely utilized in the field of machine learning, particularly in the context of neural networks.
The Sigmoid function maps any input value to a value between 0 and 1, making it an ideal activation function for binary classification problems, where the output is required to be a probability value. The function’s equation is given by:
where:
- e is the base of the natural logarithm,
- x is the input to the function.
The Sigmoid function provides a smooth gradient and is differentiable at every point, which is an essential property that allows for efficient backpropagation during the neural network training process.
AND Gate with Sigmoid Function
As shown in the above diagram, to calculate the output, we first compute the weighted sum of the inputs plus a bias term using the below equation:
The bias allows us to shift the activation function to the left or right, which is critical for learning complex patterns. Once we have the weighted sum, we apply the Sigmoid function to this sum to get the neuron’s output.
Example Calculations:
Let’s consider an example where we have two inputs to our Sigmoid based Perceptron x₁ and x₂, with respective weights w₁ = 5.470, w₂ = 5.470, and b = -8.30. A table illustrates the output of our neuron for all combinations of binary inputs (0 or 1) is shown below:
As seen in the table, when both inputs are 0, the weighted sum is simply the bias b, which after being passed through the Sigmoid function gives an output close to 0. Conversely, when both inputs are 1, the sum is positive, and the output is close to 1, indicating a high probability scenario.
Through these simple but illustrative examples, we gain insight into the role of the perceptron as a building block in neural networks. The perceptron’s ability to perform logical operations with different activation functions demonstrates its flexibility and power. The step function provides binary decisions reminiscent of traditional digital circuits, while the sigmoid function introduces the nuance and complexity needed to model more complex, continuous responses. Understanding these dynamics is critical to appreciating the broader capabilities and applications of neural networks in AI.
Summary
The perceptron is a fundamental element of neural networks and represents a simplified model of a neuron. By adjusting weights and biases, perceptrons can perform logical operations and make binary decisions. Through examples using AND and OR gates with step and sigmoid functions, we see how perceptrons work and the importance of activation functions in capturing complex patterns. Understanding perceptrons is critical to appreciating the broader applications of neural networks in AI.
4 Ways to Learn
1. Read the article: Perceptron
2. Play with the visual tool: Perceptron
3. Watch the video: Perceptron
4. Practice with the code: Perceptron
Previous Article: Tensors
Next Article: Backpropagation