8000 GitHub - raj-kariya/CNN_tut
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

raj-kariya/CNN_tut

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Kaggle Dataset: MNIST in CSV

For image recognition and classification tasks, Convolutional Neural Networks (CNNs) are a powerful tool in neural networks. They excel in areas like object detection, facial recognition, and more.

CNN Image Classification

A CNN takes an image as input, processes it, and assigns it to a specific category (e.g., dog, cat, tiger, lion). Computers perceive images as arrays of pixels, with the dimensions (height, width, and depth) depending on the resolution. For example, a 6x6x3 image represents a color image with RGB values (3 channels), while a 4x4x1 image represents a grayscale image.

Deep Learning with CNNs

To train and test deep learning CNN models, each input image goes through a series of steps:

  1. Convolution Layers with Filters (Kernels): These layers extract features from the image. Convolution preserves the spatial relationships between pixels by learning features using small squares of input data. It's a mathematical operation involving the image matrix and a filter/kernel.

    • Example: A 5x5 image matrix multiplied by a 3x3 filter matrix produces a "feature map" as output.
  2. Pooling Layers: These layers reduce the number of parameters, particularly when dealing with large images. Spatial pooling (also known as subsampling or downsampling) reduces the dimensionality of each map while retaining important information. Pooling can be of different types:

    • Max Pooling: Takes the largest element from the rectified feature map.
    • Average Pooling: Takes the average of elements in the feature map.
    • Sum Pooling: Takes the sum of all elements in the feature map.
  3. Fully Connected Layers (FC Layers): After processing through convolutional and pooling layers, the data is flattened into a vector and fed into fully connected layers, similar to traditional neural networks. These layers further refine the extracted features for classification.

  4. Softmax Function: This function assigns probabilities (between 0 and 1) to each class, allowing the model to predict the image's class with a degree of confidence.

The following diagram depicts the complete flow of a CNN for processing and classifying an image:

Key Components of a CNN

  • Convolution: The first layer to extract features, preserving spatial relationships between pixels.

  • Filters (Kernels): Small squares of weights that slide over the input image to extract features. Convolution of the image with different filters can perform operations like edge detection, blurring, and sharpening.

  • Stride: The number of pixels to shift the filter over the input matrix. A stride of 1 moves the filter by one pixel at a time, while a stride of 2 moves it by two pixels.

  • Padding: A technique to address filters not perfectly fitting the input image. Options include adding zeros (zero-padding) or dropping the non-fitting parts (valid padding).

  • ReLU (Rectified Linear Unit): A non-linear activation function that introduces non-linearity into the network. It helps the network learn more complex features by taking the maximum between 0 and the input value.

    • Importance of ReLU: It avoids the vanishing gradient problem that can occur with other activation functions like sigmoid and tanh. ReLU is generally preferred for its computational efficiency and performance.

Summary of the CNN Classification Process

  1. Provide the input image to the convolutional layer.
  2. Choose parameters, filters with strides, and padding (if needed).
  3. Perform convolution on the image and apply ReLU activation.
  4. Perform pooling to reduce dimensionality.
  5. Add as many convolutional layers as necessary.
  6. Flatten the output and feed it into fully connected layers.
  7. Output the class using an activation function (often Softmax) to classify the image.

In the next post, we'll explore popular CNN architectures like AlexNet, VGGNet, GoogLeNet, and ResNet.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0