Computer Vision Architectures – From Scratch Implementations

Overview

This repository is a structured exploration of classical and modern computer vision architectures implemented in PyTorch.

The goal of this project is to deeply understand the evolution of convolutional and attention-based models — from AlexNet to Vision Transformers — by implementing them modularly and analyzing their architectural trade-offs.

Architectures Implemented

AlexNet
VGG
ResNet
DenseNet
GoogLeNet (Inception v1)
MobileNet
SqueezeNet
Vision Transformer (ViT)

Additional Modules

Convolution operations (from scratch implementation)
Regularization techniques
Transfer learning strategies

Research Motivation

This repository was created to:

Understand architectural innovations in deep learning
Analyze parameter efficiency vs performance trade-offs
Compare convolution-based and attention-based approaches
Explore generalization techniques in deep neural networks

Key Insights from Implementation

Residual connections mitigate vanishing gradients
Dense connectivity encourages feature reuse
Depthwise separable convolutions reduce computational cost
Transformers remove spatial locality bias but require larger datasets
Regularization techniques significantly improve generalization

Tech Stack

Python
PyTorch
NumPy

Future Work

Hybrid CNN-Transformer architectures
3D CNNs for hyperspectral imagery
Vision-Language Models (VLMs)
Self-Supervised Pretraining methods

This repository was created as a structured study of deep learning architectures to understand their mathematical foundations and architectural evolution.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
AlexNet		AlexNet
GoogleNet_InceptionV1		GoogleNet_InceptionV1
SqeezeNet		SqeezeNet
VGG		VGG
Vision_transformer		Vision_transformer
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
Transfer_leaning_attaching_head.ipynb		Transfer_leaning_attaching_head.ipynb
finetuning_with_dynamic_lr.ipynb		finetuning_with_dynamic_lr.ipynb
regularization.ipynb		regularization.ipynb
simple_convolution_operation.ipynb		simple_convolution_operation.ipynb
transfer_learning_differential_lr.ipynb		transfer_learning_differential_lr.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Computer Vision Architectures – From Scratch Implementations

Overview

Architectures Implemented

Additional Modules

Research Motivation

Key Insights from Implementation

Tech Stack

Future Work

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Computer Vision Architectures – From Scratch Implementations

Overview

Architectures Implemented

Additional Modules

Research Motivation

Key Insights from Implementation

Tech Stack

Future Work

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages