VeRA: Vector-based Random Matrix Adaptation
Tony Shin
VeRA: Vector-based Random Matrix Adaptation
1:16
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Tony Shin
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
0:50
HyperAttention: Long-context Attention in Near-Linear Time
Tony Shin
HyperAttention: Long-context Attention in Near-Linear Time
1:25
Fast Feedforward Networks
Tony Shin
Fast Feedforward Networks
1:15
Nougat: Neural Optical Understanding for Academic Documents
Tony Shin
Nougat: Neural Optical Understanding for Academic Documents
1:14
Retentive Network: A Successor to Transformer for Large Language Models
Tony Shin
Retentive Network: A Successor to Transformer for Large Language Models
1:05
LLava: Visual Instruction Tuning
Tony Shin
LLava: Visual Instruction Tuning
1:09
DeepReader Live Stream
Tony Shin
DeepReader Live Stream
BloombergGPT: A Large Language Model for Finance
Tony Shin
BloombergGPT: A Large Language Model for Finance
1:56
ImageBind: One Embedding Space To Bind Them All
Tony Shin
ImageBind: One Embedding Space To Bind Them All
3:02
Segment Anything
Tony Shin
Segment Anything
2:00
Are Emergent Abilities of Large Language Models a Mirage?
Tony Shin
Are Emergent Abilities of Large Language Models a Mirage?
2:17
Synthetic Data Boosts ImageNet Classification
Tony Shin
Synthetic Data Boosts ImageNet Classification
1:12
Unlimiformer: Long-Range Transformers with Unlimited Length Input
Tony Shin
Unlimiformer: Long-Range Transformers with Unlimited Length Input
0:47
[Tutorial] Image Super Resolution without Photoshop
Tony Shin
[Tutorial] Image Super Resolution without Photoshop
23:34
YOLO9000: Better, Faster, Stronger
Tony Shin
YOLO9000: Better, Faster, Stronger
10:32
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Tony Shin
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
15:10
Florence: A New Foundation Model for Computer Vision
Tony Shin
Florence: A New Foundation Model for Computer Vision
10:27
DSSD: Deconvolutional Single Shot Detector
Tony Shin
DSSD: Deconvolutional Single Shot Detector
8:03
MAE: Masked Autoencoders Are Scalable Vision Learners
Tony Shin
MAE: Masked Autoencoders Are Scalable Vision Learners
8:02
PVANet: Deep but Lightweight Neural Networks forReal-time Object Detection
Tony Shin
PVANet: Deep but Lightweight Neural Networks forReal-time Object Detection
5:01
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Tony Shin
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
5:36
R-FCN: Object Detection via Region-based Fully Convolutional Networks
Tony Shin
R-FCN: Object Detection via Region-based Fully Convolutional Networks
6:32
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Tony Shin
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
5:28
Pix2Seq: A Language Modeling Framework for Object Detection
Tony Shin
Pix2Seq: A Language Modeling Framework for Object Detection
7:33
Improved Regularization of Convolutional Neural Networks with Cutout
Tony Shin
Improved Regularization of Convolutional Neural Networks with Cutout
2:41
VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning
Tony Shin
VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning
7:13
SSD: Single Shot MultiBox Detector
Tony Shin
SSD: Single Shot MultiBox Detector
3:23
Barlow Twins: Self-Supervised Learning via Redundancy Reduction
Tony Shin
Barlow Twins: Self-Supervised Learning via Redundancy Reduction
4:33
MLP-Mixer: An all-MLP Architecture for Vision
Tony Shin
MLP-Mixer: An all-MLP Architecture for Vision
5:22
YOLO: Unified, Real-Time Object Detection
Tony Shin
YOLO: Unified, Real-Time Object Detection
4:09
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Tony Shin
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
4:26
OHEM: Training Region-based Object Detectors with Online Hard Example Mining
Tony Shin
OHEM: Training Region-based Object Detectors with Online Hard Example Mining
3:41
Swin Transformer Object Detection Demo
Tony Shin
Swin Transformer Object Detection Demo
6:31
Faster R CNN
Tony Shin
Faster R CNN
6:30
Fast R-CNN
Tony Shin
Fast R-CNN
5:27
AttentionNet: Aggregating Weak Directions for Accurate Object Detection
Tony Shin
AttentionNet: Aggregating Weak Directions for Accurate Object Detection
6:33
DeepBox: Learning Objectness with Convolutional Networks
Tony Shin
DeepBox: Learning Objectness with Convolutional Networks
8:59
MR-CNN: Object detection via a multi-region & semantic segmentation-aware CNN model
Tony Shin
MR-CNN: Object detection via a multi-region & semantic segmentation-aware CNN model
8:09
[DeepReader] SPP-Net: Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
Tony Shin
[DeepReader] SPP-Net: Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
6:07
[DeepReader] MultiBox: Scalable Object Detection using Deep Neural Networks
Tony Shin
[DeepReader] MultiBox: Scalable Object Detection using Deep Neural Networks
4:17
[DeepReader] OverFeat: Integrated Recognition, Localization and Detection using Conv. Networks
Tony Shin
[DeepReader] OverFeat: Integrated Recognition, Localization and Detection using Conv. Networks
5:15
[DeepReader] R-CNN
Tony Shin
[DeepReader] R-CNN
4:50
[Tutorial] Training End-to-end Object Detection with Transformer(DETR) model on custom dataset
Tony Shin
[Tutorial] Training End-to-end Object Detection with Transformer(DETR) model on custom dataset
25:16
[DeepReader] Informative Dropout for Robust Representation Learning A Shape bias Perspective
Tony Shin
[DeepReader] Informative Dropout for Robust Representation Learning A Shape bias Perspective
6:36
[DeepReader] DeLighT: Very Deep and Light weight Transformer
Tony Shin
[DeepReader] DeLighT: Very Deep and Light weight Transformer
6:45
[DeepReader] Contrastive Learning for Unpaired Image to Image Translation
Tony Shin
[DeepReader] Contrastive Learning for Unpaired Image to Image Translation
4:40
[DeepReader] Big Bird: Transformers for Longer Sequences
Tony Shin
[DeepReader] Big Bird: Transformers for Longer Sequences
6:03
[DeepReader] MiCo: Mixup Co Training for Semi Supervised Domain Adaptation
Tony Shin
[DeepReader] MiCo: Mixup Co Training for Semi Supervised Domain Adaptation
6:34
[DeepReader] PP-YOLO: An Effective and Efficient Implementation of Object Detector
Tony Shin
[DeepReader] PP-YOLO: An Effective and Efficient Implementation of Object Detector
7:12
[DeepReader] DeepSVG: A Hierarchical Generative Network for Vector Graphics Animation
Tony Shin
[DeepReader] DeepSVG: A Hierarchical Generative Network for Vector Graphics Animation
7:10
[DeepReader] Transformers are RNNs
Tony Shin
[DeepReader] Transformers are RNNs
5:30
RepPoints: Point Set Representation for Object Detection
Tony Shin
RepPoints: Point Set Representation for Object Detection
5:14
RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval
Tony Shin
RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval
5:52
PointRend: Image Segmentation as Rendering
Tony Shin
PointRend: Image Segmentation as Rendering
6:04
Neural Architecture Design for GPU Efficient Networks
Tony Shin
Neural Architecture Design for GPU Efficient Networks
9:11
Locally Masked Convolution for Auto-regressive Models
Tony Shin
Locally Masked Convolution for Auto-regressive Models
4:43
Rethinking the Truly Unsupervised Image-to-Image Translation
Tony Shin
Rethinking the Truly Unsupervised Image-to-Image Translation
6:43
Generative Pretraining from Pixels
Tony Shin
Generative Pretraining from Pixels
6:13
Disentangled Non local Neural Networks
Tony Shin
Disentangled Non local Neural Networks
8:13
DETR: End-to-End Object Detection with Transformers
Tony Shin
DETR: End-to-End Object Detection with Transformers
6:14
CornerNet : Detecting Objects as Paired Keypoints
Tony Shin
CornerNet : Detecting Objects as Paired Keypoints
5:03
EfficientDet: Scalable and Efficient Object Detection
Tony Shin
EfficientDet: Scalable and Efficient Object Detection
7:24