8000 GitHub - quarrying/quarrying-paper-notes: 个人论文笔记
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

quarrying/quarrying-paper-notes

Repository files navigation

Incomplete Record of Paper Reading

记录当天阅读 1 小时以上的文献 (不一定读完), 勾选表示做了笔记.

2025

  • 20250213 [2020] End-to-end object detection with transformer
  • 20250212 [2017] Attention is All You Need
  • 20250121 [2022] SVTR_ Scene Text Recognition with a Single Visual Model

2024

  • 20241111 [2022 IJCAI] SVTR_ Scene Text Recognition with a Single Visual Model
  • 20241111 [2020 AAAI] Real-time Scene Text Detection with Differentiable Binarization
  • 20240408 [2023] Improved Baselines with Visual Instruction Tuning
  • 20240227 泛读大模型压缩相关文献
  • 20240227 [2022 ICLR] Finetuned Language Models Are Zero-Shot Learners
  • 20240223 [2015] Cross Modal Distillation for Supervision Transfer
  • 20240222 泛读大模型压缩相关文献
  • 20240222 [2022] Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation
  • 20240222 [2023] Multimodal Chain-of-Thought Reasoning in Language Models
  • 20240221 泛读大模型压缩相关文献
  • 20240221 [2018] Improving language understanding by generative pre-training
  • 20240220 [2023 ACL] Distilling Step-by-Step_ Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
  • 20240220 [2018] Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
  • 20240220 [2020] DistilBERT, a distilled version of BERT_ smaller, faster, cheaper and lighter
  • 20240204 [2023 CVPR] Micron-BERT_ BERT-based Facial Micro-Expression Recognition

2023

  • 20231120 [2023] Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
  • 20231116 [2023] Shikra_ Unleashing Multimodal LLM’s Referential Dialogue Magic
  • 20231116 [2023] Ferret_ Refer and Ground Anything Anywhere at Any Granularity
  • 20231107 [2023] Visual Instruction Tuning
  • 20231107 [2023] What Makes for Good Visual Instructions_ Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning
  • 20231019 [2021] Improving Calibration for Long-Tailed Recognition
  • 20231017 [2023 NeurIPS] Multi-modal Queried Object Detection in the Wild
  • 20231017 [2019] Objects365_ A Large-scale, High-quality Dataset for Object Detection
  • 20230829 [2021] RegionCLIP_ Region-based Language-Image Pretraining
  • 20230829 [2021] OpenPrompt_ An Open-source Framework for Prompt-learning
  • 20230630 泛读多模态任务微调相关文献
  • 20230627 [2023] Segment Anything
  • 20230627 [2020] End-to-End Object Detection with Transformers
  • 20220621 [2022 ECCV] Visual Prompt Tuning
  • 20220621 [2023] Segment Anything in High Quality
  • 20220510 泛读视觉语言预训练相关文献
  • 20230427 [2021] Swin Transformer_ Hierarchical Vision Transformer using Shifted Windows
  • 20230427 [2022] Expanding Language-Image Pretrained Models for General Video Recognition
  • 20230422 [2020] An image is worth 16x16 words_ Transformers for image recognition at scale
  • 20230422 [2021] Align before Fuse_ Vision and Language Representation Learning with Momentum Distillation
  • 20230418 [2021] BEiT_ BERT Pre-Training of Image Transformers
  • 20230418 [2021] iBOT_ Image BERT Pre-Training with Online Tokenizer
  • 20230418 [2023] DINOv2_ Learning Robust Visual Features without Supervision
  • 20230417 [2021] Emerging Properties in Self-Supervised Vision Transformers
  • 20230416 [2023] Scaling Vision Transformers to 22 Billion Parameters
  • 20230416 [2022] LiT_ Zero-Shot Transfer with Locked-image text Tuning
  • 20230416 [2021 ICML] Scaling up visual and vision-language representation learning with noisy text supervision
  • 20230404 [2022] Confident Learning_ Estimating Uncertainty in Dataset Labels
  • 20230206 [2021 NIPS] SegFormer_ Simple and Efficient Design for Semantic Segmentation with Transformers

2022

  • 20221212 泛读 image editing 相关文献
  • 20221122 [2018 ECCV] Learning to Navigate for Fine-grained Classification
  • 20221121 [2022 CVPR] Fine-Grained Object Classification via Self-Supervised Pose Alignment
  • 20221121 [2018 CVPR] Cascaded Pyramid Network for Multi-Person Pose Estimation
  • 20221019 [2021 CVPR] Benchmarking Representation Learning for Natural World Image Collections
  • 20220906 [2019 CVPR] Feature Selective Anchor-Free Module for Single-Shot Object Detection
  • 20220729 [2022] Language Models are General-Purpose Interfaces
  • 20220728 [2022 ICML] OFA_ Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
  • 20220621 [2020] Rethinking of Pedestrian Attribute Recognition_ Realistic Datasets and A Strong Baseline
  • 20220614 [2021] Are Large-scale Datasets Necessary for Self-Supervised Pre-training
  • 20220430 泛读 self-supervised learning 相关文献
  • 20220429 泛读 self-supervised learning 相关文献
  • 20220418 [2021] DataPerf: Benchmarking Data for Better ML
  • 20220413 [2018] Arbitrary-Oriented Scene Text Detection via Rotation Proposals
  • 20220413 [2017] Attention is All You Need
  • 20220412 [2022] Towards Online Domain Adaptive Object Detection
  • 20220411 [2020] Channel Distillation_ Channel-Wise Attention for Knowledge Distillation
  • 20220312 [2021] GAN inversion: A survey
  • 20220311 [2022 WACV] Latent to Latent_ A Learned Mapper for Identity Preserving Editing of Multiple Face Attributes in StyleGAN-generated Images
  • 20220308 [2021] Pivotal Tuning for Latent-based Editing of Real Images
  • 20220307 [2020] LSUN-Stanford Car Dataset_ Enhancing Large-Scale Car Image Datasets Using Deep Learning for Usage in GAN Training
  • 20220307 [2022] Self-Distilled StyleGAN_ Towards Generation from Internet Photos
  • 20220307 [2016] Image-to-Image Translation with Conditional Adversarial Networks
  • 20220307 [2017 ICCV] Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
  • 20220304 [2019] Interpreting the Latent Space of GANs for Semantic Face Editing
  • 20220303 [2020 NeurIPS] Training generative adversarial networks with limited data
  • 20220301 [2021] Lite-HRNet_ A Lightweight High-Resolution Network
  • 20220228 [2021] SimMIM_ A Simple Framework for Masked Image Modeling
  • 20220224 [2021] End-to-End Object Detection with Fully Convolutional Network
  • 20220223 [2018] Unsupervised Feature Learning via Non-Parametric Instance Discrimination
  • 20220222 [2020] Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2 Network
  • 20220217 [2017 ICCV] Arbitrary style transfer in real-time with adaptive instance normalization
  • 20220217 [2019 ICCV] Image2StyleGAN_ How to Embed Images Into the StyleGAN Latent Space
  • 20220217 [2021] Encoding in Style_ a StyleGAN Encoder for Image-to-Image Translation
  • 20220216 [2020 CVPR] Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection
  • 20220216 [2017] An Implementation of Faster RCNN with Study for Region Sampling
  • 20220215 [2018 ECCV] Acquisition of Localization Confidence for Accurate Object Detection
  • 20220215 [2019] FCOS_ Fully Convolutional One-Stage Object Detection
  • 20220214 [2021] Swin Transformer_ Hierarchical Vision Transformer using Shifted Windows
  • 20220210 [2021 CVPR] Exploring Simple Siamese Representation Learning
  • 20220210 [2017] Large batch training of convolutional networks
  • 20220209 [2016] Perceptual Losses for Real-Time Style Transfer and Super-Resolution
  • 20220127 [2021] Sample and Computation Redistribution for Efficient Face Detection
  • 20220107 [2021] Residual Attention_ A Simple but Effective Method for Multi-Label Recognition
  • 20220105 [2021] PP-YOLOv2_ A Practical Object Detector
  • 20220104 [2016] DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations
  • 20220104 [2019 CVPR] DeepFashion2_ A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images

2021

  • 20211230 [2020] ELF_ An Early-Exiting Framework for Long-Tailed Classification
  • 20211118 [2019] M2det_ A single-shot object detector based on multi-level feature pyramid network
  • 20211118 [2018 CVPR] Scale-Transferrable Object Detection
  • 20211017 [2021] You Only Look One-level Feature
  • 20211015 [2021] Hand Image Understanding via Deep Multi-Task Learning
  • 20210913 [2021] YOLO5Face_ Why Reinventing a Face Detector
  • 20210907 [2021] You Only Look One-level Feature
  • 20210824 [2020] EfficientDet_ Scalable and Efficient Object Detection
  • 20210824 [2021] Revisiting Classification Perspective on Scene Text Recognition
  • 20210809 [2020] Attention_ A Lightweight 2D Hand Pose Estimation Approach
  • 20210730 [2021 CVPR] Exploring Simple Siamese Representation Learning
  • 20210729 [2019] RetinaFace_ Single-stage Dense Face Localisation in the Wild
  • 20210728 [2021 CVPR] Multi-Scale Aligned Distillation for Low-Resolution Detection
  • 20210725 [2018 CVPR] Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation
  • 20210724 [2019] Bag of Freebies for Training Object Detection Neural Networks
  • 20210723 [2021] YOLOX_ Exceeding YOLO Series in 2021
  • 20210719 [2020 ICLR] CurricularFace_ Adaptive Curriculum Learning Loss For Deep Face Recognition
  • 20210713 [2016] Simple Online And Realtime Tracking
  • 20210713 [2017] Simple Online and Realtime Tracking with a Deep Association Metric
  • 20210628 [2018] The iNaturalist Species Classification and Detection Dataset
  • 20210627 [2021] TinaFace_ Strong but Simple Baseline for Face Detection
  • 20210627 [2021 CVPR] EPSANet_ An Efficient Pyramid Split Attention Block on Convolutional Neural Network
  • 20210617 [2018] CrowdHuman_ A Benchmark for Detecting Human in a Crowd
  • 20210218 [2019 CVPR] Look More Than Once_ An Accurate Detector for Text of Arbitrary Shapes
  • 20210218 [2021] Pushing the Envelope of Thin Crack Detection
  • 20210202 [2018] Shape Robust Text Detection with Progressive Scale Expansion Network
  • 20210202 [2017 CVPR] EAST_ An Efficient and Accurate Scene Text Detector
  • 20210201 [2019 CVPR] Look More Than Once_ An Accurate Detector for Text of Arbitrary Shapes
  • 20210112 [2021] Pushing the Envelope of Thin Crack Detection
  • 20210112 [2016] Deeptext_ A unified framework for text proposal generation and text detection in natural images
  • 20210112 [2021] Research on Fast Text Recognition Method for Financial Ticket Image
  • 20210111 [2020 CVPRW] CSPNet_ A new backbone that can enhance learning capability of CNN

2020

  • 20201229 [2020] Scene Text Detection with Scribble Lines
  • 20201217 [2020] Group Masked Autoencoder Based Density Estimator For Audio Anomaly Detection
  • 20201217 [2019] Real-time Scene Text Detection with Differentiable Binarization
  • 20201208 [2020] MAAD-Face_ A Massively Annotated Attribute Dataset for Face Images
  • 20201208 [2020] OneNet_ End-to-End One-Stage Object Detection by Classificaion Cost
  • 20201207 [2020] Cc-Loss_ Channel Correlation Loss For Image Classification
  • 20201203 [2020 ECCV] PIoU Loss_ Towards Accurate Oriented Object Detection in Complex Environments
  • 20201114 [2020] TResNet_ High Performance GPU-Dedicated Architecture
  • 20201109 [2020] Attentional Feature Fusion
  • 20201103 [2020 ECCV] Dive Deeper Into Box for Object Detection
  • 20200816 [2020] Prime-Aware Adaptive Distillation
  • 20200816 [2019 CVPR] C3AE: Exploring the Limits of Compact Model for Age Estimation
  • 20200813 [2020] Incomplete Descriptor Mining with Elastic Loss for Person Re-Identification
  • 20200812 [2019] Multi-Similarity Loss with General Pair Weighting for Deep Metric Learning
  • 20200804 [2020] PP-YOLO_ An Effective and Efficient Implementation of Object Detector
  • 20200700 [2011] Blind/Referenceless Image Spatial Quality Evaluator
  • 20200616 [2020] Rethinking ImageNet Pre-training
  • 20200105 [2019] MRCNet_ Crowd Counting and Density Map Estimation in Aerial and Ground Imagery

2014-2017

  • 20170813 [2013 CVPR] Saliency Detection via Graph-Based Manifold Ranking
  • 20170214 [2016] Understanding and Improving Convolutional Neural Networks via CReLU
  • 20170122 [2016] Deep Learning without Poor Local Minima
  • 20170120 [2016] Large-Margin Softmax Loss for Convolutional Neural Networks
  • 20170000 [2016 ICLR] Deep Compression_ Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
  • 20161230 [2011] Fast coordinate descent methods with variable
  • 20150000 [2012 NPAR] Combining Sketch and Tone for Pencil Drawing Production
  • 20140000 [2010 CVPR] Detecting Text in Natural Scenes with Stroke Width Transform

About

个人论文笔记

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0