목록전체 글 (9)
hwchung 님의 블로그
U-Net: Convolutional Networks for Biomedical Image SegmentationMICCAI 2015논문 링크: https://arxiv.org/abs/1505.045970. AbstractIn this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. → U-Net 모델을 훈련할 때 data augmentation을 적용해서 모델이 더 잘 학습할 수 있음. 1. IntroductionWhile convolutional networks have..
AN IMAGE IS WORTH 16X16 WORDS:TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE ICLR 2021논문 링크: https://arxiv.org/abs/2010.11929github 링크: https://github.com/google-research/vision_transformer0. AbstractWhile the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in ..
Learning Deep Features for Discriminative LocalizationCVPR 2016논문 링크: https://arxiv.org/abs/1512.041500. AbstractShed light on how it explicitly enables the convolutional neural network to have remarkable localization ability despite being trained on image-level labels. → Global average pooling(GAP)의 구조적인 의미를 재발견하는 걸로 이해하면 됨. 논문에서도 remarkable localization이라고 해놨으니까 뭔가 이미지 내에서 위치를 알려주는 걸로 이해하면 됨. ..
Segment AnythingICCV 2023논문 링크: https://arxiv.org/abs/2304.02643project 링크: https://aidemos.meta.com/segment-anything0. Abstract먼저, promptable segmentation task를 정의함. segmentation task는 향후 medical segmentation을 진행함에 있어서 중요함. 따라서 어떤 구조, 어떤 방식으로 진행되는지 파악하는 것이 중요함. promptable segmentation task를 한 문장으로 말하면, '이미지를 주고 정답 mask를 맞혀라가 아닌, 점·박스·마스크 같은 프롬프트가 주어졌을 때 그 의도에 맞는 영역을 분할하하는 것'임. → 사용자의 의도에 따라 유연하..
Deep Unsupervised Learning usingNonequilibrium ThermodynamicsICML 2015논문 링크: https://arxiv.org/abs/1503.03585github 링크: https://github.com/Sohl-Dickstein/Diffusion-Probabilistic-Models0. Abstract먼저, 2015년 논문을 들고온 이유는 DDPM(Denoising Diffusion Probabilistic Models)의 근간이 되는 논문이기 때문임. 실제로 DDPM 저자들도 'Deep Unsupervised Learning usingNonequilibrium Thermodynamics' 논문을 토대로 전개한다고 언급하기도 함. 따라서 DDPM을 명확하게 ..
Attention Is All You NeedNeurIPS 2017논문 링크: https://arxiv.org/abs/1706.03762github 링크: https://github.com/tensorflow/tensor2tensor1. IntroductionRecurrent neural networks, long short-term memory and gated recurrent neural networks in particular, have been firmly established as state of the art approaches in sequence modeling and transduction problems such as language modeling and machine transla..
Flow Matching for Generative ModelingICLR 2023논문 링크: https://arxiv.org/abs/2210.027470. 핵심어떤 data-distribution 에서 simple-distribution (e.g. standard gaussian) 으로 변화하는 path (e.g. forward-diffusion process) 를 좀 더 잘 정의해서, 그것의 inverse (image generation via the diffusion model) 또한 더 잘 되도록 하고싶다가 핵심.노이즈를 더해가는 방식 비교(Top) Diffusion forward process(Middle) Diffusion Flow-matching(Bottom) Optimal transport..
Generative Adversarial NetsNeuIPS 2014논문 링크: https://arxiv.org/abs/1406.26610. AbstractWe simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G.→ 하나는 생성 모델 G. 실제 학습 데이터와 구분하기 어려운 가짜 데이터를 만들어 데이터 분포를 모사하려는 역할을 함.→ 다른 하나는 판별 모델 D로. 입력된 샘플이 진짜 학습 데..
Learning Transferable Visual Models From Natural Language SupervisionPMLR 2021논문 링크: https://arxiv.org/abs/2103.00020github 링크: https://github.com/OpenAI/CLIP0. AbstrctLearning directly from raw text about images is a promising alternative which leverages a much broader source of supervision. → 원본 텍스트를 직접 학습하는 것.The model transfers non-trivially to most tasks and is often competitive with a full..