직관적인 내용 정리 요약: 기존의 연구가 attention weight을 가지고 모델의 prediction을 설명하는게 위험하다, 검증되지 않았다고 주장했다면, 이 논문은… 그 논문을 저격한다. 일단 기존의 논문에서 수행한 실험의 결점들을 지적한다. 그리고 실험을 다시 해서 att...

attention-is-not-explanation-review

2 minute read

직관적인 내용 정리

Transformer 의문점 정리

7 minute read

Transformer 의문점 정리

밑바닥 까지 다 파보는 GAN 리뷰

9 minute read

Paper Review GAN Generative Adversarial Nets

review ODQR

4 minute read

ODQR paper

implement_DPR

less than 1 minute read

hard negative는 batch 밖에서 bm25로 유사도가 높은 문서이지만 정답은 없는 문장이다. query 1개당 하나씩 만들어서 negative sample로써 현재 배치 전체에 동일하게 적용. 그래서 만약 배치 크기가 8이면, 8개의 새로운 negative 들이 각각의 ...

Trainer_API_QA_task_Log

2 minute read

문제점 huggingface에서 QA task example에서…

Trainer_API

6 minute read

사전 지식: Trainer: native pytorch 코드가 아니라 간단하게 training argument, 사용할 metric 함수, dataset만 던져 넣으면 알아서 학습을 돌려준다. 그러면 내부에서 epoch 마다 돌리고, step 마다 돌리고, loss 계산하고, gr...

Skim-RoBERTa

less than 1 minute read

Skim-RoBERTa RoBERTa(https://arxiv.org/pdf/1907.11692.pdf) Dynamic Masking 사용 에폭 마다 masking을 다르게 준다. N...

RAG-review

less than 1 minute read

PDF.

MRC_Retrieval_Sparse_Embedding

2 minute read

DB/웹에 있는 필요한 조각이 있는 문서를 찾아주는 것. 문서를 가져오는 시스템.

MRC_Retrieval_Dense_Embedding

1 minute read

Dense Embedding Retrieval

Link_MRC_Retrieval

less than 1 minute read

#MRC와 Retrival을 연결 Introduction to ODQA 지문이 주어지는 것이 아니라. 웹 전체 혹은 위키. 일단 문서를 뒤져야 함. 그 다음에 MRC 수행. 인풋와 아웃풋은 동일. 질문과 답변.

KegNet-review

less than 1 minute read

PDF.

KegNet-review

less than 1 minute read

PDF.

DPR-review

less than 1 minute read

PDF.

KegNet-review

less than 1 minute read

PDF.

GPT

1 minute read

GPT 버트는 임베딩 모델이다. gpt는 생성 모델이다. 버트는 인코더를 사용한다. gpt는 디코더를 사용한다. 단어가 나왔을 때 다음 단어가 어떤 단어가 나오는 것이 가장 적절할까? Aauto regressive 모델이다. 버트 보다 먼저 나왔음. 자연어 문장...

2021-huggingface-trainer-resume-wandb

less than 1 minute read

huggingface의 trainer api을 쓰면서 wandb을 쓸때.

1006-BERT-model-size

less than 1 minute read

버트 모델 크기는 512이다. 이것보다 긴 문장이 들어가면?

1005-Implementing-NLP-tokenizer

1 minute read

Implementing Tokenizer

paper-review

DIFFERENTIAL TRANSFORMER review

5 minute read

DIFFERENTIAL TRANSFORMER

Not All LLM Reasoners Are Created Equal skimming

1 minute read

Not All LLM Reasoners Are Created Equal

RATIONALYST:-Pre-training-Process-Supervision-for-Improving-Reasoning-review

2 minute read

RATIONALYST: Pre-training Process-Supervision for Improving Reasoning 리뷰

LLMS KNOW MORE THAN THEY SHOW: ON THE INTRINSIC REPRESENTATION OF LLM HALLUCINATIONS review

1 minute read

이런 연구를 하시는 분들에게 추천:

DPO review

7 minute read

DPO

mathBERT review

3 minute read

mathBERT review

INCORPORATING BERT INTO NEURAL MACHINE TRANSLATION review

1 minute read

INCORPORATING BERT INTO NEURAL MACHINE TRANSLATION (ICLR 2020) review

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(ACL 2021) review

2 minute read

Contrastive Learning(ACL 2021) review

attention-is-not-not-explanation-review

2 minute read

attention-is-not-explanation-review

2 minute read

직관적인 내용 정리

Transformer 의문점 정리

7 minute read

Transformer 의문점 정리

밑바닥 까지 다 파보는 GAN 리뷰

9 minute read

Paper Review GAN Generative Adversarial Nets

review ODQR

4 minute read

ODQR paper

Skim-RoBERTa

less than 1 minute read

Skim-RoBERTa RoBERTa(https://arxiv.org/pdf/1907.11692.pdf) Dynamic Masking 사용 에폭 마다 masking을 다르게 준다. N...

RAG-review

less than 1 minute read

PDF.

KegNet-review

less than 1 minute read

PDF.

KegNet-review

less than 1 minute read

PDF.

DPR-review

less than 1 minute read

PDF.

KegNet-review

less than 1 minute read

PDF.

paper-review-Focal-Loss

1 minute read

focal loss

Paper_Review_Batch-Normalization

4 minute read

Batch Normalization Sergey Ioffe at al. 리뷰

YOLO Net

7 minute read

Paper_Review_YOLO_NET

ResNet

6 minute read

Paper_Review_ResNet

Inception Net

7 minute read

Paper_Review_InceptionNet=GoogLeNet Going Deeper with Convolutions AKA Inception by Szegedy et al.

boostcamp

1108_data_제작

3 minute read

brief OT 데이터 제작의 중요성 데이터 구축 과정과 설계 기초 자연어처리 데이터

MRC_Retrieval_Sparse_Embedding

2 minute read

DB/웹에 있는 필요한 조각이 있는 문서를 찾아주는 것. 문서를 가져오는 시스템.

MRC_Retrieval_Dense_Embedding

1 minute read

Dense Embedding Retrieval

MRC-INTRO-AND-PYTHON-BASICS

2 minute read

MRC-INTRO-AND-PYTHON-BASICS

Link_MRC_Retrieval

less than 1 minute read

GPT

1 minute read

08-20-multi-gpu-trouble-shppting

2 minute read

torch mullti gpu

0819-pre-trained-save-model

2 minute read

트렌드 백본 모델을 가지고 와서 우리 데이터에 맞춰서 다시 학습하는 알고리즘이 대세.

0818-torch-network-autograd-dataset

3 minute read

과제 1 질문 backward hook, forward hook에 input output 모두 있다. 굳이? forward pre hook에는 input만 있음.

0817-torch-breif

2 minute read

토치 프레임워크

0813_GAN

2 minute read

week 2 Fri 0813 Generative Model stanford deep generative model io으로부터 강의록을 만드셨다고 한다. 문장을 만드는, 이미지를 만드는, 생성하는 것이 gen의 전부가 아니다. 알고 있는 gen mode...

0812_rnn

1 minute read

week 2 Thu 0812 Seqeunce Data and RNN 일상의 대부분 seq 데이터. 원하는 것. 단순. 길이가 언제 끝날지 모름. 그래서 고정되어 있는 conv을 쓸 수 없다. 어느 길이까지 받아야 할지 모르니까. lang model: 이전 데이터로 다음...

0811_cnn

3 minute read

week 2 Wed 0811 convolution의 직관 도장을 찍는다 의미: 필터에 찍는다. 필터 모양에 따라 다른 결과가 나온다. 필터의 평균: 블러 등등 필터의 수만큼 결과의 채널 1번 cov 후 activication...

0810_Optim

2 minute read

week 2 Tue 0810

0809_MLP

1 minute read

week 2 Mon 0809

0806_math

7 minute read

CNN

0804_math

2 minute read

딥러닝 학습

0803_py

3 minute read

python data structure Tuple 튜플을 왜 쓸까?

0805_log

4 minute read

py 5-1 Exception try: ... except ZeroDivisionError: ... except IndexError as i: ... print(i) print("Index Error Occurs!") else: ... ...

0802_py_math

4 minute read

py 1-1 Basic computer class for newbies 운영체제 운영체제: 사용자 프로그램(애플리케이션)과 사용자 인터페이스와 연결해 주고 대신 처리해줌. e.g. data.to_excel(…)와 같은 명령어만 치면 실제 작동은 알아서 해줌.

deep-learning

optim_and_learning_rate

7 minute read

optim

밑바닥 까지 다 파보는 GAN 리뷰

9 minute read

Paper Review GAN Generative Adversarial Nets

review ODQR

4 minute read

ODQR paper

dataloader

less than 1 minute read

dataloader output dimension 데이터 1개의 tuple을 쪼개서 배치 단위로 concat한다.

Trainer_API_QA_task_Log

2 minute read

문제점 huggingface에서 QA task example에서…

Trainer_API

6 minute read

2021-huggingface-trainer-resume-wandb

less than 1 minute read

huggingface의 trainer api을 쓰면서 wandb을 쓸때.

1006-BERT-model-size

less than 1 minute read

버트 모델 크기는 512이다. 이것보다 긴 문장이 들어가면?

1005-Implementing-NLP-tokenizer

1 minute read

Implementing Tokenizer

paper-review-Focal-Loss

1 minute read

focal loss

Paper_Review_Batch-Normalization

4 minute read

Batch Normalization Sergey Ioffe at al. 리뷰

2021-04-18-Metrics-Review

4 minute read

metric 정리

Binary-Classification-Cross-Enropy-Implementation

2 minute read

Binary Classification Cross Enropy Implementation

YOLO Net

7 minute read

Paper_Review_YOLO_NET

ResNet

6 minute read

Paper_Review_ResNet

Inception Net

7 minute read

Paper_Review_InceptionNet=GoogLeNet Going Deeper with Convolutions AKA Inception by Szegedy et al.

BackPropagation-step-by-step

12 minute read

Stanford CS229 DNN dW 1개의 원소로 편미분.

Blog

Post: Link Permalink

less than 1 minute read

This theme supports link posts, made famous by John Gruber. To use, just add link: http://url-you-want-linked to the post’s YAML front matter and you’re done.

Post: Quote

less than 1 minute read

Only one thing is impossible for God: To find any sense in any copyright law on the planet. Mark Twain

Post: Notice

1 minute read

A notice displays information that explains nearby content. Often used to call attention to a particular detail.

Post: Chat

2 minute read

Abbott: Strange as it may seem, they give ball players nowadays very peculiar names.

Post: Standard

4 minute read

All children, except one, grow up. They soon know that they will grow up, and the way Wendy knew was this. One day when she was two years old she was playing...

Post: Modified Date

less than 1 minute read

This post has been updated and should show a modified date if used in a layout.

machine-learning

Maximum-weighted-liklihood-estimation

1 minute read

Maximum-weighted-liklihood-estimation review

2021-04-18-Metrics-Review

4 minute read

metric 정리

cs229-ps1

less than 1 minute read

Stanford CS229 PS1 Solutoin

cv

YOLO Net

7 minute read

Paper_Review_YOLO_NET

ResNet

6 minute read

Paper_Review_ResNet

Inception Net

7 minute read

Paper_Review_InceptionNet=GoogLeNet Going Deeper with Convolutions AKA Inception by Szegedy et al.

project

implement_DPR

less than 1 minute read

0818_autograd

1 minute read

autograd of Torch

DeepLearning

2022-06-22-mixed-precision

3 minute read

Pytorch mix precision

ML-model-debugginng

3 minute read

debugging models

blog

Welcome to Jekyll!

less than 1 minute read

You’ll find this post in your _posts directory. Go ahead and edit it and re-build the site to see your changes. You can rebuild the site in many different wa...

data-structure

Heap-Proof

7 minute read

복잡도가 $O(n)$이라는데 궁금했음… 알고리즘 교과서 heap 정리

Posts by Category

nlp

paper-review

boostcamp

deep-learning

Blog

machine-learning

cv

recap

project

DeepLearning

blog

data-structure