Posts by Tag

직관적인 내용 정리 요약: 기존의 연구가 attention weight을 가지고 모델의 prediction을 설명하는게 위험하다, 검증되지 않았다고 주장했다면, 이 논문은… 그 논문을 저격한다. 일단 기존의 논문에서 수행한 실험의 결점들을 지적한다. 그리고 실험을 다시 해서 att...

attention-is-not-explanation-review

2 minute read

직관적인 내용 정리

Transformer 의문점 정리

7 minute read

Transformer 의문점 정리

밑바닥 까지 다 파보는 GAN 리뷰

9 minute read

Paper Review GAN Generative Adversarial Nets

review ODQR

4 minute read

ODQR paper

implement_DPR

less than 1 minute read

hard negative는 batch 밖에서 bm25로 유사도가 높은 문서이지만 정답은 없는 문장이다. query 1개당 하나씩 만들어서 negative sample로써 현재 배치 전체에 동일하게 적용. 그래서 만약 배치 크기가 8이면, 8개의 새로운 negative 들이 각각의 ...

Skim-RoBERTa

less than 1 minute read

Skim-RoBERTa RoBERTa(https://arxiv.org/pdf/1907.11692.pdf) Dynamic Masking 사용 에폭 마다 masking을 다르게 준다. N...

RAG-review

less than 1 minute read

PDF.

KegNet-review

less than 1 minute read

PDF.

KegNet-review

less than 1 minute read

PDF.

DPR-review

less than 1 minute read

PDF.

KegNet-review

less than 1 minute read

PDF.

paper-review-Focal-Loss

1 minute read

focal loss

Paper_Review_Batch-Normalization

4 minute read

Batch Normalization Sergey Ioffe at al. 리뷰

Back to Top ↑

NLP

DIFFERENTIAL TRANSFORMER review

5 minute read

DIFFERENTIAL TRANSFORMER

Not All LLM Reasoners Are Created Equal skimming

1 minute read

Not All LLM Reasoners Are Created Equal

RATIONALYST:-Pre-training-Process-Supervision-for-Improving-Reasoning-review

2 minute read

RATIONALYST: Pre-training Process-Supervision for Improving Reasoning 리뷰

LLMS KNOW MORE THAN THEY SHOW: ON THE INTRINSIC REPRESENTATION OF LLM HALLUCINATIONS review

1 minute read

이런 연구를 하시는 분들에게 추천:

DPO review

7 minute read

DPO

mathBERT review

3 minute read

mathBERT review

INCORPORATING BERT INTO NEURAL MACHINE TRANSLATION review

1 minute read

INCORPORATING BERT INTO NEURAL MACHINE TRANSLATION (ICLR 2020) review

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(ACL 2021) review

2 minute read

Contrastive Learning(ACL 2021) review

marginal rank loss

3 minute read

marginal rank loss

attention-is-not-not-explanation-review

2 minute read

직관적인 내용 정리 요약: 기존의 연구가 attention weight을 가지고 모델의 prediction을 설명하는게 위험하다, 검증되지 않았다고 주장했다면, 이 논문은… 그 논문을 저격한다. 일단 기존의 논문에서 수행한 실험의 결점들을 지적한다. 그리고 실험을 다시 해서 att...

attention-is-not-explanation-review

2 minute read

직관적인 내용 정리

Transformer 의문점 정리

7 minute read

Transformer 의문점 정리

implement_DPR

less than 1 minute read

hard negative는 batch 밖에서 bm25로 유사도가 높은 문서이지만 정답은 없는 문장이다. query 1개당 하나씩 만들어서 negative sample로써 현재 배치 전체에 동일하게 적용. 그래서 만약 배치 크기가 8이면, 8개의 새로운 negative 들이 각각의 ...

RAG-review

less than 1 minute read

PDF.

KegNet-review

less than 1 minute read

PDF.

KegNet-review

less than 1 minute read

PDF.

DPR-review

less than 1 minute read

PDF.

KegNet-review

less than 1 minute read

PDF.

GPT

1 minute read

GPT 버트는 임베딩 모델이다. gpt는 생성 모델이다. 버트는 인코더를 사용한다. gpt는 디코더를 사용한다. 단어가 나왔을 때 다음 단어가 어떤 단어가 나오는 것이 가장 적절할까? Aauto regressive 모델이다. 버트 보다 먼저 나왔음. 자연어 문장...

1005-Implementing-NLP-tokenizer

1 minute read

Implementing Tokenizer

Back to Top ↑

camp

1108_data_제작

3 minute read

brief OT 데이터 제작의 중요성 데이터 구축 과정과 설계 기초 자연어처리 데이터

08-20-multi-gpu-trouble-shppting

2 minute read

torch mullti gpu

0819-pre-trained-save-model

2 minute read

트렌드 백본 모델을 가지고 와서 우리 데이터에 맞춰서 다시 학습하는 알고리즘이 대세.

0818-torch-network-autograd-dataset

3 minute read

과제 1 질문 backward hook, forward hook에 input output 모두 있다. 굳이? forward pre hook에는 input만 있음.

0818_autograd

1 minute read

autograd of Torch

0817-torch-breif

2 minute read

토치 프레임워크

0813_GAN

2 minute read

week 2 Fri 0813 Generative Model stanford deep generative model io으로부터 강의록을 만드셨다고 한다. 문장을 만드는, 이미지를 만드는, 생성하는 것이 gen의 전부가 아니다. 알고 있는 gen mode...

0812_rnn

1 minute read

week 2 Thu 0812 Seqeunce Data and RNN 일상의 대부분 seq 데이터. 원하는 것. 단순. 길이가 언제 끝날지 모름. 그래서 고정되어 있는 conv을 쓸 수 없다. 어느 길이까지 받아야 할지 모르니까. lang model: 이전 데이터로 다음...

0811_cnn

3 minute read

week 2 Wed 0811 convolution의 직관 도장을 찍는다 의미: 필터에 찍는다. 필터 모양에 따라 다른 결과가 나온다. 필터의 평균: 블러 등등 필터의 수만큼 결과의 채널 1번 cov 후 activication...

0810_Optim

2 minute read

week 2 Tue 0810

0809_MLP

1 minute read

week 2 Mon 0809

0806_math

7 minute read

CNN

0804_math

2 minute read

딥러닝 학습

0803_py

3 minute read

python data structure Tuple 튜플을 왜 쓸까?

0805_log

4 minute read

py 5-1 Exception try: ... except ZeroDivisionError: ... except IndexError as i: ... print(i) print("Index Error Occurs!") else: ... ...

0802_py_math

4 minute read

py 1-1 Basic computer class for newbies 운영체제 운영체제: 사용자 프로그램(애플리케이션)과 사용자 인터페이스와 연결해 주고 대신 처리해줌. e.g. data.to_excel(…)와 같은 명령어만 치면 실제 작동은 알아서 해줌.

Back to Top ↑

Post Formats

Post: Link Permalink

less than 1 minute read

This theme supports link posts, made famous by John Gruber. To use, just add link: http://url-you-want-linked to the post’s YAML front matter and you’re done.

Post: Quote

less than 1 minute read

Only one thing is impossible for God: To find any sense in any copyright law on the planet. Mark Twain

Post: Notice

1 minute read

A notice displays information that explains nearby content. Often used to call attention to a particular detail.

Post: Chat

2 minute read

Abbott: Strange as it may seem, they give ball players nowadays very peculiar names.

Post: Standard

4 minute read

All children, except one, grow up. They soon know that they will grow up, and the way Wendy knew was this. One day when she was two years old she was playing...

Post: Modified Date

less than 1 minute read

This post has been updated and should show a modified date if used in a layout.

Back to Top ↑

DeepLearning

2022-06-22-mixed-precision

3 minute read

Pytorch mix precision

ML-model-debugginng

3 minute read

debugging models

밑바닥 까지 다 파보는 GAN 리뷰

9 minute read

Paper Review GAN Generative Adversarial Nets

review ODQR

4 minute read

ODQR paper

Paper_Review_Batch-Normalization

4 minute read

Batch Normalization Sergey Ioffe at al. 리뷰

Binary-Classification-Cross-Enropy-Implementation

2 minute read

Binary Classification Cross Enropy Implementation

Back to Top ↑

pytorch

2022-06-22-mixed-precision

3 minute read

Pytorch mix precision

dataloader

less than 1 minute read

dataloader output dimension 데이터 1개의 tuple을 쪼개서 배치 단위로 concat한다.

pytorch-recap

1 minute read

pytorch 정리

1005-Implementing-NLP-tokenizer

1 minute read

Implementing Tokenizer

0818_autograd

1 minute read

autograd of Torch

Back to Top ↑

python

python-recap

5 minute read

python recap

0803_py

3 minute read

python data structure Tuple 튜플을 왜 쓸까?

0805_log

4 minute read

py 5-1 Exception try: ... except ZeroDivisionError: ... except IndexError as i: ... print(i) print("Index Error Occurs!") else: ... ...

0802_py_math

4 minute read

py 1-1 Basic computer class for newbies 운영체제 운영체제: 사용자 프로그램(애플리케이션)과 사용자 인터페이스와 연결해 주고 대신 처리해줌. e.g. data.to_excel(…)와 같은 명령어만 치면 실제 작동은 알아서 해줌.

Back to Top ↑

project

dataloader

less than 1 minute read

dataloader output dimension 데이터 1개의 tuple을 쪼개서 배치 단위로 concat한다.

Trainer_API_QA_task_Log

2 minute read

문제점 huggingface에서 QA task example에서…

Trainer_API

6 minute read

사전 지식: Trainer: native pytorch 코드가 아니라 간단하게 training argument, 사용할 metric 함수, dataset만 던져 넣으면 알아서 학습을 돌려준다. 그러면 내부에서 epoch 마다 돌리고, step 마다 돌리고, loss 계산하고, gr...

2021-huggingface-trainer-resume-wandb

less than 1 minute read

huggingface의 trainer api을 쓰면서 wandb을 쓸때.

Back to Top ↑

review

밑바닥 까지 다 파보는 GAN 리뷰

9 minute read

Paper Review GAN Generative Adversarial Nets

review ODQR

4 minute read

ODQR paper

Paper_Review_Batch-Normalization

4 minute read

Batch Normalization Sergey Ioffe at al. 리뷰

Back to Top ↑

Tokenizer

2021-huggingface-trainer-resume-wandb

less than 1 minute read

huggingface의 trainer api을 쓰면서 wandb을 쓸때.

1006-BERT-model-size

less than 1 minute read

버트 모델 크기는 512이다. 이것보다 긴 문장이 들어가면?

1005-Implementing-NLP-tokenizer

1 minute read

Implementing Tokenizer

Back to Top ↑

Preprocessing

2021-huggingface-trainer-resume-wandb

less than 1 minute read

huggingface의 trainer api을 쓰면서 wandb을 쓸때.

1006-BERT-model-size

less than 1 minute read

버트 모델 크기는 512이다. 이것보다 긴 문장이 들어가면?

1005-Implementing-NLP-tokenizer

1 minute read

Implementing Tokenizer

Back to Top ↑

recap

huggingface-recap

less than 1 minute read

인코딩 값을 되돌리기 decode from transformers import AutoTokenizer, AutoModelForSequenceClassification model_name = "nlptown/bert-base-multilingual-uncased-sentiment"...

python-recap

5 minute read

python recap

pytorch-recap

1 minute read

pytorch 정리

Back to Top ↑

MRC

MRC_Retrieval_Sparse_Embedding

2 minute read

DB/웹에 있는 필요한 조각이 있는 문서를 찾아주는 것. 문서를 가져오는 시스템.

MRC_Retrieval_Dense_Embedding

1 minute read

Dense Embedding Retrieval

Link_MRC_Retrieval

less than 1 minute read

#MRC와 Retrival을 연결 Introduction to ODQA 지문이 주어지는 것이 아니라. 웹 전체 혹은 위키. 일단 문서를 뒤져야 함. 그 다음에 MRC 수행. 인풋와 아웃풋은 동일. 질문과 답변.

Back to Top ↑

Retrieval

MRC_Retrieval_Sparse_Embedding

2 minute read

DB/웹에 있는 필요한 조각이 있는 문서를 찾아주는 것. 문서를 가져오는 시스템.

MRC_Retrieval_Dense_Embedding

1 minute read

Dense Embedding Retrieval

Link_MRC_Retrieval

less than 1 minute read

#MRC와 Retrival을 연결 Introduction to ODQA 지문이 주어지는 것이 아니라. 웹 전체 혹은 위키. 일단 문서를 뒤져야 함. 그 다음에 MRC 수행. 인풋와 아웃풋은 동일. 질문과 답변.

Back to Top ↑

transformer

attention-is-not-not-explanation-review

2 minute read

직관적인 내용 정리 요약: 기존의 연구가 attention weight을 가지고 모델의 prediction을 설명하는게 위험하다, 검증되지 않았다고 주장했다면, 이 논문은… 그 논문을 저격한다. 일단 기존의 논문에서 수행한 실험의 결점들을 지적한다. 그리고 실험을 다시 해서 att...

attention-is-not-explanation-review

2 minute read

직관적인 내용 정리

Transformer 의문점 정리

7 minute read

Transformer 의문점 정리

Back to Top ↑

readability

Post: Standard

4 minute read

All children, except one, grow up. They soon know that they will grow up, and the way Wendy knew was this. One day when she was two years old she was playing...

Post: Modified Date

less than 1 minute read

This post has been updated and should show a modified date if used in a layout.

Back to Top ↑

standard

Post: Standard

4 minute read

All children, except one, grow up. They soon know that they will grow up, and the way Wendy knew was this. One day when she was two years old she was playing...

Post: Modified Date

less than 1 minute read

This post has been updated and should show a modified date if used in a layout.

Back to Top ↑

implementation

implement_DPR

less than 1 minute read

hard negative는 batch 밖에서 bm25로 유사도가 높은 문서이지만 정답은 없는 문장이다. query 1개당 하나씩 만들어서 negative sample로써 현재 배치 전체에 동일하게 적용. 그래서 만약 배치 크기가 8이면, 8개의 새로운 negative 들이 각각의 ...

Binary-Classification-Cross-Enropy-Implementation

2 minute read

Binary Classification Cross Enropy Implementation

Back to Top ↑

machine-learning

Maximum-weighted-liklihood-estimation

1 minute read

Maximum-weighted-liklihood-estimation review

2021-04-18-Metrics-Review

4 minute read

metric 정리

Back to Top ↑

math

0806_math

7 minute read

CNN

0804_math

2 minute read

딥러닝 학습

Back to Top ↑

loss

marginal rank loss

3 minute read

marginal rank loss

Maximum-weighted-liklihood-estimation

1 minute read

Maximum-weighted-liklihood-estimation review

Back to Top ↑

bert

KegNet-review

less than 1 minute read

PDF.

1006-BERT-model-size

less than 1 minute read

버트 모델 크기는 512이다. 이것보다 긴 문장이 들어가면?

Back to Top ↑

kagnet

RAG-review

less than 1 minute read

PDF.

KegNet-review

less than 1 minute read

PDF.

Back to Top ↑

blog

MRC_Retrieval_Sparse_Embedding

2 minute read

DB/웹에 있는 필요한 조각이 있는 문서를 찾아주는 것. 문서를 가져오는 시스템.

MRC_Retrieval_Dense_Embedding

1 minute read

Dense Embedding Retrieval

Back to Top ↑

Huggingface

Trainer_API_QA_task_Log

2 minute read

문제점 huggingface에서 QA task example에서…

Trainer_API

6 minute read

사전 지식: Trainer: native pytorch 코드가 아니라 간단하게 training argument, 사용할 metric 함수, dataset만 던져 넣으면 알아서 학습을 돌려준다. 그러면 내부에서 epoch 마다 돌리고, step 마다 돌리고, loss 계산하고, gr...

Back to Top ↑

Trainer

Trainer_API_QA_task_Log

2 minute read

문제점 huggingface에서 QA task example에서…

Trainer_API

6 minute read

사전 지식: Trainer: native pytorch 코드가 아니라 간단하게 training argument, 사용할 metric 함수, dataset만 던져 넣으면 알아서 학습을 돌려준다. 그러면 내부에서 epoch 마다 돌리고, step 마다 돌리고, loss 계산하고, gr...

Back to Top ↑

issue

Trainer_API_QA_task_Log

2 minute read

문제점 huggingface에서 QA task example에서…

Trainer_API

6 minute read

사전 지식: Trainer: native pytorch 코드가 아니라 간단하게 training argument, 사용할 metric 함수, dataset만 던져 넣으면 알아서 학습을 돌려준다. 그러면 내부에서 epoch 마다 돌리고, step 마다 돌리고, loss 계산하고, gr...

Back to Top ↑

chat

Post: Chat

2 minute read

Abbott: Strange as it may seem, they give ball players nowadays very peculiar names.

Back to Top ↑

notice

Post: Notice

1 minute read

A notice displays information that explains nearby content. Often used to call attention to a particular detail.

Back to Top ↑

quote

Post: Quote

less than 1 minute read

Only one thing is impossible for God: To find any sense in any copyright law on the planet. Mark Twain

Back to Top ↑

link

Post: Link Permalink

less than 1 minute read

This theme supports link posts, made famous by John Gruber. To use, just add link: http://url-you-want-linked to the post’s YAML front matter and you’re done.

Back to Top ↑

Jekyll

Welcome to Jekyll!

less than 1 minute read

You’ll find this post in your _posts directory. Go ahead and edit it and re-build the site to see your changes. You can rebuild the site in many different wa...

Back to Top ↑

update

Welcome to Jekyll!

less than 1 minute read

You’ll find this post in your _posts directory. Go ahead and edit it and re-build the site to see your changes. You can rebuild the site in many different wa...

Back to Top ↑

data-structure

Heap-Proof

7 minute read

복잡도가 $O(n)$이라는데 궁금했음… 알고리즘 교과서 heap 정리

Back to Top ↑

heap

Heap-Proof

7 minute read

복잡도가 $O(n)$이라는데 궁금했음… 알고리즘 교과서 heap 정리

Back to Top ↑

cs229

cs229-ps1

less than 1 minute read

Stanford CS229 PS1 Solutoin

Back to Top ↑

ExponentialFamily

cs229-ps1

less than 1 minute read

Stanford CS229 PS1 Solutoin

Back to Top ↑

GeneralizedLinearModel

cs229-ps1

less than 1 minute read

Stanford CS229 PS1 Solutoin

Back to Top ↑

LogisticRegression

cs229-ps1

less than 1 minute read

Stanford CS229 PS1 Solutoin

Back to Top ↑

GDA

cs229-ps1

less than 1 minute read

Stanford CS229 PS1 Solutoin

Back to Top ↑

Proof

cs229-ps1

less than 1 minute read

Stanford CS229 PS1 Solutoin

Back to Top ↑

ps1

cs229-ps1

less than 1 minute read

Stanford CS229 PS1 Solutoin

Back to Top ↑

deep-learning

BackPropagation-step-by-step

12 minute read

Stanford CS229 DNN dW 1개의 원소로 편미분.

Back to Top ↑

BackPropagation

BackPropagation-step-by-step

12 minute read

Stanford CS229 DNN dW 1개의 원소로 편미분.

Back to Top ↑

cross entropy

Binary-Classification-Cross-Enropy-Implementation

2 minute read

Binary Classification Cross Enropy Implementation

Back to Top ↑

keras

Binary-Classification-Cross-Enropy-Implementation

2 minute read

Binary Classification Cross Enropy Implementation

Back to Top ↑

metric

2021-04-18-Metrics-Review

4 minute read

metric 정리

Back to Top ↑

BatchNormalization

Paper_Review_Batch-Normalization

4 minute read

Batch Normalization Sergey Ioffe at al. 리뷰

Back to Top ↑

autograd

0818_autograd

1 minute read

autograd of Torch

Back to Top ↑

focal loss

paper-review-Focal-Loss

1 minute read

focal loss

Back to Top ↑

data imbalance

Maximum-weighted-liklihood-estimation

1 minute read

Maximum-weighted-liklihood-estimation review

Back to Top ↑

Maximum-weighted-liklihood-estimation

1 minute read

Maximum-weighted-liklihood-estimation review

Back to Top ↑

Statistics

Maximum-weighted-liklihood-estimation

1 minute read

Maximum-weighted-liklihood-estimation review

Back to Top ↑

tokenizer

1005-Implementing-NLP-tokenizer

1 minute read

Implementing Tokenizer

Back to Top ↑

GPT

1 minute read

GPT 버트는 임베딩 모델이다. gpt는 생성 모델이다. 버트는 인코더를 사용한다. gpt는 디코더를 사용한다. 단어가 나왔을 때 다음 단어가 어떤 단어가 나오는 것이 가장 적절할까? Aauto regressive 모델이다. 버트 보다 먼저 나왔음. 자연어 문장...

Back to Top ↑

DPR

DPR-review

less than 1 minute read

PDF.

Back to Top ↑

LUKE

KegNet-review

less than 1 minute read

PDF.

Back to Top ↑

RoBERTa

Skim-RoBERTa

less than 1 minute read

Skim-RoBERTa RoBERTa(https://arxiv.org/pdf/1907.11692.pdf) Dynamic Masking 사용 에폭 마다 masking을 다르게 준다. N...

Back to Top ↑

dataloader

less than 1 minute read

dataloader output dimension 데이터 1개의 tuple을 쪼개서 배치 단위로 concat한다.

Back to Top ↑

data

1108_data_제작

3 minute read

brief OT 데이터 제작의 중요성 데이터 구축 과정과 설계 기초 자연어처리 데이터

Back to Top ↑

GAN

밑바닥 까지 다 파보는 GAN 리뷰

9 minute read

Paper Review GAN Generative Adversarial Nets

Back to Top ↑

rank

marginal rank loss

3 minute read

marginal rank loss

Back to Top ↑

huggingface

huggingface-recap

less than 1 minute read

인코딩 값을 되돌리기 decode from transformers import AutoTokenizer, AutoModelForSequenceClassification model_name = "nlptown/bert-base-multilingual-uncased-sentiment"...

Back to Top ↑

math-domain

mathBERT review

3 minute read

mathBERT review

Back to Top ↑

debugging

ML-model-debugginng

3 minute read

debugging models

Back to Top ↑