Wei Yi – Medium

Wei Yi

Pinned

Wei Yi
in
Towards Data Science

Understanding the Denoising Diffusion Probabilistic Model, the Socratic Way

A deep dive into the motivation behind the denoising diffusion model and detailed derivations for the loss function

Feb 25, 2023

Understanding the Denoising Diffusion Probabilistic Model, the Socratic Way

Feb 25, 2023

Pinned

Wei Yi
in
Towards Data Science

Understand REINFORCE, Actor-Critic and PPO in one go

Use the loss function of the Policy Gradient algorithm to understand REINFORCE, Actor-Critic, and Proximal Policy Optimization (PPO).

2d ago

Understand REINFORCE, Actor-Critic and PPO in one go

2d ago

Pinned

Wei Yi
in
Towards Data Science

How Does an Image-Text Multimodal Foundation Model Work

Learn how an image-text multi-modality model can perform image classification, image retrieval, and image captioning

Jun 1

How Does an Image-Text Multimodal Foundation Model Work

Jun 1

Pinned

Wei Yi
in
Towards Data Science

How Does the Segment-Anything Model’s (SAM’s) Encoder Work?

a deep dive into how image content embedding, sine and cosine positional embedding, guidance click embedding and dense mask embedding is…

May 14

How Does the Segment-Anything Model’s (SAM’s) Encoder Work?

May 14

Pinned

Wei Yi
in
Towards Data Science

How does the Segment-Anything Model’s (SAM’s) decoder work?

A deep dive into how the Segment-Anything model’s decoding procedure, with a focus on how its self-attention and cross-attention mechanism…

Mar 24

How does the Segment-Anything Model’s (SAM’s) decoder work?

Mar 24

Wei Yi
in
Towards Data Science

Speeding up vision transformer prediction by 9 times faster with PyTorch, ONNX and TensorRT

How to use 16bit float, TensorRT, network rewriting and multi-threading to dramatically speed up deep learning model prediction

Jun 4, 2023

Speeding up vision transformer prediction by 9 times faster with PyTorch, ONNX and TensorRT

Jun 4, 2023

Wei Yi
in
Towards Data Science

How Decision Trees Split Nodes, from Loss Function Perspective

Learn how a decision tree splits nodes only to minimize its loss function

May 15, 2023

How Decision Trees Split Nodes, from Loss Function Perspective

May 15, 2023

Wei Yi
in
Towards Data Science

Distributed data parallel and distributed model parallel in PyTorch

How distributed data parallel DDP and distributed model parallel DMP works in stochastic gradient descent with large models and huge data

May 8, 2023

Distributed data parallel and distributed model parallel in PyTorch

May 8, 2023

Wei Yi
in
Towards Data Science

The Input-output Attention Mechanism from “Neural Machine Translation by Jointly Learning…

Learn the math and intuition behind the input-output attention mechanism in a RNN-based language to language translation model

Mar 18, 2022

The Input-output Attention Mechanism from “Neural Machine Translation by Jointly Learning…

Mar 18, 2022

Wei Yi
in
Towards Data Science

Can We Use Stochastic Gradient Descent (SGD) on a Linear Regression Model?

Learn why it is valid to use SGD on a linear regression model for parameter learning, see however, SGD can be inefficient, and appreciate…

Aug 5, 2021

Can We Use Stochastic Gradient Descent (SGD) on a Linear Regression Model?

Aug 5, 2021

Wei Yi

Wei Yi

I am a principal data scientist at AstraZeneca. Previously I worked at SecondMind, Microsoft Research, and also was CTO of a hedge fund EQB.

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams