Concepts Tracker (ML Fundamentals)

Use this to track coverage across the fundamentals. Check items as you master them.

Calculus

MATH-011 Chain Rule (High, Not Started, Week 1) :: Formula: d/dx f(g(x)) = f'(g(x))g'(x) :: Interview: How is chain rule used in backprop? :: Resource: https://www.youtube.com/watch?v=YG15m2VwSjA :: Notes: Backpropagation foundation
MATH-012 Gradient (High, Not Started, Week 1) :: Formula: ∇f = [∂f/∂x₁ ... ∂f/∂xₙ] :: Interview: What is the gradient? :: Resource: https://www.youtube.com/watch?v=GkB4vW16QHI :: Notes: Direction of steepest ascent
MATH-013 Convexity (Medium, Not Started, Week 1) :: Formula: f(λx+(1-λ)y) ≤ λf(x)+(1-λ)f(y) :: Interview: Why is convexity important for optimization? :: Resource: https://www.youtube.com/watch?v=kcOodzDGV4c :: Notes: Guarantees global minimum

Linear Algebra

MATH-001 Dot Product (High, Not Started, Week 1) :: Formula: a·b = Σ(aᵢ×bᵢ) = |a||b|cos(θ) :: Interview: What is the geometric interpretation of dot product? :: Resource: https://www.youtube.com/watch?v=LyGKycYT2v0 :: Notes: Measures similarity between vectors
MATH-002 Cosine Similarity (High, Not Started, Week 1) :: Formula: cos(θ) = (a·b)/(|a|×|b|) :: Interview: Why use cosine similarity for embeddings? :: Resource: https://www.youtube.com/watch?v=e9U0QAFbfLI :: Notes: Ignores magnitude focuses on direction
MATH-003 Matrix Multiplication (High, Not Started, Week 1) :: Formula: (AB)ᵢⱼ = Σₖ Aᵢₖ × Bₖⱼ :: Interview: Explain matrix multiplication complexity :: Resource: https://www.youtube.com/watch?v=XkY2DOUCWMU :: Notes: O(n³) naive - foundation of neural networks
MATH-004 Eigenvalues/Eigenvectors (Medium, Not Started, Week 1) :: Formula: Av = λv :: Interview: What are eigenvalues used for in ML? :: Resource: https://www.youtube.com/watch?v=PFDu9oVAE-g :: Notes: PCA dimensionality reduction
MATH-005 SVD (High, Not Started, Week 1) :: Formula: A = UΣVᵀ :: Interview: Explain SVD applications in ML :: Resource: https://www.youtube.com/watch?v=nbBvuuNVfco :: Notes: Recommendations matrix completion

Probability

MATH-006 Bayes Theorem (High, Not Started, Week 1) :: Formula: P(A|B) = P(B|A)P(A)/P(B) :: Interview: Derive Naive Bayes classifier :: Resource: https://www.youtube.com/watch?v=9wCnvr7Xw4E :: Notes: Foundation of probabilistic ML
MATH-007 Expectation (High, Not Started, Week 1) :: Formula: E[X] = Σ xᵢP(xᵢ) :: Interview: What is expectation of X²? :: Resource: https://www.youtube.com/watch?v=KLs_7b7SKi4 :: Notes: Mean of random variable
MATH-008 Variance (High, Not Started, Week 1) :: Formula: Var(X) = E[X²] - (E[X])² :: Interview: Derive variance formula :: Resource: https://www.youtube.com/watch?v=Qf3RMGXR-h8 :: Notes: Spread of distribution
MATH-009 Normal Distribution (High, Not Started, Week 1) :: Formula: f(x) = (1/√2πσ²)exp(-(x-μ)²/2σ²) :: Interview: Why is normal distribution important? :: Resource: https://www.youtube.com/watch?v=rzFX5NWojp0 :: Notes: Central limit theorem
MATH-010 MLE (High, Not Started, Week 1) :: Formula: θ_MLE = argmax Π P(xᵢ|θ) :: Interview: Derive MLE for Gaussian mean :: Resource: https://www.youtube.com/watch?v=XepXtl9YKwc :: Notes: Training objective foundation

Classical ML

Ensemble

ML-010 Random Forest (High, Not Started, Week 3) :: Formula: Bagging + feature randomness :: Interview: Why does RF reduce variance? :: Resource: https://www.youtube.com/watch?v=J4Wdy0Wc_xQ :: Notes: Ensemble of trees
ML-011 Bagging (High, Not Started, Week 3) :: Formula: Bootstrap + aggregate :: Interview: Bagging vs Boosting? :: Resource: https://www.youtube.com/watch?v=2Mg8QD0F1dQ :: Notes: Parallel variance reduction
ML-012 Boosting (High, Not Started, Week 3) :: Formula: Sequential residual fitting :: Interview: How does gradient boosting work? :: Resource: https://www.youtube.com/watch?v=3CC4N4z3GJc :: Notes: Sequential bias reduction
ML-013 XGBoost (High, Not Started, Week 3) :: Formula: Obj = L + γT + λ||w||² :: Interview: XGBoost vs Random Forest? :: Resource: https://www.youtube.com/watch?v=OtD8wVaFm6E :: Notes: State-of-art tabular

Metrics

ML-018 Precision (High, Not Started, Week 2) :: Formula: TP/(TP+FP) :: Interview: When optimize for precision? :: Resource: https://www.youtube.com/watch?v=j-EB6RqqjGI :: Notes: Of predicted positive correct
ML-019 Recall (High, Not Started, Week 2) :: Formula: TP/(TP+FN) :: Interview: When optimize for recall? :: Resource: https://www.youtube.com/watch?v=j-EB6RqqjGI :: Notes: Of actual positive found
ML-020 F1 Score (High, Not Started, Week 2) :: Formula: 2×P×R/(P+R) :: Interview: When use F1? :: Resource: https://www.youtube.com/watch?v=j-EB6RqqjGI :: Notes: Harmonic mean P and R
ML-021 AUC-ROC (High, Not Started, Week 2) :: Formula: Area under TPR vs FPR :: Interview: Why AUC-PR for imbalanced? :: Resource: https://www.youtube.com/watch?v=4jRBRDbJemM :: Notes: Threshold independent

Optimization

ML-003 Gradient Descent (High, Not Started, Week 2) :: Formula: θ = θ - α∇L(θ) :: Interview: Explain SGD vs batch GD :: Resource: https://www.youtube.com/watch?v=sDv4f4s2SB8 :: Notes: Core optimization algorithm

Regularization

ML-004 L1 Regularization (High, Not Started, Week 2) :: Formula: L + λΣ|wᵢ| :: Interview: When use L1 vs L2? :: Resource: https://www.youtube.com/watch?v=Q81RR3yKn30 :: Notes: Feature selection sparsity
ML-005 L2 Regularization (High, Not Started, Week 2) :: Formula: L + λΣwᵢ² :: Interview: How does L2 prevent overfitting? :: Resource: https://www.youtube.com/watch?v=Q81RR3yKn30 :: Notes: Weight shrinkage

Supervised

ML-001 Linear Regression (High, Not Started, Week 2) :: Formula: ŷ = Xw + b; L = (1/n)Σ(y-ŷ)² :: Interview: Derive gradient for linear regression :: Resource: https://www.youtube.com/watch?v=nk2CQITm_eo :: Notes: Foundation regression model
ML-002 Logistic Regression (High, Not Started, Week 2) :: Formula: ŷ = σ(Xw+b); L = -Σ[y log(ŷ)] :: Interview: Why cross-entropy not MSE for classification? :: Resource: https://www.youtube.com/watch?v=yIYKR4sgzI8 :: Notes: Foundation classification model
ML-007 Decision Trees (High, Not Started, Week 3) :: Formula: IG = H(S) - Σ(|Sᵥ|/|S|)H(Sᵥ) :: Interview: How does a decision tree split? :: Resource: https://www.youtube.com/watch?v=_L39rN6gz7Y :: Notes: Interpretable model
ML-014 SVM (Medium, Not Started, Week 3) :: Formula: min ||w||² + CΣξᵢ :: Interview: Explain kernel trick :: Resource: https://www.youtube.com/watch?v=efR1C6CvhmE :: Notes: Maximum margin classifier

Theory

ML-006 Bias-Variance Tradeoff (High, Not Started, Week 2) :: Formula: Error = Bias² + Variance + Noise :: Interview: Explain bias-variance tradeoff :: Resource: https://www.youtube.com/watch?v=EuBBz3bI-aA :: Notes: Fundamental ML concept
ML-008 Entropy (High, Not Started, Week 3) :: Formula: H(S) = -Σ pᵢ log₂(pᵢ) :: Interview: What is entropy in ML? :: Resource: https://www.youtube.com/watch?v=YtebGVx-Fxw :: Notes: Impurity measure
ML-009 Gini Impurity (Medium, Not Started, Week 3) :: Formula: Gini = 1 - Σ pᵢ² :: Interview: Gini vs Entropy? :: Resource: https://www.youtube.com/watch?v=u4IxOk2ijSs :: Notes: Faster than entropy
ML-015 Kernel Trick (Medium, Not Started, Week 3) :: Formula: K(x,x') = φ(x)·φ(x') :: Interview: Why is kernel trick useful? :: Resource: https://www.youtube.com/watch?v=OmTu0fqUsQk :: Notes: Implicit high-dim mapping

Unsupervised

ML-016 K-Means (High, Not Started, Week 3) :: Formula: min Σ||x-μₖ||² :: Interview: K-means limitations? :: Resource: https://www.youtube.com/watch?v=4b5d3muPQmA :: Notes: Simple clustering
ML-017 PCA (High, Not Started, Week 3) :: Formula: X = VΛVᵀ; keep top k :: Interview: When does PCA fail? :: Resource: https://www.youtube.com/watch?v=FgakZw6K1QQ :: Notes: Dimensionality reduction

Deep Learning

Activations

DL-003 Sigmoid (High, Not Started, Week 4) :: Formula: σ(z) = 1/(1+e⁻ᶻ) :: Interview: Why not sigmoid in hidden layers? :: Resource: https://www.youtube.com/watch?v=Xvg00QnyaIY :: Notes: Vanishing gradient issue
DL-004 ReLU (High, Not Started, Week 4) :: Formula: max(0,z) :: Interview: Why ReLU works better? :: Resource: https://www.youtube.com/watch?v=Xvg00QnyaIY :: Notes: No vanishing gradient for z>0
DL-005 Softmax (High, Not Started, Week 4) :: Formula: exp(zᵢ)/Σexp(zⱼ) :: Interview: Implement softmax numerically stable :: Resource: https://www.youtube.com/watch?v=LLux1SW--oM :: Notes: Multi-class output

CNN

CNN-001 Convolution (High, Not Started, Week 5) :: Formula: out = Σ input×kernel :: Interview: Explain convolution operation :: Resource: https://www.youtube.com/watch?v=KuXjwB4LzSA :: Notes: Feature extraction
CNN-002 Pooling (High, Not Started, Week 5) :: Formula: Max or average over window :: Interview: Why use pooling? :: Resource: https://www.youtube.com/watch?v=8oOgPUO-TBY :: Notes: Reduce spatial size
CNN-003 ResNet (High, Not Started, Week 5) :: Formula: y = F(x) + x :: Interview: Why does ResNet work? :: Resource: https://www.youtube.com/watch?v=ZILIbUvp5lk :: Notes: Skip connections
CNN-004 1x1 Convolution (High, Not Started, Week 5) :: Formula: Channel mixing no spatial :: Interview: Purpose of 1x1 convolutions? :: Resource: https://www.youtube.com/watch?v=c1RBQzKsDCk :: Notes: Reduce dimensions
CNN-005 Transfer Learning (High, Not Started, Week 5) :: Formula: Pretrain + finetune :: Interview: When to freeze layers? :: Resource: https://www.youtube.com/watch?v=yofjFQddwHE :: Notes: Use pretrained weights

Fundamentals

DL-001 Forward Propagation (High, Not Started, Week 4) :: Formula: z=Wx+b; a=activation(z) :: Interview: Explain forward pass :: Resource: https://www.youtube.com/watch?v=aircAruvnKk :: Notes: NN computation flow
DL-002 Backpropagation (High, Not Started, Week 4) :: Formula: ∂L/∂W = ∂L/∂a × ∂a/∂z × ∂z/∂W :: Interview: Derive backprop for 2-layer NN :: Resource: https://www.youtube.com/watch?v=Ilg3gGewQ5U :: Notes: Training algorithm

Loss

DL-006 Cross-Entropy Loss (High, Not Started, Week 4) :: Formula: L = -Σ yᵢlog(ŷᵢ) :: Interview: Derive cross-entropy gradient :: Resource: https://www.youtube.com/watch?v=6ArSys5qHAU :: Notes: Classification loss

Optimization

DL-011 Adam Optimizer (High, Not Started, Week 4) :: Formula: m=β₁m+(1-β₁)g; v=β₂v+(1-β₂)g² :: Interview: Why Adam works well? :: Resource: https://www.youtube.com/watch?v=JXQT_vxqwIs :: Notes: Adaptive learning rates
DL-012 Learning Rate Schedule (Medium, Not Started, Week 4) :: Formula: Warmup + decay :: Interview: When use LR warmup? :: Resource: https://www.youtube.com/watch?v=UoFxCN2ROag :: Notes: Training stability

Problems

DL-007 Vanishing Gradient (High, Not Started, Week 4) :: Formula: Gradient → 0 in deep nets :: Interview: How to solve vanishing gradient? :: Resource: https://www.youtube.com/watch?v=qhXZsFVxGKo :: Notes: Deep network issue

RNN

RNN-001 Vanilla RNN (Medium, Not Started, Week 6) :: Formula: hₜ = tanh(Wₕₕhₜ₋₁ + Wₓₕxₜ) :: Interview: Why RNN has vanishing gradient? :: Resource: https://www.youtube.com/watch?v=AsNTP8Kwu80 :: Notes: Sequence modeling
RNN-002 LSTM (High, Not Started, Week 6) :: Formula: Forget/Input/Output gates :: Interview: How LSTM solves vanishing gradient? :: Resource: https://www.youtube.com/watch?v=YCzL96nL7j0 :: Notes: Long-term dependencies
RNN-003 GRU (Medium, Not Started, Week 6) :: Formula: Update/Reset gates :: Interview: LSTM vs GRU? :: Resource: https://www.youtube.com/watch?v=8HyCNIVRbSU :: Notes: Simpler than LSTM

Regularization

DL-009 Dropout (High, Not Started, Week 4) :: Formula: Randomly zero p fraction :: Interview: Why does dropout work? :: Resource: https://www.youtube.com/watch?v=D8PBER2klLg :: Notes: Ensemble effect
DL-010 Batch Normalization (High, Not Started, Week 4) :: Formula: x̂ = (x-μ)/σ; y = γx̂+β :: Interview: BatchNorm vs LayerNorm? :: Resource: https://www.youtube.com/watch?v=yXOMHOpbon8 :: Notes: Normalize activations

Training

DL-008 Weight Initialization (High, Not Started, Week 4) :: Formula: Xavier: N(0, 2/(n_in+n_out)) :: Interview: Why is initialization important? :: Resource: https://www.youtube.com/watch?v=1PGLj-uKT1w :: Notes: Prevent vanishing/exploding
DL-013 Gradient Clipping (Medium, Not Started, Week 4) :: Formula: g = g × threshold/||g|| :: Interview: When to use gradient clipping? :: Resource: https://www.youtube.com/watch?v=8zJMxkghjZU :: Notes: Prevent exploding gradients

Transformer

TRANS-001 Self-Attention (High, Not Started, Week 6) :: Formula: softmax(QKᵀ/√dₖ)V :: Interview: Implement self-attention :: Resource: https://www.youtube.com/watch?v=PSs6nxngL6k :: Notes: Core transformer mechanism
TRANS-002 Multi-Head Attention (High, Not Started, Week 6) :: Formula: Concat(head₁...headₕ)Wᴼ :: Interview: Why multiple heads? :: Resource: https://www.youtube.com/watch?v=mMa2PmYJlCo :: Notes: Different attention patterns
TRANS-003 Positional Encoding (High, Not Started, Week 6) :: Formula: sin/cos at different frequencies :: Interview: Why sinusoidal encoding? :: Resource: https://www.youtube.com/watch?v=1biZfFLPRSY :: Notes: Position information
TRANS-004 Transformer Architecture (High, Not Started, Week 6) :: Formula: Encoder/Decoder with attention :: Interview: Explain transformer architecture :: Resource: https://www.youtube.com/watch?v=zxQyTK8quyY :: Notes: Foundation of modern NLP
TRANS-005 BERT (High, Not Started, Week 6) :: Formula: Masked LM + NSP :: Interview: BERT vs GPT? :: Resource: https://www.youtube.com/watch?v=xI0HHN5XKDo :: Notes: Bidirectional encoder
TRANS-006 GPT (High, Not Started, Week 6) :: Formula: Autoregressive LM :: Interview: How does GPT generate text? :: Resource: https://www.youtube.com/watch?v=kCc8FmEb1nY :: Notes: Causal decoder

NLP

Decoding

NLP-004 Beam Search (Medium, Not Started, Week 7) :: Formula: Keep top k candidates :: Interview: Beam search vs greedy? :: Resource: https://www.youtube.com/watch?v=RLWuzLLSIgw :: Notes: Sequence decoding

Embeddings

NLP-001 Word2Vec (High, Not Started, Week 7) :: Formula: Skip-gram/CBOW :: Interview: Skip-gram vs CBOW? :: Resource: https://www.youtube.com/watch?v=viZrOnJclY0 :: Notes: Word embeddings

Metrics

NLP-003 Perplexity (Medium, Not Started, Week 7) :: Formula: exp(-1/N × Σlog P(wᵢ)) :: Interview: What is perplexity? :: Resource: https://www.youtube.com/watch?v=BAN3NB_SNHY :: Notes: Language model metric

Preprocessing

NLP-002 Subword Tokenization (High, Not Started, Week 7) :: Formula: BPE/WordPiece :: Interview: How handle OOV words? :: Resource: https://www.youtube.com/watch?v=HEikzVL-lZU :: Notes: Handle rare words

RecSys

Methods

RECSYS-001 Collaborative Filtering (High, Not Started, Week 7) :: Formula: User-based or Item-based :: Interview: Cold start problem? :: Resource: https://www.youtube.com/watch?v=h9gpufJFF-0 :: Notes: Similarity-based recs
RECSYS-002 Matrix Factorization (High, Not Started, Week 7) :: Formula: R ≈ PQᵀ :: Interview: How to handle implicit feedback? :: Resource: https://www.youtube.com/watch?v=ZspR5PZemcs :: Notes: Latent factors
RECSYS-003 Two-Tower Model (High, Not Started, Week 8) :: Formula: User tower + Item tower :: Interview: Explain two-tower architecture :: Resource: https://www.youtube.com/watch?v=Jnll9TYxsVM :: Notes: Deep learning recs

Serving

RECSYS-004 ANN/FAISS (Medium, Not Started, Week 8) :: Formula: Approximate nearest neighbor :: Interview: How to serve recs at scale? :: Resource: https://www.youtube.com/watch?v=sKyvsdEv6rk :: Notes: Fast similarity search

ML Systems

Deployment

MLSYS-004 Model Serving (Medium, Not Started, Week 8) :: Formula: Real-time inference :: Interview: Batch vs online serving? :: Resource: https://www.youtube.com/watch?v=6m4qFQMH6vk :: Notes: Deployment patterns

Evaluation

MLSYS-002 A/B Testing (High, Not Started, Week 8) :: Formula: Randomized experiment :: Interview: How to A/B test ML models? :: Resource: https://www.youtube.com/watch?v=DUNk4GPZ9bw :: Notes: Online evaluation

Infrastructure

MLSYS-001 Feature Store (Medium, Not Started, Week 8) :: Formula: Centralized feature management :: Interview: What is a feature store? :: Resource: https://www.youtube.com/watch?v=OVWW93IGlnk :: Notes: Feature management

Monitoring

MLSYS-003 Data Drift (High, Not Started, Week 8) :: Formula: Input distribution shift :: Interview: How to detect data drift? :: Resource: https://www.youtube.com/watch?v=tGckE83G-E4 :: Notes: Model monitoring

Optimization

MLSYS-005 Quantization (Medium, Not Started, Week 8) :: Formula: FP32 → INT8 :: Interview: When to quantize models? :: Resource: https://www.youtube.com/watch?v=VKnUVkgTVCE :: Notes: Model compression

Concepts Tracker (ML Fundamentals)

Math

Calculus

Linear Algebra

Probability

Classical ML

Ensemble

Metrics

Optimization

Regularization

Supervised

Theory

Unsupervised

Deep Learning

Activations

CNN

Fundamentals

Loss

Optimization

Problems

RNN

Regularization

Training

Transformer

NLP

Decoding

Embeddings

Metrics

Preprocessing

RecSys

Methods

Serving

ML Systems

Deployment

Evaluation

Infrastructure

Monitoring

Optimization

Comments