Concepts Tracker (ML Fundamentals)

Use this to track coverage across the fundamentals. Check items as you master them.

Related:

Math

Calculus

  • MATH-011 Chain Rule (High, Not Started, Week 1) :: Formula: d/dx f(g(x)) = f'(g(x))g'(x) :: Interview: How is chain rule used in backprop? :: Resource: https://www.youtube.com/watch?v=YG15m2VwSjA :: Notes: Backpropagation foundation
  • MATH-012 Gradient (High, Not Started, Week 1) :: Formula: ∇f = [∂f/∂x₁ ... ∂f/∂xₙ] :: Interview: What is the gradient? :: Resource: https://www.youtube.com/watch?v=GkB4vW16QHI :: Notes: Direction of steepest ascent
  • MATH-013 Convexity (Medium, Not Started, Week 1) :: Formula: f(λx+(1-λ)y) ≤ λf(x)+(1-λ)f(y) :: Interview: Why is convexity important for optimization? :: Resource: https://www.youtube.com/watch?v=kcOodzDGV4c :: Notes: Guarantees global minimum

Linear Algebra

  • MATH-001 Dot Product (High, Not Started, Week 1) :: Formula: a·b = Σ(aᵢ×bᵢ) = |a||b|cos(θ) :: Interview: What is the geometric interpretation of dot product? :: Resource: https://www.youtube.com/watch?v=LyGKycYT2v0 :: Notes: Measures similarity between vectors
  • MATH-002 Cosine Similarity (High, Not Started, Week 1) :: Formula: cos(θ) = (a·b)/(|a|×|b|) :: Interview: Why use cosine similarity for embeddings? :: Resource: https://www.youtube.com/watch?v=e9U0QAFbfLI :: Notes: Ignores magnitude focuses on direction
  • MATH-003 Matrix Multiplication (High, Not Started, Week 1) :: Formula: (AB)ᵢⱼ = Σₖ Aᵢₖ × Bₖⱼ :: Interview: Explain matrix multiplication complexity :: Resource: https://www.youtube.com/watch?v=XkY2DOUCWMU :: Notes: O(n³) naive - foundation of neural networks
  • MATH-004 Eigenvalues/Eigenvectors (Medium, Not Started, Week 1) :: Formula: Av = λv :: Interview: What are eigenvalues used for in ML? :: Resource: https://www.youtube.com/watch?v=PFDu9oVAE-g :: Notes: PCA dimensionality reduction
  • MATH-005 SVD (High, Not Started, Week 1) :: Formula: A = UΣVᵀ :: Interview: Explain SVD applications in ML :: Resource: https://www.youtube.com/watch?v=nbBvuuNVfco :: Notes: Recommendations matrix completion

Probability

  • MATH-006 Bayes Theorem (High, Not Started, Week 1) :: Formula: P(A|B) = P(B|A)P(A)/P(B) :: Interview: Derive Naive Bayes classifier :: Resource: https://www.youtube.com/watch?v=9wCnvr7Xw4E :: Notes: Foundation of probabilistic ML
  • MATH-007 Expectation (High, Not Started, Week 1) :: Formula: E[X] = Σ xᵢP(xᵢ) :: Interview: What is expectation of X²? :: Resource: https://www.youtube.com/watch?v=KLs_7b7SKi4 :: Notes: Mean of random variable
  • MATH-008 Variance (High, Not Started, Week 1) :: Formula: Var(X) = E[X²] - (E[X])² :: Interview: Derive variance formula :: Resource: https://www.youtube.com/watch?v=Qf3RMGXR-h8 :: Notes: Spread of distribution
  • MATH-009 Normal Distribution (High, Not Started, Week 1) :: Formula: f(x) = (1/√2πσ²)exp(-(x-μ)²/2σ²) :: Interview: Why is normal distribution important? :: Resource: https://www.youtube.com/watch?v=rzFX5NWojp0 :: Notes: Central limit theorem
  • MATH-010 MLE (High, Not Started, Week 1) :: Formula: θ_MLE = argmax Π P(xᵢ|θ) :: Interview: Derive MLE for Gaussian mean :: Resource: https://www.youtube.com/watch?v=XepXtl9YKwc :: Notes: Training objective foundation

Classical ML

Ensemble

Metrics

Optimization

  • ML-003 Gradient Descent (High, Not Started, Week 2) :: Formula: θ = θ - α∇L(θ) :: Interview: Explain SGD vs batch GD :: Resource: https://www.youtube.com/watch?v=sDv4f4s2SB8 :: Notes: Core optimization algorithm

Regularization

Supervised

  • ML-001 Linear Regression (High, Not Started, Week 2) :: Formula: ŷ = Xw + b; L = (1/n)Σ(y-ŷ)² :: Interview: Derive gradient for linear regression :: Resource: https://www.youtube.com/watch?v=nk2CQITm_eo :: Notes: Foundation regression model
  • ML-002 Logistic Regression (High, Not Started, Week 2) :: Formula: ŷ = σ(Xw+b); L = -Σ[y log(ŷ)] :: Interview: Why cross-entropy not MSE for classification? :: Resource: https://www.youtube.com/watch?v=yIYKR4sgzI8 :: Notes: Foundation classification model
  • ML-007 Decision Trees (High, Not Started, Week 3) :: Formula: IG = H(S) - Σ(|Sᵥ|/|S|)H(Sᵥ) :: Interview: How does a decision tree split? :: Resource: https://www.youtube.com/watch?v=_L39rN6gz7Y :: Notes: Interpretable model
  • ML-014 SVM (Medium, Not Started, Week 3) :: Formula: min ||w||² + CΣξᵢ :: Interview: Explain kernel trick :: Resource: https://www.youtube.com/watch?v=efR1C6CvhmE :: Notes: Maximum margin classifier

Theory

  • ML-006 Bias-Variance Tradeoff (High, Not Started, Week 2) :: Formula: Error = Bias² + Variance + Noise :: Interview: Explain bias-variance tradeoff :: Resource: https://www.youtube.com/watch?v=EuBBz3bI-aA :: Notes: Fundamental ML concept
  • ML-008 Entropy (High, Not Started, Week 3) :: Formula: H(S) = -Σ pᵢ log₂(pᵢ) :: Interview: What is entropy in ML? :: Resource: https://www.youtube.com/watch?v=YtebGVx-Fxw :: Notes: Impurity measure
  • ML-009 Gini Impurity (Medium, Not Started, Week 3) :: Formula: Gini = 1 - Σ pᵢ² :: Interview: Gini vs Entropy? :: Resource: https://www.youtube.com/watch?v=u4IxOk2ijSs :: Notes: Faster than entropy
  • ML-015 Kernel Trick (Medium, Not Started, Week 3) :: Formula: K(x,x') = φ(x)·φ(x') :: Interview: Why is kernel trick useful? :: Resource: https://www.youtube.com/watch?v=OmTu0fqUsQk :: Notes: Implicit high-dim mapping

Unsupervised

Deep Learning

Activations

CNN

Fundamentals

  • DL-001 Forward Propagation (High, Not Started, Week 4) :: Formula: z=Wx+b; a=activation(z) :: Interview: Explain forward pass :: Resource: https://www.youtube.com/watch?v=aircAruvnKk :: Notes: NN computation flow
  • DL-002 Backpropagation (High, Not Started, Week 4) :: Formula: ∂L/∂W = ∂L/∂a × ∂a/∂z × ∂z/∂W :: Interview: Derive backprop for 2-layer NN :: Resource: https://www.youtube.com/watch?v=Ilg3gGewQ5U :: Notes: Training algorithm

Loss

  • DL-006 Cross-Entropy Loss (High, Not Started, Week 4) :: Formula: L = -Σ yᵢlog(ŷᵢ) :: Interview: Derive cross-entropy gradient :: Resource: https://www.youtube.com/watch?v=6ArSys5qHAU :: Notes: Classification loss

Optimization

  • DL-011 Adam Optimizer (High, Not Started, Week 4) :: Formula: m=β₁m+(1-β₁)g; v=β₂v+(1-β₂)g² :: Interview: Why Adam works well? :: Resource: https://www.youtube.com/watch?v=JXQT_vxqwIs :: Notes: Adaptive learning rates
  • DL-012 Learning Rate Schedule (Medium, Not Started, Week 4) :: Formula: Warmup + decay :: Interview: When use LR warmup? :: Resource: https://www.youtube.com/watch?v=UoFxCN2ROag :: Notes: Training stability

Problems

  • DL-007 Vanishing Gradient (High, Not Started, Week 4) :: Formula: Gradient → 0 in deep nets :: Interview: How to solve vanishing gradient? :: Resource: https://www.youtube.com/watch?v=qhXZsFVxGKo :: Notes: Deep network issue

RNN

Regularization

Training

  • DL-008 Weight Initialization (High, Not Started, Week 4) :: Formula: Xavier: N(0, 2/(n_in+n_out)) :: Interview: Why is initialization important? :: Resource: https://www.youtube.com/watch?v=1PGLj-uKT1w :: Notes: Prevent vanishing/exploding
  • DL-013 Gradient Clipping (Medium, Not Started, Week 4) :: Formula: g = g × threshold/||g|| :: Interview: When to use gradient clipping? :: Resource: https://www.youtube.com/watch?v=8zJMxkghjZU :: Notes: Prevent exploding gradients

Transformer

  • TRANS-001 Self-Attention (High, Not Started, Week 6) :: Formula: softmax(QKᵀ/√dₖ)V :: Interview: Implement self-attention :: Resource: https://www.youtube.com/watch?v=PSs6nxngL6k :: Notes: Core transformer mechanism
  • TRANS-002 Multi-Head Attention (High, Not Started, Week 6) :: Formula: Concat(head₁...headₕ)Wᴼ :: Interview: Why multiple heads? :: Resource: https://www.youtube.com/watch?v=mMa2PmYJlCo :: Notes: Different attention patterns
  • TRANS-003 Positional Encoding (High, Not Started, Week 6) :: Formula: sin/cos at different frequencies :: Interview: Why sinusoidal encoding? :: Resource: https://www.youtube.com/watch?v=1biZfFLPRSY :: Notes: Position information
  • TRANS-004 Transformer Architecture (High, Not Started, Week 6) :: Formula: Encoder/Decoder with attention :: Interview: Explain transformer architecture :: Resource: https://www.youtube.com/watch?v=zxQyTK8quyY :: Notes: Foundation of modern NLP
  • TRANS-005 BERT (High, Not Started, Week 6) :: Formula: Masked LM + NSP :: Interview: BERT vs GPT? :: Resource: https://www.youtube.com/watch?v=xI0HHN5XKDo :: Notes: Bidirectional encoder
  • TRANS-006 GPT (High, Not Started, Week 6) :: Formula: Autoregressive LM :: Interview: How does GPT generate text? :: Resource: https://www.youtube.com/watch?v=kCc8FmEb1nY :: Notes: Causal decoder

NLP

Decoding

Embeddings

Metrics

Preprocessing

RecSys

Methods

  • RECSYS-001 Collaborative Filtering (High, Not Started, Week 7) :: Formula: User-based or Item-based :: Interview: Cold start problem? :: Resource: https://www.youtube.com/watch?v=h9gpufJFF-0 :: Notes: Similarity-based recs
  • RECSYS-002 Matrix Factorization (High, Not Started, Week 7) :: Formula: R ≈ PQᵀ :: Interview: How to handle implicit feedback? :: Resource: https://www.youtube.com/watch?v=ZspR5PZemcs :: Notes: Latent factors
  • RECSYS-003 Two-Tower Model (High, Not Started, Week 8) :: Formula: User tower + Item tower :: Interview: Explain two-tower architecture :: Resource: https://www.youtube.com/watch?v=Jnll9TYxsVM :: Notes: Deep learning recs

Serving

  • RECSYS-004 ANN/FAISS (Medium, Not Started, Week 8) :: Formula: Approximate nearest neighbor :: Interview: How to serve recs at scale? :: Resource: https://www.youtube.com/watch?v=sKyvsdEv6rk :: Notes: Fast similarity search

ML Systems

Deployment

Evaluation

Infrastructure

  • MLSYS-001 Feature Store (Medium, Not Started, Week 8) :: Formula: Centralized feature management :: Interview: What is a feature store? :: Resource: https://www.youtube.com/watch?v=OVWW93IGlnk :: Notes: Feature management

Monitoring

Optimization

Comments

Share your approach or ask questions

0 comments
?
|
Markdown supported
Sign in to post

Loading comments...