February 20, 2026
- 1:26 PM – 2:26 PM (1h): Updated this site
- 11:11 AM – 11:41 AM (30m): Anki
- 7:58 AM – 9:58 AM (2h): test entry with new site
- 9:00 AM – 9:10 AM (10m): test json write
- 7:14 AM – 8:08 AM (54m): clues by sam and dailyintegral logic puzzle
February 19, 2026
- 9:50 AM – 12:13 PM (2h 23m): Continued PDE homework, mainly practiced higher dimensional weak derivatives
February 18, 2026
- 8:15 AM – 11:04 AM (2h 49m): PDE homework which mainly covered weak derivatives and more practice solving 1d wave equations
February 17, 2026
- 11:20 AM – 12:59 PM (1h 39m): More asymptotics of rl
- Notes
- Went over the convergence in the finite time case and discovered why the result holds for all discount factors there
- Notes
- 10:55 AM – 11:20 AM (25m): Logic puzzle and integral
- Sources
- dailyintegral.com
- Sources
-
- 7:57 AM – 8:57 AM (1h): Read more of high-dimensional probability by Vershnin. Did the first exercise in Chapter 2
- Notes
- I got the lower bound with induction, upper bound using Markov's after logging, exponentiating
- Screenshots
- Notes
February 16, 2026
- 9:30 AM – 11:00 AM (1h 30m): Cleared Anki deck, roughly 250 cards.
- Notes
- Still struggling with colors, blues always give me a tough time
- Notes
- 7:30 AM – 9:00 AM (1h 30m): Reviewed Asymptotics of RL paper and talked a bit with Gemini and NotebookLM
- Notes
- Mainly focused on overall proof flow. The trickiest step is the stochastic decomposition step where the martingale terms pop out.
- Notes
February 15, 2026
-
- Sources
- Notes
- The measure for the weight distribution is frozen in time, but the solution still evolves in time. Roughly, the network is so wide that we're making tiny enough updates that the distribution doesn't change, but the accumulated changes over all weights is \(O(1)\). The learning is driven from the kernel \(A\) and the TD-error given from the environment. The network acts as like a fixed feature space instead of learning to represent new features.
- 8:28 AM – 9:13 AM (45m): Built this app, made some git repos and pushed changes
- 8:00 AM – 8:30 AM (30m): Continued learning transformers
- Sources
- Karpathy's tutorial on youtube
- Notes
- Biggest takeaway from this session was learning the difference between encoder and decoder, which he doesn't really explain until the end
- Sources
- 7:30 AM – 8:00 AM (30m): Relearned comparative advantage
- Sources
- Notes
- So the only reason this works is because of relative opportunity cost. Even if party A has an absolute advantage in efficiency over party B, paty A has an opportunity cost in choosing to spend time making something that they aren't most efficient at. Thus, they leave it to the scrubs to make that thing (even thought party A is better at it).
Tags
Browse days by topic
