<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>nzxhuong'log</title><link>https://nzxhuong.github.io/</link><description>Just bunch of learning notes.</description><atom:link href="https://nzxhuong.github.io/rss.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><lastBuildDate>Sat, 12 Apr 2025 19:08:45 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Variational Autoencoders</title><link>https://nzxhuong.github.io/posts/variational-autoencoders/</link><dc:creator>Ngo Truong</dc:creator><description>&lt;div&gt;&lt;p&gt;This is my learning note from &lt;a class="reference external" href="https://www.youtube.com/watch?v=FMuvUZXMzKM"&gt;L4 Latent Variable Models&lt;/a&gt; by Pieter Abbeel. The idea of being able to generate (potentially new) images, songs, or any data type you want with generative models always amazes me. And I just want to share my thoughts on it (mostly Latent Variable Models, LVMs).&lt;/p&gt;
&lt;p&gt;&lt;a href="https://nzxhuong.github.io/posts/variational-autoencoders/"&gt;Read more…&lt;/a&gt; (11 min remaining to read)&lt;/p&gt;&lt;/div&gt;</description><category>generative-models</category><category>mathjax</category><category>variational-autoencoders</category><guid>https://nzxhuong.github.io/posts/variational-autoencoders/</guid><pubDate>Sat, 12 Apr 2025 18:35:36 GMT</pubDate></item><item><title>Prioritized Experience Replay and Importance Sampling</title><link>https://nzxhuong.github.io/posts/prioritized-experience-replay-and-importance-sampling/</link><dc:creator>Ngo Truong</dc:creator><description>&lt;div&gt;&lt;p&gt;During the time I’m learning about Deep Q-Learning (DQN), I was stumbling on one thought: with experience replay, we sample transitions in the buffer uniformly. But the way we append &lt;em&gt;every&lt;/em&gt; transition means it's likely that most of the transitions are 'bad' experiences, especially early on. Like in the cliff walking problem, early in training we're just running around randomly and falling off the cliff a lot. This leads to the buffer being mostly filled with bad outcomes. Even when we &lt;em&gt;do&lt;/em&gt; finally reach the goal and get a good transition, it's drowned out by the huge amount of bad stuff we already stored. This means most of the time we're sampling 'bad' experiences during training updates, and that doesn't feel very efficient.&lt;/p&gt;
&lt;p&gt;So I started thinking... is there a way to sample more of the 'good' stuff? That's when I found Prioritized Experience Replay (PER), proposed by DeepMind.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://nzxhuong.github.io/posts/prioritized-experience-replay-and-importance-sampling/"&gt;Read more…&lt;/a&gt; (5 min remaining to read)&lt;/p&gt;&lt;/div&gt;</description><category>importance sampling</category><category>mathjax</category><category>prioritized experience replay</category><category>reinforcement learning</category><guid>https://nzxhuong.github.io/posts/prioritized-experience-replay-and-importance-sampling/</guid><pubDate>Wed, 02 Apr 2025 13:31:56 GMT</pubDate></item><item><title>Understanding Shannon Information and Entropy</title><link>https://nzxhuong.github.io/posts/understanding-shannon-information-and-entropy/</link><dc:creator>Ngo Truong</dc:creator><description>&lt;div&gt;&lt;p&gt;Many materials on this topic start with Claude Shannon’s concept of information. So let’s start with that. &lt;br&gt;
Information, in Shannon's theory, is defined in the context of transferring a message from a source (transmitter) to a receiver over a channel. Imagine tossing a coin. In this scenario, the coin toss outcome acts as the source (transmitter), and you, observing the outcome, are the receiver.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://nzxhuong.github.io/posts/understanding-shannon-information-and-entropy/"&gt;Read more…&lt;/a&gt; (1 min remaining to read)&lt;/p&gt;&lt;/div&gt;</description><category>entropy</category><category>information theory</category><category>mathjax</category><category>probability</category><guid>https://nzxhuong.github.io/posts/understanding-shannon-information-and-entropy/</guid><pubDate>Tue, 01 Apr 2025 17:11:30 GMT</pubDate></item></channel></rss>