December 5, 2014

What if a Zombie outbreak happened in Spain?

I found an interesting article in Max Berggren’s Blog, where he implements an SIR Model, a Survival - Infected - Removed model that simulates the spread of diseases. Max did it for Norway, but I wanted to see what would happened if a Zombie outbreak would happen in my home country, Spain. I wanted to check how much of an impact the location of the patien zero would have on the spread of the disease. Read more

December 1, 2014

Building a Recommendation Engine for Reddit. Part 4

On this final part of my series about Building a recommendation engine for Reddit I will explain how to use the similarity engine on a web application. We left Part 3 with a fully functional similarity engine, that given a set of subreddits for a Redditor it would return the top N subreddits that are more similar to that initial set. Step 4. Building the web application To build the web application, we need to decide how to implement it. Read more

November 17, 2014

Building a Recommendation Engine for Reddit. Part 3

We left Part 2 with a dataset including a set of Redditors and the Subreddits they comment on. We also defined which similarity index we are going to use to measure the similarity among each Subreddit. So, let’s continue! Step 3. Calculating Subreddit similarity Let’s refresh how we want the Subreddit similarity table look like sub1 | sub2 | similarity funny | aww | similarity(funnny-aww) funny | Iama | similarity(funny-Iama) . Read more

November 13, 2014

What I learned today

Life Today I learned that hackers can gain information about which sites have you visited by using CSS Selectors Today I learned that I belong to a selective group with a genetic mutation that makes as hate cilantro (I HATE IT!) Work Today I learned about Flask Command line Interface a new feature of Flask that allows running view functions directly from the command line without having a context. Read more

November 12, 2014

Building a Recommendation Engine for Reddit. Part 2

Step 2. Building the dataset On part 1 of this tutorial we laid out the project in detail, and decided that in order to calculate the similarity between two subs, we just need to find the list of users on each one. However, this methodology becomes a computational problem when you consider the magnitud of Reddit. Reddit has more than 300,000 subreddits, and more than 100 Million users (6% of the US population), so we need to narrow a little bit the data that we need. Read more

Powered by Hugo & Kiss.