Using TaskPipes to Parse a Reddit RSS Feed

Ever since Yahoo Pipes shut down recently, users have been looking for an alternative way to parse and edit RSS feeds.

Thankfully, we’ve built TaskPipes with the objective of being able to pull in data from any location. This including RSS and Atom feeds.

An Example

For this example, let’s extract posts from the homepage of Reddit, filter and clean only those we are interested in, and send ourselves a daily summary of the results by email.

Interestingly, you can get an RSS feed for any Reddit page by simply appending “.rss” to the end of the URL. Therefore, the first thing we do in TaskPipes is define the input data by referencing an “External Link”:

Screen Shot 2015-10-07 at 13.58.00

This gives us data in the following format:

Screen Shot 2015-10-07 at 13.59.48

Now let’s say we want to pull out the subreddit, the title of the post and the link to the content. Firstly, let’s extract the link from the “description” column.

We do this using two “Extract Text” steps in TaskPipes. The first will pull out the text between the start of the description and the first occurrence of “>[link].

Screen Shot 2015-10-07 at 14.17.16

Now we have all the text between the start and the end of the link. We now pull out the text from the first occurrence of the ” character and the end of this text:

Screen Shot 2015-10-07 at 14.17.26

We now just strip out the columns we don’t need using the “Remove Columns” method. Finally, let’s say that we don’t want to be sent links to images, so let’s apply a filter to remove entries that contain “” within them:

Screen Shot 2015-10-07 at 14.21.40

The final step in the process is to send this data as an email. We set our email address, the subject and a message if required. The final process looks like this:

Screen Shot 2015-10-07 at 14.24.35

We can run this on a schedule, by using the “Triggers” functionality:

Screen Shot 2015-10-07 at 14.31.22

We’ll be sent an email every day at 4pm with a CSV of the cleaned data.

To use this pipe, head over to the TaskPipes Examples Page and make a copy of the Reddit process.

Using TaskPipes to Parse a Reddit RSS Feed

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s