Ever since Yahoo Pipes shut down recently, users have been looking for an alternative way to parse and edit RSS feeds.
Thankfully, we’ve built TaskPipes with the objective of being able to pull in data from any location. This including RSS and Atom feeds.
For this example, let’s extract posts from the homepage of Reddit, filter and clean only those we are interested in, and send ourselves a daily summary of the results by email.
Interestingly, you can get an RSS feed for any Reddit page by simply appending “.rss” to the end of the URL. Therefore, the first thing we do in TaskPipes is define the input data by referencing an “External Link”:
This gives us data in the following format:
Now let’s say we want to pull out the subreddit, the title of the post and the link to the content. Firstly, let’s extract the link from the “description” column.
We do this using two “Extract Text” steps in TaskPipes. The first will pull out the text between the start of the description and the first occurrence of “>[link].
Now we have all the text between the start and the end of the link. We now pull out the text from the first occurrence of the ” character and the end of this text:
We now just strip out the columns we don’t need using the “Remove Columns” method. Finally, let’s say that we don’t want to be sent links to images, so let’s apply a filter to remove entries that contain “imgur.com” within them:
The final step in the process is to send this data as an email. We set our email address, the subject and a message if required. The final process looks like this:
We can run this on a schedule, by using the “Triggers” functionality:
We’ll be sent an email every day at 4pm with a CSV of the cleaned data.
To use this pipe, head over to the TaskPipes Examples Page and make a copy of the Reddit process.