Extracting Data from the Body of an Email using TaskPipes

One of the most powerful uses of TaskPipes is that of pulling data straight from the body of regular emails.

I personally receive dozens of email updates, ranging from property alerts via Zoopla/Zillow to payment emails sent by PayPal.

But what if you want to automatically pull out the square footage of every apartment; or extract the payment information from every transaction, whenever you receive a new email?

Copying and pasting is not the solution.

An Example

In this specific example, we introduce a fictional supplier of international SMS messages who sends us frequent pricing updates by email.

The emails looks like this:

TaskPipes Incoming Email

Most of the data we are interested in is within that ugly green table. I wish “Supplier Five” hadn’t let go of their in-house designer.

To receive emails in this format in TaskPipes, we create a new pipe with a “Read from Email Body” step. Defined within this step is a unique TaskPipes preview email address.

TaskPipes - Read from Email Body

If we forward our original message to that email address, TaskPipes will then display a preview of the incoming data. You will notice that TaskPipes cleverly pulls out tables and new lines of text like so:

TaskPipes Data

I can now define a few intermediate data manipulation steps, to get the data in the right format:

Screen Shot 2015-09-29 at 12.44.40

To run this process, I simply assign a custom email address to the pipe.

TaskPipes Email Trigger

Whenever an email gets sent to this address, the body of the email will be parsed and processed through each of these steps.

Exporting

To send this to a final location, I can use one of the many output steps, including Google Spreadsheets, API webhooks or direct database writing.

TaskPipes Pipe

Use the above process via the TaskPipes Examples Page – it’s called “Parse Data From Body of Email”

Advertisements
Extracting Data from the Body of an Email using TaskPipes

Scraping Hacker News on a Schedule with TaskPipes

Background

TaskPipes is a tool to turn any data into a spreadsheet. But why?

Well, spreadsheets are flexible. They’re accessible to the full spectrum of users, from completely non-technical to Turing Award winners. Tabular data is just really easy to manipulate, modify and get into the format you need.

an example

If you’d like to follow along with this, please go to taskpipes.com/examples and make a copy of the Hacker News Pipe.

Let’s use TaskPipes to scrape the front page of Hacker News every day, pull out any stories from GitHub, and email me the results.

Although you can pull in data from a range of sources with TaskPipes, we want to use the “External Link” option. Let’s set this to HN:

Screen Shot 2015-09-22 at 19.27.23

TaskPipes extracts any tables that are present in the HTML and, if you view the source on news.ycombinator.com, you’ll notice that there are three columns. We only want the third column, so let’s remove the first two.

Screen Shot 2015-09-22 at 19.27.45

Next, we want to extract the number of points of each submission. We use the “Extract Text” functionality to get the text between the start position and the first occurrence of the word “points”.

Screen Shot 2015-09-22 at 19.29.52

No more regex!

We do a similar thing to pull out the headline, domain and the number of comments, to end up with the data in this format:

Screen Shot 2015-09-22 at 17.38.46

Now, let’s apply a filter to extract only the stories from github.com, and our pipe is set up.

We can set this process to run on a schedule, and will be emailed a CSV file with the results.

Screen Shot 2015-09-22 at 17.40.57

Alternatively, send this data to an external API, a database, Google spreadsheet, or elsewhere.

Wrapping Up

You can use this above example by visiting taskpipes.com/examples

TaskPipes can pull data from almost anywhere, including web pages, the body of emails or even email attachments.

Clean and manipulate data, and send it to a range of different destinations.

Sign up for a free TaskPipes account at taskpipes.com

Scraping Hacker News on a Schedule with TaskPipes