Friday, September 19, 2008

More fun with Yahoo! Pipes

Summary: A few advanced techniques for building Yahoo! Pipes.

In the previous post, I explained how to use Yahoo! Pipes for monitoring hot deals. In this post, I'll cover several techniques which can help you get the best out of your pipes.

Before I get to the the good parts, let me share a couple of problems, which I discovered. Understanding these issues can help you avoid some frustration.

Problem #1: Changes appear not to be saved.
If you create a new pipe (or make changes to the existing pipe), save the pipe, and then try to run it (or open it for edit), you may get an error saying that the pipe was deleted. If you re-open the My Pipes page, the new pipes may be missing, and if you open a modified pipe in the editor, you may not see the changes you just made. Don't panic, yet: your new pipes (or modifications) are there. Just refresh the page until the error message goes away (you will need to refresh the page a few times). I'm not sure what causes the problem, but to me it looks like there is a synchronization issue on the Yahoo! side (maybe your changes are saved in one database, but when you switch a page, you hit a different server to which your changes have not been replicated). After a couple of hours the problem will go away and you will not see the errors. I reported this problem to Yahoo! (and others did too), so hopefully, it will be fixed (no word on this, yet).

Problem #2: RSS reader is slow to report new posts.
I do not know if it affects all RSS readers, but I saw this problem on Bloglines, as well as Google Reader: matching posts returned by your pipes may appear in the RSS reader several hours after they have been posted in the original feed. This is not a big deal for regular posts (like your favorite blog subscriptions), but if you want to be notified about a product returned by your pipe faster, you may want to use the pipes' built-in notification mechanisms instead of relying on the RSS reader.

Update: A few weeks ago, Bloglines stopped retrieving RSS posts from almost all of my Yahoo! Pipes. Due to multiple issues with Bloglines (also reported by others), I no longer recommend using it (I switched to Google Reader).
And now, the good stuff.

Hint #1: Reuse your pipes.
Say, you want to build several pipes with something in common. Instead of duplicating the common functionality in every pipe, you can build a special pipe containing the shared logic, and then pick this pipe as any other module from the module library (you will find your pipe under the My pipes heading in the library panel of the editor). For example, I created the following pipe defining the input sources and transformation rules for all of my shopping-related pipes:

My other pipes use this pipe as input, and apply additional filters to specify pipe-specific conditions. Now, if I find a better way to define input sources and transformation rules, I just need to change one pipe and all pipes that use it will pick up the modifications automatically.

Hint #2: Use regular expressions for more precise matches.
Say, you are shopping for a ring, so you define a filter with the following rule:
item.title contains ring
What you may not realize is that this pipe will return posts with titles containing the following words: spring, boring, ringtones, and so on. To make sure that your filter finds exact matches of the whole word ring, instead of the Contains condition, use Matches regex.

Tips: For non-programmers: Regex stands for regular expression, which is a sophisticated mechanism for finding (and replacing) more complex matches. For programmers: In case you wonder, Yahoo! Pipes use the Perl-like regular expression syntax with certain caveats. For example, to look for case-insensitive matches, you need to prefix the search string with (?i) (I thought it would be /text/i). For all: Here are some great resources for learning, building, and testing regular expressions:And this is some informations specific to Yahoo! Pipes:
To search for the exact word match, you need to use the following pattern:
As you may have noticed the \W tokens surrounding the search string specify the word boundaries, while (?i) indicates case-insensitive search. Here is an example of the search filter one may use for baby-related products (notice that it uses another pipe as input):

Once you get a grip on regular expressions (if you are so inclined), you will be able to build more efficient filters, but in the meantime, the information I provided should get you one step past the novice. If you stumble upon a problem you cannot solve, check the Message Boards for Pipes (this is where I found how to implement case-insensitive search).

Additional references:
Yahoo! Pipes Documentation
Yahoo! Pipes Tutorials

No comments: