Currently browsing xml
Sun, 24 Feb 2008 18:10 UTC
As you may have already noticed, I’ve started focusing a lot of my attention on XML, specifically with regard to the Atom Publishing Protocol. Now, please don’t confuse this attention with the giddy excitement of a newbie encountering XML for the first time and thinking of it as the greatest thing since sliced bread. On the contrary, that moment happened for me back in 2001 when I bought the first editions of Learning XML and XML in a Nutshell. Rather, I’ve simply come full-circle to a renewed appreciation for the syntax. In these days when people are singing the praises of JSON and other data interchange formats, I still think XML is the best format for exchanging data between disparate systems.
This year marks the tenth anniversary—or birthday, if you will—of the Extensible Markup Language. Version 1.0 of the W3C recommendation was published on February 10, 1998. In a press release covering this occasion, Tim Bray said:
There is essentially no computer in the world, desk-top, hand-held, or back-room, that doesn’t process XML sometimes. This is a good thing, because it shows that information can be packaged and transmitted and used in a way that’s independent of the kinds of computer and software that are involved. XML won’t be the last neutral information-wrapping system; but as the first, it’s done very well.
According to the press release, there will be “a variety of activities and events” planned throughout the year “to recognize and thank the dedicated communities and individuals responsible for XML for their contributions.” Since this is also the tenth anniversary of open source and the tenth anniversary of OSCON, I wonder if there will be any XML “birthday” festivities at OSCON this year. It already sounds like it’s going to be one big 10th anniversary party; I can’t wait to see who shows up for the fun.
No Comments
Permalink
Tags: birthday, oscon, oscon08, w3c, xml
Tue, 31 Oct 2006 23:12 UTC
I’ve just finished giving my presentation on XML & Web Services with PHP (An Overview). Overall, I think the presentation went quite well, though I had entirely too much material to cover in a very short period of time, so it was impossible to go into much depth on any one type of Web Service. This was unfortunate, but I think the “overview” nature of the presentation allowed for this top-level approach.
As promised, here are the slides from the presentation (PDF, 2 MB), and what follows is the list of links for further reading:
Further Reading
XML-RPC
1 Comment »
Permalink
Tags: conferences, php, talks, webservices, xml, zend, zendconference2006
Fri, 19 May 2006 1:52 UTC
I had some time to kill and a silly problem to solve, which means here’s some more SimpleXML fun for you:
The Problem: It’s not really a “problem,” but FeedBurner’s FeedCountTM image is a rigid 88 pixels wide, and I wanted to include it on my homepage under the “syndicate” heading, an area that I’ve defined in my template as having only 80 pixels in width. The 88 pixels were throwing things off, so I used the width attribute of the HTML img tag to solve the problem. Unfortunately, it just squeezes the image, making the text in it appear fuzzy.
The Solution: FeedBurner conveniently provides what they call their “Awareness API,” which is a RESTful interface to retrieve (as XML data) the same exact information displayed in the FeedCountTM image. Since I wanted to maintain the same kind of image (because it’s a recognized look-and-feel for FeedBurner feeds), I simply fired up an image editing program, shuffled things around a bit until the image was a nice, clean 80 pixels wide, and saved it as the base image (shown to the right) I would use for generating an image similar to the one FeedBurner provides.
Then, I wrote a script to grab the FeedBurner data (using SimpleXML) and used PHP’s image functions to open the base image, write the FeedBurner circulation data to it, and save it to use on my site. I simply have a cron job that runs this script once a day to keep my image updated.
Here’s the code. Enjoy!
<?php
$sxe = new SimpleXMLElement('http://api.feedburner.com/awareness/1.0/GetFeedData?uri=ramsey', NULL, TRUE);
$readers = (string) $sxe->feed->entry['circulation'];
$readers = is_numeric($readers) ? $readers : '0';
$xpos = 39 - (strlen($readers) * 6);
$img = imagecreatefromgif('feedburner-base.gif');
$color = imagecolorallocate($img, 0x00, 0x66, 0xCC);
imagestring($img, 2, $xpos, 2, $readers, $color);
imagegif($img, 'feedburner-readers.gif');
?>
Note that I determined the numbers used in lining up the text in the image by trial-and-error. I played around with it a bit to get things right. The $xpos variable is my attempt to right-align the text because you can only specify the left edge of the text with the x coordinate.
No Comments
Permalink
Tags: feedburner, images, php, simplexml, xml
Thu, 11 May 2006 3:05 UTC
I was very excited today while glancing through the code in ext/simplexml/simplexml.c to find some, as of yet, undocumented methods in PHP’s SimpleXMLElement class. This discovery came after I’ve spent several hours over the last couple of nights banging my head against the desk to figure out a way to create a class that extends SimpleXMLElement and adds a new method for adding a child, which would have to use DOM in order to work—or so I thought.
During this process, I’ve discovered two major limitations of SimpleXML:
- You cannot override the SimpleXMLElement constructor; it is a “final” method in the internals
- You can extend SimpleXMLElement and add new methods, but looping through child nodes creates new SimpleXMLElement objects that do not use your class and, thus, cannot use your methods
Nevertheless, I can get over these limitations after discovering these new SimpleXMLElement methods in PHP 5.1.4. In particular, the last three are what interest me most (these are not yet documented in the PHP manual ):
bool SimpleXMLElement->registerXPathNamespace(string prefix, string uri)
array SimpleXMLElement->getNamespaces([bool recursive])
array SimpleXMLElement->getDocNamespaces([bool recursive])
string SimpleXMLElement->getName()
object SimpleXMLElement->addChild(string name [, string value [, string ns]])
void SimpleXMLElement->addAttribute(string name, string value [,string ns])
Previously, one of the biggest limitations of SimpleXML was that you could not add new child nodes to an XML document. Adding attributes to an element was no problem, but in order to add a child element, you had to go a round-about way through importing the SimpleXML object to DOM with dom_import_simplexml(), adding the child element, and importing the DOM object back to SimpleXML with simplexml_import_dom(). This was messy and anything but simple.
Now, it’s clean and simple, as SimpleXML should be:
<?php
$xml = <<<XML
<books>
<book title="Fahrenheit 451" author="Ray Bradbury"/>
<book title="Stranger in a Strange Land" author="Robert Heinlein"/>
</books>
XML;
$sxe = new SimpleXMLElement($xml);
$new_book = $sxe->addChild('book');
$new_book->addAttribute('title', '1984');
$new_book->addAttribute('author', 'George Orwell');
echo $sxe->asXML();
?>
UPDATE: I’ve added all of these (except for registerXPathNamespace(), which I can’t seem to figure out) to the documentation for SimpleXML in the PHP Manual. The documentation for each new method will be available the next time the scripts run to update the manual.
6 Comments »
Permalink
Tags: php, simplexml, xml
Mon, 8 May 2006 14:35 UTC
It seems I’ve been focused on OPML for the past few posts, and why not? OPML appears to be gaining a lot of attention lately. There’s even an OPML Camp in Boston later this month.
So, what’s all the fuss about? Adam Green summarizes his vision for the future of OPML on the OPML Camp blog:
I see OPML as an extremely flexible container for many types of XML and HTML data. Its simplicity and extensibility means that it can be used to bring together many types of structured content that are currently being developed on the Web, such as microcontent, attention data, identity data, and data that will eventually comprise the Semantic Web. OPML won’t replace these many efforts, it will help integrate them, by providing a common packaging mechanism.
Adam Green, OPML Camp blog
As of right now, many sites are using RSS as a common content delivery format. While RSS works for syndication, though, I think that it lacks the extensibility and context to fill the role of this “flexible container.” OPML won’t replace RSS, but it will aid in its delivery. So, I’m a little more than casually interested in seeing the application and future direction of OPML.
Along these lines, Dave Winer, launched today his new Share Your OPML service, the purpose of which is to “gather a community of subscription lists, in OPML format, and aggregate them in interesting ways.” It’s unclear to me (right now) the practical value of this service, since I was unable to discover how to actually generate OPML from the site. If the site were to provide OPML feeds for each data display, then its value would be readily apparent. Right now, I can only find an OPML feed for the Top 100 shared feeds.
Nevertheless, in the spirit of signing up for every new service out there, you can view my shared feeds here.
No Comments
Permalink
Tags: opml, rss, semantic-web, web-2.0, xml
Sat, 6 May 2006 15:40 UTC
UPDATE: Please note that del.icio.us has updated their API in the time since I wrote this. I have updated the URLs in my examples accordingly. The authentication is much safer since they are now using SSL.
The other day, I mentioned that I generate my blogroll from an OPML file. Until recently, this OPML file was generated manually by me whenever I updated my blogroll in NewsFire.
Now, there are some problems with NewsFire, one of which is that you cannot generate OPML for a specific group or selection of blogs (NetNewsWire excels in this area, however). Instead, the OPML file generated is a dump of all your feeds. This was not ideal for me, since I then had to go through and clean up the OPML before uploading it to my site. Needless to say, I updated my OPML as little as possible.
After my last blog post, I posted a comment to Chris’s blog saying that he should do something similar. That is, I suggested he generate his blogroll from an OPML file instead of del.icio.us, which has the unfortunate limitation of only 30 items per RSS feed. He replied, saying that he would then have to update two places (his feed reader and del.icio.us) instead of only one. In short, it would create more work. After assessing my practices, I cannot agree more.
So, I investigated a way to get around the 30-feed limitation in del.icio.us, and I found that their API allows you to do just this, albeit with a few restrictions of its own, which I’ll explain in a few moments. Using the del.icio.us API, instead of their RSS feeds, I was able to use the following code to first check whether my del.icio.us account has been updated since I last cached their data, and, if so, to grab all of the links for a particular tag and cache the data to a file for later use:
<?php
$username = 'your_username';
$password = 'your_password';
$cache_file = '/tmp/blogroll.xml';
$update = simplexml_load_file("https://{$username}:{$password}@api.del.icio.us/v1/posts/update");
if (strtotime($update['time']) > filemtime($cache_file))
{
$data = file_get_contents("https://{$username}:{$password}@api.del.icio.us/v1/posts/all?tag=blogroll");
file_put_contents($cache_file, $data);
}
$blogroll = simplexml_load_file($cache_file);
foreach ($blogroll->post as $blog)
{
echo '<a href="' . htmlentities($blog['href']) . '">';
echo htmlentities($blog['description']);
echo "</a><br />\n";
}
?>
This code sample uses SimpleXML to traverse the nodes of the XML documents retrieved from del.icio.us. It first polls del.icio.us with a request to see the update timestamp. Checking this before executing a query to see all posts saves bandwidth for del.icio.us and can help us determine whether our cached file needs updating. This is also the recommended practice (from del.icio.us).
This approach can help you grab all of your links (or all links for a specific tag) from del.icio.us. I’ve noted several issues, however:
- You can specify a tag to the
/posts/all command with ?tag=tagname, but you cannot specify a combination of tags to retrieve
- You cannot specify a tag to the
/posts/update command, so the update timestamp received is for your last overall update to del.icio.us—if specifying a tag to /posts/all, then it would be beneficial to specify a tag to this command, saving del.icio.us even more bandwidth
- This method requires HTTP authentication; you’ll notice how I passed a username and password through the URL to retrieve the data—you can retrieve only your links; whereas, using RSS, you may retrieve anyone’s links, but you are limited to their 30 most recent links
Finally, I have implemented this approach using my blogroll tag on del.icio.us. My OPML file is now generated nightly (if detecting updates to del.icio.us). This will end up saving me quite a bit of time, since I can use the del.icio.us plugin for Firefox to add any site I visit to my blogroll, and, then, I can import the generated OPML into my feed reader at my leisure, with little to no headache.
You can see my script, which is executed by a nightly cron job, here. You may notice that it’s using a getRSSLocation() function to auto-discover the RSS feed from each site for inclusion in the OPML file. I grabbed this function from Keith Deven’s site, and you can see my implementation of it here.
4 Comments »
Permalink
Tags: delicious, netnewswire, newsfire, opml, rss, xml
Thu, 4 May 2006 14:53 UTC
The other day, I came across Scott Johnson’s PHP OPML Reading List. Offering an OPML reading list for others to download is a great idea, and, since I’ve not yet blogged about it, I wanted to point out that I’ve been doing this for a long while now. On my home page, under the “syndicate” heading, is a link to my OPML blogroll. Feel free to import my OPML into your feed reader; that’s what it’s there for. (Please also note that I use SimpleXML to generate the blogroll on my home page from this list.)
There’s a lot of overlap between Scott’s and my list; however, I think both lists provide a good overview of some of the great PHP bloggers out there—and I know there are many more than are on these lists, as pointed out by Chris.
In other news, the PHP development team released PHP 5.1.3 earlier this week, and, shortly thereafter, a user reported a critical bug with data in the $_POST array when submitting from a multi-part form. As of right now, there is no official release update posted on the PHP downloads page, but Chris provides a link to download the 5.1.4 source from a mirror.
I’ve already compiled and updated from the new source, and I encourage you to do the same.
UPDATE (5/5/06 4:30 UTC): The PHP team has made an official announcement concerning the 5.1.4 release on their Web site and have updated the downloads page.
No Comments
Permalink
Tags: blogging, community, feed-reader, opml, php, xml