cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: URL Parsing libraries

From: Joe Halpin <j.p.h_at_comcast.net>
Date: Tue, 27 Jan 2004 17:34:51 -0600

codemastr wrote:
> Does anyone out there know of any good URL parsing libraries? I'm currently
> aware of two such libraries, w3c's libwww, and senga's uri lib. Libwww is
> clearly bloated (it should probably be at least 3-4 smaller libraries) and
> so I'd rather not include it with my program for the small portion that I
> actually need. Senga looked really promising. It's small, and it does just
> what I need. Unfortunately, it relies on libintl (which it doesn't even
> bother to search for) and not all systems have that. Not to mention, the
> project seems dead, I tried contacting the author regarding adding ftps
> support but he never responded.
>
> What I really need is a library that runs under both *nix and Win32 and can:
> 1.) Split a URL into its components
> 2.) Convert a URL to canonical form from "user" form (e.g. www.blah.com/some
> dir/file.txt -> http://www.blah.com/some%20dir/file.txt, etc.)

RFC 2396 gives a regular expression that is claimed to parse URI's
correctly (in Appendix B). Given this regex you should be able to
extract, add, substitute at will given a compliant URI.

I've not messed with it very much but it looks like it's correct from
what little I have done with it. However, it also gives a BNF grammar
for URIs in Appendix A.

Joe

-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
Received on 2004-01-28