I'm downloading a web page (tag soup HTML) with XMLHttpRequest and I want to take the output and turn it into a DOM object that I can then run XPATH queries on. How do I convert from a string into DOM object?
It appears that the general solution is to create a hidden iframe and throw the contents of the string into that. There has been talk of updating DOMParser to support text/html but as of Firefox 3.0.1 you still get an NS_ERROR_NOT_IMPLEMENTED
if you try.
Is there any option besides using the hidden iframe trick? And if not, what is the best way to do the iframe trick so that your code works outside the context of any currently open tabs (so that closing tabs won't screw up the code, etc)?
This is an example of why I'm looking for a solution other than the iframe hack, if I have to write all that code to have a robust solution, then I'd rather keep looking for something else.
Ajaxian actually had a post on inserting / retrieving html from an iframe today. You can probably use the js snippet they have posted there.
As for handling closing of a browser / tab, you can attach to the onbeforeunload (http://msdn.microsoft.com/en-us/library/ms536907(VS.85).aspx) event and do whatever you need to do.
Try this:
Notice the overrideMimeType and responseXML.
The
readyState == 4
is 'completed'.Try creating a div
And then set the tag soup HTML to the innerHTML of the div. The browser should process that into XML, which then you can parse.
So you want to download a webpage as an XML object using javascript, but you don't want to use a webpage? Since you have no control over what the user will do (closing tabs or windows or whatnot) you would need to do this in like a OSX Dashboard widget or some separate application. A Firefox extension would also work, unless you have to worry about the user closing the browser.
Unfortunately, no, not now. Otherwise the microsummary code you point to would use it instead.
The code you quoted uses the recent browser window, so closing tabs won't affect parsing. Closing that browser window will abort your load, but you can deal with it (detect that the load is aborted and restart it in another window for example) and it doesn't happen very often.
You need a DOM window for the iframe to work properly, so there's no clean solution at the moment (if you're keen on using the mozilla parser).