UI-Less Parser
This Delphi-sample demonstrates the use of MSHTML as a
UI-less HTML parser. It's based on Microsoft sample "Walkall":
To successfully walk the HTML Scripting Object Model that
the parser exposes after loading the specified document, the host
application must wait until MSHTML has finished loading the document. To
track MSHTML's READYSTATE, the host implements a simple COM object
that exposes IPropertyNotifySink. The host application connects to MSHTML
using the standard connection point interfaces. As MSHTML loads the
document it's readiness state changes. To notify the host of these changes,
it executes IPropertyNotifySink.OnChanged passing along the DISPID of the
property that has changed (DISPID_READYSTATE). The host uses MSHTML's
automation interface obtained at creation time to retrieve the current
value of this property. When the value equals READYSTATE_COMPLETE, MSHTML
has finished loading the document.
If the loaded document is an HTML page which contains scripts, Java
Applets and/or ActiveX controls, and those scripts are coded for immediate
execution, MSHTML will
execute them by default. To disable this feature, the host must implement
two additional interfaces: IOleClientSite and IDispatch. At initialization
time, after MSHTML is instantiated, the host should query MSHTML for its
IOleObject interface, pass MSHTML a reference to its IOleClientSite
interface, query MSHTML for its IOleControl
interface, and call OnAmbientPropertyChange(DISPID_AMBIENT_DLCONTROL).
MSHTML will in turn query the host's IOleClientSite interface for
IDispatch and then request the value of this property from the host. This
is the hosts opportunity to control MSHTML, disabling the execution of
scripts, etc.
See also IEPARSER HTML-parser Component