OLE / COM / DCOM / Automation / ActiveX - Delphi

發表時間：2005-02-18 00:29:07
IP:218.175.xxx.xxx 未訂閱
http://delphi.about.com/od/comoleactivex/
 Automating Applications Without OLE or DDE
Julian V Moss    A common requirement of the MIS programmer is to write software that can control other applications. Delphi is provided with components for using DDE, while Delphi 2 has excellent support for OLE automation. But often, the programs you want to control support neither of these methods. This article shows you the secret for controlling any application.    In an ideal world we would create a custom software solution to meet each new user requirement. But in reality, that isn't practical. Time and budgetary constraints are two obvious reasons why we can't design every solution from scratch. But there are often other reasons why we need to harness existing software to get information or carry out functions required by our applications, instead of doing it directly.    As Delphi developers we are already accustomed to the idea of building solutions using ready made software components. Automation is a way of using other applications as components. Under Windows, DDE is one method that can be used by programs to allow them to be automated by other software, though few applications support it. OLE Automation is a far superior mechanism which has more recently been introduced. This, too, is not yet widely supported, though hopefully the situation will change as new development tools make it easier to create OLE servers. Delphi 2 leads the way in this area.    What do you do when you need to automate another application, but it doesn't support DDE or OLE? You must resort to the somewhat crude method of keystroke stuffing: running the program as if you were sitting at the keyboard typing text into it. Microsoft's Visual Basic has the SendKeys command to allow you to do this. But Delphi has nothing equivalent to SendKeys.    A First Stab At The Problem    Faced with the lack of this vital capability, the first place to turn is the Windows API reference. You probably know that all input into Windows applications arrives in the form of messages. The Windows API routines for producing messages are SendMessage and PostMessage. These functions can be used to send messages like WM_KEYDOWN, WM_KEYUP and WM_CHAR.    Unfortunately, although this path looks promising, this information is a blind alley. Although you can generate keystroke messages for other windows this way, the code needed to generate the correct parameters for the WM_KEYDOWN and WM_KEYUP messages, and to ensure that the messages arrive at the right time and are translated correctly by the receiving application, makes this by far the most complex method to implement. It is also the least reliable method, as there is no way to ensure that a message is not ignored, which will happen if it arrives before the receiving application is ready to deal with it. So this is definitely not a good way to proceed.    Another Stab    A better way to send keystrokes from one program to another is via the use of journal hooks. As the name implies, journal hooks are a low level interface that enable you to hook into the stream of Windows messages. Under Windows 3.1 you install a journal hook using the SetWindowsHook function, and remove it using UnhookWindowsHook. In the 32-bit Windows environment you must use the very similar functions SetWindowsHookEx and UnhookWindowsHookEx.    Journal hook procedures are useful because they allow you to get information about what is going on in the system that you could not obtain any other way. Debugging tools such as Delphi's WinSight use them, as do shell programs and other software that needs to know what is going on in the system as a whole.    Two types of hook procedure are useful from the point of view of automating other applications. A JournalRecordProc hook lets you monitor the stream of keyboard and mouse messages so that they can be recorded in some data structure. You can then install a JournalPlaybackProc hook to play back the sequence of messages at a later time. This is essentially what the Windows 3.1 Recorder accessory does.    Using journal hooks has some disadvantages, though. Because they hook into the system at a low level, journal hooks have the potential to adversely affect the stability of Windows. A JournalPlaybackProc procedure disables normal keyboard and mouse input while it is installed, so an unexpected event causing the hook procedure to fail to terminate and remove the hook would result in a locked system.    Under Windows 3.1, which has a single message queue and where multitasking only occurs when one program yields to another, the most likely unexpected events are fatal ones that would probably cause a system crash anyway, so journal playback hooks present less of a risk. In the multitasking, multiple input queue 32-bit environment, however, it is less easy to predict what will happen while the messages are playing back. Also, anything that compromises system integrity is undesirable in the 32-bit environment, which is often chosen because of its stability. Microsoft has chosen not to provide a 32-bit version of Recorder for precisely this reason.    Third Time's A Charm!    Fortunately there is a third method of generating keystrokes, though it is the one that you are least likely to discover for yourself. Indeed, under Windows 3.1 it is completely undocumented. This method does not require the installation of alien code into Windows' messaging system, so it presents no risk to system integrity. It is also much simpler to use than trying to generate WM_KEYDOWN, WM_KEYUP and WM_CHAR messages within your application. This is the method I will describe here.    Activating The Right Window    No matter which method you use to send keystrokes to an application, one thing you have to ensure is that the program you want to send keystrokes to is the active one: the one in the foreground with the input focus. When typing at the keyboard, one window must be active to receive your input. If none is, Windows just beeps. Exactly the same applies when sending keystrokes from another program. Using the method to be described, keystrokes enter the messaging system in just the same way as if they originated from the keyboard. Your Delphi program must ensure, as you would do when operating the program yourself, that it's 'typing' goes to the right window.    Note that what you can't do is activate the application by sending mouse movements. Due to the nature of mouse activity it is difficult to know what to send without recording mouse movements using a journal hook. Playing mouse movements back is also prone to error, since application windows may not always appear in the same place as before: this was always the limitation of Windows Recorder. There were ways to overcome some of these limitations, which have been exploited successfully by some third party automation tools. However, the process is made much more difficult under 32-bit Windows, which provides each program with its own input message queue so that one program can't easily see another's messages. It's simplest to accept that using mouse input is impractical, and to control your target program using only keyboard input.    All of the Windows API functions for controlling other windows use a window handle, hWnd, to identify the target. You have a choice of methods for finding out what this is. If you launch the target application from within your Delphi program, perhaps using the ShellExec function from Delphi Developer June 1996, then you know that once the program has initialised itself, its window will be the active one. Your program could simply wait a sufficient time for the program to initialise, and then use the function GetActiveWindow to obtain the handle of the currently active window. Having got the handle, you can check it belongs to the window you want by getting the title bar text using GetWindowText or the window class name using GetClassName.    Note that there are a few differences between 16-bit API functions and their 32-bit namesakes. In the 32-bit API the scope of GetActiveWindow is confined to the windows that are running on the calling process's thread. In effect, it will return the handle of the active window within your Delphi application. If some other program is currently active, the function will return a null value, since none of your Delphi forms will be active. So when using Delphi 2 you will want to use the new API function GetForegroundWindow instead.    If you don't know whether the program you want is active or not, you can use FindWindow to find it. FindWindow takes two PChar parameters: the first is the window class name (such as 'TMyForm') and the second is the full window title bar text. Generally you pass FindWindow one parameter or the other, leaving the unused parameter as a null pointer (PChar(0)). If a matching window is found, the function returns its handle; if not it returns zero.    Once you've got the handle of the window you want to automate, the next thing you should do before sending keystrokes to it is make sure it's active. The window might be iconized, which you can test using the IsIconic function. If it is, you should restore it by calling the ShowWindow procedure, supplying as parameters the window handle and the value SW_RESTORE. This has exactly the same effect as when you restore a window from an icon using the mouse, and leaves the window active and ready to receive keyboard input.    If the window is not iconic then for 16-bit programs you use the BringWindowToTop function to make sure it is active. Once again, in the 32-bit environment BringWindowToTop affects only your own application's windows (or more precisely, those on the same thread) and so you should use SetForegroundWindow instead.    WinCtl Makes It All So Easy...    I have refrained from giving actual examples of using these functions in your code, because a better solution is to create a Delphi unit – which I have called WinCtl – containing the functions you need to control an application. You can then call the functions in this unit from your program instead of calling the API directly. This has three benefits.     1. The unit takes care of converting between strings and the zero terminated PChars that all API functions expect, so you can use Delphi strings and nothing but Delphi strings.     2. You need only call one function where using the raw API you might need several lines of code.     3. The unit can hide API differences such as those I have already mentioned. When the time comes to recompile your program with Delphi 2, a slightly modified WinCtl unit will be all that you require.    The routines in WinCtl are explained below. In the code that follows, the 16-bit version is shown. A 32-bit version is also provided on your companion disk. The disk also has a short test project which opens a copy of Notepad, sends some text to it and then closes the program after quitting out of the "Text has changed, do you want to save" dialog. Thanks to WinCtl hiding the API differences only one version of the test program is needed: a compiler conditional is used to ensure the correct version of the unit is used.    WinCtl's Non-Automation Methods    WinCtl includes a couple of functions and procedures that are not solely of use for the purpose of automating other applications, but which are certainly useful to have when doing so. For example, when executing a command or option of the automated program, you will often need to wait for a given interval to allow the program to carry out some processing, display another window or whatever. The Delay procedure allows you to do this. You specify the length of the delay in milliseconds, and your program will wait for that length of time, regardless of the speed of the processor it is running on.    { Delay program execution by n milliseconds }    procedure Delay(n: Integer);    var    start: LongInt;    begin    start := GetTickCount;    repeat    Application.ProcessMessages;    until (GetTickCount - start) >= n;    end;    In the June issue of Delphi Developer, I wrote an article called Launching Applications From Delphi. ShellExec is the function I demonstrated in that article which launches applications. It lacks the parameter allowing you to specify whether to wait for the executed program to finish, since in this case you will want to be returned to your program as soon as the program has been launched, in order to automate it.    ShellExec takes five parameters. The op parameter is the shell operation to run, which to launch a program is 'open'. The parameters fn, par and dir are the executable file path, the command line parameters (if any) and the initial starting directory (if not the default) in that order The show parameter is one of the SW_ constants defined in unit WinTypes. SW_NORMAL would usually be the best choice. The function returns the value False if it fails.    { Execute a command, such as to run a program }    function ShellExec(const op, fn, par,     dir: String; show: Word): Boolean;    To launch an instance of Notepad, you would simply use:    ShellExec('open', 'notepad.exe', '', '',     SW_NORMAL);    The Wrappers    The next group of functions are simple wrappers round API functions that make them easier to use. The purpose of each one should be obvious from its name. They let you find the handle of a window if you know either that it is currently active (because you just launched the program, for example) or you know its title bar text or window class name. You can use WinSight to find out a window's class name.    { Functions to obtain handle of desired window }    function GetHandleFromWindowTitle(    const titletext: string): hWnd;    var    strbuf: Array[0..255] of Char;    begin    result := FindWindow(PChar(0),    StrPCopy(strbuf,titletext));    end;    function GetHandleFromWindowClass(    const classname: string): hWnd;    var    strbuf: Array[0..255] of Char;    begin    result := FindWindow(StrPCopy(strbuf,classname),    PChar(0));    end;    function GetHandleOfActiveWindow: hWnd;    begin    result := GetActiveWindow;    end;    The next two functions are complementary to the previous ones. They let you obtain the title bar text or class name of a window if you know it's handle. If you want to make sure that the currently active window is what you think it is, one or other of these functions will help you.    { Functions to obtain identification of     a window }    function GetWindowTitleText(    whandle: hWnd): string;    var    strbuf: Array[0..255] of Char;    begin    GetWindowText(whandle,strbuf,255);    result := StrPas(strbuf);    end;    function GetWindowClassName(    whandle: hWnd): string;    var    strbuf: Array[0..255] of Char;    begin    GetClassName(whandle,strbuf,255);    result := StrPas(strbuf);    end;    MakeWindowActive    The procedure MakeWindowActive is very important. As I have already said, keystrokes from whatever source are acted upon by whichever window, and whichever control within the window, currently has the input focus. All sorts of things – user interference, a background process suddenly displaying a message, to list but two – could conspire to change the active window when you aren't expecting it, so it's a good idea to call MakeWindowActive before sending any keystrokes from your program, to make sure they go to the right place.    { Procedure to make a specific window the     active one }    procedure MakeWindowActive(whandle: hWnd);    begin    if IsIconic(whandle) then    ShowWindow(whandle,SW_RESTORE)    else    BringWindowToTop(whandle);    end;    SendKeys    Now I come to the SendKeys procedure itself. This couldn't be simpler to use. It's one parameter is a string containing the keys that you want to be sent. If you want to send ASCII text, you just enter the ASCII text in the string.    { Send key strokes to active window }    procedure SendKeys(const text: String);    The trouble is, not all the keys that you might want to send are printable ASCII characters. Some of the keys which you will need to use when automating an application have no ASCII equivalents at all. Visual Basic's SendKeys command uses a metalanguage in which special symbols and keynames within curly brackets are used to specify non-ASCII key codes. In this implementation, I have adopted the simpler expedient of defining a set of constants corresponding to the most commonly needed keys, and mapping these into the byte range 228 .. 255 (see Listing 1). This makes processing the key string simpler, but means that there is no way to send any of the ASCII codes in this range. Few people are likely to be inconvenienced by this; however if you are, it would not be too difficult to write a more sophisticated string parser that used an escape character in front of special key codes, allowing all 255 ASCII codes to be sent if you wished.    Listing 1. Special key constants for use with SendKeys    SK_BKSP = #8;    SK_TAB = #9;    SK_ENTER = #13;    SK_ESC = #27;    SK_F1 = #228;    SK_F2 = #229;    SK_F3 = #230;    SK_F4 = #231;    SK_F5 = #232;    SK_F6 = #233;    SK_F7 = #234;    SK_F8 = #235;    SK_F9 = #236;    SK_F10 = #237;    SK_F11 = #238;    SK_F12 = #239;    SK_HOME = #240;    SK_END = #241;    SK_UP = #242;    SK_DOWN = #243;    SK_LEFT = #244;    SK_RIGHT = #245;    SK_PGUP = #246;    SK_PGDN = #247;    SK_INS = #248;    SK_DEL = #249;    SK_SHIFT_DN = #250;    SK_SHIFT_UP = #251;    SK_CTRL_DN = #252;    SK_CTRL_UP = #253;    SK_ALT_DN = #254;    SK_ALT_UP = #255;    Although SendKeys is easy to use, it was not quite as easy to implement. You can see the code in Listing 2.    Listing 2: The SendKeys procedure    procedure SendKeys(const text: String);    var    i: Integer;    shift: Boolean;    vk,scancode: Word;    ch: Char;    c,s: Byte;    const    vk_keys: Array[0..9] of Byte =    (VK_HOME,VK_END,VK_UP,VK_DOWN,VK_LEFT,VK_RIGHT,VK_PRIOR,    VK_NEXT,VK_INSERT,VK_DELETE);    vk_shft: Array[0..2] of Byte = (VK_SHIFT,VK_CONTROL,VK_MENU);    flags: Array[false..true] of Integer = (KEYEVENTF_KEYUP, 0);    begin    shift := false;    for i := 1 to Length(text) do    begin    ch := text[i];    if ch >= #250 then    begin    s := Ord(ch) - 250;    shift := not Odd(s);    c := vk_shft[s shr 1];    scancode := MapVirtualKey(c,0);    Keybd_Event(c,scancode,flags[shift],0);    end    else    begin    vk := 0;    if ch >= #240 then    c := vk_keys[Ord(ch) - 240]    else if ch >= #228 then    c := Ord(ch) - 116 {228 (F1) ==> $70 (vk_F1)}    else if ch < #32 then    c := Ord(ch)    else    begin    vk := VkKeyScan(Word(ch));    c := LoByte(vk);    end;    scancode := MapVirtualKey(c,0);    if not shift and (Hi(vk) > 0) then    Keybd_Event(VK_SHIFT,$2A,0,0);{ $2A = scancode of VK_SHIFT }    Keybd_Event(c,scancode,0,0);    Keybd_Event(c,scancode,KEYEVENTF_KEYUP,0);    if not shift and (Hi(vk) > 0) then    Keybd_Event(VK_SHIFT,$2A,KEYEVENTF_KEYUP,0);    end;    Application.ProcessMessages;    end;    end;    As users we tend to think of a single keypress as being one event. However, SendKeys feeds information into the system at the same point as a keyboard driver. From the driver's point of view, a keypress is in fact two events, a key down and a key up. SendKeys generates both events for each keypress.    The keyboard driver knows nothing about ASCII codes. What it receives from the keyboard is a scan code: a number which corresponds to the physical position of the key on the keyboard. SendKeys derives the scan code from the ASCII code by first calling the API function VkKeyScan, which translates the ASCII byte code to a 'virtual key code': an internal Windows representation of the key. For the special SK_ key values and the lower 32 characters of the ASCII set this step is not necessary, and the virtual key code can be derived directly from the byte value. The virtual key code is then converted to a scan code using the MapVirtualKey function. Both the scan code and the virtual key code are required by the Keybd_Event procedure.    The scan code of a particular letter key is constant whether the character is lower or upper case, so the keyboard driver also needs to tell Windows whether the shift key is down. If an ASCII character is a shifted key the high byte of the work returned by the VkKeyScan function is 1: SendKeys uses this fact to generate additional key down and key up events for the shift key, surrounding the key events for the character key.    Sometimes, though, it is important for you to control when the shift key is pressed and released. For example, you might hold down Shift while moving the cursor arrows to select some text in an edit control. The same applies to the Alt and Ctrl keys which are usually used in combination with other keys. To allow for this, there are separate down and up SK_ values for all three shift keys. To close an application you might use the following to send Alt F4:    SendKeys(SK_ALT_DN SK_F4 SK_ALT_UP);    It is important, when you use one of these shift key down values, to remember to use the corresponding key up value, otherwise Windows will think that the key is stuck down!    The keystrokes are entered into the message chain by calling the Keybd_Event function. This takes four parameters, the Windows virtual key code, the scan code, a flags doubleword which indicates whether the key is being pressed or released, and an extra information doubleword which we can set to zero. Because Keybd_Event does not exist in the 16-bit API, WinCtl16 includes a private Keybd_Event procedure which takes the same parameters as the public Win32 version, and passes the data to the undocumented Kbd_Event procedure. This keeps the SendKeys procedure itself free of any version dependent code.    Tips On Putting Automation To Work    Automating other applications is a process that entails a certain amount of trial and error. First, you have to get the sequence of keystrokes exactly right. Second, you must fine tune the timing and the window handle management to get your code to work reliably every time. Remember, when you operate a program from the keyboard you are receiving visual feedback that tells you whether the program has responded to your input and is ready for the next command. Your Delphi program has to do the same thing blind, using only the functions described here.    Use the Delay function liberally to give your target program time to respond. Check the identity of the active window frequently, and make sure you activate the target window before sending any keystrokes that might cause a disaster if received by the wrong program. Follow those three rules and you should find it easy, with the help of my WinCtl unit, to control other applications from Delphi.
這下糟糕 po 錯區了發表人 - conundrum 於 2005/02/18 00:33:19