About URL Sequences and Dynamic ContentWeb pages which include client-side programming or dynamically generated content can present problems in constructing SiteScope URL Sequence monitors. Client-side programs might include Java applets, ActiveX controls, Javascript, or VBScript. Web pages which are generated by server-side programming (Perl/CGI, ASP, CFM, SSI, and so forth) can also present a problem if link references or form attributes are changed frequently. SiteScope does not interpret Javascript, VBScript, Java applets, or Active X Controls embedded in HTML files. This may not be a problem when the functionality of the client-side program is isolated to visual effects on the page where it is embedded. Problems can arise when the client-side program code controls links to other URL's or modifies data submitted to a server-side program. Because SiteScope does not interpret client-side programs, actions or event handlers made available by scripts or applets will be invisible to the URL Sequence wizard. Some Web sites use dynamically generated link references on pages generated by server-side programming. While these Web pages do not contain client-side programs, frequently changing link references or "cookie" data can make it difficult to set up and maintain a URL Sequence Monitor. Dynamic Content WorkaroundsThere are several ways to make a SiteScope URL Sequence monitor perform actions controlled by client-side programs and other dynamic content. Several of these workarounds are presented below. The workarounds generally require knowledge of the principles of Web page construction, CGI programming, Perl-style regular expressions, and the programming used to support the Web site being monitored.
The figure below illustrates several of the principles of constructing a URL Sequence Monitor using regular expressions. The regular expressions shown in the figure can be used to extract URLs from Javascript or other Web page content. As indicated, content matches for a given step are performed on the content returned for that step. The parentheses used in the regular expressions cause the value matched by the expression inside the parentheses to be remembered or retained. This retained value can be passed on to the next step of the sequence by using the {$n} variable. Because the regular expression can contain more than one set of parentheses, the $n represents the match value from the $nth set of parentheses. The example in the figure only uses one set of parentheses and thus references the retained value as {$1} Web pages containing code that perform the following present additional challenges:
Web pages with dynamically generated link and form content will probably not be parsed correctly by SiteScope URL Sequence Monitor Wizard.
|