|
|
|
WizTom for the Web
- Extracting Texts
|
|
|
|
While the capture mode is well suited for
small projects and exceptional update of
the Multilingual Database (for instance,
if the Web site was only slightly modified),
it is not feasible to use it for a larger
project.
Extracting the texts of aa Web site consists
in scanning the files and eventual databases
making up the Web site. WizTom Studio provides
several specialized extractors together
with extensions in order to make it possible
to retrieve texts directly from the Web
site's HTML and ASP code, JavaScript and
XML files, etc.
|
|
|
|
|
|
|
|
 |
Setting up
the Extractor |
 |
|
|
Open the Extractor window by clicking
on the Extractor button and click on the Options
icon from the toolbar to select the extractors
for your project.
|
 |
|
The Options dialog box shows the list of available
extractors, but only those that are checked will
be used. Make sure the HTML and JavaScript Extractors
are selected and active. If your project uses
VBScript, or server-processed pages (such as ASP
for Microsoft IIS sites for instance), check those
extractors.
|
 |
|
You may want to customize each Extractor by clicking
on the Properties button. This opens up a dialog
box that display the file type associations for
the current Extractor. For instance, the HTML
Extractor is configured to extract texts from
.HTM and .HTML files. If your Web site uses Server
Side Includes, you will need to add .SHTML and/or
.SSI to the file associations. It is also recommended
you disable text normalization for Web projects.
|
 |
|
The next step is to select the files for extraction.
Click on the Open Directory icon to select the
files for your project.
By default, WizTom Studio scans entire directories,
but it is possible to select single files only.
Alternatively, you may create a temporary folder
and copy the necessary files accross only. Please
note that you must have access to your Web site's
original files to use the Extractor; if you do
not have access to a Web site's source files,
you may scan the files of this Step by Step Guide
as an example. They are located in D:\wizstart, where
D is your CD-ROM drive letter.
In this example, the original files of WizArt's
Web site has been used as the source for extraction.
|
 |
|
When you close the Extractor window, these settings
will be saved in the project's thesaurus, so that
you will not have to set up the Extractor again
the next time you use it.
|
|
|
|
|
|
|
 |
Extracting
Texts from the Web Site |
 |
|
|
To extract the texts from the Web site, simply
click on the Extract Texts icon. The selected
files will be scanned and texts will be retrieved
and displayed in the Extractor's right hand window.
|
 |
|
Extracted texts may be reviewed before inclusion
in the Multilingual Database. This is especially
useful when scanning Java applets source files
for instance, where undesirable strings might
be extracted such as e.g. SQL query strings. To
exclude a particular string, simply uncheck it
or use a filter with the Excluded Texts dialog.
Once texts have been reviewed, they may be sent
to the Multilingual Database. Click on the
Send Texts icon to include the selected texts
in the Multilingual Database.
|
 |
|
Translate texts 
|
|
|
|
|