Interview with Ron Deardorff on Document Generation

Ron Deardorff is North American Technical Support Manager at Inventive Designers, the developers of Scriptura XBOS that supports easy generation of business output. Deardorff spoke to INsights in an in-depth interview covering important aspects of document generation and the Scriptura XBOS solution.

INsights: What is XSLT?
Deardorff: XSLT, which stands for eXtensible Stylesheet Language: Transformations, is a language which is primarily designed for transforming one XML document into another. However, XSLT is capable of transforming XML to many other text-based formats, so a more general definition might be appropriate: XSLT is a language for transforming the structure of an XML document. For more information, go here.

INsights: What is XPath?
Deardorff: XPath is the result of an effort to provide a common syntax and semantics for functionality shared between XSL Transformations and Xpointer. The primary purpose of XPath is to address parts of an XML document. In support of this primary purpose, it also provides basic facilities for manipulation of strings, numbers and Booleans. XPath uses a compact, non-XML syntax to facilitate use of XPath within URIs and XML attribute values. XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax. XPath gets its name from its use of a path notation as in URLs for navigating through the hierarchical structure of an XML document. Detailed information can be found here.

INsights: Can the XSLT code describing the XSL-FO be edited directly from within the Scriptura Designer?
Deardorff: No. The Scriptura Designer is, on purpose, a purely graphical design environment.

INsights: What is XSL?
Deardorff: The Extensible Stylesheet Language (XSL) is a language for expressing style sheets. It consists of two parts: a language for transforming XML documents (XSLT) and an XML vocabulary for specifying formatting semantics (XSL-FO). Go here for more information.

INsights: What is XSL-FO?
Deardorff: XSL-FO is an XML vocabulary that serves the purpose of unambiguously (device independent) describing block layout, inline formatting, and other presentational characteristics. More info can be found here.

INsights: What is XML?
Deardorff: From the XML FAQ: XML is the "Extensible Markup Language" (extensible because it is not a fixed format like HTML). It is designed to enable the use of SGML on the World Wide Web. XML is not a single, predefined markup language: it's a metalanguage--a language for describing other languages--which lets you design your own markup. (A predefined markup language like HTML defines a way to describe information in one specific class of documents only: XML lets you define your own customized markup languages for limitless different classes of document.) It can do this because it's written in SGML, the international standard metalanguage for text markup systems.

INsights: What is XHTML?
Deardorff: The Extensible HyperText Markup Language (XHTML) is a family of current and future document types and modules that reproduce, subset, and extend HTML, reformulated in XML. XHTML Family document types are all XML-based, and ultimately are designed to work in conjunction with XML-based user agents. XHTML is the successor of HTML, and a series of specifications has been developed for XHTML. Detailed information can be found here.

INsights: Which version of the JDBC specification is used by Scriptura XBOS?
Deardorff: Scriptura XBOS uses the JDBC 2.0 specification. In order to make Scriptura XBOS work with a custom JDBC driver, you will need a JDBC 2.0-compliant driver.

INsights: What is SQL?
Deardorff: Structured Query Language (SQL) is used to communicate with a database. According to ANSI (American National Standards Institute), it is the standard language for relational database management systems. SQL statements are used to perform tasks such as update data on a database, or retrieve data from a database. More information can be found here.

INsights: What is ODBC?
Deardorff: Open DataBase Connectivity (ODBC) is a standard database access method developed by Microsoft Corporation. The goal of ODBC is to make it possible to access any data from any application, regardless of which database management system (DBMS) is handling the data. ODBC manages this by inserting a middle layer, called a database driver, between an application and the DBMS. The purpose of this layer is to translate the application's data queries into commands that the DBMS understands. For this to work, both the application and the DBMS must be ODBC-compliant--that is, the application must be capable of issuing ODBC commands and the DBMS must be capable of responding to them. More information can be found here.

INsights: What is JDBC?
Deardorff: Java Database Connectivity (JDBC) is the standard for communication between a Java application and a relational database. The JDBC API is released in three versions; JDBC version 1.22 (released with JDK 1.1.X in package java.sql), version 2.0 (released with Java platform 2 in packages java.sql and javax.sql) and starting from Java 1.4.0, version 3.0. It is a simple and powerful largely database-independent way of extracting and inserting data to or from any database. Additional information about JDBC is available from the Sun Developer Network site. More information is also available here.

INsights: What is PDF?
Deardorff: PDF (Portable Document Format) is a file format developed by Adobe Systems for distributing documents cross-platform, with original formatting intact and independent of the application software used to create them. Reading PDF files requires a viewing program, which can be set up for use within a graphical Web browser.

INsights: Borders of objects do not look pretty in a PDF document. How can I improve this?
Deardorff: Even though borders of objects are actually correct, by default Adobe Acrobat Reader does not render them smoothly. When the same document is printed or displayed in the Designer (using a large zoom factor) you will see that the borders are really smooth. Acrobat Reader allows lines to be rendered smoothly by selecting "Preferences" from the "Edit" menu and selecting the "Smooth Line Art" checkbox in the "Display" submenu.

INsights: Why is the border of a text shape in a table incorrect in the output?
Deardorff: When you want to draw a border around a cell in a table, select the cell (by using the content outline or by right-clicking on the cell), and set the border on that cell. Don't put the border on the text shape in the cell because then the borders will not extend to the full size of the cell.

INsights: Which image formats are supported by Scriptura?
Deardorff: Scriptura supports GIF, JPEG, PNG and SVG images. Transparent images (GIF and PNG) are also supported.

INsights: Which fonts are supported by Scriptura?
Deardorff: Depending on the output format, Scriptura supports different sets of fonts. Warning: some fonts are protected by licenses, which prohibit you to copy or embed them. You are solely responsible to check the licenses of the fonts you try to convert. (On Microsoft Windows you can view the font properties including license information by right clicking a font in the File Explorer.)

INsights: When text is inserted into a document, does the other content on the page flow?
Deardorff: In Scriptura, there are two ways of inserting objects (such as text boxes) on a page. These two ways are called absolute positioned and relative positioned. The absolute positioned objects have a fixed position, even if some objects on the page change in size, such as a text box that includes a variable length string or a table with a dynamical number of rows. This implies that if you mark an object absolute positioned, it may be overwritten by other objects that have grown. To prevent this overwriting of objects, mark objects as relative positioned. An object that is positioned relative will keep the same distance from the previous relative positioned object. (By previous we mean closer to the top of the page.) Relative objects cannot overlap, and if they are they will be positioned right underneath each other on the output page. You can change the default position for each of the regions (left, right, header, footer and body region). At installation time, only the body has a default relative position, the other regions have default absolute position. In case of creating forms where everything is on the same position, it is probably best to set the default for body to "absolute." In case of creating dynamical tables in the body, "relative" is probably the best value for position. You are free to change this default in Preferences.

INsights: When I preview a document in the Scriptura Server everything looks fine. But when I ask the Scriptura Server to generate XSL-FO, make some modifications to it and then generate the PDF, the whole layout looks bad. What is wrong?
Deardorff: Make sure that, when you manipulate the XSL-FO generated by the Scriptura Server, no whitespace is added to the original document. A lot of XML tools and XSLT transformers allow you to indent XML files to make them look nicer. Adding whitespace (blanks, tabs and newlines) impacts the formatter when rendering the output document.

INsights: When I position an object at the absolute position (5cm,5cm) and I print the generated PDF file, the result on paper is not printed at the correct position. What is wrong?
Deardorff: The PDF file generated by Scriptura does contain the object at the correct position (5cm,5cm). You can verify this by opening the PDF file in Adobe Acrobat, displaying the grid (Menu -> View -> Grid) and setting the grid size to the position of your object (5cm) (Menu -> Edit -> Preferences -> General -> Layoutgrid). You will see that the object is at the correct position. Unfortunately, printing a PDF file does not always respect the positioning of the contents of the document. These differences are probably caused by the printer driver installed on your local machine. Experience shows that postscript printer drivers result in better positioning the PCL printer drivers.

INsights: What is the use of page margins in the Scriptura Designer?
Deardorff: The page margins displayed by the Scriptura Designer are just a hint to the user that maybe some of the objects placed on the edge of the page cannot be printed. The actual printable area on a page is different for each printer, so you should look this up for the printer you will be printing the document to and set the page margins correctly. If you are creating PDF files that are not going to be printed, you can simply ignore the page margins or set them to "0" (which would make them disappear) in the page setup (see "File -> Page setup" in the Scriptura Designer). The Scriptura Designer does not take these page margins into account anywhere. They are only displayed on the page for your convenience.

INsights: My absolute positioned text object does not show up in the result document. What is wrong?
Deardorff: An absolute positioned object uses an exact specification for its width and height. Depending on the font and output medium the text you inserted might be too large to render inside the provided area. In this case, no text will appear. Please enlarge your text object to accommodate for the text.

INsights: How can I design, in a dynamic table, alternating background colors for each row?
Deardorff: This can be accomplished by specifying a condition on the property "Fill Color" of the table row. The condition to add is "[position() mod 2] = 1"-- this condition must be created using the advanced wizard.

INsights: Does Scriptura support vertically positioning content in table cells?
Deardorff: Yes. Text objects inside table cells have a "vertical align" property.

INsights: Does Scriptura support nesting of tables?
Deardorff: Yes.

INsights: Does Scriptura support line drawing?
Deardorff: Yes. Scriptura supports both horizontal and vertical lines in documents. The XSL-FO standard only supports horizontal lines and they are drawn using the line object in the Scriptura Designer. Vertical lines can be simulated using a rectangle where you only set the left (or right) border.

INsights: Does Scriptura support complex tables using column- or row-spanning, nesting, running headers and footers and so on?
Deardorff: Yes.

INsighs: Does Scriptura support barcodes?
Deardorff: Yes. Scriptura supports barcodes using either static or dynamic data. The following types of barcodes are supported: Code 39, Code 93, Code 39 Extended, Code 93 Extended, US Postal Service Code (PostNet), Universal Product Code A (UPC-A), Universal Product Code E (UPC-E), European Article Numbering 8 (EAN-8), European Article Numbering 13 (EAN-13), Code 128, Codabar (Code 2 of 7), Interleaved 2 of 5, Standard 2 of 5, Code 11 (or USD-8)

INsights: Can I use different page settings within a single document?
Deardorff: Yes. You can apply the page specific settings to the current page only or to the whole document. Page settings can easily be retrieved by going to File -> Page Setup or by selecting the page and selecting "Properties" from the right-click menu. This way you can create a document where page one is portrait and all others are landscape.

INsights: Can I insert hyperlinks?
Deardorff: Yes.You can mark selected text as a hyperlink.

INsights: What is a ".lic" file, and what should I do with it?
Deardorff: The Scriptura License File is a file that has the ".lic'"extension and that allows you to use Scriptura. It is a special file that can only be opened by the Scriptura License Wizard. This file is strictly personal.

INsights: On which platforms does the Scriptura Designer run?
Deardorff: Any Java 1.4.2 platform.

INsights: How do I tweak the amount of memory the Scriptura XBOS programs use?
Deardorff: If you are not satisfied with the default settings you can customize these in the corresponding .ja files for the commands. These include JVM neutral settings for specifying the JVM's memory usage. The line %IF_EXISTS%("INIT_JAVA_HEAP", "@INIT_JAVA_HEAP@<size>m") specifies the minimum memory the JVM will use, the %IF_EXISTS%("MAX_JAVA_HEAP", "@MAX_JAVA_HEAP@<size>m") line specifies the maximum memory the JVM will use. Replace the <size in megabytes> factor with a number expressed in megabytes to specify the custom memory settings.