4.2 Activities and work products


Business language analysis produces several modeling components and formal documentation to make these modeling components accessible.  The work products range from documents and graphic pattern depictions to complex multi-dimensional semantic networks in appropriate repository technology.

Because business language is essentially a bottom-up analysis of an existing corpus of specific business language, the work is very detail-oriented.  It starts with a large mass of language mate-rial that is provided or found in the environment.  By determining definitions, applying existing patterns, and filling in new patterns of abstraction, we add detail to a higher level framework to clarify and reduce the ambiguity of domain-specific language.

The following section describes the activities of business language analysis and their related work products.  It employs a small sample of language from a hypothetical insurance company to illus-trate some of the steps and results of a typical business language analysis.

Gather language sources - There are several sources of business language from which we can derive the patterns of language.  Some can be proactively developed sources:  interviews, facili-tated sessions, and questionnaires.  The advantage of these techniques is that they involve people from the business, fostering discussion, raising issues, and moving the group toward consensus.  However, they make time demands on people who are already overworked, and they are limited by the memory and biases of a small group of individuals constrained by a time-box.

“Found” sources, on the other hand, are documents and other materials produced by the business for its own use.  They range from public pronouncements to proprietary items, and from formal to  ad hoc documents.  Examples include:  requirements documents, business plans, product specifi-cations, catalogs, training materials, regulatory filings, methods & procedures, process models, forms, charts of accounts, business plans, organization charts, QIT and BPR models, contracts, and mission or vision statements.  Often existing business documents prove to be the best sources of raw material for models because in many cases, the material is not raw at all;  it is already quite refined.  Some existing, information sources are well on their way to being models, worked over by many business minds in an attempt to reach consensus.

The example below is a single document fragment from an insurance policy, scanned and transformed via OCR, from a paper copy of an insurance policy form.  It is a section of the policy informing the policyholder of certain conditions of the contract related to designating and changing beneficiaries:


You may designate or change a beneficiary. Your request must be in writing and in a form that meets our needs. It will take effect only when we file it at our Home Office; this will be after you send the contract to us to be endorsed, if we ask you to do so. Then any previous beneficiary's interest will end as of the date of the request. It will end then even if the Insured is not living when we file the request. Any beneficiary's interest is subject to the rights of any as-signee of whom we know.
When a beneficiary is designated. any relationship shown is to the Insured, unless otherwise stated. To show priority, we may use numbered classes, so that the class with first priority is called class 1, the class with next priority is called class 2, and so on. When we use numbered classes, these statements apply to beneficiaries unless the form states otherwise:
1. One who survives the Insured will have the right to be paid only if no one in a prior class survives the Insured.
2. One who has the right to be paid will be the only one paid if no one else in the same class survives the Insured.
3. Two or more in the same class who have the right to be paid will be paid in equal shares.
4. If none survives the insured, we will pay in one sum to the Insured's estate.
Before we make a payment, we have the right to decide what proof we need of the identity, age or any other fact about any persons designated as beneficiaries. If beneficiaries are not designated by name and we make payment(s) based on that proof, we will not have to make the payment(s) again.


<!--[if gte mso 9]> <![endif]--><!--[if gte mso 9]> 0 false 18 pt 18 pt 0 0 false false false <![endif]--><!--[if gte mso 9]> <![endif]--> <!-- /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-parent:""; margin:0in; margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; mso-bidi-font-size:10.0pt; font-family:"Times New Roman"; mso-fareast-font-family:"Times New Roman"; mso-bidi-font-family:"Times New Roman";} @page Section1 {size:8.5in 11.0in; margin:1.0in 1.25in 1.0in 1.25in; mso-header-margin:.5in; mso-footer-margin:.5in; mso-paper-source:0;} div.Section1 {page:Section1;} --> <!--[if gte mso 10]> <! /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman";} --> <!--StartFragment-->

Extract business  terms - The next step after obtaining the sources of language is to identify the business terms they contain.  Recognition of a business term becomes a matter of intuitive feel for business language analysts.  The search through the files and documents produces a list of terms.  A fragment of such a list is shown in Table 1.


Table 1


As we extract terms from the original source document, we can eat up the file by replacing found terms with surrogates, such as “**”.  What we end up with looks like the skeletal remains below:





At this stage it is still possible to identify terms that may have been previously missed.  In the stripped down text above, we can identify at least two interesting terms that hadn’t yet found their way into our list: “apply to” and  “terms”. 

Build glossary - After alphabetizing and removing duplicate terms from the list, we can create a glossary with definitions.  While building the glossary, it is particularly important to involve busi-ness experts - those who actually know how terms are used, and can identify and differentiate among different uses of the same word.  Often glossaries that provide the raw material for the language analysis already exist in source documents.  The following is a sample of glossary en-tries:

Beneficiary -- A person or other entity designated to receive benefits from an insurance policy upon the death of the insured.
Proceeds -- The total amount paid out of an insurance policy upon termination of the agreement.
Assignee -- A person or other party to whom benefits from an insurance policy are contractually assigned.
Interest -- The type and quantity of benefit from a policy that are allocated to a particular party, as in “beneficiary’s interest”.

Classify terms - Classification of terms begins to determine the basic shape of the information requirements that will need to be met by information systems.  Areas of key importance will ex-hibit long lists of terms.  This is a business-oriented demonstration of the Whorfian principle that the language shapes the thinking of its users.  The concepts that are provided by a generic busi-ness ontology form the basis of this classification, but they will most likely need to be extended by concepts that are relevant and possibly unique to the particular business domain.  An analyst can take a first cut at classifying terms, but business experts need to validate this work.  The fol-lowing is a set of terms extracted from the sample document above, classified by a very generic business ontology.


Table 2

Link terms - Linkage among business terms sets up the meaning structures that help to build business object models (class hierarchies, object composition, variables, collaborations among objects). There is a set of relationships that can be articulated for business terms including linkage of terms to business concepts, linkage of terms to each other via semantic relationships, and link-age of terms to sources in which they were found.

The following set of figures provide an indication of how semantic linkage evolves in our think-ing about a set of terms from a business source.  It suggests the types of questions to be asked about each term that will allow us to understand the patterns of meaning in the business.

Figure 5 is a generic conceptual pattern.  It says there is such a thing as an external role that we may expect to find.  Any external role is likely to be either a source or a recipient, may be formal or informal, is played by an individual or organization, is involved in situations, and generates events.

Figure 5

In Figure 6, we have filled the slot in the center of the pattern with one of the terms that we found in our analysis of the document fragment.  This directs attention to a set of questions, based on the fact that we have classified “beneficiary” as an “external role”.  These questions cause us to go back to our term list to see if we can find terms to fill the refinement, subtype, individual/organization, situation, and event slots that are indicated by the question marks in the figure.

Figure 6

Figure 7 shows the slots in the template filled in.  Among the terms, there are clear-cut subtypes of beneficiaries, called “class 1 beneficiary” and “class 2 beneficiary”, and a refinement, “previous”.  The beneficiary role can be played by a person or by an estate.  A term “request” may fill the event slot in this pattern, but we’re going to go out on a limb and suggest that maybe it is a “claim request”.  We have also invented a term “death claim” to represent a situation that a beneficiary would be involved in.  These suggestions by the analyst will need to be validated by the business user, and may lead us to additional terminology that we haven’t discovered in the document.

Figure 7

Ideally, every term would be diagrammed to create semantic patterns like the one above.  Realistically it is most important to create these diagrams for certain key terms that provide high leverage for understanding the domain of interest.

Load semantic database - It is easy to see from the small sample outlined above, that analysis of business language leads to a complex, multidimensional network of terms, concepts, and mean-ing.  Every way we try to portray this on two dimensional paper seems somehow inadequate.  In the original text, terms can be easily overlooked.  A simple list of terms is just a start.  A glossary is more helpful, but suffers from the circularity of definitions, and the restriction of considering only one term at a time.  Graphic linkages, according to predefined patterns help give more of a sense of the overall language, and appeal to the visually oriented.  They, however, are laborious to create, and, in a large vocabulary, become overwhelming by their sheer numbers.

A highly linked database can overcome most of these paper-oriented limitations by representing  the terms, definitions, sources, linkage to concepts, and linkage to each other.  There are many products or technologies that can support this requirement, including object-oriented databases, hypertext, or proprietary flat-file access methods.  There is also a class of database management system that specializes in capturing and maintaining multidimensional semantic networks.   Once in the database format, a multidimensional browsing tool mirrors the multidimensional data structure, so that all links from a specific term can be followed and displayed at the same time. 

A repository of business terms, business concepts, definitions, sources, inter-term linkages, concept-to-term linkages, and linkages between terms and design artifacts (object classes, database tables, etc.) can all be maintained dynamically as the models evolve.  It is important to establish a data administration function to make sure that updates, backups, and data consistency matters are attended to.

Overall documentation of results -- Throughout the process of creation and maintenance of the business language model, there are periodic points where it is useful to report results.  A number of documents that can serve this reporting requirement.  Issues lists are working documents for the team that is performing the business language analysis.  A team member should be assigned to each issue, so that there is responsibility for its resolution.  A findings document is a simple listing of conclusions and implications that have emerged during the course of the analysis.  De-scriptive papers embed parts of the model in explanatory text.