Introduction
1. Critical components under QA Testable category
The most critical component where about 70-90% of QA efforts are concentrated in testing data intensive systems is a “data processing/ transformation engine” which can be identified as a “common critical component” of all these systems. This lion share of testing efforts is commonly concentrated around this engine which forms the core of such systems since all these systems are in place to process the streamed data into usable information.
From observations of such systems any “data processing/transformation engine” is governed by a set of data transformation rules whose complexity can range from low to extremely high depending on system design.
The common test automation approach for such projects would be to automate the verification and validation of the transformations that occur to the data since it forms a critical component of the system and huge testing efforts are concentrated around the same. A common yet wrong approach noticed to automate the same is to mimic the end to end transformation rules governing each module of the system using test scripts and performing a “parallel feed” of the candidate test data through the QA setup and test script and compare the outputs of the same.
QUESTION: If a QA personnel can mimic the end to end transformation rules governing a module for the purpose of testing the same and certify a bug free test script, cant the code be ported to be in-place within the application for a bug free rollout?
2. Proposed Approach
An alternative approach to automate the testing of the same; referred henceforth as the reverse tree parse methodology on a high level is described as an approach to invert the parsing of such transformation rules algorithms so that QA would be undertaking a bottom up parsing technique.
In this approach the relevant part of a transformation engine is converted into a flow diagram. The flow diagram is then reverse parsed to generate and codify an optimum number of reverse parse trees which will then be predictive parsed to validate the transformation scenario and arrive at certain subset of “base states” which denote a passed test case. In this approach QA would be adopting a “inverted logic flow” to validate the scenario. The plus point of this approach being that the test scripts would be able to not only predict the validity of a test case but also give pointers to what “base scenarios” would generate the current output.
3. Where do we start???
The following section attempts to do a step by step walkthrough of adopting the approach.
1. Modularize the transformation rules: Every transformation engine or the design specification governing them can be classified into modules such that for every module we have a finite set of rules governing the transformation of the data relevant to the module.
2. Define the finite set of relevant rules: Depending on the complexity of the transformation rules the optimum approach to maintain the modularity of the automation effort would be to identify and isolate every single rule governing the relevant module. This would form the superset of all rules within the transforming logic.
** Rules can be classified into the following broad categories:
Ø Rules that are independent of all other rules.
Ø Rules which act as the “invoker” of other rules.
Ø Rules which are “exclusively invoked” by other rules. They don’t have independent existence.
Ø Rules which act as “invoker” at times and as being “invoked” at other times.
Irrespective of the classification of the rule; every rule is then converted into a flow diagram.
4. Process of creating flow diagrams for codifying
Terminology:
The following are the major components of parse trees:
Nodes: These are the finite set of all states that the transforming process will reside in during the course of parsing the rule. Each node is associated with a set of data and a finite set of conditions which need to be verified to exist for the algorithm to move out of that node to the next node.
Start Node: This is the set of states from where a possible parse can start. The set of start nodes are dependant on the data which needs to be validated.
End Nodes: This is the finite set of nodes where the parse algorithm can reside; at the end of the parsing process. The test case is said to have passed when the algorithm resides in any of these set of nodes.
Process Arrow: This is a unit of processing that happens in the parse algorithm that will take the algorithm from on state to another. Every parse arrow exist attached to a node, hence they are in effect the conditions that help the algorithm to move from 1 valid node to another
** The algorithm can never reside at the following states:
Ø A state which is not pre-defined and whose conditions are not known.
Ø A state which does not have a valid outgoing process arrow attached to it.
If the process of executing takes the algorithm to a node other than an end node which has no valid process arrow which can be called; then it indicates a failed test case.
5. Creation of parse trees
The process of creating the parse tree is to create a representation of the process flow of the rule in terms of start nodes, process arrows and end nodes. Since we reverse parse this tree for the purpose of validation the nodes that define the beginning of the transform rule will become the end node (the nodes that denote success when the tree is codified) and the leaf nodes defined will become the start nodes for the algorithm when the tree is coded. Refer to a sample transformation rule to explain the process of creating a parse tree.
“If modified date (MD) present and is greater than update date (UD) then output 1 else if modified date(MD) is not present or less that update date (UD) then output 0”
The following simple rule can be represented as a parse tree as follows:
In the above parse tree the nodes marked in gold will be the start nodes from where the algorithm can be initiated. The green node is the end node which signals a passed test case.
6. Coding for nodes
To convert the parse tree to an executable script each node in the tree is defined as an object of a custom class. The class definition for each node would contain the variables to hold the data that are relevant to the node e.g. variables MD and UD for node number 4, MD alone for node number 2 etc. The class definition would also contain functions to represent each process arrow and perform the validation check attached to that process arrow. Each process arrow would also pass relevant data to its target nodes if needed.
7. Building that “Parse Director”
The parse director module is a wrapper script which directs the calling of parse trees and performs the actual reverse parse of the transformation rule.
Need for parse director:
Every transformation will have a set of many rules. The parse director will perform the following:
- Will hold the structure that defines the rules that can be called after evaluating one rule in the reverse parse process.
- Direct the parsing of the tree by calling the appropriate parse tree to validate the test case.
8. Benefits of using parse tree oriented automation
- This automation framework when developed will enable the QA team to predict the finite set of permutations that are possible for the process flow in each module.
- This will enable them to finalize the minimum number of test cases required to ensure test coverage of all valid process paths.
- This can serve as the direct replacement for orthogonal arrays to enhance and streamline the testing of each module.
** note that this would serve to minimize test case to achieve coverage at a “function call level” and not a “code branching level”. This is justifiable for the black box level testing relevant to QA.
- The ratio of number of test cases execute to the number of defects identified will be minimized thus enhancing the productivity of the QA team and facilitate cut downs in QA schedules.
9. Requirements to adopt the approach
- In depth understanding of the transformation engine under test.
- QA tester with experience in building automation scripts using object oriented languages like JAVA/ RUBY etc.
- Testing teams with expertise in refining and analyzing test scripts developed in the above languages.
If you enjoyed this post, make sure you subscribe to my RSS feed!