A Virtual Deployment Testing Environment for Enterprise Software Systems∗ Jian Yu Faculty of Information and Communication Technologies Swinburne University of Technology Hawthorn, VIC 3122, Australia jianyu@swin.edu.au Jun Han Faculty of Information and Communication Technologies Swinburne University of Technology Hawthorn, VIC 3122, Australia jhan@swin.edu.au Jean-Guy Schneider Faculty of Information and Communication Technologies Swinburne University of Technology Hawthorn, VIC 3122, Australia jschneider@swin.edu.au Cameron Hine Faculty of Information and Communication Technologies Swinburne University of Technology Hawthorn, VIC 3122, Australia chine@swin.edu.au Steve Versteeg CA Labs 380 St. Kilda Rd Melbourne, VIC 3004, Australia steve.versteeg@ca.com ABSTRACT Modern enterprise software systems often need to interact with a large number of heterogeneous systems in an enter- prise IT environment. The distributedness, large-scale-ness, and heterogeneity of such environment makes it difficult to test a system’s quality attributes such as performance and scalability before it is actually deployed in the environment. In this paper, we present a Coloured Petri nets (CPN) based system behaviour emulation approach and a lightweight vir- tual testing framework for provisioning the deployment test- ing environment of an enterprise system so that its quality attributes, especially scalability, can be evaluated without physically connecting to the real production environment. This testing environment is scalable and has a flexible plug- gable architecture to support the emulation of the behaviour of heterogeneous systems in the environment. To validate the feasibility of this approach, a CPN emulation model for LDAP has been developed and applied in testing the scala- bility of a real-life identity management system. An in-lab performance study has been conducted to demonstrate the effectiveness of this approach. ∗This work is supported by the ARC Linkage Project LP100100622 Large-Scale Emulation for Enterprise Soft- ware Systems. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. QoSA’12, June 25–28, 2012, Bertinoro, Italy. Copyright 2012 ACM 978-1-4503-1346-9/12/06 ...$10.00. Categories and Subject Descriptors D.2.5 [Software Engineering]: Testing and Debugging— Testing tools; D.2.2 [Software Engineering]: Design Tools and Techniques—Petri nets General Terms Design, Measurement, Verification Keywords Enterprise software systems, Deployment testing, System emulation, Petri nets 1. INTRODUCTION Modern enterprise software systems usually operate in a complex environment where up to tens of thousands of di- verse systems interact and cooperate to support the daily operation of a large enterprise such as, for example, a multi- national financial institution. From the perspective of a sin- gle system as a node in this networked architecture, its oper- ating environment, that is, the collection of systems it needs to interact with, is inherently distributed, heterogeneous, and large-scale. In a typical scenario, the systems in the envi- ronment are not only distributed globally in various physical locations, but also vary in functionality, type, and commu- nication protocols. Furthermore, an enterprise-class system is supposed to have the capability to concurrently serve or interact with a large number of systems. For example, CA’s Identity Manager [2, 6] is an enterprise- class software system used to coordinate and manage the many thousands of user accounts and computational re- sources present in a large organization. Identity Manager is capable of managing heterogeneous enterprise environments and thus can apply access control policies to a wide range of resources, such as, Lightweight Directory Access Proto- col (LDAP) directories [22], email servers, Windows NT and Unix machines etc. In practice, an Identity Manager system 101 Figure 1: Conceptual model of the virtual testing environment. may need to manage and control up to 10,000 computer systems in an organization. Quality assurance is a major concern of enterprise software systems [5]. However, the complex operating environment of an enterprise system brings difficulty in examining and testing its quality attributes, such as performance and scal- ability, before it is actually deployed in its target environ- ment. Physically provisioning the production environment for testing purpose is generally impractical because of both the large number of systems involved and the geographical distribution of these systems. Furthermore, the diversity in the types and configurations of systems in the environment also bring another layer of complexity to testing. Over the years, a number of approaches have been proposed to pro- vide executable, interactive representations of deployment environments (cf. Section 2). However, all of these ap- proaches fall short in providing software engineers with a flexible tool-set to examining and testing software systems’ quality attributes. To address these shortcomings, we propose a novel Coloured Petri nets (CPN) [14] based system behaviour emulation ap- proach and framework for provisioning a quality testing en- vironment of enterprise software systems. As illustrated in Figure 1, we use a single virtual testing environment (VTE) to provision a system-under-test (SUT). A system residing in the testing environment that the SUT needs to interact with (in the context of this paper, we call such system an endpoint system, or EP) is replaced by a virtual endpoint system comprising a dedicated CPN model that emulates the behaviour of a real endpoint system, and an emulation node that executes this model. The main benefits of our emulation based virtual testing environment include: (i) the heterogeneity of enterprise soft- ware environments is accommodated by providing different behaviour models that emulate the behaviour of different endpoint systems and then execute these models in generic emulation nodes. A behaviour model can either be spec- ified according to the functionality of the EP and/or the communication protocol between the SUT and the EP, or be derived from the interaction traces between the SUT and the EP. (ii) The approach is expected to scale to the number of endpoint systems typically found in large-scale enterprise system environments. On the one hand, an emulation node has the potential to emulate the behaviour of a large set of EPs that have the same functionality (e.g., up to 10,000 LDAP servers) instead of just one. On the other hand, the behaviour model could be a simplification of the real EP as long as this model is consistent with a given testing scenario. This means that a VTE running on a physical machine may use much less resources than tens of thousands of real EPs running on physical or virtual machines, respectively. (iii) The VTE provides a closed testing environment for the SUT because the physically unavailability or inaccessibility prob- lem of real EPs is avoided by behaviour emulation. In this paper, we illustrate our approach to systematically building a CPN emulation component for the VTE from behaviour modelling, model pruning, operation modelling, to model integration. Although CPN has been widely used in distributed systems modelling and analysis [14], to the best of our knowledge, it is the first effort in applying CPN in the emulation of endpoint systems for purpose of deployment testing. A case study in building a CPN emulation model for LDAP servers has been conducted in order to demonstrate the efficiency and scalability of this approach. The rest of the paper is organized as follows: in Section 2, we discuss related work, followed by an introduction of the architecture and implementation of the virtual testing envi- ronment in Section 3. Section 4 introduces the CPN based approach to endpoint system emulation using LDAP server emulation as an illustrating example. In sections 5 and 6, we present and interpret the results of our scalability exper- iments, respectively. Finally, Section 7 concludes the paper with a summary of the main observations and a discussion of future work. 2. RELATEDWORK Testing distributed systems is a complex problem [28]. For example, Ghosh and Mathur raise nine issues in testing dis- tributed components by pointing out that “components that interact with a heterogeneous environment can be more eas- ily tested by utilizing an emulation environment capable of representing a range of different components, and scalabil- ity and performance testing for components can be enabled through using scalable models.”[7] To provision a testing environment for a distributed sys- tem or component, physical replication is clearly the most primitive approach, which in many cases cannot well han- dle the heterogeneity and scalability of a testing environ- ment. Hardware virtualization tools such as VMWare [23] and VirtualBox [26] certainly provide better management and control over virtual testing servers and they are capa- ble in hosting heterogeneous endpoint systems in the testing environment. But this approach has two major limitations: (i) for a large-scale environment, provisioning the whole en- vironment through hardware virtualization alone is costly. (ii) It is not possible to provision an endpoint system that is not available to be hosted on a virtual machine (e.g., hosted in another organization). Specifically designed for evaluating the performance and scalability of server software, load generation tools such as HP’s LoadRunner [12], SLAMD Distributed Load Genera- tion Engine [24] and the Apache software foundation’s JMe- ter [1] are capable of representing many thousands of concur- rent clients, transmitting requests to the server under test. This approach is very suitable for providing performance 102 Scalability Dealing with Heterogeneity Behaviour Abstraction Interaction Mode Configurability Physical Replication None Weak None Two-Way Weak Hardware Virtualization Weak Good Machine Two-Way Good Load Generation Tools Good Weak User One-Way Good Stubs and Mocks Good Weak Interface Two-Way Weak Virtual Emulation Environment Good Good Endpoint system Two-Way Good Table 1: Capability comparison of different approaches. diagnosis of the system-under-test and identifying the crit- ical bottlenecks in the system. One of the limitations of load generation tools is that they are mainly used to gener- ate scalable autonomous client load against a reactive server system, but they are not able to represent the complex in- teractions between the system-under-test and the diverse endpoint systems, respectively. Method stubs and mock objects [4, 8] have been used to programmatically emulate the behaviour of remote end- point systems. With the flourish of a test-driven develop- ment paradigm, there are currently some language-specific mocking frameworks available such as Mockito [18] for Java, Rspec [21] for Ruby, and Mockery [17] for PHP. The main limitations of the stub/mock approach include: (i) the test- ing framework is usually language specific and thus not suit- able to provision a generic testing environment. (ii) The be- haviour of a stub/mock is programmed in an ad hoc fashion. It could be challenging to configure at a high level the be- haviour of a stub or mock, and any changes to the test plan may lead to changing the code of the stub/mock. In Table 1, the features of different approaches discussed above are summarized from the perspectives of provisioning a distributed deployment testing environment. The follow- ing features are considered in the table: • Scalability. Referring to the relationship between the number of systems capable of being represented using a single physical host. • Dealing with Heterogeneity. Referring to the abil- ity of provisioning diverse endpoint systems. • Behaviour Abstraction. The underlying abstrac- tion model on which the interactions are based. • Interaction Mode. Whether the approach can han- dle the two-way interactions between a client and a server. • Configurability. Referring to the ability of provid- ing changeable parameters to the user so that certain characteristics of the approach can be easily changed. A more detailed discussion of the limitations of existing ap- proaches to enable enterprise software system analysis can be found in [11]. From the perspective of system behaviour emulation, the emulation logic can be directly expressed in a given host pro- gramming language. For example, in our previous work [9, 10], the Haskell programming language [16] has been used to approximate the behaviour of LDAP server endpoints. In contrast to directly using a programming language, CPN has the following benefits in specifying the emulation logic: (i) CPN has a graphical representation, leading to easier specification. (ii) The true concurrency property of Petri nets makes it suitable to directly model and emulate the behaviour of distributed systems [14]. (iii) CPN is a for- mal modelling language, which makes it possible to directly verify the correctness properties, such as deadlock, bound- edness, of the behaviour model; it is also possible to model check the behaviour model against a formal specification of the emulated system. A potential weakness of using CPN in the emulation en- vironment is that models defined using CPN may not be as efficiently executable as endpoints directly encoded in pro- gramming language. However, the results of a case study in emulating LDAP endpoint systems presented in Section 5 demonstrate that the run-time performance of the CPN- based approach is sufficient for simultaneously emulating the behaviour of up to 10,000 LDAP endpoint systems. 3. ARCHITECTURE OF THE VIRTUAL TESTING ENVIRONMENT As a proof-of-concept, we have implemented a prototype of a virtual testing environment (VTE). In this section, we will briefly introduce the main elements of the underlying architecture. The interested reader is referred to [11] for further details. The architecture of the virtual testing environment is illus- trated in Figure 2. The VTE has four main components: (i) the network interface component for encoding and decoding of native communication, (ii) the engine for executing the emulated behaviour of an endpoint system, (iii) the moni- tor component for observing and publishing the behaviour of the nodes emulated by the engine, and (iv) the configura- tion component for managing configurations. In the follow- ing, we will discuss the important aspects of each of these components in more detail. Network Interface The heterogeneous nature of enterprise software environ- ments means that messages exchanged on communication channels linking nodes may be packaged and encoded on a native channel in numerous different ways. LDAP, for ex- ample, encodes its messages using the basic encoding rules of ASN.1, with some restrictions. The network interface module of an emulator allows the encoding and decoding concerns of native communication to be isolated from the models used by the engine to emulate node behaviour. The network interface consists of four entities. The native service allows external nodes to establish new channels with the emulator. The native conduits are responsible for facil- itating the interaction between external nodes and the em- 103 Figure 2: Architecture of the Virtual Testing Environment. ulator’s engine along previously established communication channels. The interface scheduler is responsible for schedul- ing when the various services and conduits get an oppor- tunity to perform some activity. The interface monitor is responsible for observing and reporting the behaviour of the interface. Engine The engine is the host of the Virtual Endpoint Systems, the Node Scheduler, and the Monitor. Every Virtual Endpoint System has an emulation node for executing the actual emu- lation logic specified by the behaviour models. Services and channels are responsible for interacting with the network in- terface. The Node Scheduler is for selecting and triggering the execution of emulated nodes at particular points in time, and the Engine Monitor is for observing and publishing the behaviour of emulation nodes. Monitoring and Configuration The monitoring component is responsible for processing the data published by the different internal monitors of the emu- lator, including the interface and engine monitors. The em- ulator monitor registry listens to the information published by the other monitors and stores and organizes it. Other modules interested in emulation data can then extract the specific data which they are interested in from this registry for further processing and use. The visualization compo- nent extracts data required to visualize an emulation from the registry. This component uses the data to produce use- ful visualization both during and after an emulation. The logging component extracts information required for logging an emulation. This logged information can be used for post- emulation analysis, helping to pinpoint the root cause of some observed failure in an external software systems under test. Finally, the configuration component is responsible for managing the configuration of an emulator. This includes specifying which native services are going to be provided and how they map to specific engine services. The configuration module also provides a means to select and refine specific node models for an enterprise software tester’s purposes. 4. EMULATING ENDPOINT SYSTEMS WITH PETRI NETS In this section, we discuss the CPN-based modelling and emulation approach to emulating the behaviour of endpoint systems. Petri nets is a formalism suitable for modelling the be- haviour of systems with characteristics of concurrency, com- munication and synchronization such as network protocols, distributed systems, and workflow systems [19, 20]. Coloured Petri nets (CPN) is a backwards compatible extension of Petri nets and is often viewed as a graphical language for constructing executable models of concurrent software sys- tems [13]. Coloured Petri nets combine Petri nets and the functional programming language CPN ML. CPN Tools [14] is an environment for editing, executing, and the verification of properties of Coloured Petri nets and has been widely used in both, academia and industry. There are several benefits in using CPN Tools as the be- haviour modelling and emulation environment: (i) the con- currency feature of CPNs makes it possible to model the behaviour of a set of endpoint systems that have the same functionality using a number individual tokens running on just one Petri net structure. This may contribute signifi- cantly to the scalability of the VTE. (ii) With CPN ML as a full-fledged programming language, we are able to model the detailed message exchange between the SUT and the VTE. (iii) CPN Tools can efficiently execute CPN models as we will further demonstrate in Section 5. Last but not least, the graphical features and verification capabilities of CPN Tools facilitate both the design and verification of the CPN-based emulation models, respectively. In general, our approach of emulating a specific endpoint system can be divided into four steps: (i) use CPNs to de- scribe the states and operations/transitions of endpoints to obtain a high-level behaviour model, (ii) prune the high- level behaviour model to get a subset of the functionality 104 Figure 3: Emulating the behaviour of n LDAP server endpoints. that is pertinent to the testing, (iii) model the behaviour of each operation, and (iv) connect the model to auxiliary I/O models (also in CPN) responsible for interacting with the VTE and then put it in the VTE as a behaviour model for the engine component. In the following, we use an industry-motivated scenario to explain the details of these four steps. This scenario is to emulate the behaviour of a variable number n (up to 10,000) of LDAP servers [22]. It is derived from the requirements of CA’s quality assurance team for the purpose of testing Identity Manager. As we can see in Figure 3, we want to emulate the behaviour of n LDAP servers with one node in the engine of the virtual testing environment, and the behaviour of this node is specified by a single CPN model. 4.1 High Level Behaviour Modelling In general, a system model comprises two types of ele- ments: elements that represent state and elements that rep- resent change. For example, programming languages use variables to represent state and assignment statements to represent change [29]. Petri nets use places (graphically rep- resented as circles) to represent state and transitions (graph- ically represented as rectangles) to represent change. A high level behaviour model of a system can be constructed by identifying the key operations of the system, followed by specifying the state changes caused by the execution of these operations. The Lightweight Directory Access Protocol (LDAP) [22] is a communication protocol widely used in enterprise environ- ments, in particular to manage access to the computational resources of an enterprise. In general, an LDAP server is a communication end-point which allows a client to search and modify a persistent directory (tree) structure hosted on that server. A CPN model of LDAP that captures the key LDAP op- erations1 that are used by Identity Manager and the state changes related to these operations is shown in Figure 4. The regular interaction generally begins with the client es- tablishing a connection to an LDAP server and transmitting a bind request with some authentication details. The LDAP server then issues a bind response to the client which indi- 1In order to enhance the presentation of the modelling steps, we have omitted some of the “administrative” LDAP opera- tions from the model. cates either the success or failure of the corresponding bind operation. Once bound, a client can issue a number of different re- quests to search and/or modify the LDAP server with add or delete operations. The majority of the requests a client can make result in just a single response from the server, indicating either success or failure of this request. A search request, on the other hand, can result in zero or more search result entries which match the search criteria received by the server. After zero or more result entries have been transmit- ted, the search completion is indicated by the transmission of a search result done message. Finally, an LDAP session is usually closed by the client issuing an unbind request. It is worth noting that (i) because behaviour modelling is done manually with application specific requirements, it is highly possible that modelling the same protocol may create different models depending on the situation, and (ii) proper- ties of the created models, such as liveness and boundedness, can be verified by the CPN Tools [14]. 4.2 Model pruning After a high-level behaviour model of an endpoint system has been specified, a software engineer may prune the model by removing places and transitions that are not pertinent to a particular testing activity. This step helps to shrink the model and thus makes the model more efficient to execute. A software engineer can remove places and/or transitions of the model to get a subnet of the original Petri net. To ensure the behavioural consistency between the subnet and the parent net, the reachable marking set (cf. Definition 4.2 below) of the subnet should be covered by that of its parent net. In the following, we will illustrate the model pruning process and define behavioural consistency between a subnet and its parent net. The reader may note that pruning primarily affects the net structure of a Coloured Petri net, the structure which is shared by most members of the Petri nets family [15]. There- fore, in order to enhance readability of this section, we define the pruning action based on the net structure definition of Petri nets and not given the full definition of Coloured Petri nets. Definition 1 (Petri net Structure). A Petri net struc- ture N = (S, T ;F ) is a 3-tuple where S ∩ T = ∅; S ∪ T %= ∅; F ⊆ S × T ∪ T × S; dom(F ) ∪ cod(F ) = S ∪ T , where 105 ee e e eee e e e e e e e e e e e e e e e e e e e e t3 search t2t1 unbind deleteadd bind More Entries UNIT Search Failed UNIT Not Found UNIT Found UNIT Delete Failed UNIT Delete Success UNIT Add Failed UNIT Add Success UNIT Authentication Failed UNIT Bound UNIT Unbound () UNIT Figure 4: A high level Petri net model of LDAP. e e e e e e e ee e e e e e e e t2t1 unbind deleteadd bind UNIT Delete Success UNIT Add Success UNIT Bound UNIT Unbound () UNIT t3 Found search Figure 5: Pruned LDAP model. dom(F ) = {x | ∃y : (x, y) ∈ F}, cod(F ) = {y | ∃x : (x, y) ∈ F}. Definition 2 (Pruning). Pruning a Petri net structure N1 = (S1, T1;F1) by removing a set of elements X ⊂ S1∪T1 from N1 produces another net N2 = (S2, T2;F2) where S2 = S1−X, T2 = T1−X, and F2 = F1− ((S2×T1)∪ (T1×S1)). We call N2 a subnet of N1. Because a Petri net is a state transition system, its be- haviour can be defined by its reachable marking set: Definition 3 (Reachable Marking Set). For a Petri net structure N = (S, T ;F ), its reachable marking set [M0〉 is the smallest set that satisfies the following two conditions: • M0 ∈ [M0〉; and • if M ′ ∈ [M0〉, t ∈ T,M ′[t〉M , then M ∈ [M0〉. M : S -→ V is denoted as the marking function from S to a multiset V , M0 as the initial marking of N , and “〉” as the corresponding firing function. Behavioural consistency is defined based on the coverabil- ity between two Petri net structures [19]: Definition 4 (Behavioural Consistency). For two Petri net structures N1 and N2, N1 is behavioural consis- tent with N2 iff: • MN10 ≤MN20 ; and • ∀Mi ∈ [MN10 〉, there is a Mj ∈ [MN20 〉 where Mi ≤Mj . According to Definition 4, if N1 is behaviourally consis- tent with N2, then any of the markings of N1 is covered value n-1n (key, value) key key (key, value) (key, value) n n-1 (key, value) (key, value) n+1n x x Clear Remove Contains Add Found Fusion Found VALUE Remove Parameter Fusion DelParam KEY Removed KEYxVALUE Size 0 INT Contains Parameter Fusion SearchParam KEY Container KEYxVALUE Add Parameter Fusion AddParam KEYxVALUE (a) The generic collection model sid (sid, ADD, s2) (s, value) sid (sid, msg) Add input (msg, sid); output (s2, s, value); action let val x = extractKeyValue(msg); val key = sid^"+"^(#1 x); val y = extractMsgID (msg); val y = genAddResponse (y); in (y, key, #2 x) end; Response Fusion Response RESPONSE KeyValue Fusion AddParam KEYxVALUE Add Request Fusion AddReq SIDxMSG Binded Fusion Binded SID (b) The Add operation model Figure 6: The generic collection model. by a marking of N2. It is worth noting that behavioural consistency can be checked using CPN Tools. Usually, pruning can be done by removing the alternative flows of a function whilst keeping the main flow. A similar technique is also used in defining the main sequence flow of a use case in requirements engineering [3]. For example, in the LDAP Petri net model in Figure 4, we have grayed the states that belong to exceptions and alternative flows. If we prune these states, we get a subnet as shown in Figure 5. This subnet is behaviourally consistent with its parent net if the initial marking of the parent net only includes a number of units in the unbound place. The reader may note that different versions of subnets may be derived from pruning the same net model based on different testing requirements. 106 4.3 Operation modelling After a subnet has been derived by (possibly) pruning the initial high-level model, the next step is to define the de- tailed CPN model for each operation in the subnet. Because we only keep the main function flow in the subnet that is pertinent to specific testing scenarios (as illustrated in Fig- ure 5), the detailed modelling of the operations could also be simplified. For LDAP, we have removed all exception han- dling functionality, and also the multiple entry results in the search operation. Therefore, for any keyword-based query, only a single entry will be returned on match. We use a generic key-value map data structure to emulate the modifi- cation and search operations, respectively. Figures 6(a) and 6(b) show the Petri net models for the generic map data structure and the add operation. In Figure 6(a), the Con- tainer place is used to storepair entries, and the Contains and Remove transitions are used implement keyword based entry search and delete, respectively. In Fig- ure 6(b), if there is a new LDAP add request message, the Add transition checks if the endpoint (identified by sid) has been bound, then the new entry is added to the container and the corresponding response message is generated. Besides these two models, we also defined CPN models for the LDAP operations bind, unbind, delete, and search.2 For modularity purpose, we use fusion sets [14] to interface between Petri net models. 4.4 Model integration After the complete Petri net-based emulation model of an endpoint has been constructed, we need to integrate this model into the VTE. Because the CPN models are executed by the CPN Tools Simulator, and CPN Tools does not sup- port direct interaction with its simulator, we incorporate Access/CPN [27] as the middleware between the VTE en- gine and the CPN Tool Simulator for interactions between the VTE engine and the CPN Tools Simulator, respectively. Access/CPN is a framework for external applications to in- tegrate CPN models and to interact with the CPN Tools Simulator. It provides both Standard ML and Java inter- faces. In our case, the Java interfaces of Access/CPN have been used to interact with the CPN Tools Simulator where the CPN models are executed. As is shown in Figure 7, the channel component in the en- gine is responsible for communicating with the Access/CPN component, and the Access/CPN component communicates with the CPN Tools simulator. We use to uniquely identify an endpoint, and use plus to uniquely identify a message in the CPN emulation model. In addition to the core emulation logic model, we also have two supporting models: a Request Reader for parsing the incoming messages and distributing them to the appropriate operation model, and a Response Writer for generating the correct outgoing messages. The details of the Request Reader CPN model for LDAP is shown in Figure 8, where LDAP request messages are dis- tributed to corresponding models based on operation type. In order to verify that the CPN model introduced in this section is behaviourally consistent with the LDAP specifi- cation, we performed both unit and integration tests. Unit testing was conducted by triggering the Read Request tran- 2The complete CPN package can be downloaded at: http: //www.ict.swin.edu.au/personal/jianyu/cpn/Ldap.cpn Figure 7: Integrating the CPN emulation models. sition to read an actual LDAP request message that was generated by Identity Manager, and then checking if the model generates a correct response message. The request message types include bind, unbind, add, delete, and search. Integration testing was conducted by sending a sequence of request messages to the model and checking if all the re- sponse messages were generated correctly in the correspond- ing sequence, and if the model was in a consistent state after executing an sequence. 5. EVALUATION In order to quantify the scalability of CPN-based endpoint models, we performed an initial in-lab performance study of the CPN-based LDAP emulation node (CPN-LDAP). All experiments were conducted on a Windows XP system with Intel Core2 Duo CPU@3.00GHz and 3GB of memory. The purpose of the experiments was to collect data on how fast CPN-LDAP running on a specific hardware con- figuration can respond when emulating a large number of endpoint systems. The results of the experiments can be used to pinpoint the threshold number of endpoint systems that a physical machine can hold when running CPN-LDAP, and if the threshold number is less than the required num- ber of endpoints, then more than one physical machine may be provisioned to run multiple instances of the VTE. It is worth noting that the data collected are based on the re- sponse times of the CPN emulation engine, and not of the whole VTE. Two scenarios were devised as the sequence of message exchange between the system under test (Identity Manager) and each virtual endpoint: 1. Bind→Add→Search→Delete→Unbind 2. Bind→Add*3→Search→Delete→Add →Search*2→Delete*3→Unbind The first sequence is a simple iteration over the core LDAP operations, while the second sequence reflects the message interaction patterns we observed when Identity Manager is managing an LDAP server endpoint. The system under test uses n channels to simultaneously interact with CPN-LDAP. That is, if CPN-LDAP is emulating n endpoint systems, then a maximum of n requests may be received by CPN- LDAP at a time.3 In order to measure the actual response 3This is the worst case scenario because of simultaneous arrival of messages. We have also tested the scenario where the n requests were sequentially issued, which resulted in a marginally better performance. 107 if optype = DELETE then 1`(sid, msg) else empty if optype = SEARCH then 1`(sid, msg) else empty if optype = ADD then 1`(sid, msg) else empty request if optype = UNBIND then 1`(sid, msg) else empty if optype = BIND then 1`(sid, msg) else empty (sid,optype, msg) Distribute Delete Request Fusion DeleteReq SIDxMSG Search Request Fusion SearchReq SIDxMSG Add Request Fusion AddReq SIDxMSG Unbind Request Fusion UnbindReq SIDxMSG Bind Request Fusion BindReq SIDxMSG In REQUEST Read Request Figure 8: The Request Reader CPN model. time of CPN-LDAP without considering delays at the sys- tem under test side, the request of the next operation is sent right after the response of the current request is received. The results of the experiments are given in Figure 9. Fig- ure 9(a) shows the total execution time of a specific number of endpoint systems (i.e. the time to execute all steps of the given scenarios for all endpoint systems), whilst Figure 9(b) shows the corresponding average execution times. The results of the experiments reveal that the execution time of CPN-LDAP with regards to the number of endpoint systems is not linear for both scenarios. Further experi- ments are needed though to identify whether the execution times increase exponentially (as Figure 9(a) may suggest) and what specifically causes this non-linear increase. For the first scenario, CPN-LDAP used 52 seconds to ex- ecute the scenario for 4,000 endpoints (with on average app. 13 ms for each endpoint), whilst it took 325 seconds to ex- ecute the scenario for 10,000 endpoint systems (with on av- erage app. 32.5 ms per endpoint). For the second, more realistic scenario, CPN-LDAP used 140 seconds to execute the scenario for 4,000 endpoints (with on average app. 35 ms per endpoint), whist 2115 seconds were needed to execute the scenario for 10,000 endpoints (with on average 211.5 ms for each endpoint). Discussions with quality assurance engi- neers at CA revealed that this performance was considered to be acceptable in an industrial setting, but further exper- iments will be needed to get a better understanding of the run-time performance and scalability of endpoint models. Based on the encouraging results of the CPN-LDAP per- formance experiments, we used the VTE to validate the scal- ability of CA’s Identity Manager (IM) in managing up to 10,000 LDAP endpoints. IM was scripted to acquire each endpoint, explore the endpoint, add a new user at the end- point, and modify an existing user, very similar to (but not 100% identical) the second scenario used for the CPN-LDAP performance evaluation. The memory usage, total CPU us- age and response time of IM was recorded for the experi- ments (for further details of the IM scalability experiment, refer to [25]). The experiment confirmed that IM is able to satisfactorily scale to manage this number of endpoints. The experiment also confirmed that a VTE can indeed scale up to 10,000 endpoint systems and be effectively used for quality assur- ance purposes in an industrial setting. (a) Total time (b) Average time per endpoint Figure 9: Performance of the CPN-LDAP engine with regards to the number of endpoints. 6. DISCUSSION Using an emulation-based approach to provisioning a vir- tual deployment testing environment for distributed enter- prise software system may bring several benefits as follows: firstly, the diverse physical endpoint systems are abstracted as virtual endpoint software components in the VTE, which could potential save both resources and effort in implement- ing and maintaining a testing environment. Secondly, the emulation-based approach is inherently scalable, considering that a virtual endpoint is an abstraction (or more specif- 108 ically, a simplification) of a real endpoint system, and a physical machine running the VTE may be able to host a large number of virtual endpoints. Thirdly, the heterogene- ity of endpoint systems can be accommodated by providing a unique behaviour model for each type of system in an envi- ronment. Finally, physically unavailable or inaccessible end- point systems can also be emulated as a virtual endpoint in the VTE. We are currently exploring scenarios with hetero- geneous and/or temporarily unavailable endpoints as part of our further work. An emulation-based approach also has its limitations: firstly, extra effort is needed to implement/model a virtual end- point, either manually or automatically (e.g., based on record- ings of network traces). Secondly, extra effort may be needed for testing or verifying the correctness of a virtual endpoint model itself considering it is not a mature commercial prod- uct as most likely the real endpoint system is. As to our current implementation of the CPN-based ap- proach to modelling the behaviour of endpoint systems, be- sides the nice features brought by the graphical notations, concurrency nature, and formalness of CPN, it also has some limitations: firstly, the current implementation relies on the simulator component of the CPN Tools software to execute the endpoint model, and each simulator can only run a single CPN model at a time, which means we need to run one sim- ulator instance for each CPN model in the VTE. Secondly, only the Microsoft Windows version of the CPN Tools soft- ware is updated and maintained, which means that we may be restricted on the type of platform the VTE can run on. Finally, the architecture of the VTE has been defined in such a way that it is scalable itself. For example, if the run- time performance of a given endpoint model does not allow for sufficient scale on a single host, then with little extra effort, the VTE can be deployed on multiple physical hosts, enabling the emulation of an increased number of endpoint systems. 7. CONCLUSIONS AND FUTUREWORK In this paper, we have presented a Coloured Petri net- based virtual testing environment for effectively provisioning the complex testing environment of an enterprise software system. A CPN endpoint system behaviour emulation ap- proach has been developed and successfully applied on provi- sioning the scalability testing environment of an industrial- scale identity management suite. The proposed approach is capable of emulating the behaviour of endpoint systems at a desired level of abstraction with acceptable performance, and a model pruning technique is proposed to reduce the complexity of the emulation models. We are currently working on the virtual testing environ- ment along several directions, including evaluating the work intensity of creating a emulation model, (semi-)automatically generating behaviour model from interaction sequence traces in order to alleviate the effort of manually creating an em- ulation model, deriving formal temporal logic specifications from the protocol specification and then model checking the consistency between the emulation model and the specifi- cation, and also optimizing the performance of the CPN emulation component. We are currently also exploring the applicability of the approach on additional enterprise-scale protocols and systems, respectively. Acknowledgments We would like to thank the Australian Research Council, CA Technologies and in particular CA Labs for their on- going support of, and contributions to, this research. We would also like to thank Dr. Michael Westergaard at Eind- hoven University of Technology, for sharing the Access/CPN source code and his help in setting up the Access/CPN de- velopment environment in Eclipse. 8. REFERENCES [1] Apache Software Foundation. Apache Jmeter. http://jakarta.apache.org/jmeter, 2011. [2] CA. CA Identity Manager One Hundred Million User Test: Results and Analysis. http://www.ca.com/us/collateral/white-papers/, 2011. [3] Alistair Cockburn. Writing Effective Use Cases. Addison-Wesley, 2000. [4] S. Freeman, T. Mackinnon, N. Pryce, and J. Walnes. Mock Roles, not Objects. In Companion to the 19th Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 236–246, 2004. [5] J. Gao, H.-S. Tsao, and Y. Wu. Testing and Quality Assurance for Component-Based Software. Artech House, 2003. [6] M. Gardiner. CA Identity Manager. White Paper on CA Identity Manager, 2006. [7] S. Ghosh and A.P. Mathur. Issues in Testing Distributed Component-Based Systems. In Proceedings of the 1st International ICSE Workshop on Testing Distributed Component-Based Systems, 1999. [8] P. Gibbons. A Stub Generator for Multilanguage RPC in Heterogeneous Environments. IEEE Transactions on Software Engineering, 13(1):77–87, 1987. [9] C. Hine, J. Schneider, J. Han, and S. Versteeg. Scalable Emulation of Enterprise Systems. In Proceedings of the 20th Australian Software Engineering Conference, pages 142–151, Gold Coast, Australia, 2009. [10] C. Hine, J. Schneider, J. Han, and S. Versteeg. Modelling Enterprise System Protocols and Trace Conformance. In Proceedings of the 20th Australian Software Engineering Conference, pages 35–44, Auckland, New Zealand, 2010. [11] Cameron Hine. Emulation of Enterprise Software Environments. PhD Thesis - Swinburne University of Technology, 2012. Under Examination, Available at http://quoll.ict.swin.edu.au/doc/ chine-phd-thesis-submission.pdf. [12] HP. HP LoadRunner Software Data Sheet. www8.hp. com/us/en/software/software-product.html, 2007. [13] K. Jensen. Coloured Petri Nets. Basic Concepts, Analysis Methods and Practical Use. Volume 2: Analysis Methods. Springer, 1994. [14] K. Jensen, L. M. Kristensen, and L. Wells. Coloured Petri Nets and CPN Tools for Modelling and Validation of Concurrent Systems. Software Tools for Technology Transfer Manuscript, 9(3-4):213–254, 2007. [15] Kurt Jensen and Lars M. Kristensen. Coloured Petri 109 Nets: Modeling and Validation of Concurrent Systems. Springer, 2009. [16] S. Jones. Haskell 98 Language and Libraries: The Revised Report. Cambridge University Press, 2003. [17] Mockery. https://github.com/padraic/mockery, 2011. [18] Mockito. http://mockito.org/, 2011. [19] W. Reisig. Petri Nets: An Introduction. Springer, 1985. [20] Wolfgang Reisig. Petri nets and algebraic specifications. Theoretical Computer Science, 80(1):1–34, March 1991. [21] RSpec. http://rspec.info, 2011. [22] J. Sermersheim. Lightweight Directory Access Protocol (LDAP): The Protocol. http://www.ietf.org/rfc/rfc4511.txt, 2006. [23] J. Sugerman, G. Venkitachalam, and B.-H. Lim. I/O Devices on VMware Workstation Hosted Virtual Machine Monitor. In Proceedings of the General Track: 2002 USENIX Annual Technical Conference, pages 1–14, 2001. [24] Sun Microsystems. SLAMD Distributed Load Generation Engine - Release Notes, 2006. [25] Steve Versteeg and Cameron Hine. Scaling to the Sky (Reacto). CA Technology Exchange (CATX), Issue 4, May 2012. [26] J. Watson. VirtualBox: Bits and Bytes Masquerading as Machines. Linux Journal, 166(1), 2008. [27] M. Westergaard and L. M. Kristensen. The Access/CPN Framework: A Tool for Interacting with the CPN Tools Simulator. In Petri Nets 2009, number 313-322 in LNCS, 2009. [28] E. J. Weyuker and F. I. Vokolos. Experience with Performance Testing of Software Systems: Issues, an Approach, and Case Study. IEEE Transactions on Software Engineering, 26(12):1147–1156, 2000. [29] C. Yuan. Petri Nets: Theory and Applications. Publishing House of Electronics Industry, 2005. 110