White Paper: CM: THE NEXT GENERATION of Tool Integration and Toolkits

Tool integrations have been going on ever since the initial days of JCL (IBM's Job Control Language). JCL actually made things a lot simpler. But as tools have become more complex and diverse, tool integration presents many challenges. How do you integrate user interface, and simplify the corresponding training? What about administration? How do you deal with varying scalability capabilities, and varying server requirements? What about Multiple Site operation? Successful tool integrations must effectively address these issues. And it must do so by starting from a process-centric view of the world.
There have been many "backplane" recommendations over the years for plugging in tools into a common framework. How successful are these and what success can be expected in the future? There has not been a lot of success here. Eclipse is an open source effort to push forward, and with some success. Like other companies, Neuma has embraced the effort with it's own Eclipse plug-in, but with an order of magnitude more effort than at first anticipated. In my opinion, this complexity will make it difficult for this solution to succeed in the long term as a full solution. No doubt it has and will have more specific partial solution successes.

The work of tool integration is not easy. Some say it should not be approached with over ambitious expectations. However, if you find you have to limit your approach, you're losing before you get going. Instead, a process-centric understanding of the goal is necessary.

To start with, every company, every organization, wants to do CM/ALM/Development differently. Looking first at the technical side of things, in the '70s, diversity was in computer platform. In the '80s that moved to programming language. As user interfaces cropped up in the '90s, a new level of diversity emerged, with GUI toolkits and languages popping up to support these. From a management side of things, there were a number of factors which caused diversity: small and large projects, hardware and software projects, embedded and end-user projects, business and engineering projects, and so forth.

Initial Success Stories
All this diversity is no good if you're trying to integrate tools into a common framework. My first attempt at an "integrated" solution was in the 1970s. Putting together "full-screen" editors, compilers/linkers, version control, change management and build management into a single tool that could support automation and give project control was a definite challenge. The project was initially hundreds of developers, but that grew quickly to thousands. It was an IBM mainframe based environment in a telecom setting. "Network" didn't apply to the data side of telecom back then - in fact there was very little, if any, data side of telecom. But, surprisingly, things like Virtual Machines (VM) were an integral part of IBM's mainframe capability. And so, even though each developer didn't even have his/her own keyboard and monitor (i.e. there were shared resources in the "computer" room), each did have his/her own virtual machine. This first tool, named PLS, understood the source language enough to identify dependencies, and understood the target OS enough to automate builds. It understood and worked with the full-screen editor so that it could effectively do version control. And it was a very successful project, still in use to this day. However, although it was ahead of its time, it still barely addressed things like integration with project activities/tasks and with the problem reporting system - which was just moving from a paper-based to a mainframe-based application.

Still it enjoyed success. Here's a few factors that contributed.

Before we explore further, let's look at the second CM tool I put together, this time in the 1980's at another telecom company. It again was a single computer platform (initially at least, on VAXes, eventually expanding to SUN workstations as well). There was a great deal of experience from the first go around. It covered more than just CM, from system requirements to test suites, and supported a more diverse set of compilers and editors. So it had a generic name: Software Management System (SMS). The concept of network was being established, initially as a small network of VAXes and then with workstations added in. With a wider management scope, there was a more concentrated effort at process focus. As a result, there needed to be more flexibility to support a changing process, for Project Management, Problem Tracking, Test Case Management, Document Management and Configuration Management. It supported a more flexible array of editors, documentation tools, compilers and linkers - a more flexible build process. Again SMS was a very successful project, so much so that we tried to export it to other companies. Although successful, there were challenges as the CM/ALM processes were different, as were the build tools. Some key factors here were:

From these two projects, I learned that it is a lot easier to integrate tools when all of the information was in the same repository. If you look at today's solutions, the most complex ones are those that try to work across multiple repositories, and the simplest ones are those where the repository spans the solution (and often that's why the solution is not wider in scope).

I also learned that things like traceability and reporting were easier to learn and to do when it was the same for all of the management components. In the first case (PLS), we had separate tools for problem reporting, for project management and for change/configuration/build management. That simply meant that we didn't, as developers, use the project management or the problem reporting tools. Activities were assigned by word of mouth from the manager. Problems were reported, initially on paper, and eventually through an on-line form. But that was the extent of our exposure to the rest of the process. There was a whole separate world of testing and verification results that we only touched at CRB meetings when we were invited.

But using SMS, with everything under one roof, everyone was aware of the work breakdown structure, could search the problem report data, knew about new documents and changed documents on a regular basis. Even though we were still in the days of command line tools, at least the query capability was consistent and provided for an easy way to navigate traceability links.

I also learned that although we thought that we were building a flexible system from a process perspective, there was a long way still to go. Fortunately, we owned the source code and could adjust the tools as necessary to meet our process requirements.

Drawing the Integration Lines
Tool integration isn't just a matter of taking a bunch of tools and putting them together. First of all, we must understand that certain tools "belong" together. They have to be designed to live together. Otherwise, integration glue is going to provide only a partial capability.

You will get a much better integration of an IDE if the editor(s), compiler(s), linker, debugger and run-time monitoring tools are provided/architected by a single vendor, or by a common (e.g. open source) project. Why? Because there is so much data that needs to be shared by the various tools, some which create the data, some which consume it. To take ready-made tools and try to integrate them together, with a lot of glue and without a common architecture may result in a system which works, but it will be less functional, and result in lower productivity. The linker defines the format for the compiler output. The debugger identifies data that must be available at run-time. The editors can be smarter if they're tied to the language. And so forth.

The key is that someone has to specify a framework to orchestrate the effort, and all of the tools have to sing to the same tune. A single vendor will generally make sure that this happens, but not always. For example, when a vendor creates a solution by acquiring component tools, there will be a lot of glue and rework before they can sing together. The game plan for having a common framework is often abandoned in favor of gluing together tools that already exist but don't share the same architectural base. This glue integration approach is usually a mistake.

When can an integration approach work? It can work for very well understood applications. Computer languages are compiled into one of many object formats. Whether through a GNU effort or a cross-platform single vendor effort. This is a well understood application. The expectations for integration are well know, and so all tools are designed to the expected standards. New vendors will produce point tools that fit into this framework. Glue will only be needed for perimeter requirements, such as the conventions used in the commands/user interface.

Application Lifecycle Management is not well understood, by comparison. And even if it were, there are quite a number of factors that need to be addressed. For one, there is a lot of management data that needs to persist beyond a simple build operation. Problem reports, configuration lineups, file history, etc. There are management processes for each type of object, and workflow spanning types of objects.

Common Engines
Traceability and cross-life-cycle reporting requires that a common repository is used to reduce complexity. This makes it easy to specify traceability links and to query them. Ideally, your data query language goes beyond relational so that you can directly model the real world without having to first go through a relational data mapping.

Use of a single process engine allows state-flow and workflow to be specified across the lifecycle in a common way. You should be able to identify object states, and transitions between states. You should be able to put rules/triggers and permissions on these transitions. But it's vital that you don't have to use separate workflow tools for each different type of object (i.e. problems, requirements, tasks, changes, etc.).

Start with these two premises and your chances for success improve dramatically. You'll also begin to realize that it's through the common engines that your Multiple Site solutions will have to evolve if those solutions are to deal with all of the life cycle data in a consistent manner.

Horizontal and Vertical Integration
Another key factor is understanding that there are two types of integration: horizontal, or management information integration, and vertical. Vertical integration involves, on the one hand, tools that provide/gather/consume data, and on the other hand, data mining/metric capabilities. Data gathering tools include editors, word processors, diagramming tools, resource editors, data entry tools, compilers, linkers, make/build tools, testing engines, etc. These are varied.

It's fine that Visual Studio has tried to integrate it's environment to various CM tools. It's fine that Eclipse has tried an even more generic approach. However, these tools have not done enough to provide definitive boundaries for vertical integration. Visual Studio has assumed a file-based CM tool instead of change-based CM tool, for example. Ever tried to configure it so that you can check-in or delta report on a change instead of a file? It's actually possible, but difficult and somewhat "forced". Eclipse has the opposite problem: no real CM framework - just a generic framework. So each CM tool integration will operate completely differently from an Eclipse perspective.

CM tools and processes are sufficiently varied that it is not obvious what the API should look like that ties together the IDE and the CM/ALM tool. Hence Microsoft decided it should try to put it's own CM tools into it's IDEs. Now I've got a good idea of what I might recommend for this API, and it's not that complex. But it looks at a wider, ALM view of CM, whereas some want a more narrow view.

The problem with not taking the Microsoft approach is: Where is the user interface? Is it in the IDE or in the CM/ALM tool? If you're using several different IDEs, you'd probably be happy with the latter. Or if you're a developer working in a single IDE, you might want to do all of your CM right from the IDE.

What about the version control component - is it a management component or a separate vertical plug-in. The function of version control - saving and identifying versions of a file - is vertical in nature. However, management data, including branching and history information is horizontal. It is needed by the broader management functions. Ideally, your version control component manages versions and identifies them by a single (numeric) identifier, which can be used to reference any version of the file. The management tool should then be responsible for storing history and branching information in terms of these identifiers. If not, you'll find that simple operations are inefficient, or possibly not fully catered to because of the need for a separate mechanism to query and manipulate such data.

When we have a clear vision of horizontal versus vertical integration, we start to see that it makes sense that horizontal tools should share the same repository data, the same process engine and perhaps the same user interface. We don't have to decide between a monolithic system and individual tools. There's a part way in-between solution. Horizontal integration will blossom when it's tools are sharing a common platform.

The CM Toolkit
So one approach is to focus first on the toolkit we'll use to build our ALM solution. It must start with a common repository. But that doesn't mean the documentation tools, compilers and code dependency tools need to share this repository. I'm sure there's extra benefit if they do, but it's just not always practical. The management suites are dealing with process, reporting, persistent data, quality and traceability. All of them need these components. And to spend time gluing from one tool to another isn't going to get the job done. Look at the tools that have integrated repositories: generally there's lower complexity and a smaller learning curve. In fact, many CM tools won't grow into ALM tools because their repositories are CM-specific.

But if you start with a toolkit that understands process, has a central repository, and even moves towards a common user interface, you have a much broader success. No glue. No worrying about what happens when one tool is upgraded and it's affect on the integration - the glue, which once resolved is met with the new release of one of the other tools. Both MKS and Neuma, two Canadian companies, are taking this approach. And in each case the approach allows rapid expansion of the suite of CM tools into ALM suites, and beyond.

Integrated User Interface
So what about the approach to CM/ALM user interfaces. Here, more than anywhere in the solution, the focus has to be on process. The user interface is what the users use to execute the process. If it's unnatural or if it's overly complex, your process has no hope. It's critical that CM vendors build in the flexibility for their customers to easily tailor the CM solutions. One process does not fit all. And although there are some prepackaged processes and tools (e.g. Rational's UCM), if they don't quite fit your process, you're in a lot of trouble because they are not generally flexible.

Ideally your CM tool interface is the same across ALM functions. If you have to pop in and out of different tools for Problem Tracking, Change Management, Building, etc., you're going to lose a great deal of functionality - developers, and others, won't do tool hopping if they can avoid it.

So you want to have a flexible way of navigating your data as your process changes, a means of prompting for exactly the information you expect your users to understand for each operation, a way of presenting custom to-do lists, in-boxes or whatever. A prepackaged GUI is not sufficient. It is either too general or too specific. You need an exact match to your process, and that needs to change as your process continues to grow and improve. Minimize clicks. Improve object-oriented operation and organization. The ability to handle the common cases trivially, and the ability to guide users through the more complex cases are essential.

Back to IDEs, the CM tool interface is going to be important as a cross-IDE interface, and also for users which are not IDE-centric. Furthermore, because CM can view the data differently, developers are going to find that it's easier to do some things from the CM interface. These operations might include:

In broader tools, you'll find that a significant proportion of your users are not developers. So the CM tool interface will be even more crucial, and will have to be role based.

There are other areas of User Interface that are important. How does the OS view your files and data? Do you have a window into your CM tool directly from the OS, and how difficult is it to use? There are a lot of non-development, non-technical users that could benefit from an OS-transparent view of CM - at least as much as it can be transparent: lawyers, accountants, etc. would love to be able to easily version their spreadsheets and documents. One of the reasons that Atria's ClearCase was so rapidly adopted was that it allowed users to look at and use their files from the operating system. Even if administration overhead crept in, it's still an enviable capability.

How far have we come?
So where are we today. I have not done a recent inventory of tools. But progress is being made. The industry understands that it's not a choice between a monolithic system and a set of glued together tools - and as a result, there are fewer home-grown systems. Instead there are guiding principles that I believe the successful vendors will embrace:

How much has the industry embraced these concepts? I'd say there has been definite recognition of these requirements, but there's a long ways to go. You may have your own opinions and I'd like to hear them.

Joe Farah is the President and CEO of Neuma Technology . Prior to co-founding Neuma in 1990 and directing the development of CM+, Joe was Director of Software Architecture and Technology at Mitel, and in the 1970s a Development Manager at Nortel (Bell-Northern Research) where he developed the Program Library System (PLS) still heavily in use by Nortel's largest projects. A software developer since the late 1960s, Joe holds a B.A.Sc. degree in Engineering Science from the University of Toronto. You can contact Joe by email at farah@neuma.com

More white papers:

Find out more ...

Neuma White Paper:

CM: THE NEXT GENERATION of Tool Integration and Toolkits