More white papers:
Find out more ...
|
|
Neuma White Paper:
CM: THE NEXT GENERATION of Tool Integration and Toolkits
Tool integrations have been going on ever since the initial days of JCL
(IBM's Job Control Language). JCL actually made things a lot simpler.
But as tools have become more complex and diverse, tool integration
presents many challenges. How do you integrate user interface, and
simplify the corresponding training? What about administration? How
do you deal with varying scalability capabilities, and varying server
requirements? What about Multiple Site operation? Successful tool
integrations must effectively address these issues. And it must do so
by starting from a process-centric view of the world.
There have
been many "backplane" recommendations over the years for plugging in
tools into a common framework. How successful are these and what
success can be expected in the future? There has not been a lot of
success here. Eclipse is an open source effort to push forward, and
with some success. Like other companies, Neuma has embraced the effort
with it's own Eclipse plug-in, but with an order of magnitude more
effort than at first anticipated. In my opinion, this complexity will
make it difficult for this solution to succeed in the long term as a
full solution. No doubt it has and will have more specific partial
solution successes.
The work of tool integration is not easy.
Some say it should not be approached with over ambitious expectations.
However, if you find you have to limit your approach, you're losing
before you get going. Instead, a process-centric understanding of the
goal is necessary.
To start with, every company, every
organization, wants to do CM/ALM/Development differently. Looking
first at the technical side of things, in the '70s, diversity was in
computer platform. In the '80s that moved to programming language. As
user interfaces cropped up in the '90s, a new level of diversity
emerged, with GUI toolkits and languages popping up to support these.
From a management side of things, there were a number of factors which
caused diversity: small and large projects, hardware and software
projects, embedded and end-user projects, business and engineering
projects, and so forth.
Initial Success Stories
All
this diversity is no good if you're trying to integrate tools into a
common framework. My first attempt at an "integrated" solution was in
the 1970s. Putting together "full-screen" editors, compilers/linkers,
version control, change management and build management into a single
tool that could support automation and give project control was a
definite challenge. The project was initially hundreds of developers,
but that grew quickly to thousands. It was an IBM mainframe based
environment in a telecom setting. "Network" didn't apply to the data
side of telecom back then - in fact there was very little, if any, data
side of telecom. But, surprisingly, things like Virtual Machines (VM)
were an integral part of IBM's mainframe capability. And so, even
though each developer didn't even have his/her own keyboard and monitor
(i.e. there were shared resources in the "computer" room), each did
have his/her own virtual machine. This first tool, named PLS,
understood the source language enough to identify dependencies, and
understood the target OS enough to automate builds. It understood and
worked with the full-screen editor so that it could effectively do
version control. And it was a very successful project, still in use
to this day. However, although it was ahead of its time, it still
barely addressed things like integration with project activities/tasks
and with the problem reporting system - which was just moving from a
paper-based to a mainframe-based application.
Still it enjoyed success. Here's a few factors that contributed.
- It was designed for a particular computer system.
- It had a small, fixed set of development tools that had to be integrated.
- There was a single data repository that held all of the CM data.
- We had full control over how the tools would work together.
- It was designed to support a single common configuration management process.
Before
we explore further, let's look at the second CM tool I put together,
this time in the 1980's at another telecom company. It again was a
single computer platform (initially at least, on VAXes, eventually
expanding to SUN workstations as well). There was a great deal of
experience from the first go around. It covered more than just CM,
from system requirements to test suites, and supported a more diverse
set of compilers and editors. So it had a generic name: Software
Management System (SMS). The concept of network was being established,
initially as a small network of VAXes and then with workstations added
in. With a wider management scope, there was a more concentrated
effort at process focus. As a result, there needed to be more
flexibility to support a changing process, for Project Management,
Problem Tracking, Test Case Management, Document Management and
Configuration Management. It supported a more flexible array of
editors, documentation tools, compilers and linkers - a more flexible
build process. Again SMS was a very successful project, so much so
that we tried to export it to other companies. Although successful,
there were challenges as the CM/ALM processes were different, as were
the build tools. Some key factors here were:
- It was designed to have a configurable process, at least somewhat.
- All of the management tools shared the common data repository.
- It had a portability layer for migrating from VAX to other platforms
- We had full control over how the tools would work together.
- All of the management tools shared a single common user interface
- All of the development plug-in tools (editors, compilers, etc.) had their own interfaces
From
these two projects, I learned that it is a lot easier to integrate
tools when all of the information was in the same repository. If you
look at today's solutions, the most complex ones are those that try to
work across multiple repositories, and the simplest ones are those
where the repository spans the solution (and often that's why the
solution is not wider in scope).
I also learned that things like
traceability and reporting were easier to learn and to do when it was
the same for all of the management components. In the first case
(PLS), we had separate tools for problem reporting, for project
management and for change/configuration/build management. That simply
meant that we didn't, as developers, use the project management or the
problem reporting tools. Activities were assigned by word of mouth
from the manager. Problems were reported, initially on paper, and
eventually through an on-line form. But that was the extent of our
exposure to the rest of the process. There was a whole separate world
of testing and verification results that we only touched at CRB
meetings when we were invited.
But using SMS, with everything
under one roof, everyone was aware of the work breakdown structure,
could search the problem report data, knew about new documents and
changed documents on a regular basis. Even though we were still in the
days of command line tools, at least the query capability was
consistent and provided for an easy way to navigate traceability links.
I
also learned that although we thought that we were building a flexible
system from a process perspective, there was a long way still to go.
Fortunately, we owned the source code and could adjust the tools as
necessary to meet our process requirements.
Drawing the Integration Lines
Tool
integration isn't just a matter of taking a bunch of tools and putting
them together. First of all, we must understand that certain tools
"belong" together. They have to be designed to live together.
Otherwise, integration glue is going to provide only a partial
capability.
You will get a much better integration of an IDE if
the editor(s), compiler(s), linker, debugger and run-time monitoring
tools are provided/architected by a single vendor, or by a common (e.g.
open source) project. Why? Because there is so much data that needs to
be shared by the various tools, some which create the data, some which
consume it. To take ready-made tools and try to integrate them
together, with a lot of glue and without a common architecture may
result in a system which works, but it will be less functional, and
result in lower productivity. The linker defines the format for the
compiler output. The debugger identifies data that must be available
at run-time. The editors can be smarter if they're tied to the
language. And so forth.
The key is that someone has to specify
a framework to orchestrate the effort, and all of the tools have to
sing to the same tune. A single vendor will generally make sure that
this happens, but not always. For example, when a vendor creates a
solution by acquiring component tools, there will be a lot of glue and
rework before they can sing together. The game plan for having a
common framework is often abandoned in favor of gluing together tools
that already exist but don't share the same architectural base. This
glue integration approach is usually a mistake.
When can an
integration approach work? It can work for very well understood
applications. Computer languages are compiled into one of many object
formats. Whether through a GNU effort or a cross-platform single
vendor effort. This is a well understood application. The expectations
for integration are well know, and so all tools are designed to the
expected standards. New vendors will produce point tools that fit into
this framework. Glue will only be needed for perimeter requirements,
such as the conventions used in the commands/user interface.
Application
Lifecycle Management is not well understood, by comparison. And even
if it were, there are quite a number of factors that need to be
addressed. For one, there is a lot of management data that needs to
persist beyond a simple build operation. Problem reports,
configuration lineups, file history, etc. There are management
processes for each type of object, and workflow spanning types of
objects.
Common Engines
Traceability and cross-life-cycle reporting requires that a common repository
is used to reduce complexity. This makes it easy to specify
traceability links and to query them. Ideally, your data query language
goes beyond relational so that you can directly model the real world
without having to first go through a relational data mapping.
Use of a single process engine
allows state-flow and workflow to be specified across the lifecycle in
a common way. You should be able to identify object states, and
transitions between states. You should be able to put rules/triggers
and permissions on these transitions. But it's vital that you don't
have to use separate workflow tools for each different type of object
(i.e. problems, requirements, tasks, changes, etc.).
Start
with these two premises and your chances for success improve
dramatically. You'll also begin to realize that it's through the
common engines that your Multiple Site solutions will have to evolve if
those solutions are to deal with all of the life cycle data in a
consistent manner.
Horizontal and Vertical Integration
Another
key factor is understanding that there are two types of integration:
horizontal, or management information integration, and vertical.
Vertical integration involves, on the one hand, tools that
provide/gather/consume data, and on the other hand, data mining/metric
capabilities. Data gathering tools include editors, word processors,
diagramming tools, resource editors, data entry tools, compilers,
linkers, make/build tools, testing engines, etc. These are varied.
It's
fine that Visual Studio has tried to integrate it's environment to
various CM tools. It's fine that Eclipse has tried an even more
generic approach. However, these tools have not done enough to provide
definitive boundaries for vertical integration. Visual Studio has
assumed a file-based CM tool instead of change-based CM tool, for
example. Ever tried to configure it so that you can check-in or delta
report on a change instead of a file? It's actually possible, but
difficult and somewhat "forced". Eclipse has the opposite problem: no
real CM framework - just a generic framework. So each CM tool
integration will operate completely differently from an Eclipse
perspective.
CM tools and processes are sufficiently varied that
it is not obvious what the API should look like that ties together the
IDE and the CM/ALM tool. Hence Microsoft decided it should try to put
it's own CM tools into it's IDEs. Now I've got a good idea of what I
might recommend for this API, and it's not that complex. But it looks
at a wider, ALM view of CM, whereas some want a more narrow view.
The
problem with not taking the Microsoft approach is: Where is the user
interface? Is it in the IDE or in the CM/ALM tool? If you're using
several different IDEs, you'd probably be happy with the latter. Or if
you're a developer working in a single IDE, you might want to do all of
your CM right from the IDE.
What about the version control
component - is it a management component or a separate vertical
plug-in. The function of version control - saving and identifying
versions of a file - is vertical in nature. However, management data,
including branching and history information is horizontal. It is
needed by the broader management functions. Ideally, your version
control component manages versions and identifies them by a single
(numeric) identifier, which can be used to reference any version of the
file. The management tool should then be responsible for storing
history and branching information in terms of these identifiers. If
not, you'll find that simple operations are inefficient, or possibly
not fully catered to because of the need for a separate mechanism to
query and manipulate such data.
When we have a clear vision of
horizontal versus vertical integration, we start to see that it makes
sense that horizontal tools should share the same repository data, the
same process engine and perhaps the same user interface. We don't have
to decide between a monolithic system and individual tools. There's a
part way in-between solution. Horizontal integration will blossom when
it's tools are sharing a common platform.
The CM Toolkit
So
one approach is to focus first on the toolkit we'll use to build our
ALM solution. It must start with a common repository. But that
doesn't mean the documentation tools, compilers and code dependency
tools need to share this repository. I'm sure there's extra benefit if
they do, but it's just not always practical. The management suites are
dealing with process, reporting, persistent data, quality and
traceability. All of them need these components. And to spend time
gluing from one tool to another isn't going to get the job done. Look
at the tools that have integrated repositories: generally there's lower
complexity and a smaller learning curve. In fact, many CM tools won't
grow into ALM tools because their repositories are CM-specific.
But
if you start with a toolkit that understands process, has a central
repository, and even moves towards a common user interface, you have a
much broader success. No glue. No worrying about what happens when one
tool is upgraded and it's affect on the integration - the glue, which
once resolved is met with the new release of one of the other tools.
Both MKS and Neuma, two Canadian companies, are taking this approach.
And in each case the approach allows rapid expansion of the suite of CM
tools into ALM suites, and beyond.
Integrated User Interface
So
what about the approach to CM/ALM user interfaces. Here, more than
anywhere in the solution, the focus has to be on process. The user
interface is what the users use to execute the process. If it's
unnatural or if it's overly complex, your process has no hope. It's
critical that CM vendors build in the flexibility for their customers
to easily tailor the CM solutions. One process does not fit all. And
although there are some prepackaged processes and tools (e.g.
Rational's UCM), if they don't quite fit your process, you're in a lot
of trouble because they are not generally flexible.
Ideally your
CM tool interface is the same across ALM functions. If you have to pop
in and out of different tools for Problem Tracking, Change Management,
Building, etc., you're going to lose a great deal of functionality -
developers, and others, won't do tool hopping if they can avoid it.
So
you want to have a flexible way of navigating your data as your process
changes, a means of prompting for exactly the information you expect
your users to understand for each operation, a way of presenting custom
to-do lists, in-boxes or whatever. A prepackaged GUI is not
sufficient. It is either too general or too specific. You need an
exact match to your process, and that needs to change as your process
continues to grow and improve. Minimize clicks. Improve
object-oriented operation and organization. The ability to handle the
common cases trivially, and the ability to guide users through the more
complex cases are essential.
Back to IDEs, the CM tool interface
is going to be important as a cross-IDE interface, and also for users
which are not IDE-centric. Furthermore, because CM can view the data
differently, developers are going to find that it's easier to do some
things from the CM interface. These operations might include:
- Searching for strings across revisions of a file
- Reviewing changes made to a product
- Comparing the content of builds from a problem tracking perspective
- Identifying changes performed by a particular user or group.
- Navigating traceability links
- Performing merges on files and workspaces
In
broader tools, you'll find that a significant proportion of your users
are not developers. So the CM tool interface will be even more
crucial, and will have to be role based.
There are other areas
of User Interface that are important. How does the OS view your files
and data? Do you have a window into your CM tool directly from the OS,
and how difficult is it to use? There are a lot of non-development,
non-technical users that could benefit from an OS-transparent view of
CM - at least as much as it can be transparent: lawyers, accountants,
etc. would love to be able to easily version their spreadsheets and
documents. One of the reasons that Atria's ClearCase was so rapidly
adopted was that it allowed users to look at and use their files from
the operating system. Even if administration overhead crept in, it's
still an enviable capability.
How far have we come?
So
where are we today. I have not done a recent inventory of tools. But
progress is being made. The industry understands that it's not a
choice between a monolithic system and a set of glued together tools -
and as a result, there are fewer home-grown systems. Instead there are
guiding principles that I believe the successful vendors will embrace:
- Central repository for management (horizontal)
- Common process engine (horizontal)
- Easily customized user interface (horizontal)
- Vertical tool integration through vertical standards (APIs)
How
much has the industry embraced these concepts? I'd say there has been
definite recognition of these requirements, but there's a long ways to
go. You may have your own opinions and I'd like to hear them.
Joe Farah is the President and CEO of Neuma Technology . Prior to co-founding Neuma in 1990 and directing the development of CM+, Joe was Director of Software Architecture and Technology at Mitel, and in the 1970s a Development Manager at Nortel (Bell-Northern Research) where he developed the Program Library System (PLS) still heavily in use by Nortel's largest projects. A software developer since the late 1960s, Joe holds a B.A.Sc. degree in Engineering Science from the University of Toronto. You can contact Joe by email at farah@neuma.com
|