Tuesday, November 16, 2010

Subject: Distributed vs Centralised SCM? - by: joefarah

Here are my thoughts...

1. If your code is distributed across many sites/depots/servers/whatever, does that mean that your ability to do a full build is affected whenever one of them, or the network to it, is down?

2. With distributed code, how do you do a consistent backup of your entire product?

3. I like multiple site solutions, as long as there is no need to do partitioning and re-synchronization operations. This seems only feasible if the entire product exists at all sites, or at least some subset of the sites.

4. When you distribute code across sites, it's probably more important that you have an at-rest encryption capability, because it's typically under less control. So if someone gets access to the files, they still don't have the goods.

5. Centralized repository with good remote access is nice, but it doesn't cut it when you start doing things like full product delta reports (vs your local workspace), full builds, etc. unless you have high speed connectivity all the way through.

These are all real concerns that have to be addressed by an SCM tool, and indeed by an ALM tool. I don't like tools that make me administer separate multiple site solutions for each component of the ALM solution (and usually not for all of them).

In CM+ here's what we do.

1. Centralized repository
2. Multiple Site option allows replication of all transactions at all sites in real time. (So each site looks and feels like a single site centralized repository.)
3. At-rest encryption option (applied at library creation time)
4. Ability to restrict certain files (or file types, products, etc.) to specific sites.
5. Access to file controlled by user roles/permissions, not by location of the file. So change sites and you have the same data and same permissions based on your user id.
6. Apply multiple site capability across all ALM functions, not just source code.
7. Use multiple site feature to provide warm-standby disaster recovery and live, up-to-date on-line backups.
8. Allow you to disconnect a site and have full read access and limited write access to the repository. So if you needed to take it to the space station, or on a flight, or out to sea.
9. Allow automatic recovery from network outages so that if you're connected to the network on the space station and you lose connectivity in some parts of your orbit, you are automatically resynched when you regain connectivity.
10. Allow remote access both through a native interface, with intelligent caching, or through a web interface.
11. Allow near-zero administration for the multiple site solution (CM+ MultiSite).
12. Ensure that schema changes (for your meta data), and process changes are automatically propagated across sites in near real time.
13. Provide an option for automatic propagation of user interface customization (by default, this can be site specific).
14. Ensure that all inter-site traffic is encrypted.
15. Use the multiple site framework to monitor synchronization for any potential problems.

As a commercial organization, centralization is important. I don't buy the suggestion that distributing data minimizes backup times, server delays, etc. If that is the case, you're using old "BIG-IT" technology, which generally is server-centric, instead of using smart clients.

I really don't think cloud computing should apply to CM/ALM, unless its pseudo-cloud (e.g. having IBM host your repository for you).

So those are my thoughts, along with how we have integrated these thoughts into the CM+ product.


View the original article here

No comments:

Post a Comment