content management system discovery project report

53
Content Management System Discovery Project Report September 30, 2004

Upload: samuel90

Post on 14-May-2015

16.946 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Content Management System Discovery Project Report

Content Management System Discovery Project Report

September 30, 2004

Page 2: Content Management System Discovery Project Report

- 2 -

Table of Contents Executive Summary…………………………………………….………………………..3 Introduction………………………………………….………..………………………….4 CMS definition Project Charter

Team Assumptions………..………………….………………………………………….……...5 The CMS "Space"………….…………………...………………..………………………6 The Process...…………………………………………………………………….……….8 Gathering data from the community

The Survey The Focus Groups Conclusions based on community input

Product Review Functional Requirements Technical Requirements

Products Eliminated from Consideration Products which merited further investigation

In-depth Testing OpenACS – the First Runner Up Lenya Atomz Macromedia Contribute

The CMS Discovery Team Recommendation………………………………………...24 Large site recommendation

For any size site -- An alternative business model to #1 Small site recommendation

The Task to IS&T………………………...…………………………………………….28 Appendices….……...…………………………...……………………………………….30 Project Charter

The Survey Product Matrix SiteMaker Evaluation and SiteMaker Issues List Plone Evaluation NCSU Lenya Questions and Answers CMS Consultation Guidelines

Page 3: Content Management System Discovery Project Report

- 3 -

I. Executive Summary Based on community and product research, the team has three primary CMS recommendations that will serve the widest range of possible customers. Recommendation #1: For large sites, particularly those that want a CMS that can repurpose content, the team recommends the further development of the open source product, Lenya. Apache Lenya, based on the Apache Cocoon content management framework, is an open source, full featured content management system that is currently under active development. Lenya is programmed in cross-platform Java and its content is stored in an XML repository. The team believes that xml is a forward-looking technology, whose predicted long life and flexibility will justify the long-term commitment of resources to development and support. Adopting Lenya would require significant development and ongoing customer support on the part of IS&T. Nevertheless, given the scale and call for CMS services within the MIT community, it is appropriate for IS&T to develop in-house CMS expertise to serve those needs. Recommendation #2: For those sites who want the flexibility of xml and an easy interface, but choose not to wait for development of Lenya, the team recommends Atomz Publish. Atomz Publish is a mature, commercial content management system based on a hosted Application Service Provider business model. Development and asset content is stored in a XML repository on Atomz servers. Published content resides on the site owner’s web server or Athena locker. This model is appropriate for those DLC’s able to support the ongoing subscription costs of the ASP business model. Contracts with Atomz should be negotiated and managed through a centralized IS&T service, so that the MIT community as a whole spends its Atomz dollars efficiently, getting more service through coordinated volume. IS&T should first determine whether Atomz can scale to the potential volume that the MIT community could bring to it. Recommendation #3: For DLC's who do not support their own web site infrastructure and do not require an enterprise level CMS system, we recommend Macromedia Contribute as a workgroup level content authoring tool. Contribute's interface and tools will be familiar to current Macromedia Dreamweaver users, but offer a considerably less complicated user interface. Additionally, expanding afs services and tools will alleviate some of the burden on administrators of Athena-hosted static sites and offer more interactive functionality. Therefore, the recommendations for small sites are as follows: 1. Volume License of Contribute, pending go-ahead from SWRT process 2. Develop additional Athena services, and promote them aggressively to the community. 3. Training: Athena Web-Site Hosting Information Technology certificate program. See Small site recommendation for details of recommended Athena services and training. See The Task to IS&T for specifics on IS&T’s role in delivering these recommendations.

Page 4: Content Management System Discovery Project Report

- 4 -

II. Introduction

1. CMS definition The team agreed upon this definition for a content management system, based on the projected needs of the MIT community: A content management system is a process and/or software application that allows groups to effectively plan, create, manage, store and distribute content. Content can be anything from: � published documents (web or print), � images, � archived communications, � presentations, � or streaming media.

For a CMS to be an effective system it needs to have the following traits: � Provide a method of content creation and editing that a non-technical user is comfortable

with using; this is usually done with templates. � Provide a system for describing the content (metadata) so that it may later be searched

upon using a search engine. � Provide a workflow mechanism for resource assignment, review, and tracking throughout

the content's development cycle. � Provide a method of tracking and storing various versions of content to provide the

ability to revert to a previous version of a content item or to review the changes that occurred.

� Provide for the redistribution of content in various forms, whether that is via the web, print, disk, or other media.

2. Project Charter See Appendix A. 3. Team Project sponsor:

• Susan Minai-Azary, Director, IT Architecture and Infrastructure Project manager:

• Rich Garcia, Discovery Process Project Manager Team members:

• Cecilia Marra, Office of Academic Services (team leader) • Mark Begley, Department of Economics • Sean Brown, Web Communications Services • Tim Boyden, Facilities • Roberta Crumrine, Student Services Information Technology (expert resource) • Tim Griffin, Information Services and Technology • Carl Jones, Libraries • Larry Stone, Information Services and Technology • Johanna Purcell, Technology Review (expert resource)

Page 5: Content Management System Discovery Project Report

- 5 -

III. Assumptions In reviewing products that might suit the varied needs of the many community customers, the team made several baseline, universal assumptions:

1. Conforming to a certain level of accessibility standards is a requirement. This may mean, however, that site templates can be managed to contain and retain accessible features such as image tagging, skip links, etc. Difficulties arise when new elements are introduced within content being authored by end-users. In such cases, the burden of properly tagging elements for accessibility falls either on the author or site manager as a manual process. This is clumsy, but appears unavoidable.

2. Whatever the team recommends must be better than the existing tools and services. We will be asking customers to learn new skills and devote both human and financial resources toward working with a new product. The end results have to justify the cost and effort to the customer.

3. Whatever the proposed recommendations, there will be a significant investment of IS&T resources, both human and financial. This investment may involve the development of new technical expertise among in-house staff or the outlay of financial resources to external developers and vendors. It will certainly involve in-house training and documentation resources. The team endeavors to make responsible recommendations that promise the most value to the customer in return for the commitment of IS&T and community resources.

Page 6: Content Management System Discovery Project Report

- 6 -

IV. The CMS "Space" 1. In the Academic Community in General In addition to homegrown CMS solutions and integrated LMS/CMS/portal solutions, many CMS products are in use throughout the higher education sector. Still, no clear CMS market leaders have yet emerged. Based on information gathered over the last two years from the archives of the well-subscribed University Web Developers listserv and the Educause Web User Group archives, the following chart indicates a selection of what product is in use at which schools: Product School Customer Comments Team Comments

Estrada

University of Alabama at Birmingham, Virginia Military Institute

UAB: since 1996 with wonderful results. Most of our campus, approximately 36,000 pages and over 700 authors, uses this software. We do have some departments that prefer to use their own methods and no campus mandate prevents this. VMI: We have been very pleased with this commercial product.

OmniUpdate Dartmouth

We started phasing out Frontier and Manilla in late 2002, when we conducted a discovery project on content management. (http://www.dartmouth.edu/goto/webcmsdiscovery/) We made the decision to move forward with OmniUpdate for our clients. (http://www.dartmouth.edu/goto/webcms/) We now have fifty sub-sites under the management of our four-person department, and nearly 100 content developers. (http://www.dartmouth.edu/goto/webpubsites/) Right now, all the sites are supporting a standard template from the Office of Public Affairs. When we next move forward, into 100% CSS design, each site will be able to have a distinct look.

Atomz Publish

Southern Oregon University

We'll be rolling it out over the summer (2002). I think we've found an excellent product at a good price.

http://www.sou.edu/access/. The Atomz logo is used as a link to enter the authoring environment.

CommonSpot Ohio University, Kent State

OU: It is a VERY robust system that runs on a cold fusion platform. Kent: This application has been a tremendously successful program for us.

Midgard

The University of the South Sewanee

It works pretty well and could work MUCH better if I had the time to write more custom code.

LiquidMatrix

Canisius College, Wilkes University

CC: It is pretty robust -- ask them about version 1.9 -- and the interface is simple enough for those without any HTML knowledge. It will cost you a few $$ but is worth it. Wilkes: The CMS has its quirks…

Page 7: Content Management System Discovery Project Report

- 7 -

Roxen CMS University of Alaska

Cost: $50-100K. Low level template development is handled by 2 FTE (technical) Template implementation is done by the hosted site's campus web coordinator (semi-technical.) Most everyone else is a content provider (no technical knowledge necessary, including HTML.) Upcoming portal project will require an additional 3+ technical FTE's in this area.

Typo3 Univ Missouri-Rolla

a lot of pre built functionality, a robust user management, active real world user groups

Zope/Plone University of Calgary

used consultants to help us get it up (in a record 3 months from start to finish) and are now in the process of learning the language. We have about 25 staff doing updates to the site and the CMS has made life much simpler for all of us. Previously, our site was maintained with Dreamweaver library items and templates - much greater learning curve for the users who are generally area secretaries with no Web background.

www.haskayne.ucalgary.ca

Ektron eMpower

Rice, Emerson

Rice: We've rolled this out to about 65 campus departments, research centers, and other misc organizations. It has been a very robust system for us, but it is completely lacking in cross platform compatibility.

Other schools just use the Ektron editor.

2. At MIT in particular Here at MIT, in the absence of an enterprise-wide CMS product, various DLC's have licensed or developed their own solutions. DLC Product OpenCourseWare MS CMS Technology Review Was RedDot; now migrating to Atomz MIT Home page Template Toolkit Sloanspace built on OpenACS Curriculum Information System and other SSIT systems built on Oracle with SQR

Economics customized Zope solution MIT News Office custom Filemaker solution MIT World custom LAMP solution

Page 8: Content Management System Discovery Project Report

- 8 -

V. The Process 1. Gathering data from the community

The CMS team gathered information from the MIT community on their various CMS experiences and needs, and on web site management in general. The team took a multi-prong approach to gathering data, in the hopes of gaining as much input as possible. A round-table discussion was offered during the Fall 2003 IT Partners conference; a web survey was advertised through TechTalk, yielding 40 respondents; three focus groups were held; and, finally, a hands-on demo of the two leading products in contention was hosted. A. The Survey See Appendix B.

B. The Focus Groups It became clear to the team early in the data gathering process that, from a CMS perspective, the MIT web site hosting community fell into three broad groups: small sites with no CMS, large sites with a CMS, and large sites without a CMS. The small sites typically had a single site administrator receiving and publishing content from scattered authors with varying web skill levels. They also typically did not have a structured business process for publishing. The large sites may have had more structure to their process in common, and may have had similar issues in terms of needs, but the fact that some of the large site owners had already experienced publishing with a CMS provided a natural split in the course of focus group exploration of the topic. The experienced CMS users were in a position to discuss what features and aspects of a CMS had really paid off, which had not, and which of their needs were still unserved.

C. Conclusions based on community input

1. The Product:

• No single product will suit all community needs. Therefore, the natural breakdown of customer types is by site size and the customer’s ability to provide its own technical (hardware, software, programming) support. We may also need to slice the customer base out by purpose of content/organization (academic vs. publication vs. administrative documentation). • Standards compliance matters. This encompasses 508 accessibility as well as html, xhtml (if applicable), and css validation. Compliance matters both to our customers and to MIT as institutional policy.

Page 9: Content Management System Discovery Project Report

- 9 -

• The product should have the ability to integrate with other MIT systems in some fashion. This may be accomplished elegantly or not, but is an important consideration in selection and development of the product. • Templates must be easy to use and revise. Administrators are typically expected to anticipate all possible scenarios at the time of development, and have little flexibility afterwards to make changes or enhancements. This has forced users to go outside the system for any pages that do not fit the template. • CMS templating may force customers to give up some of the javascript bells and whistles that they already have and like in their existing site’s look and feel. "Vanilla" CMS templates are an emphatic no-go for the large sites, and possibly for some of the small sites. • If the product separates content from format, the authors must somehow still be able to preview when creating or editing content. This may be through a staging server, temp files, or downloads into templates. • A shared resource is only feasible when different customers can have their own instance of the application, allowing them full control over their own instance. If given this control, customers are more than willing to share the cost of services. They are also willing to have their content stored externally, especially if there is a mechanism for data dumps in a usable format like xml. • Many of the open source products will have similar capabilities. Choosing the right open source product will depend on the Institute’s code preference, based on in-house expertise and belief in the language’s longevity.

2. The Customers:

• No product can eliminate author personalities, nor can it make content magically appear. One of the biggest frustrations noted throughout all the focus groups, surveys, and round tables had to do with aspects of the authors themselves, whether it was their varying skill levels or their willingness to conform to format requirements or delivering content in a timely manner. • The learning curve must be shallow. ‘Nuff said. • There is no product that does not require a knowledgeable web site administrator, no matter how large or small the site, no matter how primitive or sophisticated the product.

3. The Process:

• A CMS is not a substitute for a business process, unless the customer is a one-person shop. A CMS can support a process but can not force compliance or consistency

Page 10: Content Management System Discovery Project Report

- 10 -

where an ill-defined business process exists. If there is not consensus, investment, and enforcement in making the tool effective, it will fail. Well-defined business processes offer the most promise for success when implementing a CMS. On the other hand, highly-defined processes bordering on excessively complex will be so idiosyncratic that a custom-built tool will be required, and all the IS&T can do is offer guidelines rather than product recommendations. • Workflow must be highly adaptable, or workarounds ensue. Customers need to be able to remove steps/layers of approval as needed.

2. Product Review A. Functional Requirements

The team found that these functional requirements listed by the Technology Review publishing staff were representative of community customer needs:

Need: • To be able to tag content with categories -- on multiple levels • Authorization (having an admin account that sets the level of access others will get

for the project) • Flag items on an element basis (i.e. an article is made up of the elements: deck, title,

author, date, body, etc…) • Global elements • Global files (i.e. using the same article over again in different locations in the project) • Publishing out different versions (e.g., print version) • Publishing to a dev server before the page goes live to the site • Ability to preview before publishing • Integration with rich media (e.g., Flash) and other vendors (e.g., Dart) • Ability to publish our 3x weekly newsletter • Internal searchable directory • Ability to have dynamic pages • Proprietary database • Ability to access CMS form the office and form outside the office easily • Versioning control (e.g., templates) • Compatibility with browsers • Customization • Publishing XML content • Speedy and accurate publishing • Notification of publishing (e.g., when it’s finished, if there were errors) Want: • Ability to code in a program like Dreamweaver rather then an unusual template that is

part of CMS software • Cross browser testing • Cross platform testing • Ability to have entire site in content management system

Page 11: Content Management System Discovery Project Report

- 11 -

• Better integration with Search tool • Intuitive UI • Easy upgrades/maintenance • Small learning curve

B. Technical Requirements

The team compiled a matrix of products and how those products stacked up against the technical requirements of serving the MIT community enterprise-wide. There was separate consideration for small sites whose hosting and infrastructure could either be managed by the site owner or though afs services. Both proprietary and open source technologies were reviewed.

1. Products Eliminated from Consideration

The team decided that since many open source products had much in common, it was not necessary to review all of them, but rather take a few that were representative of their technologies and features. Also, platform constraints were a weeding factor. Platforms and technologies, such as IIS/ASP or Cold Fusion, are not ideally or easily suited to the MIT environment. The team does recognize that platform issues may not be a show stopper for MIT (e.g., mod_asp would run ASP on apache) and suggests that consideration be given to proprietary products, should the right candidate come down the pike in the future. At this time, we did not find any proprietary product that offered dramatically superior features to the open source products we reviewed. In addition to the above considerations, the team decided not to move forward with testing several of the products on our matrix for various reasons, which are summarized here: � Oracle: Enticing due to the existing volume site license, but considered an

"elephant gun" for development and therefore only appropriate for enterprise-wide applications, none of which are currently outstanding.

� SSIT Toolkit: The SSIT toolkit is not a contender as a product unless it were to be rewritten in Java. The IAP and CIS deployments, which were developed with the Toolkit, do have workflow schemes. These may be useful as models of workflow for comparison to other products.

� Manila and other blogging products: While Manila is perhaps a lightweight tool, not sufficiently feature-rich for full-scale CMS use, it could be a handy add-on to site managers that want blog-type community discussion. Userland puts out a fuller CMS product --in addition to Manila-- called Frontier that might be more appropriate as a real CMS. Pros include low cost; cons include questionable performance capability, particularly of concern due to the MIT network's susceptibility to being "slashdotted." Also, the showstopper for Manila is that secure ftp is not currently available, but expected in the next release. The team agreed in general that blogging tools are not sufficient to the community's CMS needs.

Page 12: Content Management System Discovery Project Report

- 12 -

� MS CMS: While we did not want to eliminate products for platform reasons alone, we did not find the MS CMS had sufficient functionalities to compensate for the platform and security concerns around running the product. We can not expect that our customers can or will be willing to give up their existing cross-platform user base. The primary community user of this product (OCW) is in process of looking for a technology to replace MS CMS.

� RTFM: This product is currently bare bones, and would require development almost from scratch. However, it bears watching, particularly as an implementation of RTFM is already underway as part of the Casetracker-to-RT development and deployment. RTFM will host Stock Answers. How well RTFM serves Stock Answers will better inform us as to its value as a CMS service within the MIT community.

� Template Toolkit and MetaDot (TT application): Template Toolkit lies behind MIT's home and upper level pages, but for larger sites, much more development would be required. For the amount of labor invested in development, other open source products offer more functionality on which to build.

For the complete matrix of products discarded from contention, see https://web.mit.edu/is/discovery/content-mgmt/internal/matrix_rej.shtml

2. Products which merited further investigation

As a consequence of completing the product matrix, several products stood out as worthy of more in-depth evaluation, for either large or small sites. Of these, the team chose to devote significant effort to assessing the viability of following products within the MIT environment: GVC SiteMaker, OpenACS, Apache Lenya, MacroMedia Contribute, and Zope/Plone.

For the complete matrix of products considered the strongest contenders, see Appendix C.

3. In-depth Testing

Because no polished demo versions of these open source products already exist, the team took up the challenge of implementing our own demo installations for evaluation and testing purposes. Carl Jones implemented test sites for SiteMaker, Lenya, and Plone. Larry Stone implemented a test site for OpenACS. Larry and Carl also implemented templates in their respective sites, while the rest of the team simulated various authoring tasks and levels of experience. Tim Griffin took on the task of in-depth evaluation of Contribute, bringing to bear his roles as a Contribute Beta Project tester for MacroMedia and a member of the IS&T’s Web Communications Services team. See Appendix D for evaluation of SiteMaker. See Appendix F for evaluation of Plone.

Page 13: Content Management System Discovery Project Report

- 13 -

A. OpenACS – the First Runner Up

The team found OpenACS to be a very strong contender, and could certainly be developed to serve the community needs, should IS&T determine not to purse Lenya’s particular set of technologies. The integrated database, the available authoring and cms modules, and the existing MIT/Sloan expertise in its technologies all make OpenACS a viable candidate. The disadvantages noted below (standards compliance and lack of WYSIWYG authoring) point to where further development should be prioritized. Because of its potential to serve the MIT community’s needs, the team’s evaluation of OpenACS is included here.

Summary OpenACS could form the basis of an excellent enterprise-wide CMS, although it would require a substantial investment in development, customization, and documentation. It already has virtually all of the features we require. We can also leverage the experience and efforts of SloanSpace, a major OpenACS installation already on campus.

Overview OpenACS is a server-based open source Web application framework. It consists of a suite of core services and a large collection of optional application modules, including several different types of content management system (CMS). These application modules are easily modified, extended, and integrated into new services. OpenACS has its roots in the ArsDigita Community System (ACS). Version 3 of the ACS was released to the open source community to become OpenACS, while the Java reimplementation is now with Red Hat.

Features Some relevant features that exist in the current (5.0) release of OpenACS:

• User registration and login. • User groups, sophisticated ACL-type permission system. • Supports multiple independent top-level websites, each with independent

administration, look-and-feel, etc. • Driven by relational database for performance, reliability, and ability to manage

structured data easily. • Hierarchical "Content Repository" is the basis of OpenACS' CMS-like applications.

It supports versioning, audit trails, access control. • A powerful, hierarchical template engine makes it easy to separate content and data

from presentation. Lets you customize all Web pages, even the administrative UI. • "File Storage" module maintains "assets" like images and opaque (Word, PDF)

documents. Upload and download individual files or whole ZIP archives. (WebDAV coming soon.)

• Simple but highly user-friendly "Edit This Page" CMS, suitable for maintaining simple pages and structured data.

• "News" application specialized for managing news items, including automatic posting and removal.

Page 14: Content Management System Discovery Project Report

- 14 -

Technology and Platform OpenACS is built on the following components:

• Any common Unix operating system such as Linux (e.g. Athena Linux 9.2) or Solaris 8/9.

• AOLserver open-source Web server. "AOLserver is the backbone of the largest and busiest production environments in the world. AOLserver is a multithreaded, Tcl-enabled web server used for large scale, dynamic web sites."

• Relational database, choice of PostgreSQL or Oracle. • TCL, an interpreted scripting language. OpenACS is primarily implemented in

TCL, with some functions in the database scripting language (PL/SQL and PostgreSQL's equivalent) for performance.

Although AOLserver is not as widely deployed as Apache, it is superior in some ways; it is multithreaded, fast, and efficient. Some of the busiest sites on the internet run on it. It's open-source and has an active development community with good support. TCL is an unusual choice for a development language, but at least one potential developer in IS&T is fond of it. TCL is clean, simple, and easy to learn; anyone exposed to Scheme or LISP will find it familiar. It is also less prone to bugs and pitfalls than, e.g., Perl. Although some of the technologies behind OpenACS seem exotic if you are accustomed to LAMP (Linux + Apache + MySQL + Php/Perl) servers, they are all well-proven, mature, and actively supported.

Community There is a large and active OpenACS commnity. See http://openacs.org for references to large-scale users, such as Greenpeace. A major OpenACS-based project, DotLRN (http://www.dotlrn.org/), has its roots in the Sloan School here at MIT.

Advantages

• Scalability: Small e.g. departmental site with minimal customization is easy and cheap to implement, while a sophisticated site is also possible with more time and effort.

• Very low learning curve to maintain site built with Edit-This-Page. Small updates and changes are easy.

• Mature software that is unlikely to change in drastic ways, so it is less effort to maintain a central service than if software is still evolving.

• Can collaborate with other MIT projects (Sloan, DotLRN). • Additional useful services are also available (news, calendar, workflow, conference

registration, ecommerce, RSS feeds).

Disadvantages

Page 15: Content Management System Discovery Project Report

- 15 -

• No HTML validation of content (might be possible to add it). • Poor and incomplete documentation. • No tools to extract and repurpose content e.g. as XML or print (could be added). • No WYSIWYG HTML editing (might be added with in-browser editors).

Additional Work Needed If OpenACS is adopted, it will need at least this much additional work to be deployed at MIT:

• Finish adapting OpenACS to login automatically with MIT client certificates (mostly done).

• Finish implementation of user-editable templates in EditThisPage. • Write documentation for end users (webmasters, site maintainers). • Add or improve an asset-management module (like FileStorage), for e.g. images in

web pages. Try adding WebDAV support. • Improve tools for importing an existing site into CMS. • Add EditThisPage sub-applications for common cases of structured data (e.g.

faculty and staff lists, publications, etc.).

Extras Some unexpected extra features and benefits of OpenACS:

• Workflow module available; we didn't investigate it. • Calendar (scheduling) module, also untried. • DotLRN project is implementing WebDAV server module that can be used by

itself. • Ecommerce module and event registration module could be combined to automate

conference registration and payment. • Can be configured to allow outside users to register and join communities, so they

can be given administrative privileges or just see restricted material. • OpenACS has other, more sophisticated (and less mature or documented) CMS

modules that should be investigated.

Page 16: Content Management System Discovery Project Report

- 16 -

B. Lenya – The “Winning” Product

Lenya could fulfill the role of an enterprise-wide CMS at MIT, although it would require a substantial investment in development, customization, and documentation. It has many of the features we require, although some will need enhancement. One of Lenya's core concepts is to re-enforce the separation of formatting from content creation. Overview Lenya is a server-based open source Web application framework. Lenya has its roots in the Wyona content management system, which is based upon the Apache Cocoon web publishing framework. Wyona was then released into open-source and quickly adopted by the Apache Cocoon community. One of the early adopters was the University of Zurich. Wyona.com continues to supply consulting and support. Features Some relevant features that exist in the current (1.2) release of Lenya: · XML-centric architecture · Robust XML/XSLT/CSS support enforces separation of content from formatting · WYSIWIG XHTML/XML editor suppport from within web browser · XHTML/XML form editor · User registration and login · User groups, sophisticated ACL-type permission system. · Supports multiple independent top-level websites, each with independent administration, look-and-feel, etc. · Customizable workflow engine, with audit trail · Asset management to keep track of images and documents that belong to a page · Revision control · Flexible deployment options · Security: SSL, LDAP authentication, IP address range See http://cocoon.apache.org/lenya/roadmap.html for more detail on current and future release features. Lenya is built on the following components: · Unix operating system such as Linux (e.g. Athena Linux 9.2), Mac OSX, or Solaris, · Apache webserver is the backbone of the largest and busiest production environments in the world. · Servlet Container, choice of Tomcat, JBoss, etc. I chose Tomcat for simplicity. · Cocoon 1.2.5 · Java (jdk 1.4.* and above recommended) · XML documents stored as xml · Uses XSLT and CSS for formatting and page presentation · WebDAV · Lenya is officially an apache incubation project. Incubation normally lasts for about 1 year. The latest official release is version 1.2, june 27, 04. See http://forrestbot.cocoondev.org/sites/incubator-site/process.html

Page 17: Content Management System Discovery Project Report

- 17 -

Performance · Servlet containers. There have been reports of stability issues with Tomcat (depending on release). Further stress testing with servlet containers (tomcat vs. jboss, etc.) is recommended. Community There is a small but growing and active Lenya commnity. See the http://cocoon.apache.org/lenya/community/ for links to live sites. Needed · Browser-based editing (e.g., Kupu, Bxeng) with its heavy reliance on javascript can be erratic and validation sometimes unpredictable; needs to be more consistent; some of this is setup and training-related. · Add web-editable css to file menu, instead of editing files on server. · Better end-user documentation as well as documentation for maintainer/admin. Sample of existing documentation: http://cocoon.apache.org/lenya/docu.html#docs/components/ · Better image/asset selection in authoring mode. · Improve browser-based html editors. · Import pre-loaded MIT users and groups. · Finish MIT cert integration, automatic login. Allow passwords too. · Examples for integrating cocoon dynamic components and lenya.

Extras · Lucene search engine integrated into Lenya · Workflow management included. · Dynamic component integration through Java-based Cocoon frameworks. · Wide variety of technologies, both legacy and cutting edge (e.g. JSP's, Velocity templates, Cocoon forms, Flowscript, relational and xml database connectivity, portal-management, etc.) may be used. · WebDAV fosters offline editing with Dreamweaver or other desktop editing tools, including MSWord, OpenOffice, etc. · Repository JSR 170 will offer significant ease-of-use improvements for navigating the file systems -- due out in the fall. This, coupled with the ability to edit css files, will provide a substantial boost in functionality. See http://wiki.apache.org/incubator/JcrProposal.

Additional Work to do: · Configure Apache to accept MIT certs, allow login w/o password. · Will test using xincludes for content aggregation (does not reguire command-line access to server to edit xslt files). · Prototyped simple import of IS&T sample site; Advantages · Lenya is built upon the apache open-source software stack

Page 18: Content Management System Discovery Project Report

- 18 -

· Lenya is young enough that MIT could have substantial input into the direction of future development. · Static or "dynamic" publication is possible. · May integrate components developed with cocoon (e.g. forms, database access, customized pages, personalization, etc.). · Scalable to both large and small sites. · Standards-based XML editing (validation may be customized on a per-publication basis). · Custom "document types" support (customizable XML formats). · May optionally output to non-html formats, such as PDF, with minimal changes. Disadvantages · Relatively new technology with limited penetration at MIT (e.g., Apache Cocoon with its reliance on the pipeline architecture, XML, XSLT, WEBDav, etc.). · Some development needed using an "unfamiliar" XML-centered framework. · Documentation needs to be more complete. Administration and development documents can be inconsistent, end user documentation leaves much room for improvement (but is improving).

Other points to make: See North Carolina State University questions (Appendix G) on Lenya usage. Lenya outputs XHTML, using a combination of XML, XSLT, and CSS. The architecture used by most Lenya publications is based on the idea of aggregated content, whereby the output from multiple XSLT transformations is combined to form the rendered document. Each transform will construct one part of the overall document (for example, the masthead). This is a simple approach that is similar to that provided by other CMS templating systems. An alternate approach is to use a single XHTML template document that uses XInclude to aggregate content. In some ways, this is even simpler than the first approach in that no knowledge of XSLT is required.

C. Atomz – A Viable Product with an Alternative Model

Atomz came to the forefront of the team’s attention late in the project. Based on very strong positive feedback from Technology Review, the team felt that there was a customer niche that could be well served by the easy user interface and excellent customer service offered by this particular vendor through the ASP business model. Atomz Publish is a well-established commercial content management system product that could serve as a vendor hosted and supported service for the MIT community. Atomz Publish will integrate with existing MIT resources (Athena web lockers, Macromedia Dreamweaver editor) and has been recently implemented by the MIT Technology Review for their web content management and publication needs. Atomz

Page 19: Content Management System Discovery Project Report

- 19 -

Publish has an extensive customer list including high profile customers such as NASA, AOL/Time Warner, and the Harvard Medical School.

Overview Atomz Publish was introduced in 2001 as a hosted application solution to web content management. As a hosted application, Atomz Publish runs on servers located in Atomz's network operations center where it is maintained and updated, the customer's data remains on their own servers where it is accessed from Atomz Publish by SFTP or direct folder access over a VPN. Atomz updates the Publish application on a quarterly basis taking into account customer comments and feature requests.

Features

• Open standards: XML, XHTML, XSLT, CSS • Integrated XML content repository - import, export, archive XML • Cross-platform in browser WYSIWYG editing with spellcheck or integrated

development with Macromedia Dreaweaver or Adobe GoLive • Form based editing using rich text • Manage and upload Assets directly using web folders on Windows and Macs • Integrated and customizable task based workflow with email notifications • Check-in, check-out and unlimited versioning of content, assets and templates,

change tracking • Scheduled publishing and content expiration • Metadata management • Automatic hyperlink checking • User and publishing activity reporting tools

Advantages

• Hosted commercial application - no further programming development, software or hardware required to run application/service.

• End-users will easily be able to create/edit content and templates using in browser editing tools or Macromedia Dreamweaver and Adobe GoLive.

• Well documented and supported by established entity. • Works with existing Athena web lockers or departmental web server. • Web standards/accessibility compliant. • XML content could be repurposed for other publishing mediums.

Disadvantages

• Hosted commercial application - annual support/subscription costs, application not under local control.

• Windows/Macintosh-centric, limited support under Linux. • Not Open Source.

Page 20: Content Management System Discovery Project Report

- 20 -

D. Macromedia Contribute – For Smaller Sites

Overview Macromedia Contribute 2 (2.1 for Windows) is likely the best candidate for a low-end, low cost content management solution. A user can browse to web pages and then edit and publish those pages all from within Contribute. Additionally, Contribute can import Microsoft Office content (Windows only in v2). This feature alone is a big plus for faculty and administrative staff at MIT. Further, Contribute integrates with Macromedia Dreamweaver MX for site administration and development purposes. And the price is right at $79 per user license (education) and likely less with a site license.

Features: Publishing Process Contribute offers a "Word-like" WYSIWYG experience that allows content contributors to browse to a web page, click on the edit button, make changes, and click on the publish button. Users can easily update content, insert images or add new pages to a site. Users of Contribute never see the code. This prevents any deviation from an established style guide set by site administrators and eliminates the undue stress on the contributor that might be created by such a feature rich tool like Dreamweaver. Site Management Site management is done by a site administrator through Dreamweaver, Contribute and/or at the file level on the server. The site administrator creates the file structure, sets site permissions, and develops site templates. A site connection key is created and emailed to the client or stored on a local server for download by the Contribute user. The Contribute user then simply double clicks on the connection key which then launches Contribute, asks for appropriate passwords, and makes a connection to the site. All site settings and permissions are stored in an encrypted XML file at the root of each site. Integration with Dreamweaver MX Contribute is integrated with Dreamweaver MX. This integration allows for easy site administration and development using DW MX and MX Templates. Contribute uses the Dreamweaver MX authoring engine and so offers support for CSS, XHTML, server-side code and Dreamweaver MX templates. Dreamweaver MX Templates Dreamweaver templates can be designed with Contribute in mind. The site administrator can dictate which templates must be used when a Contribute user wants to create a new page on the site. This ensures adherence to design standards and protects the code from unintended hacking.

Community and Support Currently there are a few DLCs looking at rolling out Contribute to their content contributors. Among these are Facilities, OCW, and MIT Libraries. There is a large user community in education and the private sector as well as significant documention provided by the vendor.

Page 21: Content Management System Discovery Project Report

- 21 -

Advantages

• Uses Dreamweaver rendering engine • Intuitive interface • Server connection wizard is easy • Creating new pages is easy • Creating links very easy • Publishing pages is easy • No local copy needed or created • Site Administrator sets permissions • Uses check-in/check-out for version control • Supports Word file with drag-and-drop (Windows only) • Clean code • Many schools are deploying as part of a content management solution (BU, Notre

Dame, Indiana, USC, ASU) • Most users are publishing after 15 minute overview

Issues and Concerns Note: Most of the issues noted below should be remedied in the next version of the application. Security

• The shared settings file which holds all permissions, site passwords (not SFTP passwords) and permission group definitions for website access are held in a directory called "_MM." The directory name and files cannot be altered or changed in any way other than with DW or CT. These are encrypted XML files. By default, these files are visible on Apache servers. In order for these files to be secure/not visible to the world the server would need to be reconfigured as not to show files/folders that begin with an underscore.

• The IS&T Net-Security Team states that the algorithm used to encrypt these settings files, MD5, isn't truly encryption and that the algorithm doesn't provide much security at all. However, on the spectrum of risk we needn't consider it a show-stopper.

Permissions Site permissions set up is going to be tedious.

• Only one admin per folder on the server. • To establish new editing rules for the site you have to generate a new connection

key. Once that is done the old one is overwritten. Supposedly there is a way to have site permissions updated automatically from the config file at the site root but we've not tried this yet. See http://web/tgriffin/Public/contribute/howto/contrib_setup.pdf

Platform Differences

Page 22: Content Management System Discovery Project Report

- 22 -

There are significant differences in features/functionality between the Mac and PC clients.

• Contribute for Macintosh has the Opera 6 browser built into the program. Unlike IE on the Windows client, Opera is embedded in the program and can not be updated. The Contribute installer installs Opera automatically. It does not need to be installed separately.

• Window users need the 2.01 updater and the Flashpaper updater. • Microsoft Office cut and paste only available on Windows

FlashPaper

• FlashPaper development only available from Windows client. • FlashPaper is a new way of making Flash documents. It installs as a printer driver,

which means that you can create a .swf file of any document. This swf can be put in a web page and viewed in a browser, using the Flash plugin. FlashPaper allows users to zoom in and out and page through the document. The document can be printed directly from the FlashPaper web page.

• Not 508 compliant. • Needs flash player 5 or higher to work. 98% or higher of browsers [industry stat]

have v5 or higher. • Interesting alternative to PDF but don't know if it will be developed further.

CSS-P • CSS-P rendering in Edit View does not conform to IS&T's guidelines for

compliance with web standards. This is a major problem. • The Contribute editor chokes when displaying negative margin widths - which are

used a lot in creating column based liquid layouts with CSS. • Contribute uses the Dreamweaver MX 6 display engine that is highly flaky with

CSS only designs. We've found no way of hiding styles from Contribute. One university has deployed Contribute and uses javascript to sniff out Contribute and bypass the CSS file. This means that when viewing all would look fine but when editing the user only sees structured html; no styles.

Workflow There needs to be a defined workflow/business process in place for contributors and administrators per DLC. This is needed if the DLC is going to manage their own site or if IS&T develops a service offering based on the Contribute-Dreamweaver web management model. Other Issues

1. Sometimes when updating Mac OS the user is asked to input the CT serial number again. This is actually a minor security problem. See

Page 23: Content Management System Discovery Project Report

- 23 -

http://macromedia.com/devnet/security/security_zone/mpsb04-03.html for details and a patch.

2. What happens to conversion of images from Word documents: they come in as jpg files only?

3. CT auto uploads files/images on the contributor's desktop to the same directory as the file being edited unless the user knows to use the choose button.

4. Cannot edit php include files and php includes do not render in the edit mode, only in view mode.

5. We may want to recommend the disabling of "rollback files" that CT creates. These are backups of previously published pages. CT by default creates 3 backups of every file and stores them all in a "_baks" directory. Having potentially 4 copies of whole sites on the server will eat up file space pretty quickly.

6. Is there such thing as a site that is too large for Contribute? Many current users say that it's meant for smaller web sites.

7. When logging in CT auto connects to all sites. Why? 8. Need training offerings from IS&T on using CT, Managing CT users, Developing

and managing sites for CT.

Addendum: August 31, 2004 In August of 2004, Macromedia released Contribute 3 as part of the unveiling of its new Web Publishing System (WPS). The WPS is a content management suite comprising of Studio MX (Dreamweaver, Fireworks, Flash, Freehand and Cold Fusion), Flashpaper 2, Contribute 3 and Contribute Publishing Services (CPS). These 8 applications constitute a content management system. While the WPS is designed to handle large site content and publishing workflow, IS&T does not have the needed infrastructure in place to support the system. However, Contribute 3 functions as a stand alone content publishing application. Therefore, the CMS team's recommendation to release Contribute stands. Macromedia has fixed many of the problems outlined in the above report. See the provided vendor links and the Contribute 3 feature comparison at http://www.macromedia.com/software/contribute/productinfo/features/comparison/ct2_vs_ct3-wps.pdf for more information.

Page 24: Content Management System Discovery Project Report

- 24 -

VI. The CMS Discovery Team Recommendation

1. Large site recommendation

Lenya – See product evaluation above For large sites, particularly those that want a CMS that can repurpose content or communicate in some fashion with other systems, the team recommends the further development of the open source product, Lenya. Apache Lenya, based on the Apache Cocoon content management framework and several other Apache projects (such as Forrest, Lucene, Jarkarta Tomcat) is an open source, full featured content management system that is currently under active development. Apache Lenya is programmed in cross-platform Java and its content is stored in an XML repository. Apache Lenya also features customizable workflow, inline WYSIWYG content editing, and versioning of content. Apache Lenya's main advantages are its open source code, strong user community, and standards based authoring. As an xml product, Lenya meets the user requirements of standards compliance. XML also offers publishers the ability to structure data in such a way that it can be repurposed for multiple formats. Finally, xml data can be imported readily into databases, for a measure of integration with other MIT systems. The team also feels that, while cocoon itself may be a new environment for IS&T, apache and java are solid technologies and within MIT’s existing expertise. Adopting Lenya would require significant development and ongoing customer support on the part of IS&T. Nevertheless, in recognizing the scale and importance of CMS needs within the MIT community, it is appropriate for IS&T to develop expertise in CMS technology in-house so that the varied and special needs of the community may be better served. Priority tasks for development of Lenya include smoothing the user interface and attaching a database as a content repository for potential high-performance dynamic sites. The team believes that xml is a forward-looking technology, whose predicted long life and flexibility will justify the long-term commitment of resources to development and support. Should IS&T desire the services of an outside vendor to collaborate in the development of Lenya, Quoin, located in Boston (http://www.quoininc.com), stands ready to work with MIT toward that goal.

1. For any size site -- An alternative business model to #1

Atomz Publish For those sites who want the flexibility of xml and an easy interface, but choose not to wait for development of Lenya, the team recommends Atomz Publish from Atomz Corporation. Atomz Publish is a mature, commercial content management system based on a hosted Application Service Provider business model. It is a full-featured system with customizable

Page 25: Content Management System Discovery Project Report

- 25 -

workflow, "browse to edit" and inline WYSIWYG page editing, unlimited versioning of content and cross-platform compatibility. Development and asset content is stored in a XML repository on Atomz servers, while published content resides on the site owner’s web server or Athena locker. Atomz Publish's main advantage is its ease of use and low learning curve, excellent documentation, and superior customer support. This model is appropriate for those community sites able to support the ongoing subscription costs of the ASP business model. Pricing is based on a user account and page count formula. Contracts with Atomz should be negotiated, and managed through a centralized IS&T service, so that the MIT community as a whole spends its Atomz dollars efficiently, getting more service through coordinated volume. An area requiring further investigation is the determination of whether or not Atomz can scale to the volume that the MIT community could potentially bring to it. There are performance issues with the instance currently in production for the Technology Review, though it is unknown whether the performance slowdown is due to TR’s network infrastructure, or to some aspect of Atomz’ service delivery. This must be determined, and if indicated and feasible, the team recommends that MIT explore, as part of a contract with Atomz, the option of hosting Atomz web servers on the MIT network, with Atomz retaining all ownership and management of the Publish software itself. Securing the rights to the source code should Atomz not survive would be a wise precaution as well. Because there are contingencies and unknowns around the viability of Atomz as an enterprise-wide CMS solution for the MIT environment, we make our second recommendation conditional on further research by IS&T. Regardless of the outcome, we urge IS&T to recommend Atomz as an option to individual community customers as indicated in initial consultation of customers’ business needs.

3. Small site recommendation A. Tools for sites that use Athena lockers Five issues stand out for those DLC’s currently not using a CMS: • Absence of Site administrator and/or content author skill sets • Ability to support site hosting equipment and software • Flexibility in site templates • Flexible workflow • Dynamic site hosting The first and foremost burden weighing down the administrators of DLC web sites is the difficulty of getting content from their authors in any kind of web-ready format. Frequently, this is a consequence of authors who are inexperienced in web publishing, both in terms of writing style as well as appropriate file format. Lack of web expertise among the authors and business process decision-makers often leads to the unfortunate combination of undercooked content being handed off to overextended administrators who

Page 26: Content Management System Discovery Project Report

- 26 -

have to learn on the fly how to be webmasters in an unstructured environment. Both the web site administrator and content authoring skill sets have been undervalued in the MIT community, leading to web sites that are managed by insufficiently supported, often frustrated staff. Many site administrators in the MIT community are constrained to host their web sites in an Athena locker. They have neither the human nor financial resources to maintain their own web server equipment and software. Therefore, they are limited to whatever services are offered by IS&T. However, IS&T has a lot to offer, and many of its tools often go underutilized. Additionally, IS&T can introduce more features, making Athena/AFS an even more viable service, filling some of the community’s CMS needs which are currently going unmet. Based on feedback from the various CMS focus groups, potential IS&T customers are seeking some CMS features more than others, and in fact, would reject some of the constraints that would be imposed by a CMS. This is an opportunity to make Athena/AFS more attractive as a hosting service. Many DLC’s who can’t afford to purchase, implement and support a CMS have chosen instead to sink the resources they do have into attractive site templates designed by external vendors, often containing scripting that would be incompatible with many CMS applications. These template designs offer great visual value for the expense and satisfy the business owners, and in some cases meet the business requirement for web publications to match the look and feel of print publications. Giving up the templates in which DLC’s have already invested significant resources is a firm no-go for many site owners. Another common CMS feature leaving much to be desired by many site owners is rigid workflow functionality. Since many small sites either have 1 or few site administrators, a complex approval system is counter-productive. Even those sites with a larger team of editors, managers, authors, and approvers would require a workflow that allows flexibility and end-runs around the system. Rather than a complex workflow, these sites’ business processes seek either a simple supporting technical process, or one that is infinitely adaptable to their own site-specific processes. While many of the major issues for this particular customer base can be resolved by enhancing existing Athena services, hosting a dynamic site on Athena is still not possible. Those customers requiring a dynamic site must commit resources to supporting their own network or cost-sharing a CMS service offered by IS&T, upon IS&T CMS consultation. Hosting a site on Athena can be made more attractive to this customer base by compiling a toolbox of functionalities, and training site administrators and authors on their effective use. A small amount of development would go a long way in streamlining site maintenance, particularly in offering web interfaces to Athena tools currently only available through command line interfaces. Additionally, expanded training and support on selected topics would reduce the learning curve for both content authors and site

Page 27: Content Management System Discovery Project Report

- 27 -

administrators. With an effective, intuitive authoring tool, authors can be less dependent on administrators to clean up their content, and administrators can be freed up to perform more appropriate site management tasks.

Recommendations for serving small budget or static sites: Based on our findings, the CMS Discovery Project team recommends the following: Volume License: • Contribute 3 (see above for product evaluation) Develop or enable, and promote: • AFS web gui for managing acls, files, and some versioning -- explore a suitable WebDAV product, such as Xythos, or develop web interface for CVS or RCS • Test and offer newer cgiemail version and other new approved cgi scripts • Custom error pages to reduce redirect proliferation • IS&T server with blogging tool • An effective link checker, such as Xenu or Dreamweaver's built-in site link checker utility • Web developer's utilities, such as the AIS IE toolbar, Checky for Mozilla, and the Firefox web developers toolbar • RSS Training: Athena Web-Site Hosting Information Technology certificate, encompassing sessions on the following topics: 1. Assessing and Improving Your Web Publishing Business Process 2. Authoring with Contribute 3 3. Managing Contribute 3 Users and Web Sites 4. Implementing a Custom Events Calendar 5. CSS: proper table-free page layout, efficient formatting, printer-friendly versioning 6. Accessibility and X/HTML code standards 7. Web writing style: bullets vs. narrative, inverted pyramid style, succinct language 8. Using DW MX 7 templates and Contribute 3 effectively for site maintenance 9. Server Side Includes for easy maintenance of footers, navigation, and content repurposing.

Page 28: Content Management System Discovery Project Report

- 28 -

VII. The Task to IS&T This is an opportunity for IS&T to improve and expand its current web-related services. Some of these services are a fix to what's already in place as free offerings to the community; others can become part of a tiered system of fee-paid services. 1. Create IS&T CMS Services team that includes web development and business

process consultation, Lenya product development, template design, training, product support, and hosting/network management. Because there is a great deal of overlap (and significant distinctions) between content management and knowledgebase technology, this team could productively share resources and collaborative services with a Knowledgebase team.

2. Consultation services. For all size sites and DLC's, IS&T should provide

consultation and recommendations on the right product or service for the customers. A set of guidelines (See Appendix H) for discussion and decision-making should be provided to facilitate this crucial first step.

3. Put Contribute 3 through the SWRT and volume licensing processes. 4. Develop Lenya functionalities. The following development work will be required:

• Editing can be erratic and validation unpredictable; needs to be more consistent; some of this is training-related.

• Add web-editable css to file menu, instead of editing files on server. • Documentation for end-user, maintainer/admin • Better image/asset selection in authoring mode. • Improve browser-based html editors • Import pre-loaded MIT users and groups. • Finish MIT cert integration, automatic login. Allow passwords, too.

Desirable additional development for Lenya:

• Lucene search engine integrated into Lenya • Workflow management • Cocoon framework allows development of dynamic components incorporating

jsp's, database connectivity, velocity templates, portal functionality, etc. • WebDAV fosters offline editing with Dreamweaver or other desktop editing

tools, including MSWord, OpenOffice, etc.

Page 29: Content Management System Discovery Project Report

- 29 -

5. For both Lenya and Contribute, training must be part of the services offered by IS&T, running the full spectrum of step-by-step how-to’s with screen shots, quickstarts, hands-on training, and custom consulting. Creating documentation is an absolute must for both Lenya and Contribute. Quickstarts can be offered for authoring with both products. Fee-paid services and training can include proper design and implementation of templates. Additional revenue can be garnered through hosting services, data migration assistance, and consultation for DLC's who choose to do their own hosting.

6. Investigate the feasibility of Atomz as an enterprise offering, while maintaining its

ASP model and excellent customer service. IS&T should explore Atomz performance capabilities, considering the possibility of support an Atomz implementation on the MIT network while preserving the application ownership by Atomz. Should Atomz prove a viable service, IS&T should cost effectively manage the community contracts with Atomz, allowing the greatest value for the cumulative MIT dollars. IS&T should also work out a contractual arrangement that grants rights to the source code in case of Atomz' failure to thrive.

7. Develop new tools for Athena locker owners and promote them. Making Athena

easier to use for site administrators who are unfamiliar with linux will allow users to take fuller advantage of the great service that Athena is and will give the small site owners with small budgets CMS-like functionalities without the CMS investment. A few examples:

• Looking into webDAV offers the prospect of more intuitive file management and could provide some measure of versioning. If webDAV is not the answer, then developing a web interface for RCS or CVS may be the right path.

• Review and release a more recent version of cgiemail. There are added functionalities and improved security in the later versions.

For more detail, see Small site recommendation.

8. The WebPub user group and listserv must become a much more proactive service to the MIT community. Many of the small site recommendations would make excellent topics for monthly meetings and would promote existing services and features that many site owners do not know are already available, such as SSI or integrating the event calendar. Additionally, it would provide a venue for promoting newly developed tools, like newly approved perl scripts. Focus group participants stated that, had they known that applications like Tech Time and the Event Calendar were in development, they would not have tried to build their own comparable applications. Keeping the community informed of development efforts can save DLC labor and would gain much goodwill in the community. The demand for improved service and communication is out there. The webpub list is large, and is just the right vehicle to meet that demand.

Page 30: Content Management System Discovery Project Report

- 30 -

Appendices

Page 31: Content Management System Discovery Project Report

- 31 -

Appendix A. Project Charter a. Project Justification

A content management system (CMS) is a set of tools that integrate and automate the various phases of a publishing enterprise. It allows content authors to go on line to create and update their own sections of a collaborative publication, accommodates quality control by editorial staff and provides tools for workflow management, including version tracking, messaging, and varying levels of content approval. Once entered into the system and validated, new and updated content can be edited and prepared for publication in appropriate media, including print, Web, and CD-ROM. A CMS might also facilitate layout and production by automatically formatting content for the target medium or by preparing content for autoformatting (to the fullest possible extent) in allied applications. A number of administrative offices and academic departments across the MIT campus have a need for such a system. OpenCourseWare has adopted one, at least for the near term; the Reference Publications Office has another; and various Web sites (e.g., MIT World) are also using content management systems. The purpose of this project is to determine whether there is value in deploying such a system or systems at an enterprise level.

b. Discovery Questions

How widespread is the need for or interest in a CMS? What would a CMS need to do in order to accommodate the workflow of the various administrative and academic publishers? Are there sufficient common needs that one or more systems deployed at an enterprise level would be more efficient than present arrangements? Can currently deployed systems be expanded or adapted to accommodate other users? How would an enterprise-level CMS be financed, if recommended?

c. Approach to the Work

It is unlikely that a single product can meet the workflow and publishing requirements of all potential users. The team should therefore consider a range of products that address a range of needs. This can be done by surveying the needs of various departments and offices within the MIT community through interviews and focus groups; by consulting with those offices that have already adopted a CMS, to build on what they have learned; and by evaluating commercial and open-source CMS products to identify the solutions that most closely meet, or can most readily be adapted to meet, Institute requirements.

d. Expected outcomes

A survey of current CMS’s on campus and their types and use. One or more lists of functional and technical requirements for a CMS. An evaluation of costs and benefits, in terms of potential time and workload savings for the various offices that would use a CMS. A recommendation, with cost estimate, for purchasing or building products that meet these requirements, or can be modified to meet them. If an enterprise CMS or CMSs would add value, then: � A plan for funding the purchase of an appropriate CMS. � A plan for documentation and training. � A plan for maintenance and support.

Page 32: Content Management System Discovery Project Report

- 32 -

Appendix B. The Survey About your existing content:

1. What kind of data do you need to publish? Check all that apply:

Narrative copy

Table-style data (directory listings, etc.)

Forms

Brochures

Other (specify):

2. In what file formats does your content typically originate?

Paper documents

MS Word docs

Quark, Pagemaker, or other Desktop Publishing format

Image files

Existing web pages

Database (dynamic or export files)

Portable Document Format (pdf)

Other (specify):

3a. Do you publish some content in both web and print versions?

Yes No

3b. If "Yes," which ones?

MS Word docs

Quark, Pagemaker, or other Desktop Publishing format

Portable Document Format (pdf)

Other (specify):

4. What is your web site publication cycle? (How often might a document be updated?)

Daily/Constantly

Weekly

Monthly

Academic Term or Semi-Annually

Annually

Tell us about your primary web site structure and support:

5a. Approx. how many pages: 5b. URL:

6. How much staffing to maintain? Please enter number of individuals; does not have to be full-time.

Web site administrator System administrator Content authors

7. Approximate person-hours per week to maintain:

8. Is your site mostly static or dynamic or both?

Dynamic (database driven) Static Both

9. How is your web site hosted?

Page 33: Content Management System Discovery Project Report

- 33 -

Maintain own web server hardware and software

Athena locker

Hosted by I/S services

Other:

10. Would you be willing to invest resources in web site management or hosting services?

Yes No

About Content Management (definition of CMS): 11. Do you currently use a Content Management System product?

Yes No

12. If yes, rate how much you like it on a scale of 1-5, where 1 is yucky and 5 is awesome:

1 2 3 4 5 Product Name:

13. Do you have someone who centrally approves content before it's published or do your individual authors write and publish on their own authority? Describe your content approval process:

14. What is currently the most difficult aspect of managing your web site?

15. What benefit do you most hope a content management system could provide for you?

16. Many CMS products require the use of templates. You may be able to customize the overall look and feel of the site, but the flexibility an author has in the layout of individual pages may be limited to varying degrees. How important to you is the ability to control content layout at the page level?

1 Not very important 2 3 4 5 Extremely important

17. What else would you like to tell us about your web publishing needs?

Page 34: Content Management System Discovery Project Report

- 34 -

Whether or not you currently use a CMS, please tell us a little about yourself:

First Name:

Last Name:

Email:

Title:

Department:

May we contact you for further discussion?

Please check for Yes:

Your web site role/responsibilities:

System Administrator

Programmer/Coder

Content Author/Writer

Graphic Designer

Content Approver

Other role(s):

Submit

Page 35: Content Management System Discovery Project Report

- 35 -

Appendix C. Product Matrix Open Source (more or less)

zope/plone lenya OpenACS SiteMaker RTFM

Platform python; apache;Zope's object database, ZODB but can connect to other db's through odbc;

Version 1.0 RC1, java-based; apache cocoon; uses xml and xslt extensively; formerly known as "Wyona CMS"; Should work well on any platform that can support apache, tomcat, cocoon, etc. Can be deployed on Unix or Windows.

Requires AOLserver (open source webserver), TCL (open source) and either PostgreSQL (OSS) or Oracle; runs on most any Unix platform including Solaris and Linux.

java, apache, WebObjects, any db

Almost any Perl platform: Unix, windoze, MacOS X. Solaris works well.

Licensing/ Business Model

GPL. Zope.com can be consulted for a charge. Plone also under the GPL, but can be licensed if desired. Zope4edu a zope.com/Duke project, does not seem to a public license, but I am not sure. This may be worth inquiry on licensing and progress.

Apache License v.1.1/Main office is located in Switzerland; Support available through wyona.com

OpenACS source is under GPL. AOLserver is under Mozilla license; and PostgreSQL is under BSD license.

Developed by UMich, but licensing sold to GVC who does all development and resells the product. Development here would have strings, though could be done if only for our own use. Also offers ASP.

GNU General Public License, version 2. Consulting/Custom development available.

Single Site vs. Enterprise

Zope can do this with out problem. Easy enough to assign groups to different sites. Most done through a GUI. If you want to use Vhosts, then this is best done through Apache from my experience.

Enterprise. The software components can handle high loads, and OpenACS easily supports multiple independent "subsites" and hostname-based "virtual servers". (all

Either. Enterprise. Easy to isolate "classes" and even multiple instances running different versions, on one server host.

Page 36: Content Management System Discovery Project Report

- 36 -

administered through the GUI)

Integration with MS Office

There are a few modules out there for this. One that seems to be the best and leading is MSWord Document and the other is wvWare. Both haven't been updated in some time. I have never used them, but there seems to be allot of people who have. There was a person by the name of Ross Lazerus over at Harvard that worked with it a while back, he may be a good contact.

None. None, except documents can be stored for downloading as opaque format.

None. None. (this is a good thing.)

Authoring browser platforms

Uses DHTML and Python. There are a few wysiwig editors available, and you can configure external editors to interface with plone/zope. The one with plone is rumored to be decent, but again, I have never tried to install it or use it. Java authoring add-on allows cross-platform.

Works with any modern browser;supports inline wysiwig editors. Site editors do not have to learn xml; For developers plugin available for Eclipse java IDE

Any modern browser. Enter HTML or plain text in TEXTAREA boxes, or upload local files. EditThisPage module allows previewing.

Works with any modern browser (requires CSS).

Ease of templating

The beauty of Zope. To change, update, and build templates is really easy. You do need to get

Unknown Extensive template language similar to Server-Side Includes

Can use templates, including those with javascript. Occasional

Hard right now, requires authoring Mason pages. (Mason is like JSP for Perl.) Could be as simple as we

Page 37: Content Management System Discovery Project Report

- 37 -

used to the structure of Zope to find it easy. Python is occasionally necessary if you want to get detailed, but you can get by with dhtml, which should not come difficult to someone who is accustomed to maintaining a site. Templates can use javascript, if desired.

(SSI), supports variable substitution, conditionals, iteration, etc. Templates are hierarchical so look & feel can be dictated by one top level template. Only downside is most modules require editing templates on server.

workarounds are required, particularly with navigational elements as some aspects of navigation are "dynamically" controlled through the authoring interface.

want to make it.

Code base consistent with MIT community skill sets

I would imagine that this would not be a problem. The learning curve with Zope and Plone is pretty intimidating at first, but after spending a few hours with it, it becomes pretty straight forward. I do not know if there are any experts in Python here at MIT, but I truly believe you can get by just on DHTML.

Makes heavy use of cocoon xml publishing framework;built around off-the-shelf components from the apache software stack (e.g. cocoon, tomcat 1.4, java, Xalan, Xerces, etc.), so should be within MIT's reach; curious to know if anyone is already using Cocoon on campus?

OpenACS evolved from ArsDigita Community System (ACS) created by MIT community member (Greenspun). Although some components are a bit exotic (AOLserver, TCL) there are groups using it now on campus. SloanSpace is major OpenACS developer.

Can sit on almost any database, and the rendering is through java, but the framework is proprietary WebObjects.

There is considerable local expertise in RT (the companion tracking system to RTFM). Large and active worldwide user community is also helpful.

How much code mucking / maintenance needed

I believe quite a bit at first. It is pretty labor intensive to get a production product. By no ways a ready out of the box

XML-centric architecture, startup may be complicated. What's the pace of development bug-fixes, addding new

Probably half-FTE devoted to maintainence, but it would be fun :-). Some initial development

Lots of development and maintenance needed in the near future since RTFM is "young". Should be more stable in a year or so. It is

Page 38: Content Management System Discovery Project Report

- 38 -

product. But with templates provided to the customer from a service provider, I believe this is a easily maintained product.

features) in the Lenya open source community?

and documentation needed.

used in production now in places.

Access control

If we can tap it into Moira, then simple. We authenticate it against LDAP/Active Directory in Econ, so I do not think this would be a problem or very difficult.

Administrative interface allows advanced users to monitor the CMS and perform configuration tasks

Sophisticated fine-grained hierarchical permission system designed to make it easy to build communities.

Can create user groups, but not fine-grained control. Permissions are site wide, with the exception of data tables.

Very good; Fine-grained permissions and ability to configure groups as roles.

Security reputation

Works through Apache...

Built on Apache software stack

Excellent. In production use in many academic environments. (see DotLRN.org)

Quite good; also it is built on a quality foundation (Apache, Perl 5.8, rdbms).

Output to non-html formats

PDF None in general; some in certain modules.

Whatever you want to code in Perl and Mason.

Performance On my small site we have had great performance. I have read about no preformance related issues.

Demo sites are fast; my kludged untuned demo is very quick.

Reasonably fast at serving pages. Its foundation, RT, has proven robust in very large installations.

Integration with kerb/certs (single sign on)

can do Other than ssl support through apache, unknown; will probably require development by MIT

AOLserver has module to integrate OpenSSL (like MIT Apache) but a little work is needed. Looks quite possible (and we can get help from Sloan if they haven't done

Already developed to use kerberos authentication.

Good; integrates cleanly with MIT Personal Web Certificates.

Page 39: Content Management System Discovery Project Report

- 39 -

it already).

Cost (ballpark)

free unless Zope.com is consulted.

Free Free. Nothing but our time. Some cost for MySQL accessories (hot backup tool) if we choose to use them.

Comments No sftp --would have to build interface using SSL.

Java wysiwyg add-on meets platform needs

Can work with any programming/scripting language plug-in, so not locked into python.

Templates not constrained to vanilla look and feel.

special feature: output to PDF (benefit of extensive xml support)

Features: Revision Control, Scheduling, a built-in Search Engine, seperate Staging Areas, and Workflow; uses XML/XSLT throughout

Concerns: barriers to entry, May be initially complex to support; how much of Cocoon do we need to know to support/extend Lenya?

How big is the Lenya community?? Doe it have momentum or relatively small-scale?

In use at the University of Switzerland

Much more than a CMS. OpenACS has a large number of optional modules including several different kinds of CMS, workflow, surveys, ecommerce, etc. It would be a valuable resource to have on campus.

OpenACS has a sizable and active user/developer community.

Biggest weakness is the terrible documentation. We would need to provide better end-user documentation to deploy it here.

Seems to work best as an umbrella for small sites, e.g., faculty sites within a department that could inherit the template.

See BP's demo of a simple presentation front-end. Currently RTFM is mostly a powerful framework and toolkit. Has a pretty good administrative UI for forms-based authoring and editing of "documents". The "customer" UI must be developed (custom) for each site but a generalized solution is possible if we develop it.

Small Site Tools

Contribute Manila (Frontier?)

Platform Win 98, SE, 2000, XP; Mac OS X 10.1.5 and later

Frontier is built around an integrated object database. Frontier has a built-in Web server, but can also work with IIS, Apache and other Web servers using static rendering in Manila.

Licensing/ Business Model The licensing includes an auto-updater for product updates.

Single Site vs. Enterprise single or site license license is per URL

Page 40: Content Management System Discovery Project Report

- 40 -

Integration with MS Office yes, but different for Mac and PC. Mac may only attach docs as files or links

None. But you can upload Office (or any other files

Authoring browser platforms

Mac functions limited WYSIWYG authoring, except for Netscape on PC, and Safari and IE on Mac.

Ease of templating supposedly Select from "themes" or paste in your own html/javascript/etc.

Code base consistent with MIT community skill sets

yes

How much code mucking / maintenance needed

remains to be seen

Access control yes

Security reputation fair: uses SFTP; password protection now in place on app start up because Contribute auto connects to all sites at start up

Output to non-html formats no; Contribute will open a doc in another app which can then be uploaded

None, except a machine-readable export file.

Performance Offers full text search, but it degrades performance. Not used at HLS. Searching is done by date instead. Slashdot can crash it. To check performance you can check the daily hit rankings to get a feel for hit rate it can sustain (check blogs.salon.com).

Integration with kerb/certs (single sign on)

yes; exception is no cert support in Mac client

Cost (ballpark) $79 per single academic license The price is $899 per year for each subscription; academic pricing for qualified academic users. According to HLS, cost is $300 academic pricing per site/URL.

Comments May be good for interim use till fuller product is purchased or developed.

blogging tool grown into cms; can't be used with Athena unless you enable static rendering (sftp?).

Offers RSS syndication.

Page 41: Content Management System Discovery Project Report

- 41 -

Appendix D. SiteMaker Evaluation

SiteMaker is a java/WebObjects/sql-db product that is not quite open-source, not quite proprietary. It was developed at the University of Michigan and licensing rights were sold to GVC. Purchase of a license runs in the $15,000 neighborhood. There is a provision that allows educational institutions to buy a license that includes source code access (in the $50K ballpark). This would essentially grants all rights to source code, although no royalty rights. Site Management: This product lends itself to the management of small sites. The site author/administrator can add new “Sections” which are the equivalents of site pages. The flat directory structure, however, quickly makes managing anything but the smallest site a cumbersome process. Uploaded files, including site images, are displayed laundry list style, forcing the user to scroll through all files associated with the site whenever editing or managing the site. There is no conceptual or actual hierarchy to the site architecture, and is only for very small sites. The only large site configuration for which this tool might be appropriate would be a collection of small subsites that can inherit a style from a parent umbrella site. The developer has indicated that a virtual directory structure will be part of the next product upgrade, and that file management capabilities will be offered through the incorporation of WebDAV. This might enable the product to handle sites with more than a handful of pages. Incorporating a template and hosting the sites might fall under fee paid services offered by IS&T. For those small site owners or departments without technical human resources, such a service at a reasonable cost may be desired. SiteMaker offers virtual hosting and could be configured for static rendering to Athena lockers, so that URL’s for existing sites may be preserved. Content Authoring: SiteMaker has an authoring limitation that is shared to some degree by all products we reviewed, with the exception of Contribute: the WYSIWYG vs. cut-and-paste approach for marking up content. In Sitemaker, there is a WYSIWYG authoring interface that does not require the user to understand html. However, the non-html savvy user is limited to simple content –only straight narrative. Sitemaker will insert <p> tags and will allow the user to select the font style and color for highlighted text. In fact, changing the font attributes results in non-compliant html, inserting deprecated font tags which will override any associated style sheet. Therefore, SiteMaker’s WYSIWYG editor is actually counterproductive to good html authoring. It should be noted that the developer himself does not recommend the built-in editor, and recommends cutting and pasting tagged content from another editor. Authoring with SiteMaker should be done in conjunction with Dreamweaver, or with the incorporation of an in-line third-party editing tool, such as EditLive. Another difficulty for the content author is the lack of an intuitive interface for referencing site files within anchored hyperlinks. Determining the path to include in the link’s URL is at best a workaround that requires right-clicking, copying, toggling, and pasting between windows. Clumsy methods of capturing URLS for links is not exclusive to SiteMaker, but it is an issue that must be addressed for any CMS product to consider itself user-friendly. With all these limitations, why would a site owner in the MIT community choose this product over authoring a site with Dreamweaver and hosting it in an Athena locker? SiteMaker does offer one unique feature that might compensate for other lacks, especially if those lacks will be

Page 42: Content Management System Discovery Project Report

- 42 -

adequately addressed in future product releases. The unique feature is the incorporation of data tables that can be served to site visitors. This allows for interactivity and can be used to great effect, e.g., students viewing a faculty member’s real-time office hours schedule, and signing up for open slots on the spot. Users seeking to reserve meeting rooms could do so from any location institute-wide. Please refer to the Issues document (Appendix E) for further details on requested features for future releases. Many of these suggested features have already been incorporated into the development plans for version 3.5 of SiteMaker. Conclusion: SiteMaker has much to offer as a tool, but like all cms-type products, it has its limitations as well. The authoring interface needs to be enhanced with either more functionality or access to a third-party html editor. And until the developers can offer a hierarchical site structure, or at least the semblance of one, it is not practical for any but the smallest sites. SiteMaker is probably bested suited for those sites that wish to umbrella many small sites that share a look and feel, such as a department web site that houses individual faculty sites. On the plus side, it offers significant value to those small sites that might wish to serve structured content in an interactive manner, through use of its data table feature.

Page 43: Content Management System Discovery Project Report

- 43 -

Appendix E. SiteMaker Issues List Priority Issue Response High File Enhancement Package (FEP) in next release

will include: Can this be authenticated through Kerberos? HTTP file access using webdav will allow authors to

mount the site volume. Directory structure for improved file management A new sub-type for the Links section type will

display files through hyperlinks New section type which allows the author to specify

an html file that holds tagged content (allowing authoring in DW without needing to copy and paste back into SM.

Authoring: High – Accessibility issue.

On the upload image page, could “File Description” be used for alt text?

This can be added.

High On the upload file pages, it would be nice to designate the applicable section.

Accomplished in FEP.

Display - Medium; Edit - Low

On the revise file screen, it would be nice if the filename of the file you are edited is displayed, and could be edited.

Could display, but editing the name could create problems.

High How can a team preview a site? This can be added by giving the author the ability to enter a list of email addresses. Recipients will receive a link with a token that will allow them to view the page for an author-specified length of time.

? Given that content may end up being tagged and pasted from another application, can a html/javascript/? de-bugger utility be added to detect product-specifc (e.g., MM) tags and paths?

No such utility exists in any CMS.

Medium to Low How to meta-tag? Can we incorporate a search engine beyond what’s built in for data tables?

Meta-tagging can be added, which will make MIT’s googling effective.

High – user-friendly authoring in addition to DW is a necessity.

How to easily attach an html editor? Could the WYSIWYG be expanded to include un/ordered lists and bolding? And can the WYSIWYG elements be turned off if we don’t want authors overriding style rules.

The existing WYSIWYG does not work very well and it does use font tags. A more effective approach might be to incorporate a third-party in-place editor (like Lenya does). All browser-based editors will introduce browser/platform issues.

High The authoring textarea box must be much larger, given that there’s only one place to enter content.

Site Organization: High – user-friendly authoring and file management is a necessity.

The path is not obvious for putting links to site files into content. Maybe have the “File Description” field specify hyperlink text that you can later reference in content, and not have to worry about the path (a la Manila)?

Currently, the workaround is to right click the file from the main editing page, and then paste the path into the section content in question. One possibility worth pursuing is adding the ability to select from a pull-down list of site files to the WYSIWYG which will then insert the anchor tag wherever the cursor is positioned in the content. Another option is exploring the Manila model.

High Organize uploaded files into their appropriate sections, with subdirs for images. The flat laundry list is very difficult to work with.

FEP

High How to make a new page (not tied to navigation) within a section is not intuitive. This is really about having subordinate pages within a section.

Going to stew on this one.

Page 44: Content Management System Discovery Project Report

- 44 -

High How to use more than one template per site (not counting data table templates)?

This can be added by providing a pop-up list of templates on the section editing form.

High Can you use CSS? Currently you can incorporate in-line style tags, but you could conceivably have an external style file that is referenced in the template.

It would be nice to be able to change the section type after it’s been created because you often have to set the template/navigation before or in parallel with developing content.

No way (understandably). This becomes a training issue.

How large a site can SiteMaker handle? What about performance?

This is a hardware issue, not a limitation of SM. The only hit to performance would be serving the data tables.

High Can the pages be rendered statically to a locker or other server off the SiteMaker web app server? If not, is there another way to simulate/preserve existing sites’ URL’s?

Yes, for all site pages except data table sections. Can use virtual hosting to preserve URLs.

Medium-High Horizontal navigation. Already proposed. Tables: It’s hard to map fields for importing if you can’t see the

order of fields coming in.

Date-time input mask hint would be nice. In fact, importing dates is a misery. Can there be a simple date type?

Having the Import button at the top doesn’t send the message that you must click at the end. In fact the order of tasks is exactly opposite to the display order, as the file path disappears if you enter it before the other things.

How do I display table records within a published section page after the section has already been created? Do I have to delete and recreate the section to preserve the nav?

Editing table rows with restricted access can only be accomplished through “submit and view” from the appropriate section. Is that correct? Can there be a “hidden” navigation displayed for groups?

Thoughts: “Send message” features can be used as phony

workflow for approval/lifecycling.

Table data types include “file” which is useful for managing news releases, etc., or is that better served by the links section type?

I like the appointments example of the data table. So, would this be suitable for academic departments with lists of events, publications, and faculty. Not for the unsavvy administrator. Would the FL model work for IS service?

Page 45: Content Management System Discovery Project Report

- 45 -

Appendix F. Plone Evaluation Overview Plone is built atop the Zope web application development frameworks. Plone leverages the existing content management framework (CMF) within Zope, adds numerous features and a user-friendly interface to make it into a very capable out-of-the-box CMS solution. By default, Plone comes ready to run as a web-portal (like Yahoo) with built in calendars, feedback forms, public logins, etc. Plone outputs valid xhtml and tries to adhere to the latest web usability standards as much as possible. Beyond this one may customize (or "skin") the Plone UI to better suit any kind of content environment (portal or non-portal). Functionality that is not needed for a more static site may be turned on or off, etc. Much of the appearance configuration options is XML-driven. In addition, all of the normal web application development tools available through Zope may be used to add or extend functionality in Plone. Zope includes its own application server so there is no need to use the apache webserver although Zope/Plone is fully compatible with Apache. Zope also includes an integrated database, ZODB, which is used for revision control by the content management framework. For administrative tasks Plone makes use of the underlying Zope Management interface (ZMI) to offer a suite of advanced tools for site, user account, and style management. There are many options for customizing the look-and-feel of Plone sites. As open-source projects go, the overall Plone administrative user interface seems more complete than most. However, the number of administrative and configuration options can be difficult to navigate. It leaves one with the impression of having too many features grafted on top of each other in a somewhat uncontrolled fashion. In other words, the administrative and configuration option learning curve seems fairly steep. As Zope is a relatively mature open source project, the sheer number of administrative and configuration options available through the ZMI looks impressive but using them may be another story. The administrative interface could be more consistent and easier to use, While Zope no doubt is a powerful framework and has matured over time with a large number of contributed modules and new technologies but using the ZMI has a somewhat bloated, legacy, feeling. However, since the ZMI is the primary management interface this leaves us feeling somewhat less enthused about our ability to effectively manage Zope in the MIT environment. On the more positive side, there appears to be a very good in-browser editor called Epoz, which works surprisingly well. The validation is set up more loosely than with Lenya, and performance was more than satisfactory. Zope, as is well known, is based on python, and may be used to extend Plone as well as to write general purpose web applications. Finally, Plone benefits from a large and active user community that has led to a feature rich and rapidly evolving environment.

Page 46: Content Management System Discovery Project Report

- 46 -

Features

• Good support for large or small sites • Personalization and community features are there if needed, but may easily be disabled if

not • WYSIWIG (e.g. Epoz) or simple form-based content editing • XML-based UI tools (controls view, forms, portlets, etc.) • Version control through integrated ZODB database; can also use Oracle, or other

database if desired; XML and compressed data format import/export available • Compatible with Apache web server • Python programming environment for development; other languages may also be used

(e.g. perl) • Supports wide range of authentication and security options (user ACL's, LDAP, other) • Ability to create own directory structure through menu options, e.g. css, js, etc. files of

content authors choosing • WebDAV support • Static and dynamic content publishing • Supports Zope page templating language • Workflow • Scalability: load balancing, caching, ZEO (Zope Enterprise Objects) and other techniques

to improve performance Presentation/Templating From the markup contract: http://plone.org/development/teams/ui/p2uicookbook/TheMarkupContract: Component developers can create all manner of views, forms, and portlets for their content-types and tools that can be "slotted" into any Plone 2 site. At the same time, designers can take a design and produce a skin using just CSS which will also work on all Plone 2 sites. The aspect of the Plone 2 UI that makes this possible is the Markup Contract. The Markup Contract is a standardized structure for the XML that makes up the Plone UI. This standard consists of not just tag nesting but also IDs and classes. The contract specifies what information is present in the XML, and how this information is structured. Included in the standard are global constructs, such as the portal logo or columns, and local widgets such as the document description or a simple form field. Documentation Plone documentation is well organized and while it is relatively more extensive than many open source alternatives it is still lacking in completeness. One must regularly consult a mix of online documentation (e.g. Plone Book), published material, and listserv archives. The good thing is the community is large and there are likely answers to most questions but can take some effort to piece together complete answers.

Page 47: Content Management System Discovery Project Report

- 47 -

Additional Work

• There was no time to fully convert one of our sample MIT sites to Plone. • Create page templates • Turn off portal features to get a better idea of styling options in non-portal mode • Integration with certificates, probably done the standard apache way

Comments (Advantages)

• Easy to install • In-browser editing content easy using Epoz (browser specific) • Many add-on modules which can be installed to extend functionality • Plone seems more "complete", polished, at this point than many of the other systems we

have reviewed so far. • Plone is more than Zope • Python for development

Disadvantages

• Although Python, an object-oriented scripting language, is little-used by IS&T, it is not difficult to learn.

• Zope Management Interface needs improvement. • Learning curve may be high

Questions:

• Review underlying architecture. Is the underlying architecture sound? • Future directions: is Plone being developed in a coherent way? • Zope templates, how far does it enforce the separation of concerns?

Sites: Plone sites: (http://plone.org/about/sites)

Page 48: Content Management System Discovery Project Report

- 48 -

Appendix G. NCSU Lenya Questions and Answers Systems and Operations In our operational environment, we have applications in perl, Cold Fusion, php, and jsp as well as databases using Oracle, mySQL, and Postgres. Can Lenya (and how would it) handle them?

There are two parts to this answer. How does Cocoon/Lenya CMS let you write documents that interface to these applications at authoring time? Can the underlying Cocoon platform interface to existing applications written in a variety of languages and databases at run time?

Lenya is only a CMS tool and is not used directly for serving pages at run time. Thus the question of whether Lenya can handle these existing applications reduces to whether Lenya allows the author to insert the user interface components into a page at authoring time. For example, can an author insert into a document the required PHP tags for a registration application written in PHP? The answer is yes though the degree of integration depends on how XML friendly the interface components are. For example, JSPs can be written to not be well formed XML. Lenya can certainly insert a JSP into a document, but if it is not well formed, additional processing by a Cocoon pipeline may not be possible.

Once a site has been authored, the site can be deployed statically or with any number of server platforms if there is dynamic functionality. There is no requirement that the deployed site use Cocoon as the server platform. One compelling reason for using Cocoon as the deployment server is its ability to inter-operate with a wide variety of existing applications. If the existing application can produce XML (and most web applications can), it is relatively straightforward to incorporate the output on the application into the page composition facilities of Cocoon. Interfacing to existing applications that capture user input can also be handled though there are issues with transaction semantics and session control that must be addressed. Cocoon also has a range of out of the box techniques for interfacing to databases which is explained below.

How can one query an external RDBMS through Lenya? How is the RDBMS results set handled by Lenya?

As with the previous question, this question can be answered for both the authoring phase and the run time deployment phase.

Lenya natively uses the operating system's file system as its document repository. Each documents is stored as a single XML file, including the meta information associated with the document. Currently Lenya does not use a RDBMS for any persistence. However, since Lenya is based on Cocoon, there are a wide range of options available for querying external RDBMS. The simplest solution is the use of ESQL (http://cocoon.apache.org/2.1/userdocs/xsp/esql.html) within an XSP Logicsheet (http://cocoon.apache.org/2.1/userdocs/xsp/logicsheet.html). ESQL is a wrapper around JDBC so all databases supported by your JDBC driver are supported. The next level up would be to use native JDBC directly in XSP or Java transformers you write yourself. At the most sophisticated level, you can use persistence mapping tools such as hibernate (http://www.hibernate.org/) that provide enterprise scale functionality and performance. In all these cases, result sets are handled by the facilities of the persistence tool used.

If Cocoon is chosen as the run time application server, you have this same set of database access methods to implement your dynamic behavior.

Implementation issues - LDAP authentication. Lenya can manage authentication itself or use and LDAP server.

Structure What is the file system used?

Lenya uses the standard Java IO library to abstract the file system operations. Therefore, it uses the native file system provided by the operating system.

How do I use Lenya to create and manage my web site? This is a large question that will be addressed in the seminar. A brief synopsis is: a development team will construct a new publication that defines the document types, styles, work flow and functionality for your web site as part of the setup of Lenya; administrators will add users and manage their access privileges using Lenya; authors will create document content using Lenya; and reviewers will reject or publish changes to documents using Lenya.

Page 49: Content Management System Discovery Project Report

- 49 -

How do I use Lenya to put content into my web site? Lenya provides a number of editors to allow you to put or edit content in your web site. This is one of the distinguishing characteristics of Lenya compared to other CMS systems which provide only a single editor. Probably the most common approach is to use one of the browser based WYSIWYG editors which allows you to edit content in a manner similar to that of a word processor. We will be using the BXE editor as an example of this type of editor in the seminar. Lenya also provides a forms based editor that allows simple entry of structured content. Lastly, Lenya provides some facilities for uploading content authored outside of Lenya.

Should I be modifying the cocoon sitemap to use Lenya? Well, it depends on what you objective is. First, Lenya uses a number of sitemap files to implement its functionality. The base sitemap file is the base sitemap file provided by Cocoon. Flow of control is then passed to a number of Lenya specific sitemap files that provide standard CMS functionality such as work flow, revision control, UI, etc. Finally, each publication will have its own set of sitemap files that define the functionality and look and feel for each site. Typically, only the publication sitemap files will be constructed when the publication is initially constructed. Users of Lenya will be unaware of sitemap files. 5.Can you please help me read and understand your sitemap? Yes, we will walk through the various sitemap files used by Lenya. To help you understand sitemap processing, we suggest that you enable debugging level logging in Lenya by modifying the WEB-INF/logkit.xconf file and examining the logging output in the WEB-INF/logs/sitemap.log file. There is also a sequence diagram describing the flow of control through the various sitemap files at the Lenya wiki.

How do I tap into all that "PageEnvelope" stuff? Example, how do I put a value into the "publication-id"? How do I get a value out of the "publication-id"?

The PageEnvelope stuff is implemented with the cocoon input module facility. Modules are an alternative to URI parsing for passing information to a pipeline, typically session information

Cocoon input modules are defined in the coccon.xconf file. For example, the declaration of the page-envelope module is:

<input-modules>…<component-instance

name="page-envelope"

class="org.apache.lenya.cms.cocoon.components.modules.input.PageEnvelopeModule"

logger="sitemap.modules.input.page-envelope"/>

…</input-modules>

Input modules can be referenced in a sitemap file using the {…} syntax. For example, to access the id of the requested document:

<map:parameter name="documentid" value="{page-envelope:document-id}"/>

Let's say I need to make a small subset of content elements required on a subset of webpages (e.g. FAQs pages). Do I need to create a new Document Type for each class of webpage? Or is there a way to specify a "subtype" of XHTML with required elements?

XHTML-1.1 specifically supports the ability to define subsets of the overall functionality using its modularization facilities. Lenya supports this modularization using the XML RELAX NG scheme language. The example default publication that ships with Lenya defines a subset of XHTML.

Is schema validation the only way to implement form validation in Lenya?

Page 50: Content Management System Discovery Project Report

- 50 -

Scheme validation is used only for validating that a document is valid according to the XML scheme it is declared to be an instance of. Lenya uses a RELAX NG validator for both the BXE WYSIWYG and the forms based editors.

You're question seems to be related to a completely different issue: how user input is validated in a dynamic application that uses XHTML forms. This has nothing to do with XML schemes and thus scheme validation is not applicable. Instead, the dynamic application will perform input validation either in the browser with JavaScript or in the server using whatever language the server supports. Cocoon has several form processing components which include input validation.

Search engine How is the Lenya search tool configured? For example, how would I create a search index of only a specific Document Type?

Lenya is distributed with the Lucene search library but any other search application can be used. Lucene is not used directly by Lenya, but is instead used during as part of the production delivery system.

Lucene is highly configurable. It can be easily configured to search only documents that are an instance of a specific XML scheme. Lucene also supports non-XML documents such as Microsoft Word and PDF using external applications.

What search options (e.g. stemming, phrasing, stop words) can I configure? Where is this done? How and when are search indices updated?

Lucene uses its own Query Parser to define matches.

Indexing would normally be done whenever the live site is modified. Lucene has a complete API for controlling the created index. More detail on indexing can be found in the Lucene FAQ indexing section.

Lenya supports multi-channel publishing (e.g. publish xhtml doc as a PDF). What is the process for creating a new publishing output type, for example Atom XML feed. Schema validation is currently supported with Relax NG XML Syntax. Is Relax NG Compact Syntax supported as well?

Document types are configured for each publication. The following steps are required:

• The desired documents types for the publication are declared in the configuration file <pub>/config/doctypes/doctypes.xconf.

• For each declared document type, a RELAX NG scheme is written and placed in the directory <pub>/config/doctypes/schemes/.

Currently, the BXE editor does not support RELAX NG compact syntax.

Editor Rich text editor - to plug-in HTMLarea instead of KUPU

Lenya is architected to allow the use of multiple editors. Currently there is support for BXE, Kupu, and Xopus in-browser editors.. Currently, BXE is the best supported in-browser editor. It is possible to integrate htmlArea but this will require some effort. There are also issues about the range of supported XML schemes and the guarantees of validity when using editors other than BXE.

Metadata - need to increase the number of metadata elements Lenya ships with direct support for the Dublin Core meta data standard. It is straight forward to support other standards in your specific publications.

If I'm a web page content provider, how can I build a publication that retrieves customized content from an existing database-driven tool? Can I do this, or must I also involve a web page designer? Is there a real example in Lenya?

Again, we must first consider if you are referring to accessing the content from the existing database at authoring time or production time.

For authoring time, Lenya does not provide an out of the box solution to interfacing to existing databases that would allow an author to insert content into a document they are authoring.

Page 51: Content Management System Discovery Project Report

- 51 -

For production time, any application server can be used to serve existing content from a database. Lenya does not get involved at all. You can use Cocoon as the application server. Cocoon has extensive facilities for interfacing to databases. Generally, you will need a web graphic designer to design the page layout and look and feel. Cocoon has several examples that shows database access.

Basic Templating - any documentation? Templating is provided by Cocoon. All the Cocoon documentation is relevant here.

Explanation of template file structure and how it relates to Lenya as a whole. Lenya does not enforce a fixed template system. Each publication defines its own templating approach including the file structure used. Please see the next question for more on this.

Templating options - I know we can use XHTML, and XML/ XSLT. Are there any additional templating options.

XHTML, XML, and XSLT are simply underlying technologies. An architecture must be selected first that employs these technologies to define a templating system. Lenya does not impose any particular templating approach.

The architecture used by most Lenya publications is based on Cocoon Aggregators. Generally, the output from multiple XSLT transformers are combined to form the overall document. Each transform will construct one part of the overall document (for example, the masthead). This is a simple approach that is similar to that provided by other CMS templating

An alternate approach is to use a single XHTML template document that uses XInclude to aggregate content. In some ways, this is even simpler than the first approach in that no knowledge of XSLT is required.

Best way to include files in templates - is it a special procedure, or standard type include in the template. (e.g.,, a different DHTML navigation menu).

Hmmm, what is the best way is subjective. Lenya's approach is to allow the developer to use whatever templating approach they feel is most effective. The answer to the previous question provides two alternatives. Many more are possible.

Features The following are features listed as available in Lenya. We would like clarification and/or more information on how we would implement/access them, etc.

• Pluggable Authentication - clarification • Session Management • Asset Management • CGI-mode Support • Content Reuse - to have the Kinderspital piece in the default publication? • Content Scheduling • Email To Discussion • Internationalization - not priority but of interest • Macro Language • Server Page Language • Sub-sites / Roots • Template Language • Themes / Skins • Blog - is the content here integrated with the application, reusable content? • Document Management - how is this implemented? • FAQ Management - we plan on implementing an FAQ service on our website • File Distribution • Link Management - this is listed as "Limited" but would be useful for us to see it • Mail Form

Page 52: Content Management System Discovery Project Report

- 52 -

Appendix H. CMS Consultation Guidelines Regardless of the size of your web site or the number of authors and site administrators you have, your site management team must address these issues in the decision-making process of choosing a Content Management System. Do you need Content Management or a Knowledgebase? The first question the customer should consider is whether or not a content management system is really the right tool for the job. Is the customer more concerned with search and retrieval of data nuggets? If so, a knowledgebase may be in order. Is content presentation and publishing more important than indexing? In that case, a CMS may be what the customer needs. Platform matters. Every CMS product has platform issues to be taken into consideration, both at the hosting and authoring levels. While system specs will tell you in a straightforward manner what your hosting constraints are, the capabilities of authoring tools blur very quickly. While many of the most sophisticated browser-based WYSIWYG tools require the PC version of Internet Explorer, there are java-based interfaces that have a wider range of application. Additionally, there are third-party html editors that may be used in place of whatever less sophisticated tool comes with the product you select. You must become familiar with exactly which features are available to which browsers on what platforms. Macromedia Contribute is cross-platform, but its Mac functionalities are very different from its PC capabilities, in its current version. Interfaces that separate data from presentation are the most flexible in terms of platform, but are less intuitive to the non-html savvy author and may require development. Be aware both of what's available in the product and what operating systems your authors will be using, as well as their level of web expertise. Do you require print publishing? Do you want pdf output of your content? Then you may want to pay attention to those products, like Lenya, that house data in a structured format, with the ability to provide content output in XML format for easier repurposing from web to print. Lenya also offers direct pdf output. Do you ever intend to use your content as single source but with different templated layouts? For example, do you display spotlight articles that later become FAQ's? If so, you may not want to include html markup in your authoring interface, because any later changes to content formatting will require manual stripping and re-editing of html tags. Your best bet may be a structured database product that separates presentation from content, with form-based authoring and no tagging stored with the content. Do you need tight integration with MS Office or easy WYSIWYG authoring? Then Contribute may be the right product for you. If your authors are completely unversed in html tagging, or are uncomfortable copying and pasting source code from Dreamweaver into an authoring form, the Contribute provides the easiest authoring interface of any product we reviewed. Content migration will be a painful issue for any CMS product. If you already have a web site, chances are that you have a lot of static pages, all tagged up. There is no easy way around

Page 53: Content Management System Discovery Project Report

- 53 -

stripping out html tags for importing the content into the database of a CMS. If you decide to go with a product that does not separate content from format, you may have an easier time migrating your data. Either way you must consider the complications of migrating your data and plan your resource allocation accordingly. If you are planning to start a web site from scratch, migration is less of an issue than the authoring process. What are the technical resources available to you? Do you choose to support your own web hosting infrastructure, or do you prefer to receive hosting services from IS&T? Are you constrained to Athena for any reason? Hosting your own CMS solution will require the acquisition, installation, and ongoing maintenance of hardware and software. It may also involve acquiring new technical skill sets within your departmental technical staff. Do you already have a content management business process? If not, get one before attempting to implement a CMS solution. A CMS is a supporting technology for a business process, not a substitute for one. Things to think about in your business process in terms of how they will translate to your CMS solution: How much workflow do you need, i.e., how many stages of approval do you need? Do you have agreed-upon authoring conventions? What about versioning? Is there top-level buy-in in your department for moving forward toward adapting your current publishing process to incorporate a CMS? If your authors do not use your CMS, you've wasted a lot of resources implementing it. Do you need a dynamic site that generates pages on the fly, or can your needs be met with static rendering of system-housed content? If your content does not change frequently, you will get better performance with static pages. Some or all of your site may need to be database-driven, but don't automatically make that assumption. Your needs may be well served by static pages generated from a database repository or xml structured data.