ILCTA Electronic Logbooks

US/Central
Woodshed, WH7X (FNAL)

Woodshed, WH7X

FNAL

Description
Discuss the homework from last week. Develop follow on questions. Decide if any products can be eliminated at this stage. We might not finish this in one hour but we will do our best.
Minutes of Meeting, May 26, 2006 1:00 CDT Present: Rob, Suzanne, Jim, Wally By Phone: Claude. Minutes written by Rob. I have added some information that I learned after the meeting. Corrections and comments are welcome. [ My editorial comments in square brackets. ] Our next meeting is Friday June 2, 2006 at 1:00 PM CDT, in the Woodshed, WH7X. Thanks to everyone who prepared summaries of the architecture choices for the various candidate log/notebooks. This information is now summarized in ILC-doc-292. The information provided by Wally was added to that document after the meeting. The main topic of the meeting was to discuss the information summarized in this document. 1) About the AD Elog: a) All of 4 ILC test areas at Fermilab are already using it at some level. Two new instances recently created for A0. b) Wally does not believe that the architecture is robust enough to serve all our needs for 10 years: security, searchability ... c) We also had a discussion that we would like to avoid Perl if possible ( see 8), below ). 2) How do we react to 1a): if we decide to recommend a different product then we need to explicitly address that fact that there is already an investment in the AD Elog. In particular we need to acknowledge the retraining costs and include those costs in the tradeoffs. 3) Some of the logbook products require Oracle. Yet they say that they are no cost. Jim tells us that the lab has a site wide Oracle license and that the cost to add new Oracle deployments is small. Moreover the ILC, at present, has a project wide Oracle license in its spec. 4) We agreed to take one feature off of the use case list: we are not looking for a product with which people can use to deploy a new logbook on their own laptop. We are only considering a "centrally" deployed server which people can access via a browser. This simplfies the backup issue and any Oracle license issues. 5) Two logbooks are off our list: a) The DESY-IHEP elog is too immature to be considered for something we will start to use soon. b) The PSI Elog appears to be much too hard to maintain. 6) Even though something is off our list, we should still understand its features so that we know what we are missing. 7) What about the SNS E-log? - Claude will look into it. 8) There seems to be a consensus that Perl has had its day: a) The features of perl that were once unique, such as regexps are now available in many languages, eg JAVA. b) We have lots of horror stories that Perl code can be hard to maintain, although we acknowledge that well written perl could be easy to maintain. c) New products are less likely to have a PERL API than a JAVA API. This is a strike against the AD Elog. In the JLAB/SLAC elog, the use of Perl is sufficiently restricted that it is not a serious problem. 9) About PHP. Is it an acceptable language? We did not reach a conclusion. So far we know: a) It is popular at ANL and is expected to have a future. b) In the past there were security concerns. No one knew if this was still a concern. 10) So far as we know, none of the products is currently being used as a notebook. All are used only as log books. 11) What would it take to add logbook functionality? - CRL currently has annotation but not editing. One could add editing of entries, with an edit on/off switch on a per topic basis. - AD Elog currently has a "repair" feature that can only be invoked by administrators. Both the original and the modified text are saved. If desired, this could be extended to a general editing feature. - Need to learn about others. 12) The JLAB/SLAC product creates entries by making a temporary XML file that is periodically swept into the db. The other products talk directly to the db. Do we care which we choose? The answer is that both are OK. 13) About security. None of the products has state of the art security. We should anticipate that, in the future, the requirement of state of the art security may be imposed on us. For example, the computing division has been told to migrate the various docdb instances to certificate based access. Here is a summary of what we know about security: a) The AD elog has a distinction that people at addresses inside the lab firewall do not need a password to read but people outside the firewall do need a password. b) There is a Java libary to support PKI certificates. ( PKI = Public Key Infrastructure, aka X.509 ) So it should be relatively straightforward to add this feature to any of the JAVA based products. c) The LHC community failed with a certificate only policy. d) We need to understand if the effort required to get a certificate has been reduced to the point that we have moved past the LHC experience. In any case we need to account for this effort in our judgements. e) Suzanne thinks it would be relatively easy to add PKI authentication to CRL. 14) A use case to remember: - People come and go, so they need to be added and subtracted from the security system. 15) About searching. Most of the products do searches of their database data, but not of their attachments. In some cases it would be straightforward to extend the searches to include attachments. Searching is usually not indexed so searching can take a long time if there are many entries. We need to understand the scope of our problem to know if this is an issue for us. The one exception is the DESY TTF logbook which uses Apache Lucene http://lucene.apache.org/java/docs/ This is a tool that builds an index of the things searched; it can do increments builds of the index as well as batch builds. According to the FAQ there are tools to pull the text out of .doc, .pdf, .xls and many other formats. These are not formally part of Lucene but can be used to index documents in these formats. Wally says that searching is a weak spot in the AD elog. HepBook ( aka KBook for Knowlege Book ) also uses Lucene. 16) After the meeting Suzanne was curious if we could use google searching. There is a syntax to tell google to restrict its searches to particular sites. This only works if google has a button we can push to tell it to update its indices ( otherwise it will not find newly created entries ). For example, the syntax to find references on the Fermilab site to tonight's lecture on the Gathering Storm report is: "Gathering Storm" site:fnal.gov 17) An aside on history. The AD elog started life as a notebook at ORNL. It was taken to FNAL and converted to a logbook on a separate development path. The SNS logbook is informed by the experience at ORNL with their earlier product; but it is a fresh start, not an evolution of the old product. 18) Claude reported that he had heard rumors that the CRL made "unorthodox" use of its database. Suzanne replied that the CRL puts a minimum amount of info in the db for each entry. The rest of the information is stored in the HTML or XML file that contains the body of the entry. One useful sideeffect of this strategy is that the db can be 100% rebuilt by scanning the entries. This has been done in the past when some users were careless with backups. Homework: 1) Questions for Claude to ask Jerzy: - does the TD weblog support multiple logbooks 2) All: a) Find out if the product can be used as a logbook ( ie with editable entries ). Has it been used that way now or in the past. If not, what would it take to make some class of entries editable? b) How hard would it be to add certificate based security if not already present. Does the architecture allow a plug replacement of the security mechanism or does it have tentacles throughout the system. c) Please clarify the present security system: - individual or group accounts ( or both ) - can someone log in as an individual but have permissions as a member of one or more groups? - does the product maintain its own username/password database or does it use some service provided, for example, by the lab. d) Is the architecture such that adding something like Lucene is straightforward or is it an ugly hack. 3) Claude will learn something about the SNS elog. 4) Rob will learn about HepBook
There are minutes attached to this event. Show them.
The agenda of this meeting is empty