Index: openacs-4/packages/file-storage/www/doc/design.adp =================================================================== RCS file: /usr/local/cvsroot/openacs-4/packages/file-storage/www/doc/design.adp,v diff -u -r1.1.2.3 -r1.1.2.4 --- openacs-4/packages/file-storage/www/doc/design.adp 9 Jun 2016 13:03:12 -0000 1.1.2.3 +++ openacs-4/packages/file-storage/www/doc/design.adp 3 Jul 2016 18:15:32 -0000 1.1.2.4 @@ -29,8 +29,8 @@ any risk that a cracker-uploaded file will be executed as code
  • Retrieve historical versions of a file
  • We want something that is relatively secure, and can be extended -and maintained by any ArsDigita programmer, i.e., something -that requires only AOLserver Tcl and Oracle skills.

    +and maintained by any ArsDigita programmer, i.e., +something that requires only AOLserver Tcl and Oracle skills.

    In ACS 4, File Storage can be implemented on top of the Content Repository. Thus, there is no data model associated with File Storage. It is only a UI and a small set of Tcl and PL/SQL library @@ -50,7 +50,7 @@ and passwords are transmitted in the clear. A cracker might sniff a password, upload Perl scripts or ADP pages, then grab those URLs from a Web browser. The cracker is now executing arbitrary code on -your server with all the privileges that you've given your Web +your server with all the privileges that you've given your Web server.

    The File Storage application is not a web-based file system, and can not be fairly compared against such systems. The role of File @@ -65,9 +65,9 @@ trees.) The reasons for this were to simplify the code and the user experience. However, this system actually caused some confusion (e.g., explicitly granting permission to an outsider on a -file in a group's private tree did not actually give that person -access to the file) and was not as flexible as people desired. The -ACS 4 version includes folder read, write and delete +file in a group's private tree did not actually give that +person access to the file) and was not as flexible as people +desired. The ACS 4 version includes folder read, write and delete permissions.

    Note that this can create some funny results. For example, a user might have write permission on a folder, but not on some of @@ -76,27 +76,28 @@

    Deletion of Files

    Previous versions of File Storage allowed only administrators to actually delete content (although users could mark content as -"deleted" using a toggle in the data model, deleted_p.) However, -the proper use of versioning should allow users to avoid +"deleted" using a toggle in the data model, deleted_p.) +However, the proper use of versioning should allow users to avoid accidentally losing their files. So, in this version, if a person asks to delete a version or a file, we really delete it.

    Use of Content Repository

    Basing this system on the Content Repository provides a wealth of useful functionality for File Storage with no additional development costs. However, it may also constrain the system somewhat.

    -

    The Content Repository's datamodel has been extended to include -an attibute to store the filesize. Unfortunately, the Content -Repository does not automatically do this, since files may be -stored on the filesystem (the Content Repository thus serving as a -catalog to keep track of file location and some metadata, but not +

    The Content Repository's datamodel has been extended to +include an attibute to store the filesize. Unfortunately, the +Content Repository does not automatically do this, since files may +be stored on the filesystem (the Content Repository thus serving as +a catalog to keep track of file location and some metadata, but not the filesize). The filesize is therefore calculated whenever a file is inserted in the Content Repository by the external program (the -webserver's database driver) doing the insertion into the +webserver's database driver) doing the insertion into the database..

    -

    The content_revision is subtyped as a "file-storage-item" to -allow site-wide search to distinguish file storage objects in its -search results. This feature is not implemented yet, however.

    +

    The content_revision is subtyped as a +"file-storage-item" to allow site-wide search to +distinguish file storage objects in its search results. This +feature is not implemented yet, however.

    Permissions Design

    Permissions were chosen to make as much use as possible of the predefined privileges while keeping the connotative value of each @@ -134,31 +135,34 @@ this is to abstract from the Content Repository datamodel and naming conventions, due to the different way File Storage labels its objects.

    -

    The main objects of File Storage are "folders" and "files". A -"folder" is analogous to a subdirectory in the Unix/Windows-world -filesystem. Folder objects are stored as Content Repostory folders, -thus folders are stored "as is" in the Content Repository.

    -

    "Files", however, can cause some confusion when stored in the -Content Repository. A "file" in File Storage consists of meta-data, -and possibly multiple versions of the file's contents. The main -meta-data of a "file" is its "title", which is stored in the -Content Repository's "name" attribute of the cr_items table. The -"title" of a file should be unique within a subdirectory, although -a directory may contain a file and a folder with the same -"title".

    +

    The main objects of File Storage are "folders" and +"files". A "folder" is analogous to a +subdirectory in the Unix/Windows-world filesystem. Folder objects +are stored as Content Repostory folders, thus folders are stored +"as is" in the Content Repository.

    +

    "Files", however, can cause some confusion when stored +in the Content Repository. A "file" in File Storage +consists of meta-data, and possibly multiple versions of the +file's contents. The main meta-data of a "file" is +its "title", which is stored in the Content +Repository's "name" attribute of the cr_items table. +The "title" of a file should be unique within a +subdirectory, although a directory may contain a file and a folder +with the same "title".

    Each version of a file is stored as a revision in cr_revisions table of Content Repository. The Content Repository also allows some meta-data about a version to be stored in this table, and indeed File Storage uses attributes of the cr_revisions table are used. However, this is where the confusion is created. The name of -the filename uploaded from the client's computer, as a version of -the file, is stored in the "title" attribute of cr_revisions. Note -that "title" is also used as the (unique within a folder) -identifier of the file stored in cr_items. Thus, wrappers to the -Content Repository API makes sure that the naming convention is -corect: cr_items.name attribute stores the title of a file and all -its versions, while the cr_revisions.title attribute stores the -filename of the version uploaded into the Content Repository.

    +the filename uploaded from the client's computer, as a version +of the file, is stored in the "title" attribute of +cr_revisions. Note that "title" is also used as the +(unique within a folder) identifier of the file stored in cr_items. +Thus, wrappers to the Content Repository API makes sure that the +naming convention is corect: cr_items.name attribute stores the +title of a file and all its versions, while the cr_revisions.title +attribute stores the filename of the version uploaded into the +Content Repository.

    Meta-data about a version of a file stored in Content Repository are the size of the version (stored in cr_revisions.content_length) and version notes (stored in cr_revisions.description).

    @@ -205,7 +209,7 @@
    Sends the requested file to the user. Note that the path has the original file name, so the browser will have a sensible name if you save the file. Version downloads are supported -by looking for the form variable version_id. We don't actually +by looking for the form variable version_id. We don't actually check that the version_id matches the path, we just serve it up.
    Parameters:
    @@ -280,45 +284,46 @@

    Inserting a row into the table fs_root_folders occurs the first time the package instance is visited. The reason is that there is no facility in APM to insert a row in the database everytime a -package instance is created (technically, there is no "on insert" -trigger imposed by APM on Content Repository, since they are -separate packages even though they are both part of the core). The -solution to this deficiency is a bit hack-ish, but seems to be the -only solution available (unless APM allows trigger functions to be -registered, to be caled at package instance creation). Whenever the -package instance is first visited, it calls a PL/SQL function that -calculated the "root folder" of the File Storage. If this function -detects that there is no "root folder" yet for this instance (as -would be the case when the instance is first visited), it inserts -the package id and a unique folder_id into the fs_root_folder table -to serve as the root folder identifier. It also inserts meta-data -information about this folder in cr_items table. Finally, it -returns the newly created folder identifier as the root folder for -this package instance. Subsequent visits to the package instance -will detect the root folder, and will then return the root folder -identifier.

    -

    There is an "on delete cascade" constraint imposed on the -package_id attribute of fs_root_folders. The reason for this is +package instance is created (technically, there is no "on +insert" trigger imposed by APM on Content Repository, since +they are separate packages even though they are both part of the +core). The solution to this deficiency is a bit hack-ish, but seems +to be the only solution available (unless APM allows trigger +functions to be registered, to be caled at package instance +creation). Whenever the package instance is first visited, it calls +a PL/SQL function that calculated the "root folder" of +the File Storage. If this function detects that there is no +"root folder" yet for this instance (as would be the case +when the instance is first visited), it inserts the package id and +a unique folder_id into the fs_root_folder table to serve as the +root folder identifier. It also inserts meta-data information about +this folder in cr_items table. Finally, it returns the newly +created folder identifier as the root folder for this package +instance. Subsequent visits to the package instance will detect the +root folder, and will then return the root folder identifier.

    +

    There is an "on delete cascade" constraint imposed on +the package_id attribute of fs_root_folders. The reason for this is that whenever the package instance is deleted by the site administrator, it automatically deletes the mapping between APM and the Content Repository (i.e, the package identifier and the root folder identified), and presumably the particular instance of File Storage. Unfortunately this has an undesirable effect. There is no -corresponding "on delete cascade" on the Content Repository objects -so that deleting the root folder will cause deletion of everything -under the root folder. Left on its own, the "on delete cascade" on -the package identifier attribute of fs_root_folders will cause all -objects belonging to the instance of File Storage deleted to be -orphaned in the database, since the root folder is the crucial link -from which all content is referenced!

    -

    The solution is (hopefully) more elegant: an "before on delete" -trigger that first cleans up all contents under the root folder -identifier before the root folder identifier is deleted by APM. -This trigger walks through all the contents of the instance of File -Storage, and starts deleting from the "leaves" or end nodes of the -file tree up to the root folder. Later improvements in Content -Repository will allow archiving of the contents instaed of actually -deleting them from the database.

    +corresponding "on delete cascade" on the Content +Repository objects so that deleting the root folder will cause +deletion of everything under the root folder. Left on its own, the +"on delete cascade" on the package identifier attribute +of fs_root_folders will cause all objects belonging to the instance +of File Storage deleted to be orphaned in the database, since the +root folder is the crucial link from which all content is +referenced!

    +

    The solution is (hopefully) more elegant: an "before on +delete" trigger that first cleans up all contents under the +root folder identifier before the root folder identifier is deleted +by APM. This trigger walks through all the contents of the instance +of File Storage, and starts deleting from the "leaves" or +end nodes of the file tree up to the root folder. Later +improvements in Content Repository will allow archiving of the +contents instaed of actually deleting them from the database.

    VIII. User Interface

    The user interface attempts to replicate the file system metaphors familiar to most computer users, with folders containing @@ -333,15 +338,15 @@ Storage. The first parameter MaximumFileSize is the maximum size of uploaded files, which should be self-explanatory. The other parameter is a flag that indicates to the package whether -files are stored in the database or in the webserver's filesystem. -This second parameter StoreFilesInDatabaseP uses the new -capability in Content Repository to use the Content Repository as a -mere catalog to store file information while the actual file -contents are stored in the webserver's filesystem. Note that when -files are stored in the filesystem, backups of the database will -only store the catalog, but not the contents. Thus, it is important -for the site administrator to store the entire directory containing -the Content Repository files (in particular, +files are stored in the database or in the webserver's +filesystem. This second parameter StoreFilesInDatabaseP +uses the new capability in Content Repository to use the Content +Repository as a mere catalog to store file information while the +actual file contents are stored in the webserver's filesystem. +Note that when files are stored in the filesystem, backups of the +database will only store the catalog, but not the contents. Thus, +it is important for the site administrator to store the entire +directory containing the Content Repository files (in particular, pageroot/content-repository-content-files) when storing files in the fiesystem.

    When a file is stored in the Content Repository, it first @@ -375,11 +380,11 @@ that actually serves the files after you create a new site-node. This is similar to the previous issue and will probably be dealt with similarly.

  • We automatically add MIME types to cr_mime_types -if they aren't there already. However, we don't currently have a -way of entering the description at the same time, so we have to -display "application/msword" instead of "MS Word Document", for -example. We could use a method of determining the canonical long -form of a MIME type.
  • +if they aren't there already. However, we don't currently +have a way of entering the description at the same time, so we have +to display "application/msword" instead of "MS Word +Document", for example. We could use a method of determining +the canonical long form of a MIME type.

    XI. Authors

    • System creator:
      Index: openacs-4/packages/file-storage/www/doc/requirements.adp =================================================================== RCS file: /usr/local/cvsroot/openacs-4/packages/file-storage/www/doc/requirements.adp,v diff -u -r1.1.2.4 -r1.1.2.5 --- openacs-4/packages/file-storage/www/doc/requirements.adp 9 Jun 2016 13:03:12 -0000 1.1.2.4 +++ openacs-4/packages/file-storage/www/doc/requirements.adp 3 Jul 2016 18:15:32 -0000 1.1.2.5 @@ -20,26 +20,28 @@ determine which individuals or groups should be allowed to read particular items and who should be allowed to upload new versions.

      -

      Since information is only useful if you can find what you're +

      Since information is only useful if you can find what you're looking for, files in the file storage system should be searchable, both from within the application and through any site-wide search facilities.

      III. System/Application Overview

      The File-Storage application will consist primarily of a user interface that allows individuals to manage their file-storage -folder(s) and to see other people's publicly accessible files.

      +folder(s) and to see other people's publicly accessible +files.

      IV. Use Case and User Scenarios

      Using File-Storage to Run a Project

      -

      In the course of her job at Acme Publishing Company, Ursula -User is working with people from several different offices with -whom she needs to exchange pictures and Excel spreadsheets -detailing cost estimates, and collaboratively write contracts using -Word. At any time, she and the other people she works with need to -be able to find the current copy of each of these documents - and -be able to look at older versions if need be to track the evolution -of the project. If the project is large, Ursula will also need to -be able to find all the documents pertaining to a particular issue -- so she will need a full-text search feature.

      +

      In the course of her job at Acme Publishing Company, +Ursula User is working with people from several +different offices with whom she needs to exchange pictures and +Excel spreadsheets detailing cost estimates, and collaboratively +write contracts using Word. At any time, she and the other people +she works with need to be able to find the current copy of each of +these documents - and be able to look at older versions if need be +to track the evolution of the project. If the project is large, +Ursula will also need to be able to find all the documents +pertaining to a particular issue - so she will need a full-text +search feature.

      For each project, Ursula makes a folder on the file-storage system and gives read, write, and edit permission to the group of people she is working with for that project. Then she makes @@ -53,22 +55,22 @@ outside the group their opinion so she gives them read access to just one version of a file so that they can download it and take a look. Sometimes production tasks change; if so, Ursula can -rearrange the project's sub-folder hierarchy to make it more +rearrange the project's sub-folder hierarchy to make it more closely reflect the new organizational scheme. When a project is completed, if Ursula is considerate of the maintainers of the site and of other users, she will clean-up after herself, downloading the canonical version of all the documents to her local machine and deleting the files from the server.

      Administer File-Storage

      -Annie Admin primarily has the job of periodically -cleaning up after users. If disk space is tight on the server, she -may want to look for files that haven't been accessed in a long -time and either encourage the owners of those files to delete -anything they don't need on the server anymore or delete files -herself if the user can't be contacted or is unresponsive. -Depending on the precise permissions implementation, Annie may -occasionally need to intercede when the owner of a file +Annie Admin primarily has the job of +periodically cleaning up after users. If disk space is tight on the +server, she may want to look for files that haven't been +accessed in a long time and either encourage the owners of those +files to delete anything they don't need on the server anymore +or delete files herself if the user can't be contacted or is +unresponsive. Depending on the precise permissions implementation, +Annie may occasionally need to intercede when the owner of a file accidentally revokes their own permission to access the file.

      V. Related Links

        @@ -77,58 +79,62 @@

        VI.A. Requirements: Data Model

        10 The Data Model

        -10.1 each file should have a unique identifier

        -

        -10.2 each version of a file should have a unique +10.1 each file should have a unique identifier

        -10.3 each file should have an associated owner

        +10.2 each version of a file should have a +unique identifier

        -10.4 each version should have an associated owner

        +10.3 each file should have an associated +owner

        -10.5 files will be organized in a hierarchical set of -folders

        +10.4 each version should have an associated +owner

        -10.6 each version of each file will have individual read, -write, delete, comment, and administer permissions associated with -it

        +10.5 files will be organized in a hierarchical +set of folders

        +

        +10.6 each version of each file will have +individual read, write, delete, comment, and administer permissions +associated with it

        VI.B. Requirements: Administrator Interface

        20 Administrator Interface

        -20.1 the administrator should be able to view all files -in the file-storage system

        +20.1 the administrator should be able to view +all files in the file-storage system

        -20.2 the administrator should be able to edit, delete, or -alter permissions for any file belonging to any user

        +20.2 the administrator should be able to edit, +delete, or alter permissions for any file belonging to any user

        VI.C. Requirements: User Interface

        30 User Interface

        -30.1 a user should be able to create folders and -subfolders in which he can place his files

        +30.1 a user should be able to create folders +and subfolders in which he can place his files

        -30.2 a user should be able to add new files and new -versions of files

        +30.2 a user should be able to add new files and +new versions of files

        -30.3 a user should be able to move files to different -folders or sub-folders

        +30.3 a user should be able to move files to +different folders or sub-folders

        -30.4 a user should be able to delete folders and -individual files

        +30.4 a user should be able to delete folders +and individual files

        -30.5 a user should be able to specify permissions for any -user or group on any folder, file, or version.

        +30.5 a user should be able to specify +permissions for any user or group on any folder, file, or +version.

        -30.6 a user should be able to download any version which -is accessible to him

        +30.6 a user should be able to download any +version which is accessible to him

        -30.7 a user should be able to view and/or edit other -user's files if the user has been granted individual or group -permission with access to the files

        +30.7 a user should be able to view and/or edit +other user's files if the user has been granted individual or +group permission with access to the files

        -30.8 a user should be able to search the text of the -documents stored in the file-storage system (requires full-text -search capability from the database - in the case of Oracle, -requires InterMedia)

        +30.8 a user should be able to search the text +of the documents stored in the file-storage system (requires +full-text search capability from the database - in the case of +Oracle, requires InterMedia)

        VII. Revision History