Index: openacs-4/packages/file-storage/www/doc/design.adp =================================================================== RCS file: /usr/local/cvsroot/openacs-4/packages/file-storage/www/doc/design.adp,v diff -u -r1.1.2.1 -r1.1.2.2 --- openacs-4/packages/file-storage/www/doc/design.adp 20 Aug 2015 17:47:47 -0000 1.1.2.1 +++ openacs-4/packages/file-storage/www/doc/design.adp 25 Aug 2015 18:02:21 -0000 1.1.2.2 @@ -2,84 +2,110 @@ {/doc/file-storage {File Storage}} {File Storage Design Document} File Storage Design Document - -

File Storage Design Document

-by Kevin Scaldeferri, + +by Kevin Scaldeferri +, modified by Jowell S. -Sabino for OpenACS. -

I. Essentials

+

II. Introduction

+

We have our own file-storage application because we want all users to be able to collaboratively maintain a set of documents. Specifically, users can save their files to our server so that they -may:

+

We want something that is relatively secure, and can be extended and maintained by any ArsDigita programmer, i.e., something -that requires only AOLserver Tcl and Oracle skills.

In ACS 4, File Storage can be implemented on top of the Content +that requires only AOLserver Tcl and Oracle skills.

+

In ACS 4, File Storage can be implemented on top of the Content Repository. Thus, there is no data model associated with File Storage. It is only a UI and a small set of Tcl and PL/SQL library procedures. The actual storage and versioning is relegated to the -Content Repository.

III. Historical Considerations

File Storage was created to provide a mechanism for +Content Repository.

+

III. Historical Considerations

+

File Storage was created to provide a mechanism for non-technical users to collaborate on a wide range of documents, with minimum sysadmin overhead. Specifically, it allowed clients to exchange design documents (often MS Word, Adobe PDF, or other proprietary desktop file formats) that changed frequently without -having to get bogged down by sifting through multiple versions.

IV. Competitive Analysis

Why is a file-storage application useful?

If you simply give everyone FTP access to a Web-accessible +having to get bogged down by sifting through multiple versions.

+

IV. Competitive Analysis

+

Why is a file-storage application useful?

+

If you simply give everyone FTP access to a Web-accessible directory, you are running some big security risks. FTP is insecure and passwords are transmitted in the clear. A cracker might sniff a password, upload Perl scripts or ADP pages, then grab those URLs from a Web browser. The cracker is now executing arbitrary code on your server with all the privileges that you've given your Web -server.

The File Storage application is not a web-based file system, and +server.

+

The File Storage application is not a web-based file system, and can not be fairly compared against such systems. The role of File Storage is to provide a simple web location where users can share a versioned document. It does not allow much functionality with respect to aggregate file administration (ex. selecting all files -of a given type, or searching through specified file types).

V. Design Tradeoffs

Folder Permissions

Previous versions of File Storage have not included folder +of a given type, or searching through specified file types).

+

V. Design Tradeoffs

+

Folder Permissions

+

Previous versions of File Storage have not included folder permissions. (However they did have a concept of private group trees.) The reasons for this were to simplify the code and the user experience. However, this system actually caused some confusion (e.g., explicitly granting permission to an outsider on a file in a group's private tree did not actually give that person access to the file) and was not as flexible as people desired. The ACS 4 version includes folder read, write and delete -permissions.

Note that this can create some funny results. For example, a +permissions.

+

Note that this can create some funny results. For example, a user might have write permission on a folder, but not on some of its parent folders. This can cause the select box provided for -moving and copying files to look odd or misleading.

Deletion of Files

Previous versions of File Storage allowed only administrators to +moving and copying files to look odd or misleading.

+

Deletion of Files

+

Previous versions of File Storage allowed only administrators to actually delete content (although users could mark content as "deleted" using a toggle in the data model, deleted_p.) However, the proper use of versioning should allow users to avoid accidentally losing their files. So, in this version, if a person -asks to delete a version or a file, we really delete it.

Use of Content Repository

Basing this system on the Content Repository provides a wealth +asks to delete a version or a file, we really delete it.

+

Use of Content Repository

+

Basing this system on the Content Repository provides a wealth of useful functionality for File Storage with no additional development costs. However, it may also constrain the system -somewhat.

The Content Repository's datamodel has been extended to include +somewhat.

+

The Content Repository's datamodel has been extended to include an attibute to store the filesize. Unfortunately, the Content Repository does not automatically do this, since files may be stored on the filesystem (the Content Repository thus serving as a catalog to keep track of file location and some metadata, but not the filesize). The filesize is therefore calculated whenever a file is inserted in the Content Repository by the external program (the webserver's database driver) doing the insertion into the -database..

The content_revision is subtyped as a "file-storage-item" to +database..

+

The content_revision is subtyped as a "file-storage-item" to allow site-wide search to distinguish file storage objects in its -search results. This feature is not implemented yet, however.

Permissions Design

Permissions were chosen to make as much use as possible of the +search results. This feature is not implemented yet, however.

+

Permissions Design

+

Permissions were chosen to make as much use as possible of the predefined privileges while keeping the connotative value of each privilege clear. The permissions scheme is vaguely modeled off Unix file permissions, with somewhat less overloading. In particular, we define a delete privilege rather than overloading the write permission. Also, execute privileges have no meaning in this -context.

+context.

+
@@ -92,27 +118,35 @@ -
FolderFileVersion
adminmodify permission grants and read, write and delete privileges

Some notes: the admin privilege implies the read, write and + +

Some notes: the admin privilege implies the read, write and delete privileges. It may be the case that a user has delete permission on a folder or file, but not on some of its child items. This will block attempts to delete the parent item. Finally, the -write permission does not have any meaning for versions.

VI. API

For the most part, File Storage will provide wrappers to the -Content Repository APIs.

PL/SQL API

File Storage provides public PL/SQL APIs either as wrappers to +write permission does not have any meaning for versions.

+

VI. API

+

For the most part, File Storage will provide wrappers to the +Content Repository APIs.

+

PL/SQL API

+

File Storage provides public PL/SQL APIs either as wrappers to the Content Repository API, or more involved functions that calls multiple Content Repository PL/SQP functions. One reason for doing this is to abstract from the Content Repository datamodel and naming conventions, due to the different way File Storage labels -its objects.

The main objects of File Storage are "folders" and "files". A +its objects.

+

The main objects of File Storage are "folders" and "files". A "folder" is analogous to a subdirectory in the Unix/Windows-world filesystem. Folder objects are stored as Content Repostory folders, -thus folders are stored "as is" in the Content Repository.

"Files", however, can cause some confusion when stored in the +thus folders are stored "as is" in the Content Repository.

+

"Files", however, can cause some confusion when stored in the Content Repository. A "file" in File Storage consists of meta-data, and possibly multiple versions of the file's contents. The main meta-data of a "file" is its "title", which is stored in the Content Repository's "name" attribute of the cr_items table. The "title" of a file should be unique within a subdirectory, although a directory may contain a file and a folder with the same -"title".

Each version of a file is stored as a revision in cr_revisions +"title".

+

Each version of a file is stored as a revision in cr_revisions table of Content Repository. The Content Repository also allows some meta-data about a version to be stored in this table, and indeed File Storage uses attributes of the cr_revisions table are @@ -124,13 +158,17 @@ Content Repository API makes sure that the naming convention is corect: cr_items.name attribute stores the title of a file and all its versions, while the cr_revisions.title attribute stores the -filename of the version uploaded into the Content Repository.

Meta-data about a version of a file stored in Content Repository +filename of the version uploaded into the Content Repository.

+

Meta-data about a version of a file stored in Content Repository are the size of the version (stored in cr_revisions.content_length) -and version notes (stored in cr_revisions.description).

There are two internal PL/SQL functions that do not call the +and version notes (stored in cr_revisions.description).

+

There are two internal PL/SQL functions that do not call the Content Repository API, however: get_root_folder and new_root_folder, defined in the file_storage PL/SQL package -

Tcl API

+

+

Tcl API

+

children_have_permission_p

 children_have_permission_p [ -user_id user_id ] item_idprivilege
 
This procedure, given a content item and a privilege, @@ -144,7 +182,8 @@
-
+
+

fs_context_bar_list

 fs_context_bar_list [ -final final ] item_id
 
Constructs the list to be fed to ad_context_bar @@ -159,7 +198,8 @@
-
+
+

fs_file_downloader

 fs_file_downloader connkey
 
Sends the requested file to the user. Note that the @@ -173,7 +213,8 @@
-
+
+

fs_file_p

 fs_file_p file_id
 
Returns 1 if the file_id corresponds to a file in the @@ -184,7 +225,8 @@
-
+
+

fs_folder_p

 fs_folder_p folder_id
 
Returns 1 if the folder_id corresponds to a folder in @@ -195,7 +237,8 @@
-
+
+

fs_get_folder_name

 fs_get_folder_name folder_id
 
Returns the name of a folder. @@ -205,7 +248,8 @@
-
+
+

fs_root_folder

 fs_root_folder [ -package_id package_id ]
 
Returns the root folder for the file storage system. @@ -215,7 +259,8 @@
-
+
+

fs_version_p

 fs_version_p version_id
 
Returns 1 if the version_id corresponds to a version in @@ -226,10 +271,13 @@
-

VII. Data Model Discussion

File Storage uses only the Content Repository data model. There +

+

VII. Data Model Discussion

+

File Storage uses only the Content Repository data model. There is one additional table, fs_root_folders, which maps between package instances and the corresponding root folders in the -Content Repository.

Inserting a row into the table fs_root_folders occurs the first +Content Repository.

+

Inserting a row into the table fs_root_folders occurs the first time the package instance is visited. The reason is that there is no facility in APM to insert a row in the database everytime a package instance is created (technically, there is no "on insert" @@ -248,7 +296,8 @@ returns the newly created folder identifier as the root folder for this package instance. Subsequent visits to the package instance will detect the root folder, and will then return the root folder -identifier.

There is an "on delete cascade" constraint imposed on the +identifier.

+

There is an "on delete cascade" constraint imposed on the package_id attribute of fs_root_folders. The reason for this is that whenever the package instance is deleted by the site administrator, it automatically deletes the mapping between APM and @@ -261,21 +310,26 @@ the package identifier attribute of fs_root_folders will cause all objects belonging to the instance of File Storage deleted to be orphaned in the database, since the root folder is the crucial link -from which all content is referenced!

The solution is (hopefully) more elegant: an "before on delete" +from which all content is referenced!

+

The solution is (hopefully) more elegant: an "before on delete" trigger that first cleans up all contents under the root folder identifier before the root folder identifier is deleted by APM. This trigger walks through all the contents of the instance of File Storage, and starts deleting from the "leaves" or end nodes of the file tree up to the root folder. Later improvements in Content Repository will allow archiving of the contents instaed of actually -deleting them from the database.

VIII. User Interface

The user interface attempts to replicate the file system +deleting them from the database.

+

VIII. User Interface

+

The user interface attempts to replicate the file system metaphors familiar to most computer users, with folders containing files. Adding files and folders are hyperlinked options, and a web form is used to handle the search function. Files and folders are presented with size, type, and modification date, alongside hyperlinks to the appropriate actions for a given file. Admin functions will be presented alongside the normal user action when -appropriate.

IX. Configuration/Parameters

There are two configuration parameters in this version of File +appropriate.

+

IX. Configuration/Parameters

+

There are two configuration parameters in this version of File Storage. The first parameter MaximumFileSize is the maximum size of uploaded files, which should be self-explanatory. The other parameter is a flag that indicates to the package whether @@ -289,7 +343,8 @@ for the site administrator to store the entire directory containing the Content Repository files (in particular, pageroot/content-repository-content-files) when storing -files in the fiesystem.

When a file is stored in the Content Repository, it first +files in the fiesystem.

+

When a file is stored in the Content Repository, it first queries the parameter StoreFilesInDatabaseP to determine how the new file will be stored. Thus, it is important that this parameter should be changed only at package instance creation, or @@ -299,8 +354,11 @@ the file is uploaded. Although all functionality provided by File Storage will continue to work (copy, move, delete, etc.), backing up the contents will be more complicated if the parameter is -changed.

All of the other parameters in previous versions have been made -obsolete by ACS 4 features like site-nodes and templating.

X. Future Improvements/Areas of Likely Change

+

XI. Authors

+ +
3.x : David Hill and Aurelius Prochazka
4.x : Kevin Scaldeferri -
Kevin -Scaldeferri
Kevin -Scaldeferri

XII. Revision History

+ + +
Kevin +Scaldeferri
+ +
Kevin +Scaldeferri
+

XII. Revision History

+
@@ -339,5 +406,6 @@ -
Document Revision #Action Taken, NotesWhen?By Whom?
0.2Revised after review by Josh11/16/2000Kevin Scaldeferri, Josh Finkler

kevin\@arsdigita.com - + +
+kevin\@arsdigita.com