Index: openacs-4/packages/search/www/doc/guidelines.adp =================================================================== RCS file: /usr/local/cvsroot/openacs-4/packages/search/www/doc/guidelines.adp,v diff -u -N --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ openacs-4/packages/search/www/doc/guidelines.adp 20 Aug 2015 17:47:50 -0000 1.1.2.1 @@ -0,0 +1,146 @@ + +{/doc/search {Search}} {How to make an object type searchable?} +How to make an object type searchable? + + +
+

How to make an object type searchable?

+by Neophytos Demetriou (k2pts\@cytanet.com.cy) +
+Making an object type searchable involves three steps: +

Choose the object type

+In most of the cases, choosing the object type is straightforward. +However, if your object type uses the content repository then you +should make sure that your object type is a subclass of the +"content_revision" class. You should also make sure all content is +created using that subclass, rather than simply create content with +the "content_revision" type. +

Implement FtsContentProvider

+FtsContentProvider is comprised of two abstract operations, namely +datasource and url. The specification for +these operations can be found in +packages/search/sql/postgresql/search-sc-create.sql. +You have to implement these operations for your object type by +writing concrete functions that follow the specification. For +example, the implementation of datasource for the +object type note, looks like this: +
ad_proc notes__datasource {
+    object_id
+} {
+    \@author Neophytos Demetriou
+} {
+    db_0or1row notes_datasource {
+        select n.note_id as object_id, 
+               n.title as title, 
+               n.body as content,
+               'text/plain' as mime,
+               '' as keywords,
+               'text' as storage_type
+        from notes n
+        where note_id = :object_id
+    } -column_array datasource
+
+    return [array get datasource]
+}
+
+When you are done with the implementation of +FtsContentProvider operations, you should let the +system know of your implementation. This is accomplished by an SQL +file which associates the implementation with a contract name. The +implementation of FtsContentProvider for the object +type note looks like: +
select acs_sc_impl__new(
+           'FtsContentProvider',                -- impl_contract_name
+           'note',                              -- impl_name
+           'notes'                              -- impl_owner_name
+);
+
+You should adapt this association to reflect your implementation. +That is, change impl_name with your object type and +the impl_owner_name to the package key. Next, you have +to create associations between the operations of +FtsContentProvider and your concrete functions. Here's +how an association between an operation and a concrete function +looks like: +
select acs_sc_impl_alias__new(
+           'FtsContentProvider',                -- impl_contract_name
+           'note',                              -- impl_name
+           'datasource',                        -- impl_operation_name
+           'notes__datasource',                 -- impl_alias
+           'TCL'                                -- impl_pl
+);
+
+Again, you have to make some changes. Change the +impl_name from note to your object type +and the impl_alias from notes__datasource +to the name that you gave to the function that implements the +operation datasource. +

Add triggers

+If your object type uses the content repository to store its items, +then you are done. If not, an extra step is required to inform the +search_observer_queue of new content items, updates or deletions. +We do this by adding triggers on the table that stores the content +items of your object type. Here's how that part looks like for +note. +
create function notes__itrg ()
+returns opaque as $$
+begin
+    perform search_observer__enqueue(new.note_id,'INSERT');
+    return new;
+end;
+$$ language plpgsql;
+
+create function notes__dtrg ()
+returns opaque as $$
+begin
+    perform search_observer__enqueue(old.note_id,'DELETE');
+    return old;
+end;
+$$ language plpgsql;
+
+create function notes__utrg ()
+returns opaque as $$
+begin
+    perform search_observer__enqueue(old.note_id,'UPDATE');
+    return old;
+end;
+$$ language plpgsql;
+
+
+create trigger notes__itrg after insert on notes
+for each row execute procedure notes__itrg (); 
+
+create trigger notes__dtrg after delete on notes
+for each row execute procedure notes__dtrg (); 
+
+create trigger notes__utrg after update on notes
+for each row execute procedure notes__utrg (); 
+

Questions & Answers

    +
  1. Q: If content is some binary file (like a pdf file stored in +file storage, for example), will the content still be +indexable/searchable?

    +A: For each mime type we require some type of handler. Once the +handler is available, i.e. pdf2txt, it is very easy to incorporate +support for that mime type into the search package. Content items +with unsupported mime types will be ignored by the indexer.

    +
  2. Q: Can the search package handle lobs and files?

    +A: Yes, the search package will convert everything into text based +on the content and storage_type attributes. Here is the convention +to use while writing the implementation of datasource:

      +
    • Content is a filename when storage_type='file'.
    • Content is a lob id when storage_type='lob'.
    • Content is text when storage_type='text'.
    • +
    +
  3. +
+